做出来了诶O(∩_∩)O哈哈~ 有点小开心(*^▽^*)
conda装的blast做不出来,还不知道是什么原因。重新在自己家里装的blast运行没问题。
建库部分:遇到了一些关于grep、sed、awk还有for循环的问题,再开一篇记录吧。还有fa的格式问题。然后就是,原来blast是可以多对多比对的,那就可以一次性做批量处理了~
#建库
/home/hmguang/biosoft/blast/blast/ncbi-blast-2.9.0+/bin/makeblastdb -in refdata.fasta -dbtype nucl
#比对
/home/hmguang/biosoft/blast_project/blast/ncbi-blast-2.9.0+/bin/blastn -query testseq -out result.txt -db refdata.fasta -evalue 1e-5
结果:
BLASTN 2.9.0+
Reference: Zheng Zhang, Scott Schwartz, Lukas Wagner, and Webb
Miller (2000), "A greedy algorithm for aligning DNA sequences", J
Comput Biol 2000; 7(1-2):203-14.
Database: refdata.fasta
16 sequences; 411,130 total letters
Query= YW607_F06
Length=524
Score E
Sequences producing significant alignments: (Bits) Value
NC_000017.11:4932277-4935023Homosapienschromosome17_GP1BA,GRCh38.... 350 8e-98
NC_000017.11:4932277-4935023Homosapienschromosome17_GP1BA_core,GR... 350 8e-98
>NC_000017.11:4932277-4935023Homosapienschromosome17_GP1BA,GRCh38.p13PrimaryAssembly
Length=2747
Score = 350 bits (189), Expect = 8e-98
Identities = 194/197 (98%), Gaps = 0/197 (0%)
Strand=Plus/Minus
Query 1 TACAGCGAGTTCTCTTGGAGGAGAAGGGTGTCGAGATTCTCCAGCCCATTCAGGAGCCCA 60
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct 930 TACAGCGAGTTCTCTTGGAGGAGAAGGGTGTCGAGATTCTCCAGCCCATTCAGGAGCCCA 871
Query 61 GCGGGGAGCTCAGTCAAGTTGTTGTTAGCCAGACTGAGCTTCTCCAGCTTGGGTGTGGGC 120
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct 870 GCGGGGAGCTCAGTCAAGTTGTTGTTAGCCAGACTGAGCTTCTCCAGCTTGGGTGTGGGC 811
Query 121 GTCAGGAGCCCTGGGGGCAGGGTCTTCAGCTCATTGCCTTTCAGGTAGAGCTCTTGGAGT 180
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct 810 GTCAGGAGCCCTGGGGGCAGGGTCTTCAGCTCATTGCCTTTCAGGTAGAGCTCTTGGAGT 751
Query 181 TCGCMAGTACCACGCAG 197
|||| | |||||||||
Sbjct 750 TCGCCAAGACCACGCAG 734
>NC_000017.11:4932277-4935023Homosapienschromosome17_GP1BA_core,GRCh38.p13PrimaryAssembly
Length=357
Score = 350 bits (189), Expect = 8e-98
Identities = 194/197 (98%), Gaps = 0/197 (0%)
Strand=Plus/Minus
Query 1 TACAGCGAGTTCTCTTGGAGGAGAAGGGTGTCGAGATTCTCCAGCCCATTCAGGAGCCCA 60
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct 295 TACAGCGAGTTCTCTTGGAGGAGAAGGGTGTCGAGATTCTCCAGCCCATTCAGGAGCCCA 236
Query 61 GCGGGGAGCTCAGTCAAGTTGTTGTTAGCCAGACTGAGCTTCTCCAGCTTGGGTGTGGGC 120
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct 235 GCGGGGAGCTCAGTCAAGTTGTTGTTAGCCAGACTGAGCTTCTCCAGCTTGGGTGTGGGC 176
Query 121 GTCAGGAGCCCTGGGGGCAGGGTCTTCAGCTCATTGCCTTTCAGGTAGAGCTCTTGGAGT 180
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct 175 GTCAGGAGCCCTGGGGGCAGGGTCTTCAGCTCATTGCCTTTCAGGTAGAGCTCTTGGAGT 116
Query 181 TCGCMAGTACCACGCAG 197
|||| | |||||||||
Sbjct 115 TCGCCAAGACCACGCAG 99
Lambda K H
1.37 0.632 1.16
Gapped
Lambda K H
1.28 0.460 0.850
Effective search space used: 207467130
Query= YW614_E07
Length=284
Score E
Sequences producing significant alignments: (Bits) Value
NC_000005.10:52989326-53094779Homosapienschromosome5_ITGA2,GRCh38... 291 3e-80
NC_000005.10:52989326-53094779Homosapienschromosome5_ITGA2_4core,... 291 3e-80
>NC_000005.10:52989326-53094779Homosapienschromosome5_ITGA2,GRCh38.p13PrimaryAssembly
Length=105454
Score = 291 bits (157), Expect = 3e-80
Identities = 161/164 (98%), Gaps = 0/164 (0%)
Strand=Plus/Plus
Query 1 TTGTCAGCAACCAAAACAAAARGTTAACATTTTCAGTAACGCTGAAAAATAAAAGGGAAA 60
||||||||||||||||||||| ||||||||||||||||||||||||||||||||||||||
Sbjct 83807 TTGTCAGCAACCAAAACAAAAGGTTAACATTTTCAGTAACGCTGAAAAATAAAAGGGAAA 83866
Query 61 GTGCATACAACACTGGAATTGTTGTTGATTTTTCAGAAAACTTGTTTTTTGCATCATTCT 120
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct 83867 GTGCATACAACACTGGAATTGTTGTTGATTTTTCAGAAAACTTGTTTTTTGCATCATTCT 83926
Query 121 CCCTGCCGGTATGTGATGAGACCCTGTACTTAYGTCCACCATGC 164
|||||||||||||||||||||||||||||||| ||||||||||
Sbjct 83927 CCCTGCCGGTATGTGATGAGACCCTGTACTTACTTCCACCATGC 83970
>NC_000005.10:52989326-53094779Homosapienschromosome5_ITGA2_4core,GRCh38.p13PrimaryAssembly
Length=1671
Score = 291 bits (157), Expect = 3e-80
Identities = 161/164 (98%), Gaps = 0/164 (0%)
Strand=Plus/Plus
Query 1 TTGTCAGCAACCAAAACAAAARGTTAACATTTTCAGTAACGCTGAAAAATAAAAGGGAAA 60
||||||||||||||||||||| ||||||||||||||||||||||||||||||||||||||
Sbjct 1022 TTGTCAGCAACCAAAACAAAAGGTTAACATTTTCAGTAACGCTGAAAAATAAAAGGGAAA 1081
Query 61 GTGCATACAACACTGGAATTGTTGTTGATTTTTCAGAAAACTTGTTTTTTGCATCATTCT 120
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct 1082 GTGCATACAACACTGGAATTGTTGTTGATTTTTCAGAAAACTTGTTTTTTGCATCATTCT 1141
Query 121 CCCTGCCGGTATGTGATGAGACCCTGTACTTAYGTCCACCATGC 164
|||||||||||||||||||||||||||||||| ||||||||||
Sbjct 1142 CCCTGCCGGTATGTGATGAGACCCTGTACTTACTTCCACCATGC 1185
Lambda K H
1.42 0.646 1.21
Gapped
Lambda K H
1.28 0.460 0.850
Effective search space used: 109283972
Query= YW621_D08
Length=266
Score E
Sequences producing significant alignments: (Bits) Value
NC_000017.11:c44389649-44372181Homosapienschromosome17_ITGA2B_7co... 329 5e-92
NC_000017.11:c44389649-44372181Homosapienschromosome17_ITGA2B_7co... 329 5e-92
>NC_000017.11:c44389649-44372181Homosapienschromosome17_ITGA2B_7core,GRCh38.p13PrimaryAssembly
Length=17469
Score = 329 bits (178), Expect = 5e-92
Identities = 179/180 (99%), Gaps = 0/180 (0%)
Strand=Plus/Minus
Query 1 GCCTTTCTKAGGTCCCAGATCCTTTAAGGCCCATGCCCTCTGCCTCCTCACCAGCTCACG 60
|||||||| |||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct 9315 GCCTTTCTGAGGTCCCAGATCCTTTAAGGCCCATGCCCTCTGCCTCCTCACCAGCTCACG 9256
Query 61 GGTGTCTTGGTCTGAGGTAGGACACAGCTCTTCACAGCAGGATTCAGTGAATCTTGCACC 120
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct 9255 GGTGTCTTGGTCTGAGGTAGGACACAGCTCTTCACAGCAGGATTCAGTGAATCTTGCACC 9196
Query 121 AGTAGCTGGACAGAGGCCTTCACCACTGGCTGAGCTCTGATGGGATAGGGTGATGGGGTA 180
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct 9195 AGTAGCTGGACAGAGGCCTTCACCACTGGCTGAGCTCTGATGGGATAGGGTGATGGGGTA 9136
>NC_000017.11:c44389649-44372181Homosapienschromosome17_ITGA2B_7core,GRCh38.p13PrimaryAssembly
Length=1773
Score = 329 bits (178), Expect = 5e-92
Identities = 179/180 (99%), Gaps = 0/180 (0%)
Strand=Plus/Minus
Query 1 GCCTTTCTKAGGTCCCAGATCCTTTAAGGCCCATGCCCTCTGCCTCCTCACCAGCTCACG 60
|||||||| |||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct 632 GCCTTTCTGAGGTCCCAGATCCTTTAAGGCCCATGCCCTCTGCCTCCTCACCAGCTCACG 573
Query 61 GGTGTCTTGGTCTGAGGTAGGACACAGCTCTTCACAGCAGGATTCAGTGAATCTTGCACC 120
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct 572 GGTGTCTTGGTCTGAGGTAGGACACAGCTCTTCACAGCAGGATTCAGTGAATCTTGCACC 513
Query 121 AGTAGCTGGACAGAGGCCTTCACCACTGGCTGAGCTCTGATGGGATAGGGTGATGGGGTA 180
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct 512 AGTAGCTGGACAGAGGCCTTCACCACTGGCTGAGCTCTGATGGGATAGGGTGATGGGGTA 453
Lambda K H
1.36 0.630 1.15
Gapped
Lambda K H
1.28 0.460 0.850
Effective search space used: 101888816
Query= YW665_C09
Length=307
Score E
Sequences producing significant alignments: (Bits) Value
NC_000007.14:80602207-80679277Homosapienschromosome7_CD36,GRCh38.... 416 5e-118
NC_000007.14:80602207-80679277Homosapienschromosome7_CD36_core,GR... 416 5e-118
>NC_000007.14:80602207-80679277Homosapienschromosome7_CD36,GRCh38.p13PrimaryAssembly
Length=77071
Score = 416 bits (225), Expect = 5e-118
Identities = 228/229 (99%), Gaps = 1/229 (0%)
Strand=Plus/Plus
Query 1 TAGGTCAATCTATGCTGTATTTGAATCCGACGTTAATCTGAAAGGAATCCCTGTGTATAG 60
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct 68768 TAGGTCAATCTATGCTGTATTTGAATCCGACGTTAATCTGAAAGGAATCCCTGTGTATAG 68827
Query 61 ATTTGTTCTTCCATCCAAGGCCTTTGCCTCTCCAGTTGAAAACCCAGACAACTATTGTTT 120
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct 68828 ATTTGTTCTTCCATCCAAGGCCTTTGCCTCTCCAGTTGAAAACCCAGACAACTATTGTTT 68887
Query 121 CTGCACAGAAAAAATTATCTCAAAAAATTGTACATCATATGGTGTGCTAGACATCAGCAA 180
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct 68888 CTGCACAGAAAAAATTATCTCAAAAAATTGTACATCATATGGTGTGCTAGACATCAGCAA 68947
Query 181 ATGCAAAGAAGGTGAGTAAATAACCTCAGTAGCACAG-CCATACCATAA 228
||||||||||||||||||||||||||||||||||||| |||||||||||
Sbjct 68948 ATGCAAAGAAGGTGAGTAAATAACCTCAGTAGCACAGTCCATACCATAA 68996
>NC_000007.14:80602207-80679277Homosapienschromosome7_CD36_core,GRCh38.p13PrimaryAssembly
Length=1580
Score = 416 bits (225), Expect = 5e-118
Identities = 228/229 (99%), Gaps = 1/229 (0%)
Strand=Plus/Plus
Query 1 TAGGTCAATCTATGCTGTATTTGAATCCGACGTTAATCTGAAAGGAATCCCTGTGTATAG 60
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct 865 TAGGTCAATCTATGCTGTATTTGAATCCGACGTTAATCTGAAAGGAATCCCTGTGTATAG 924
Query 61 ATTTGTTCTTCCATCCAAGGCCTTTGCCTCTCCAGTTGAAAACCCAGACAACTATTGTTT 120
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct 925 ATTTGTTCTTCCATCCAAGGCCTTTGCCTCTCCAGTTGAAAACCCAGACAACTATTGTTT 984
Query 121 CTGCACAGAAAAAATTATCTCAAAAAATTGTACATCATATGGTGTGCTAGACATCAGCAA 180
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct 985 CTGCACAGAAAAAATTATCTCAAAAAATTGTACATCATATGGTGTGCTAGACATCAGCAA 1044
Query 181 ATGCAAAGAAGGTGAGTAAATAACCTCAGTAGCACAG-CCATACCATAA 228
||||||||||||||||||||||||||||||||||||| |||||||||||
Sbjct 1045 ATGCAAAGAAGGTGAGTAAATAACCTCAGTAGCACAGTCCATACCATAA 1093
Lambda K H
1.35 0.626 1.14
Gapped
Lambda K H
1.28 0.460 0.850
Effective search space used: 118733338
Database: refdata.fasta
Posted date: Nov 12, 2019 3:50 PM
Number of letters in database: 411,130
Number of sequences in database: 16
Matrix: blastn matrix 1 -2
Gap Penalties: Existence: 0, Extension: 2.5