如何建立真菌毒素数据库DFVF(Database of Known Fungal Virulence Factors) 本地分析方法

在做基因组安全性评估时，通常需要分析基因组中抗性基因，毒力与致病性基因等。对于细菌基因组来说，CARD、resfinder、Arg-annot等常用抗性基因库及VFDB、PathogenFinder 、PAIDB等常用毒力与致病性数据库等都提供再在线分析工具，分析起来十分方便。对于真菌基因组来说，目前较为公认的DFVF数据库没有提供在线分析工具，需要我们自己创建本地分析方法。以下是我使用diamond 建立DFVF分析方法的详细过程，供大家参考。

1 DFVF数据，网址 Introduction - Database of Virulence Factors in Fungal Pathogens

linux 系统下载： wget -c http://sysbio.unl.edu/DFVF/Download/AllGenes.txt

打开ALLGenes.txt 文件格式如下，共2048个蛋白信息，该文件为非标准格式需要转化为fasta格式

2 编写python 脚本，将ALLGenes.txt 从非标准格式转化为fasta格式的新文件 DFVFdata.fasta

python脚本如下：

contents1=""

with open(r"C:\desktop\DFVFanalysis\AllGenes.txt") as file1:

contents1=file1.readlines()

sentence=""

number=0

with open(r"C:\desktop\DFVFanalysis\DFVFdata.fasta",'w') as file2:

for line in contents1:

if "UniProtID" in line:

s=line.split()

number+=1

sentence= "\n"+">"+str(number) +"_"+s[0][:-1]+"_"+s[1]+"_"

file2.write(sentence)

if "Gene Symbol" in line:

s=line.split()

sentence = s[0]+"_"+s[1][:-1]+"_"+s[2]+"_"

file2.write(sentence)

if "Organism" in line:

s=line.split()

sentence=s [0][:-1]+"_"+s[1]+"_"+s[2]+"_"

file2.write(sentence)

if "Disease:" in line:

s=line.split()

a=""

for i in range(len(s)):

if i==0:

a +=s[i][:-1]+"_"

else:

a +=s[i]+"_"

sentence=a

file2.write(sentence)

if "Protein Sequence" in line:

s=line.split()

if len(s)==3:

sentence=s[0]+"_"+s[1][:-1]+"\n"+s[2]+"\n"

if len(s)==4:

sentence=s[0]+"_"+s[1][:-1]+"\n"+s[2]

file2.write(sentence)

if ":" not in line:

s=line.split()

if len(s)>0:

sentence=s[0]

file2.write(sentence)

3 参考 biopython：基因genbank格式转核酸或氨基酸fasta格式_biopython数据格式转换学习讨论-CSDN博客将需要分析的基因组注释数据转化为含有蛋白位置信息，注释蛋白信息及氨基酸序列信息的faa文件

4 使用diamond 对DFVF 数据库文本进行本地建库

diamond makedb --in DFVFdata.fasta --db ~/DFVFdb

5 使用diamond 将本地全基因组蛋白数据比对DFVF数据库

diamond blastp --db DFVFdb.dmnd --query 待比对物种全基因组蛋白序列.faa -e 1e-5 --outfmt 6 --id 30 --more-sensitive --out ./reportname.txt

这里-e 值设置为 1e-5 序列相似性--id 设置为30%，输出文件设置为.txt格式 --out ./reportname.txt ，大家可以根据需要进行修改。

就到这里了，欢迎大家批评指正，多多交流，码字不易，多多点赞，哈哈哈~~

©著作权归作者所有,转载或内容合作请联系作者
【社区内容提示】社区部分内容疑似由AI辅助生成，浏览时请结合常识与多方信息审慎甄别。
平台声明：文章内容（如有图片或视频亦包括在内）由作者上传并发布，文章内容仅代表作者本人观点，简书系信息发布平台，仅提供信息存储服务。

如何建立真菌毒素数据库DFVF(Database of Known Fungal Virulence Factors) 本地分析方法

如何建立真菌毒素数据库DFVF(Database of Known Fungal Virulence Factors) 本地分析方法

相关阅读更多精彩内容

友情链接更多精彩内容