一、从NCBI获取SRA编号
二、用不同方法进行下载
1、aspera下载
#安装
wget http://download.asperasoft.com/download/sw/connect/3.7.4/aspera-connect-3.7.4.147727-linux-64.tar.gz
tar -zxvf aspera-connect-3.7.4.147727-linux-64.tar.gz
sh aspera-connect-3.7.4.147727-linux-64.sh
echo 'export PATH=~/.aspera/connect/bin:$PATH' >> ~/.bashrc
source ~/.bashrc
#下载
ascp -i ~/.aspera/connect/etc/asperaweb_id_dsa.openssh -k 1 –T -l 200m anonftp@ftp-private.ncbi.nlm.nih.gov:/sra/sra-instant/reads/ByRun/sra/SRR/SRR404/SRR4042142/SRR4042142.sra ./SRA_data/
#-i PRIVATE-KEY-FILE Private-key file name (id_rsa)
#-k RESUME-LEVEL Resume criterion: 0,3,2,1
#-T Disable encryption
#-l MAX-RATE Max transfer rate
aspera下载速度较快,下载下来为sra格式,需要转换为fastq格式
2、sratoolkit下载
#安装
wget https://ftp-trace.ncbi.nlm.nih.gov/sra/sdk/2.8.2-1/sratoolkit.2.8.2-1-centos_linux64.tar.gz
tar -zxvf sratoolkit.2.8.2-1-centos_linux64.tar.gz
echo 'export export PATH=~/sratoolkit/sratoolkit.2.8.2-1-centos_linux64/bin/:$PATH' >> ~/.bashrc
source ~/.bashrc
用prefetch工具下载
prefetch -c SRR4042142
#-c|--check-all double-check all refseqs
prefetch下载数据为sra格式,在home目录的ncbi文件夹下
用fastq-dump工具下载
fastq-dump -X 5 -Z SRR390728
#-X|--maxSpotId <rowid> Maximum spot id
#-Z|--stdout Output to stdout, all split data become
fastq-dump下载直接转换为fastq格式
若为双端测序,用fastq-dump --split-3 SRR4042142
下载产生两个文件
3、aria2c下载
aria2c -j 20 ftp://ftp-trace.ncbi.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR404/SRR4042142/SRR4042142.sra
并行下载
4、wget下载
wget ftp://ftp-trace.ncbi.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR404/SRR4042142/SRR4042142.sra
三、批量下载
SRA数据下载链接前半部分都一致,只需要修改后面的编号
#/sra/sra-instant/reads/ByRun/sra/{SRR|ERR|DRR}/<first 6 characters of accession>/<accession>/<accession>.sra
for id in `cat SraAccList.txt`
do ascp -i ~/.aspera/connect/etc/asperaweb_id_dsa.openssh -k 1 –T -l 200m anonftp@ftp-private.ncbi.nlm.nih.gov:/sra/sra-instant/reads/ByRun/sra/{SRR|ERR|DRR}/substr($id,1,6)/$id/$id.sra ./SRA_data/
done
参考链接:
https://www.ncbi.nlm.nih.gov/books/NBK158899/#SRA_download.downloading_sra_data_using