indrop数据分析

软件:https://github.com/indrops/indrops

第一步先将软件下载下来,采用git clone https://github.com/indrops/indrops.git
根据说明先装requires,python,RSEM,bowtie,samtools,java,

image.png

再根据说明建index,

mkdir -pv DOWNLOAD_DIR
cd DOWNLOAD_DIR

# Download the soft-masked, primary assembly Genome Fasta file
wget ftp://ftp.ensembl.org/pub/release-85/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa.gz

# Download the corresponding GTF file.
wget ftp://ftp.ensembl.org/pub/release-85/gtf/homo_sapiens/Homo_sapiens.GRCh38.85.gtf.gz

# This command will go through all the steps for creating the index
python indrops.py project.yaml build_index \
    --genome-fasta-gz DOWNLOAD_DIR/Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa.gz \
    --ensembl-gtf-gz DOWNLOAD_DIR/Homo_sapiens.GRCh38.85.gtf.gz

跑这一步需要用到project.yaml。
这里是我配置的文件,

project_name : "test"
project_dir : "/work/03.indrop_data"

paths : 
  bowtie_index : "/work/03.indrop_data/DOWNLOAD_DIR"  # 由于bowtie index要建的地址,一定要写到DOWNLEAD_DIR,否则会报错找不到ref。
  bowtie_dir : "/software/biosoftware/bowtie-1.2.2-linux-x86_64" # 这是bowtie安装路径,下载,解压就可以了,
  python_dir : "/root/anaconda2/bin" # python 安装路径,
  samtools_dir : "/software/biosoftware/samtools-1.3.1/bin/samtools" #samtools 安装路径
  rsem_dir : "/software/biosoftware/RSEM-1.3.1/" # rsem 安装路径
  java_dir : "/usr/bin/"  # java安装路径

sequencing_runs : 
  - name : "Test_du"  # 随便起名
    version : 'v1'
    dir : "/work/03.indrop_data/"  # 这里是data的路径
    fastq_path : "{library_prefix}_{split_affix}_{read}_001.fastq.gz"  read是R1,R2两个,
    split_affixes : ["L007"]
    libraries : 
      - {library_name: "L007", library_prefix: "WBJPE18020236_HMWMYCCXY_L7_WBJPE18020236_20180818_P_S1"}
# 所以fastq名称应该是 WBJPE18020236_HMWMYCCXY_L7_WBJPE18020236_20180818_P_S1_L007_R1_001.fastq.gz
parameters : # OPTIONAL PARAMETERS # 这些都是默认参数。
  umi_quantification_arguments:
    m : 10 #Ignore reads with more than M alignments, after filtering on distance from transcript end.
    u : 1 #Ignore counts from UMI that should be split among more than U genes.
    d : 600 #Maximal distance from transcript end, NOT INCLUDING THE POLYA TAIL
    split-ambigs: False #If umi is assigned to m genes, add 1/m to each gene's count (instead of 1)
    min_non_polyA: 15 #Require reads to align to this much non-polyA sequence. (Set to 0 to disable filtering on this parameter.)
  output_arguments:
    output_unaligned_reads_to_other_fastq: False
    filter_alignments_to_softmasked_regions: False
    # low_complexity_mask: False
  bowtie_arguments:
    m : 200
    n : 1
    l : 15
    e : 80
  trimmomatic_arguments:
    LEADING: "28"
    SLIDINGWINDOW: "4:20"
    MINLEN: "16"
    argument_order: ['LEADING','SLIDINGWINDOW','MINLEN']
  low_complexity_filter_arguments:
    max_low_complexity_fraction: 0.50
最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容

  • afinalAfinal是一个android的ioc,orm框架 https://github.com/yangf...
    passiontim阅读 15,581评论 2 45
  • afinalAfinal是一个android的ioc,orm框架 https://github.com/yangf...
    wgl0419阅读 6,360评论 1 9
  • 昨日观澜俗事缠身,实在分身乏术。观澜看彩昨日无推荐。 冷态06 09 16开出过迟,加上早上第一期开出15 16 ...
    观澜看彩阅读 1,316评论 4 2
  • 中年以后,经过了大半生,回望岁月,有谁不是久经磨难、千疮百孔?问题在于,你是纠结于那些磨难生活在遗憾里,还是以豁达...
    鲁先圣阅读 145评论 0 0
  • stateAlwaysHidden //不自动弹出键盘 #00000000 //周围背景 @android:col...
    wslerz阅读 184评论 0 0