Nat Biotech | 小脑细胞单细胞异构体RNA测序

Single-cell isoform rNA sequencing characterizes isoforms in thousands of cerebellar cells

The authors used microfluidics to amplify full-length cDNA from single cells in a sample. cDNA produced from each single cell was barcoded to enable cell-of-origin identification and then split into two pools, with one pool being used for short-read Illumina 3′ sequencing to measure gene expression and the other pool being used for long-read sequencing and isoform identification.Long-read sequencing with Pacific Biosciences (PacBio) or Oxford Nanopore3 was used to identify full-length RNA isoforms.

pipeline



Filter cells to retain reads confidently mapped to genes. Then use these short reads to cluster cells
Tsne



Performe a second independent replicate (rep2) with threefold sequence depth
rep2中95-100%细胞分群与rep1中相同
Jaccard index 显示rep1和rep2中相同的群的marker gene相似
Comparison of single cell biological replicates



Generated ~5.2 million PacBio circular consensus reads(CCS)

  • Cellular barcodes are located close to the polyA-tail, so they first searched for polyA-tails.
  • 61.6% of CCS contained a T9.
  • Error- free sequencing of the theoretical construct (21-bp adaptor sequence, 16-bp cellular barcode, and 10-bp UMI and polyA-tail) yielded a T9 starting at position 48. ~97% of T9-CCS had a T9 starting between positions 45 and 51
  • Non-expected T9-position CCS had lower T-content while expected T9-position CCS have 30-bp T-content.
  • Expected T9-position CCS showed a higher barcode identification rate than CSS with a T9 in other positions
  • For 92.7% of barcodes, the minimal (Levenshtein) distance was 3 or greater, and for the remaining barcodes it was 2. Thus, for most barcodes there was only one specific error pattern (three errors) that would result in a mis-identified cell. Simulation indicated that all of this false-positive barcode were discarded.


    Aligned PacBio reads to the mouse genome (version mm10)
    tool:STAR
    The authors analyzed novel isoforms with respect to mouse Gencode version 10, to produce a long-read-enhanced and cell-type-resolved annotation.For these isoforms, we required all splice sites to be known in Gencode32 (version 10) and each junction and internal exon to be either annotated or observed at least twice in ScISOr-Seq. To reduce the effect of PCR artifacts on the improved mouse Gencode annotation to a minimum, and to allow for adding transcripts expressed at low levels, researchers produced an enhanced cell-type-resolved annotation that had good six-cycle PCR short-read support. For each added isoform, each intron and internal exon was required to be annotated in Gencode, or to be supported by two or more six-cycle PCR short reads.
    To validate the correct calling of the individual cell of origin for each isoform, the authors performed immunopanning(????)

Examined alternative splicing in the Bin1 gene

  • In addition to four annotated alternate exons ( A1, A3, A4 and A5) in mouse Gencode for Bin1, authors found two more alternate exons, A2 and A6, in ≥3 reads.
    Single-gene view for the Bin1 gene


  • Coordination of alternate exons is of crucial biological importance, so they searched for this in our ScISOr-Seq data. They found 25 genes with coordination of alternate exons that were separated by intermediate exons.

  • Testing all exon pairs, adjacent or separated by intermediate exons, they found 633 genes with coordination, including all 25 with intermediate exons. Thus, most coordinated pairs were adjacent exon pairs.

  • 20% (5 of 25) of coordination events of alternate exons, which were sepa- rated by constitutive exons, were a result of differences in isoform abundance .
    Quantitative isoform analysis



    Limitation
    Multiple deeply sequenced replicates are needed for precise quantification. Use of long-read technology in ScISOr-Seq makes accurate quantification expensive for now. Our estimates for specificity and sensitivity of barcode recognition in long reads are based on using 16-mer 10xGenomics barcodes for 6,000–7,000 cells. If the number of cells is increased to >1 million while still relying on 16-mer barcodes, the authors would advise reassessment of specificity and sensitivity, as specificity is likely to drop
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容

  • rljs by sennchi Timeline of History Part One The Cognitiv...
    sennchi阅读 12,136评论 0 10
  • 文|理财学院2期|4班|海绵X 时间和复利,拥有魔力。《只要6步,人人都可以拥有千万资产》第7课的核心思想大抵就是...
    海绵X阅读 4,014评论 6 2
  • 活在这个花花世界里,我们都是为爱而生的。曾经我以为在爱的面前,总得有人卑微的活着,现在我才明白,其实我们从来都不卑...
    Andy正在输入阅读 2,682评论 0 2
  • 这个杀手不太冷里面有两句句台词,翻译过来大意是这个样子: 孩子问:“人生一直都会这么辛苦,还是只有做孩子的阶段才会...
    木木青苔阅读 4,059评论 0 5