【搬砖】计算HRD(first try)

HRD score = LOH + TAI + LST

参考:Sztupinszki et al, Migrating the SNP array-based homologous recombination deficiency measures to next generation sequencing data of breast cancer, npj Breast Cancer, https://www.nature.com/articles/s41523-018-0066-6.

R package: scarHRD
https://github.com/sztup/scarHRD#introduction

workflow

第1步最关键,即得到 input file。

一、尝试Sequenza

根据sequenza说明书,需要bam file。。比较难获得。而且,需要使用python,俺不会。


image.png
TCGA data level

附可参考的网页:

  1. Sequenza User Guide
    https://rdrr.io/cran/sequenza/f/vignettes/sequenza.Rmd
  2. TCGA RNAseq BAM File
    http://seqanswers.com/forums/showthread.php?t=65176
  3. TCGA_bam_splicer
    https://freesoft.dev/program/131953985
  4. bam 格式文件
    https://blog.csdn.net/qq_36608036/article/details/104630366

二、尝试ASCAT

参考: ASCAT (Van Loo et al. 2010)
https://github.com/VanLoo-lab/ascat
先跑一下包里的ExampleData

library(ASCAT)
ascat.bc = ascat.loadData("Tumor_LogR.txt","Tumor_BAF.txt","Germline_LogR.txt","Germline_BAF.txt")
ascat.plotRawData(ascat.bc) 
ascat.bc = ascat.aspcf(ascat.bc)
ascat.plotSegmentedData(ascat.bc)
ascat.output = ascat.runAscat(ascat.bc)

ascat.output$nA
ascat.output$nB
ascat.output$ploidy
ascat.output$aberrantcellfraction

目标:跑出下图的数据


ASCAT output

很可惜GitHub里的readme写的不是很仔细,manual.pdf不见了,所以只能阅读原文 ASCAT (Van Loo et al. 2010),来破解参数的含义。

ASCAT profiles

ASCAT profiles: genome-wide allele-specific copy number profiles
左图:ASCAT首先确定肿瘤细胞的倍性ploidy 和异常细胞分数fraction of aberrant cells。然后评估 goodness of fit for a grid of possible values for both parameters (blue, good solution),选择最佳的solution,即绿色交叉点,例如A图的左边 绿色交叉点对应ploidy=1.77和fraction of aberrant cells=80%
右上图:x轴表示genomic location,y轴 CN(其中绿色是allele with lowest copy number,红色是allele with highest copy number)
右下图: an aberration reliability score异常细胞可靠性分数

  • 何为fit?
Frequency of LOH and copy number-neutral events

(A) Frequency of LOH across the genome. Probes are shown in
genomic order along the x axis, from chromosome 1 to chromosome X, where different chromosomes are delimited by gray lines.
(B) Frequency of copy number neutral events across the genome. For diploid tumors, copy number-neutral events correspond to a subset of LOH (copy number-neutral LOH), but for, for example, tetraploid tumors, a copy number neutral event can also be three copies of A and one copy of B.

  • 何为LOH?
  • 何为copy number neutral event ?

LOH:Loss of heterozygosity (LOH) was defined as the number of counts of chromosomal LOH regions shorter than whole chromosome and longer than 15 Mb 。
Copy number neutral event :Copy number正常,但存在allelic bias。

Illumina SNP arrays deliver two output tracks:** Log R, a measure of total signal intensity,** and B allele frequency (BAF), a measure of allelic contrast.
The Log R track is similar to the output given by common array-CGH platforms and quantifies the (total) copy number of each genomic locus.
The BAF track shows the relative presence of each of the two alternative nucleotides (called “A” and “B”) at each SNP locus profiled.

PennCNV
  • 为了得到LRR和BAF,还是逃不掉处理CEL文件吗?

-end-

©著作权归作者所有,转载或内容合作请联系作者
【社区内容提示】社区部分内容疑似由AI辅助生成,浏览时请结合常识与多方信息审慎甄别。
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

相关阅读更多精彩内容

友情链接更多精彩内容