R基础-10.表达矩阵的样本的相关性-2019-07-21

10.表达矩阵的样本的相关性

rm(list = ls()) #一键清空
options(stringsAsFactors = F)


library(airway)#加载airway数据包
data(airway)

exprSet=assay(airway)#表达矩阵
colnames(exprSet)#列名
dim(exprSet)#查看维度

cor(exprSet)#样本间的相关性
#cor(exprSet[,1],exprSet[,2])#第一个样本与第二个样本的相关性
pheatmap::pheatmap(cor(exprSet))#绘制样本间的相关性的热图

Rplot.png

group_list = colData(airway)[,3]
g=group_list
tmp=data.frame(g=group_list)
rownames(tmp)=colnames(exprSet)

pheatmap::pheatmap(cor(exprSet),annotation_col = tmp)
dim(exprSet)

Rplot01.png

选择表达量大于1的，样本量大于5的基因，即在至少在5个样本中出现的基因

#构建function函数是关键
apply(exprSet, 1, function(x) x>3)
x=exprSet[1,]
x
x>1
table(x>1)#统计TRUE的数量
sum(x>1)
sum(x>1)>5

exprSet =exprSet[apply(exprSet,1, function(x) sum(x>1) >5),]#选择表达量大于1的，样本量大于5的基因，即在至少在5个样本中出现的基因
dim(exprSet)

exprSet =log(edgeR::cpm(exprSet)+1)
dim(exprSet)
exprSet = exprSet[names(sort(apply(exprSet, 1, mad), decreasing= T)[1:500]),]
dim(exprSet)

M=cor(log2(exprSet + 1))
tmp=data.frame(g=group_list)
rownames(tmp)=colnames(M)
pheatmap::pheatmap(M,annotation_col = tmp, filename = 'cor.png')

cor.png

dev.off()#关闭画板,重新组织绘画
library(pheatmap)
pheatmap(scale(cor(log2(exprSet+1))))

R基础-10.表达矩阵的样本的相关性-2019-07-21

10.表达矩阵的样本的相关性

选择表达量大于1的，样本量大于5的基因，即在至少在5个样本中出现的基因

推荐阅读更多精彩内容