1. 在同一张UMAP上展示细胞分群情况
问题:如果单独取出cluster再做dimplot,就会出现UMAP图坐标轴不一
解决方法:在整体的dimplot图上标记高亮细胞,其他细胞颜色设置为白色
DimPlot(pbmc, cells.highlight=WhichCells(pbmc, idents = c("1")), cols.highlight = c("green"), cols= "white")+labs(title = "Cluster1-4573")
2.数据框元素的提取
在数据框中,如果只知道有n行,但不知道是第几行,知道行名,提取行的方法。
数据框名称["行名",]
3. 整合分析后进行NMF无法拆分,加入以下代码(参考github)
so_fromLiger[['nmf']] <- CreateDimReducObject(
embeddings = Embeddings(so_fromLiger[['nmf']]),
loadings = Loadings(so_fromLiger[['nmf']], projected = FALSE),
projected = Loadings(so_fromLiger[['nmf']], projected = TRUE),
assay = DefaultAssay(so_fromLiger[['nmf']]),
stdev = Stdev(so_fromLiger[['nmf']]),
key = Key(so_fromLiger[['nmf']]),
global = TRUE,
misc = Misc(object = so_fromLiger[['nmf']])
)
4. 在metadata中添加分类项表示基因是否在细胞中共表达
(感谢Kinesin老师告诉我代码,之前乱七八糟试了很久)
pbmc$cd4 <- ifelse(pbmc@assays$RNA@counts["CD4",]>0, "pos", "neg")
查看细胞中某些基因是否共表达
pbmc$cd8a <- ifelse(pbmc@assays$RNA@counts["CD8A",]>0, "pos", "neg")
pbmc$cd3d <- ifelse(pbmc@assays$RNA@counts["CD3D",]>0, "pos", "neg")
pbmc$cd8a_cd3d <- paste0(pbmc$cd8a, "_", pbmc$cd3d)
table(pbmc$cd8a_cd3d)
5.计算多个基因在不同cluster里面的表达量并绘制热图
如果一些基因属于同一个家族,想合并计算表达然后绘制热图
TOTAL1包括GeneA,GeneB
TOTAL2包括GeneC,GeneD
TOTAL3包括GeneE,GeneF
tmp <- AverageExpression(pbmc,assays = "RNA", features = c("GeneA","GeneB"), slot = "counts")#计算基因GeneA和GeneB的表达量
tmp1<-tmp[["RNA"]]
tmpTOTAL1 <- apply(tmp1,2,sum) # 分别计算TOTAL1中基因在不同细胞群中所占总和
---------
tmp <- AverageExpression(pbmc,assays = "RNA", features = c("GeneC","GeneD"), slot = "counts")#计算基因GeneC和GeneD的表达量
tmp1<-tmp[["RNA"]]
tmpTOTAL2 <- apply(tmp1,2,sum) # 分别计算TOTAL2中基因在不同细胞群中所占总和
----------
tmp <- AverageExpression(pbmc,assays = "RNA", features = c("GeneE","GeneF"), slot = "counts")#计算基因GeneE和GeneF的表达量
tmp1<-tmp[["RNA"]]
tmpTOTAL3 <- apply(tmp1,2,sum) # 分别计算TOTAL3中基因在不同细胞群中所占总和
----------
#分别计算好表达量后构建数据框
Total<- bind_rows(tmpTOTAL1, tmpTOTAL2,tmpTOTAL3)
#为矩阵添加列名和行名
row=c("TOTAL1","TOTAL2","TOTAL3")
column=c("Cluster0","Cluster1","Cluster2")
dimnames(Total)=list(row,column)
#绘制热图
pheatmap(Total)
6. 删除数据框中的特定列
#直接删除
df[,-which(names(df)%in%c("z","u")]
subset(df,select=-c(z,u))
#挑选出需要的
df[ , c("x","y")]
subset(df, select=c(x,y))
7. linux中的换行问题
awk BEGIN{RS=EOF}'{gsub(/\n/," ");print}' file.fasta >file.txt
sed s/[[:space:]]//g file.txt >tmp.txt