R语言ESTIMATE包报错

在运行ESTIMATE包的时候，需要先把表达矩阵输出成txt，然后运行filterCommonGenes函数，转换成gct文件，在转换这一步总是报错，错误提示是：

> filterCommonGenes(input.f="./step_7_estimate/exp.txt", 
+                   output.f="exp_genes.gct", 
+                   id="GeneSymbol")
[1] "Merged dataset includes 0 genes (10412 mismatched)."
Error in dimnames(x) <- dn : 
  length of 'dimnames' [2] not equal to array extent

排除了变量类型（numric）的问题，打开txt文件看了一眼，发现行名和列名多了很多引号，于是回过头去看输出txt的代码：

write.table(a,file="./step_7_estimate/exp.txt")

突发奇想加上了sep="\t"，于是不报错了，但是依然显示没有merge到基因：

> write.table(a,file="./step_7_estimate/exp.txt",sep="\t")
> filterCommonGenes(input.f="./step_7_estimate/exp.txt", 
+                   output.f="exp_genes.gct", 
+                   id="GeneSymbol")
[1] "Merged dataset includes 0 genes (10412 mismatched)."

但是手动grep一下，发现很多基因其实是可以找到的

> grep("DCN",rownames(a))
[1] 4783
> grep("THBS2",rownames(a))
[1] 1903

于是去看了filterCommonGenes这个函数的源码

> filterCommonGenes
function (input.f, output.f, id = c("GeneSymbol", "EntrezID")) 
{
    stopifnot((is.character(input.f) && length(input.f) == 1 && 
        nzchar(input.f)) || (inherits(input.f, "connection") && 
        isOpen(input.f, "r")))
    stopifnot((is.character(output.f) && length(output.f) == 
        1 && nzchar(output.f)))
    id <- match.arg(id)
    input.df <- read.table(input.f, header = TRUE, row.names = 1, 
        sep = "\t", quote = "", stringsAsFactors = FALSE)
    merged.df <- merge(common_genes, input.df, by.x = id, by.y = "row.names")
    rownames(merged.df) <- merged.df$GeneSymbol
    merged.df <- merged.df[, -1:-ncol(common_genes)]
    print(sprintf("Merged dataset includes %d genes (%d mismatched).", 
        nrow(merged.df), nrow(common_genes) - nrow(merged.df)))
    outputGCT(merged.df, output.f)
}
<bytecode: 0x000002a43c866268>
<environment: namespace:estimate>

把变量名赋值好之后，尝试分行run一下，在读取表达矩阵这一步发现了问题

>   input.df <- read.table(input.f, header = TRUE, row.names = 1, 
+                          sep = "\t", quote = "", stringsAsFactors = FALSE)

可以看到读取的时候其实是有很多参数的，除了sep="\t"之外，还有quete=“”，而且发现读进来的表达矩阵，行名和列名都有引号。所以可能是引号的问题。
在write.table函数里加上quote=“”这个参数看一下

write.table(a,file="./step_7_estimate/exp.txt",sep="\t",
+             quote = F)
> filterCommonGenes(input.f="./step_7_estimate/exp.txt", 
+                   output.f="exp_genes.gct", 
+                   id="GeneSymbol")
[1] "Merged dataset includes 7099 genes (3313 mismatched)."

问题解决了，merge到3313个基因。

总结：write.table的时候，sep="\t"和quote=F最后都加上

©著作权归作者所有,转载或内容合作请联系作者
【社区内容提示】社区部分内容疑似由AI辅助生成，浏览时请结合常识与多方信息审慎甄别。
平台声明：文章内容（如有图片或视频亦包括在内）由作者上传并发布，文章内容仅代表作者本人观点，简书系信息发布平台，仅提供信息存储服务。

R语言ESTIMATE包报错

R语言ESTIMATE包报错

总结：write.table的时候，sep="\t"和quote=F最后都加上

相关阅读更多精彩内容

友情链接更多精彩内容