R语言中绘制热图heatmap的包(package)有很多,除了许多炫酷的第三方包外,系统自带的image函数使用起来也十分方便灵活。
1.主要绘制参数
image(x, y, z, zlim, xlim, ylim, col = heat.colors(12),
xlab, ylab, breaks, ...)
z: 用于画图的数据,矩阵类型
x, y: 数据在x和y轴的坐标
zlim: 数据的范围,默认值是z矩阵中的最小值到最大值
xlim, ylim: 图片中x和y的数据阈值
col: 设定颜色阈值
breaks: 颜色对应的数据断点
xlab, ylab: x轴和y轴的标题
2.绘制常规heatmap
假设我们有RNA-Seq的两个重复(replicates),一共有20个基因
# 产生样本数据
m <- data.frame(
rep1 = sample(1:20),
rep2 = sample(1:20)
)
# 写一个绘图函数
draw_image <- function(data, axis = FALSE, label = FALSE) {
# 设定绘图参数
breaks.frequency <- seq(from=min(data), to=max(data), length.out=10)
myColors <- colorRampPalette(c("white", "#2874A6"))
# 产生图片
image(1:nrow(data), 1:ncol(data), as.matrix(data), breaks=breaks.frequency, col=myColors(length(breaks.frequency)-1), axes = axis, cex = 1.5, xlab = "", ylab = "")
# 自定义axis
if (axis) {
axis(2, 1:ncol(data), colnames(data), cex.axis=2.5)
axis(1, 1:nrow(data), rownames(mxdata), cex.axis=2.5)
}
# 自定义文本
if (label) {
for (x in 1:nrow(data)) {
for (y in 1:ncol(data)) {
text(x, y, data[x, y], cex = 2)
}
}
}
}
# 绘制图形
draw_image(data, FALSE, FALSE)
3.数据分组后再产生heatmap
有时候数据里包含成百上千个基因,我们会想知道一些浓缩的结果,这时可以把表达量转换成离散型的数据,将表达量分成由高到低的不同组别,统计每个组别中的个数,这样可以更清晰的展示结果。
# 产生样本数据
myProbs <- seq(0, 0.95, 0.05)
mx <- data.frame(
sample1 = sample(1:20, 100, replace = T, prob = myProbs),
sample2 = sample(1:20, 100, replace = T, prob = rev(myProbs))
)
# 创建分组函数
calc_type <- function(x, column) {
type <- apply(x, 1, function(y){no=as.numeric(y[column]);
if(no >= 15){"15-20"}
else if(no >= 10){"10-15"}
else if(no >= 5){"5-10"}
else{"0-5"}
})
type <- factor(type, levels = c("0-5", "5-10", "10-15", "15-20"))
}
# 将每个基因在两个sample中的表达量划分到相应组别中
mx$type1 <- calc_type(mx, 1)
mx$type2 <- calc_type(mx, 2)
# 转换数据框为矩阵
# mx是行数和列数相等的矩阵
mx <- as.matrix(as.data.frame.matrix(table(mx$type1, mx$type2)))
# 查看数据,可以看到sample1处于高表达量的基因个数明显高于sample2
mx
# 0-5 5-10 10-15 15-20
# 0-5 1 2 0 0
# 5-10 4 3 6 1
# 10-15 10 11 6 1
# 15-20 17 15 13 10
# 绘制图,可以明显区分sample1和sample2的差别
draw_image(mx, TRUE, TRUE)
4. 添加scale
以下代码可以添加颜色的scale
# 创建颜色scale
# 以下代码来自https://www.r-bloggers.com/new-version-of-image-scale-function/
image.scale <- function(z, zlim, col = heat.colors(12),
breaks, horiz=TRUE, ylim=NULL, xlim=NULL, ...){
if(!missing(breaks)){
if(length(breaks) != (length(col)+1)){stop("must have one more break than colour")}
}
if(missing(breaks) & !missing(zlim)){
breaks <- seq(zlim[1], zlim[2], length.out=(length(col)+1))
}
if(missing(breaks) & missing(zlim)){
zlim <- range(z, na.rm=TRUE)
zlim[2] <- zlim[2]+c(zlim[2]-zlim[1])*(1E-3)#adds a bit to the range in both directions
zlim[1] <- zlim[1]-c(zlim[2]-zlim[1])*(1E-3)
breaks <- seq(zlim[1], zlim[2], length.out=(length(col)+1))
}
poly <- vector(mode="list", length(col))
for(i in seq(poly)){
poly[[i]] <- c(breaks[i], breaks[i+1], breaks[i+1], breaks[i])
}
xaxt <- ifelse(horiz, "s", "n")
yaxt <- ifelse(horiz, "n", "s")
if(horiz){YLIM<-c(0,1); XLIM<-range(breaks)}
if(!horiz){YLIM<-range(breaks); XLIM<-c(0,1)}
if(missing(xlim)) xlim=XLIM
if(missing(ylim)) ylim=YLIM
plot(1,1,t="n",ylim=ylim, xlim=xlim, xaxt=xaxt, yaxt=yaxt, xaxs="i", yaxs="i", ...)
for(i in seq(poly)){
if(horiz){
olygon(poly[[i]], c(0,0,1,1), col=col[i], border=NA)
}
if(!horiz){
polygon(c(0,0,1,1), poly[[i]], col=col[i], border=NA)
}
}
}
# 绘制颜色scale
breaks.frequency <- seq(from=min(mx), to=max(mx), length.out=10)
myColors <- colorRampPalette(c("white", "#2874A6"))
image.scale(mx, mx, myColors(length(breaks.frequency)-1), breaks.frequency)
参考资料
image.scale地址:https://www.r-bloggers.com/new-version-of-image-scale-function/