今天我们学习画柱状图的另一个技巧,如何绘制双坐标系。一般情况下包含多组数据,还这多组数据的scale或者单位又不太一样,所以有时候就需要多坐标系来实现。
比如上面这张图,为了在一张图中展示更丰富的信息,既有柱状图又有折线图。
如果柱状图和折线图的值域不一致,比如柱状图表示的是数量,折线图表示累计百分比,当二者出现在一张图中的时候,值域范围 [0, 1] 折线图就会几乎贴近 x 轴而失去意义。
这时候我们就建立两个坐标轴,柱状图和折线图各自使用各自的scale。
比如,我们生成下面的测试数据:
data <- data.frame(group = c("<10", "10-15", "15-20", "20-25", "25-30", ">30"),
count = c(70, 15, 8, 4, 2, 1),
percent = c(0.70, 0.85, 0.93, 0.97, 0.99, 1.00))
# 把group变成factor,从而按照我们想显示的顺序显示
data$group <- factor(data$group,levels = as.character(data$group))
ggplot(data) +
geom_bar(aes(x = group, y = count), stat = "identity", fill = '#168aad')
上面就是生成了一个非常简单的柱状图。
但是,下面如果我们添加线图的话:
ggplot(data) +
geom_bar(aes(x = group, y = count), stat = "identity", fill = '#168aad')+
geom_line(aes(x = group, y = percent), size = 1, color = '#800080') +
geom_point(aes(x = group, y = percent), size = 3, shape = 19, color='#800080')
就会报错:
geom_path: Each group consists of only one observation. Do you need to adjust the group aesthetic?
查资料,说要把group设置成1,尤其是有多组变量的时候。
ggplot(data) +
geom_bar(aes(x = group, y = count), stat = "identity", fill = '#168aad')+
geom_line(aes(x = group, y = percent), size = 1, color = '#800080',group=1) +
geom_point(aes(x = group, y = percent), size = 3, shape = 19, color='#800080',group=1)
但是效果如下图,因为count和percentage scale是不一样的,所以percentage的规律其实是被完全覆盖掉的。
所以,如果我们要想让count和percent分别按照自己的值域范围显示,并且呈现在同一个图中,就需要把其中之一的值域范围向另一个做投影,以统一值域范围,相当于 scaling。
这里我们选择将percent向count做投影,投影之后新增一列percent2,然后通过改变坐标轴 label 的方式达到保持原指标值域范围的目的。
data$percent2 = data$percent / max(data$percent) * max(data$count)
ggplot(data) +
geom_bar(aes(x = group, y = count), stat = "identity", fill = '#168aad')+
geom_line(aes(x = group, y = percent2), size = 1, color = '#800080',group=1) +
geom_point(aes(x = group, y = percent2), size = 3, shape = 19, color='#800080',group=1)
这样把count和percentage放在一个scale区间内,规律就比较能展现出来了。
下面,我们尝试来添加percentage的坐标系。
主要通过scale_y_continuous函数里面的sec.axis参数来实现创制2个坐标系。
data$percent2 = data$percent / max(data$percent) * max(data$count)
count_max=max(data$count)
label=paste0(seq(0, 100, 10))
ggplot(data) +
geom_bar(aes(x = group, y = count), stat = "identity", fill = '#168aad')+
geom_line(aes(x = group, y = percent2), size = 1, color = '#800080',group=1) +
geom_point(aes(x = group, y = percent2), size = 3, shape = 19, color='#800080',group=1)+
#geom_text(aes(x=group,y=count,label=count),size=5,color="red")+
scale_y_continuous(
limits=c(0,count_max),
breaks=seq(0,count_max,5),
sec.axis = sec_axis(~./0.99, name = "percent(%)",
breaks = seq(0,count_max,count_max/10),
labels = label)
)
通过sec.axis添加次级y轴,次级y轴的刻度需要通过一级y轴的刻度调整而来,~./0.99就表示次级y轴的范围是用一级y轴除以0.99
然后,我们再来调整一下外观。
ggplot(data) +
geom_bar(aes(x = group, y = count), stat = "identity", fill = '#168aad')+
geom_line(aes(x = group, y = percent2), size = 1, color = '#800080',group=1) +
geom_point(aes(x = group, y = percent2), size = 3, shape = 19, color='#800080',group=1)+
#geom_text(aes(x=group,y=count,label=count),size=5,color="red")+
scale_y_continuous(
limits=c(0,count_max),
breaks=seq(0,count_max,5),
sec.axis = sec_axis(~./0.99, name = "percent(%)",
breaks = seq(0,count_max,count_max/10),
labels = label)
)+
theme_classic()+
#theme_minimal() +
theme(panel.grid.major.x = element_blank(),
panel.grid.minor.x = element_blank(),
panel.grid.major.y = element_blank(),
panel.grid.minor.y = element_blank()) +
theme(plot.title = element_text(hjust = 0.5)) +
labs(title = paste0("CV% distribution"), x = "group", y = "count")+
theme(text = element_text(size = 15))+
theme(axis.line.x = element_line(linetype = 1, color = "darkblue", size = 1),
axis.line.y = element_line(linetype = 1, color = "darkblue", size = 1),
axis.ticks.x = element_line(color = "darkblue", size = 1),
axis.ticks.y = element_line(color = "darkblue", size = 1),
)