R语言ggplot2
之前下载过Rstudio,这次直接开始学
准备工作
- 安装并打开包
install.packages("tidyverse")
library(tidyverse)
用到了一个mpg数据框,不了解时可以?mpg
ggplot2作图基本
- 模板
ggplot(data = <DATA>) +
<GEOM_FUNCTION>(mapping = aes(<MAPPINGS>))
注意+的位置,geom指图的类型,mapping是加图层,aesthetic是各种显示的属性
- 颜色、大小、透明度、形状
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy, color = class))
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy, size = class))
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy, alpha = class))
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy, shape = class))
可以手动设置属性,要放在aes外面:
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy), color = "blue")
- 分面
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy)) +
facet_grid(drv ~ cyl)
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy)) +
facet_wrap(~ class, nrow = 2)
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy)) +
facet_grid(. ~ cyl)
- 分组
ggplot(data = mpg) +
geom_smooth(mapping = aes(x = displ, y = hwy, group = drv))
ggplot(data = mpg) +
geom_smooth(
mapping = aes(x = displ, y = hwy, linetype = drv),
)
ggplot(data = mpg) +
geom_smooth(
mapping = aes(x = displ, y = hwy, color = drv),
show.legend = FALSE #不显示图例
)
- 全局映射
ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) +
geom_point(mapping = aes(color = class)) +
geom_smooth(data = filter(mpg, class == "subcompact"), se = FALSE)
局部映射与全局映射冲突时,服从局部映射; se默认显示标准差
统计变换
这里用到新的diamond数据
- 统计变换函数和几何对象函数
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut))
ggplot(data = diamonds) +
stat_count(mapping = aes(x = cut))
每个几何对象函数都有一个默认的统计变换,每个统计变换函数都又一个默认的几何对象(绘图时用来计算新数据的算法叫做统计变换stat)
- 修改stat
demo <- tribble(
~cut, ~freq,
"Fair", 1610,
"Good", 4906,
"Very Good", 12082,
"Premium", 13791,
"Ideal", 21551
)
ggplot(data = demo) +
geom_bar(mapping = aes(x = cut, y = freq), stat = "identity")
- 显示比例
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut, y = ..prop.., group = 1))
group把所有钻石当成一组
- 从统计变换角度作图
ggplot(data = diamonds) +
stat_summary(
mapping = aes(x = cut, y = depth),
fun.ymin = min,
fun.ymax = max,
fun.y = median
)
R for Data Science原文:stat_summary summarises the y values for each unique x value, to draw attention to the summary that you’re computing
位置调整
- color和fill
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut, colour = cut)) #边框
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut, fill = cut)) #给柱子上色
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut, fill = clarity)) #根据clarity用fill上色
position会调整数据在图上的位置:
ggplot(data = diamonds, mapping = aes(x = cut, colour = clarity)) +
geom_bar(fill = NA, position = "identity") #加上identity会“place each object exactly where it falls in the context of the graph”
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut, fill = clarity), position = "dodge") #将数据横着分开
jitter让重合的点抖动开:
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy), position = "jitter")
坐标系
- 翻转坐标系
ggplot(data = mpg, mapping = aes(x = class, y = hwy)) +
geom_boxplot()
ggplot(data = mpg, mapping = aes(x = class, y = hwy)) +
geom_boxplot() +
coord_flip()
- 极坐标
bar <- ggplot(data = diamonds) +
geom_bar(
mapping = aes(x = cut, fill = cut),
show.legend = FALSE,
width = 1
) +
theme(aspect.ratio = 1) +
labs(x = NULL, y = NULL)
bar + coord_flip()
bar + coord_polar()
总结公式
ggplot(data = <DATA>) +
<GEOM_FUNCTION>(
mapping = aes(<MAPPINGS>),
stat = <STAT>,
position = <POSITION>
) +
<COORDINATE_FUNCTION> +
<FACET_FUNCTION>