更好阅读移步至:https://www.yuque.com/docs/share/01fe5958-5f6b-4364-aa39-cb7d3fd602ed?#
参考链接:
- twitter:https://twitter.com/geokaramanis/status/1247586395876741120
- 代码 github 链接:https://github.com/gkaramanis/tidytuesday/tree/master/2020-week15
<br />
<a name="zcTxe"></a>
须知:
rle 函数:计算向量中连续相同字符的个数
> rle(c(1,1,1,2,3,3,3,1,1))
Run Length Encoding
lengths: int [1:4] 3 1 3 2
values : num [1:4] 1 2 3 1
> rle(c(1, 2, 3, 3, 1, 1, 2, 3, 2))
Run Length Encoding
lengths: int [1:7] 1 1 2 2 1 1 1
values : num [1:7] 1 2 3 1 2 3 2
<br />**glue 函数:用大括号 {} **括起来的表达式将被计算为 R 代码。长字符串由行分隔并连接在一起。从第一行和最后一行开始的空白行和空白行被自动裁剪。
> name <- "Fred"
> age <- 50
> anniversary <- as.Date("1991-10-12")
> glue('My name is {name},',
'my age next year is {age + 1},',
'my anniversary is {format(anniversary, "%A, %B %d, %Y")}.')
My name is Fred,my age next year is 51,my anniversary is 星期六, 十月 12, 1991.
> glue("My name is {name}, not {{name}}.")
My name is Fred, not {name}.
> year <- 1940
> glue("**year**")
**year**
> glue("**{year}**")
**1940**
<br />**countrycode **函数:对我不重要
> cowcodes <- c("ALG", "ALB", "UKG", "CAN", "USA")
> countrycode(cowcodes, origin = "cowc", destination = "iso3c")
[1] "DZA" "ALB" "GBR" "CAN" "USA"
<br />**ggtext **包 的 **geom_richtext **函数:加文本标签注释
<br />
<br />annotate:添加注释和以及几何图形特别方便,为所欲为
- https://ggplot2.tidyverse.org/reference/annotate.html
- annotate 添加空白
annotate("rect",
xmin = -2000, ymin = c(13, 38),
xmax = 26000, ymax = c(16, 41),
fill = "#F3F2EE", colour = "black", size = 0.3)
<br />coord_cartesian:放大镜效果不改变图形形状
- https://ggplot2.tidyverse.org/reference/coord_cartesian.html
- 绘图应该裁剪到画板的范围吗 ? 设置为"on" (默认值)表示“是”,设置为"off" 表示“不是”。在大多数情况下,不应该更改默认的 "on",因为设置 clip ="off" 可能会导致意外的结果。它允许在绘图图的任何地方绘制数据点,包括在绘图页边距中。如果通过 xlim 和 ylim 设置了限制,并且一些数据点超出了这些限制,那么这些数据点可能出现在轴、图例、画板标题或画板边距等位置。
p <- ggplot(mtcars, aes(disp, wt)) +
geom_point() +
geom_smooth()
p + scale_x_continuous(limits = c(325, 500))
p + coord_cartesian(xlim = c(325, 500))
<br />coord_cartesian(clip = 'off'):取消画板限制
<br />**scale_x_continuous **函数通过 **limits **和 **expand **函数控制贴 y 轴距离
scale_y_reverse 函数翻转 y 轴左边起始顺序,上下颠倒,并通过 expand = expansion(add = 0) 控制 y 轴顶端和低端间隙为 0<br /><br />theme_void:去除画板,包括轴以及背景**<br />
- ggplot2 Complete themes:https://ggplot2.tidyverse.org/reference/ggtheme.html
plot.margin:控制上下左右图片边距<br />**<br />
<br />
<a name="alM8p"></a>
前期数据获得:
library(tidyverse)
library(lubridate)
library(countrycode)
library(ggtext)
library(glue)
library(here)
#library(skimr)
tdf_winners <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-04-07/tdf_winners.csv')
tdf_table <- tdf_winners %>%
mutate(
# rle: 计算相同值的数目
wins_consecutive = with(rle(winner_name), rep(lengths, times = lengths)),
year = year(start_date), # 提取年数据
# glue() 函数大括号 {} 括起来表示 R 代码 year <- 1940; glue("**{year}**"); > **1940** <
year_labels = ifelse(year %% 10 == 0, glue("**{year}**"), year),
year_group = case_when(
year < 1915 ~ 1,
year > 1915 & year < 1940 ~ 2,
TRUE ~ 3),
avg_speed = distance / time_overall,
country_code = countrycode(nationality, origin = "country.name", destination = "iso3c"),
winner_annot = ifelse(wins_consecutive > 2, glue("**{winner_name} ({country_code})**"), glue("{winner_name} ({country_code})"))
) %>%
# 分组很妙,添加行号
group_by(year_group) %>%
mutate(
n_annot = row_number(),
annot = ifelse((n_annot - 2) %% 3 == 0, TRUE, FALSE)
) %>%
ungroup() %>%
add_row(year = c(1915, 1916, 1917, 1918, 1940, 1941, 1942, 1943)) %>%
arrange(year) %>%
mutate(n = row_number())
<a name="LL9Yf"></a>
分步骤重现图:
<a name="hLyBf"></a>
step1: geom_segment() 标虚线
- 使用 geom_segment() 函数添加虚线, 数据中的 NA 很妙用,如果是 NA 值就不需要加虚线,就是为了图中看到的 1915-1918 添加用。
- 真的妙,利用有多少行非 NA 数值,来保证点添加多少
ggplot(tdf_table) +
# dotted gridlines ---------------------------------------------------
# 使用 geom_segment() 函数添加虚线, 数据中的 NA 很妙用,如果是 NA 值就不需要加虚线,就是为了图中看到的 1915-1918 添加用。
# 真的妙,利用有多少行非 NA 数值,来保证点添加多少
geom_segment(data = subset(tdf_table, !is.na(year_labels)),
aes(x = 0, xend = 24000, y = n, yend = n),
linetype = "dotted", size = 0.2)
<br />
<a name="RQ23D"></a>
step2:加上左右两侧的年份
- 这里的 x = -1000 y = n 以及 x = 25000 用的很妙啊。特别是 -1000,加在坐标轴左侧
geom_richtext(aes(x = -1000, y = n, label = year_labels),
fill = "red", label.color = NA,
label.padding = unit(0.1, "lines"),
family = "JetBrains Mono", size = 2.5) +
geom_richtext(aes(x = 25000, y = n, label = year_labels),
fill = "blue", label.color = NA,
label.padding = unit(0.1, "lines"),
family = "JetBrains Mono", size = 2.5)
<br />
<a name="kUiQk"></a>
step3:geom_area() 加上填充面积
geom_area(aes(x = distance * 0.625, y = n, group = year_group),
fill = "#7DDDB6", alpha = 0.6,
orientation = "y", position = "identity")
<br />
<a name="50aa2"></a>
step4:选择性加上每一个上面对应的点
geom_point(data = subset(tdf_table, annot),
aes(x = distance * 0.625, y = n), size = 0.5)
<br />
<a name="K2rHJ"></a>
step5:给 step4 中的点加上数值
geom_label(data = subset(tdf_table, annot),
aes(x = distance * 0.625 + 100, y = n, label = distance),
fill = "#F3F2EE", label.size = 0,
label.padding = unit(0.1, "lines"),
hjust = 0, family = "JetBrains Mono", size = 2.5)
<br /><br />
<a name="dfy6e"></a>
step6:给每一行加上注释,对应 WINNER
geom_richtext(aes(x = 5300, y = n, label = winner_annot, .na = NULL),
fill = "#F3F2EE", label.size = 0,
label.padding = unit(0.1, "lines"),
hjust = 0, family = "JetBrains Mono", size = 2.5)
<br /><br />
<br />step7:
geom_label(aes(x = 10300, y = n, label = glue("{winner_team}", .na = NULL)),
fill = "#F3F2EE", label.size = 0, label.padding = unit(0.1, "lines"),
hjust = 0, family = "JetBrains Mono", size = 2.5)
<br />
<a name="Usq1r"></a>
step7:geom_segment 函数添加 AVERAGE SPEED 数据
geom_segment(aes(x = 16000, xend = 16000 + avg_speed * 66.67, y = n, yend = n),
size = 2, colour = "#7DDDB6", alpha = 0.6)
<br />
<a name="BJmwo"></a>
step8:选择性添加 AVERAGE SPEED 对应的数值
geom_label(data = subset(tdf_table, annot),
aes(x = 16000 + avg_speed * 66.67 + 100, y = n,
label = round(avg_speed, 1)), fill = "#F3F2EE",
label.size = 0, label.padding = unit(0.1, "lines"),
hjust = 0, family = "JetBrains Mono", size = 2.5)
<br />
<a name="t2Caa"></a>
step9:添加 TOTAL TIME 时间填充(geom_ribbon)、点、标签
geom_ribbon(aes(xmin = 20000, xmax = 20000 + time_overall * 10, y = n, group = year_group),
fill = "#FCDF33", alpha = 0.6, orientation = "y", position = "identity") +
geom_point(data = subset(tdf_table, annot),
aes(x = 20000 + time_overall * 10, y = n), size = 0.5) +
geom_label(data = subset(tdf_table, annot),
aes(x = 20000 + time_overall * 10 + 100, y = n,
label = round(time_overall, 1)),
fill = "#F3F2EE", label.size = 0,
label.padding = unit(0.1, "lines"),
hjust = 0, family = "JetBrains Mono", size = 2.5)
<br />
<a name="psBfU"></a>
step10:annotate 函数添加竖直线
annotate("segment",
x = c(-2000, 0, 5000, 10000, 16000, 20000, 24000, 26000),
xend = c(-2000, 0, 5000, 10000, 16000, 20000, 24000, 26000),
y = -4, yend = 115, size = 0.3)
<br />
<a name="IEOLm"></a>
step11:annotate 函数添加三条横线
annotate("segment",
x = -2000, xend = 26000,
y = c(-4, -1, 115), yend = c(-4, -1, 115), size = 0.3)
<br />
<a name="tIULc"></a>
step12:annotate 添加表头
annotate("text",
x = c(-1000, 2500, 7500, 13000, 18000, 22000, 25000),
y = -2.5,
label = toupper(c("year", "distance", "winner", "team", "average speed", "total time", "year")),
hjust = 0.5, family = "IBM Plex Sans Bold", size = 3.5)
<br />
<a name="qoRiI"></a>
step13:annotate 函数加上空白
annotate("rect",
xmin = -2000, ymin = c(13, 38),
xmax = 26000, ymax = c(16, 41),
fill = "#F3F2EE", colour = "black", size = 0.3)
<br />
<a name="Gfu1v"></a>
step14:annotate 函数参数 richtext 添加中间小表头
annotate("richtext", x = 13000, y = c(14.5, 39.5),
label = c("**1915-1918** Tour suspended because of Word War I",
"**1940-1946** Tour suspended because of Word War II"),
label.color = NA, fill = "#F3F2EE", hjust = 0.5,
family = "IBM Plex Sans", size = 3.5)
<br />
<a name="E0c7U"></a>
step15:annotate 函数参数 text 给 DISTANCE 栏加上单位刻度
annotate("text", x = c(100, 4900), y = 0,
label = c("0", "8000 km"), hjust = c(0, 1),
family = "IBM Plex Mono Light", size = 3)
<br />
<a name="TyUrp"></a>
step16:annotate 函数参数 text 给其他的添加刻度尺和注释
annotate("text", x = c(16100, 19900), y = 0,
label = c("0", "60 km/h"), hjust = c(0, 1),
family = "IBM Plex Mono Light", size = 3) +
annotate("text", x = c(20100, 23900), y = 0,
label = c("0", "300 h"), hjust = c(0, 1),
family = "IBM Plex Mono Light", size = 3) +
annotate("text", x = 26000, y = -6,
label = "Source: alastairrushworth/tdf & kaggle.com/jaminliu | Graphic: Georgios Karamanis",
hjust = 1, family = "IBM Plex Mono Light", size = 3)
<br />
<a name="9CuG9"></a>
step17:coord_cartesian 函数取消画板限制范围
- 图形相对于 step16 没啥变化
coord_cartesian(clip = 'off')
<a name="HBo0W"></a>
step18:scale_x_continuous 函数通过 limits 和 expand 函数控制贴 y 轴距离
scale_x_continuous(limits = c(-2300, 26300), expand = expansion(add = 1))
<br />
<a name="A31Sj"></a>
step19:scale_y_reverse 函数翻转 y 轴左边起始顺序,上下颠倒,并通过 expand = expansion(add = 0) 控制 y 轴顶端和低端间隙为 0
scale_y_reverse(expand = expansion(add = 0))
<br />
<a name="nszRU"></a>
step20:labs 加标题以及 theme_void 去除主题线条背景以及坐标轴
labs(
title = "Tour de France Winners"
) +
theme_void(base_family = "JetBrains Mono")
<br />
<a name="SAvaF"></a>
step21:设置灰色背景,画板大小,以及标题大小
-
F3F2EE 色条为灰白色:https://www.color-hex.com/color/f3f2ee
- plot.margin:控制上下左右边距(上,左,下,右)
theme(
plot.background = element_rect(fill = "#F3F2EE", colour = NA),
plot.margin = margin(20, 20, 20, 20),
plot.title = element_text(hjust = 0.01, size = 28,
family = "IBM Plex Sans Bold", margin = margin(0, 0, -8, 0))
)
<br />
<a name="ofV8b"></a>
step22:here 函数加时间函数命名文件名
ggsave(here::here("2020-week15", "plots", "temp",
paste0("tour-de-france", format(Sys.time(), "%Y%m%d_%H%M%S"), ".png")),
dpi = 320, width = 11, height = 15)
<br />step23:附加 AI 操作视频<br />