最近年中总结,我们学习一些基础知识,还有几天总结就结束了,期待FFPE做空间转录组大放异彩
安装及加载ggpubr包
安装方式有两种:
- 直接从CRAN安装:
install.packages("ggpubr")
- 从GitHub上安装最新版本:
if(!require(devtools)) install.packages("devtools")
devtools::install_github("kassambara/ggpubr")
安装完之后直接加载就行:
library(ggpubr)
ggpubr可绘制图形:
ggpubr可绘制大部分我们常用的图形,下面一一介绍。
分布图(Distribution)
#构建数据集
set.seed(1234)
df <- data.frame( sex=factor(rep(c("f", "M"), each=200)),
weight=c(rnorm(200, 55), rnorm(200, 58)))
head(df)
## sex weight
## 1 f 53.79293
## 2 f 55.27743
## 3 f 56.08444
## 4 f 52.65430
## 5 f 55.42912
## 6 f 55.50606
密度分布图以及边际地毯线并添加平均值线
ggdensity(df, x="weight", add = "mean", rug = TRUE, color = "sex", fill = "sex",
palette = c("#00AFBB", "#E7B800"))
带有均值线和边际地毯线的直方图
gghistogram(df, x="weight", add = "mean", rug = TRUE, color = "sex", fill = "sex",
palette = c("#00AFBB", "#E7B800"))
箱线图与小提琴图
#加载数据集ToothGrowth
data("ToothGrowth")
df1 <- ToothGrowth
head(df1)
## len supp dose
## 1 4.2 VC 0.5
## 2 11.5 VC 0.5
## 3 7.3 VC 0.5
## 4 5.8 VC 0.5
## 5 6.4 VC 0.5
## 6 10.0 VC 0.5
p <- ggboxplot(df1, x="dose", y="len", color = "dose",
palette = c("#00AFBB", "#E7B800", "#FC4E07"),
add = "jitter", shape="dose")#增加了jitter点,点shape由dose映射p
增加不同组间的p-value值,可以自定义需要标注的组间比较
my_comparisons <- list(c("0.5", "1"), c("1", "2"), c("0.5", "2"))
p+stat_compare_means(comparisons = my_comparisons)+#不同组间的比较
stat_compare_means(label.y = 50)
内有箱线图的小提琴图
ggviolin(df1, x="dose", y="len", fill = "dose",
palette = c("#00AFBB", "#E7B800", "#FC4E07"),
add = "boxplot", add.params = list(fill="white"))+
stat_compare_means(comparisons = my_comparisons, label = "p.signif")+#label这里表示选择显著性标记(星号)
stat_compare_means(label.y = 50)
条形图
data("mtcars")
df2 <- mtcars
df2$cyl <- factor(df2$cyl)
df2$name <- rownames(df2)#添加一行name
head(df2[, c("name", "wt", "mpg", "cyl")])
按从小到大顺序绘制条形图(不分组排序)
ggbarplot(df2, x="name", y="mpg", fill = "cyl", color = "white",
palette = "jco",#杂志jco的配色
sort.val = "desc",#下降排序
sort.by.groups=FALSE,#不按组排序
x.text.angle=60)
按组进行排序
ggbarplot(df2, x="name", y="mpg", fill = "cyl", color = "white",
palette = "jco",#杂志jco的配色
sort.val = "asc",#上升排序,区别于desc,具体看图演示
sort.by.groups=TRUE,#按组排序
x.text.angle=90)
偏差图
偏差图展示了与参考值之间的偏差
df2$mpg_z <- (df2$mpg-mean(df2$mpg))/sd(df2$mpg)
df2$mpg_grp <- factor(ifelse(df2$mpg_z<0, "low", "high"), levels = c("low", "high"))
head(df2[, c("name", "wt", "mpg", "mpg_grp", "cyl")])
绘制排序过的条形图
ggbarplot(df2, x="name", y="mpg_z", fill = "mpg_grp", color = "white",
palette = "jco", sort.val = "asc", sort.by.groups = FALSE, x.text.angle=60,
ylab = "MPG z-score", xlab = FALSE, legend.title="MPG Group")
坐标轴变换
ggbarplot(df2, x="name", y="mpg_z", fill = "mpg_grp", color = "white",
palette = "jco", sort.val = "desc", sort.by.groups = FALSE,
x.text.angle=90, ylab = "MPG z-score", xlab = FALSE,
legend.title="MPG Group", rotate=TRUE, ggtheme = theme_minimal())
点图(Dot charts)
棒棒糖图(Lollipop chart)
棒棒图可以代替条形图展示数据
ggdotchart(df2, x="name", y="mpg", color = "cyl",
palette = c("#00AFBB", "#E7B800", "#FC4E07"), sorting = "ascending",
add = "segments", ggtheme = theme_pubr())
可以自设置各种参数
ggdotchart(df2, x="name", y="mpg", color = "cyl",
palette = c("#00AFBB", "#E7B800", "#FC4E07"), sorting = "descending",
add = "segments", rotate = TRUE, group = "cyl", dot.size = 6,
label = round(df2$mpg), font.label = list(color="white", size=9, vjust=0.5),
ggtheme = theme_pubr())
偏差图
ggdotchart(df2, x="name", y="mpg_z", color = "cyl",
palette = c("#00AFBB", "#E7B800", "#FC4E07"), sorting = "descending",
add = "segment", add.params = list(color="lightgray", size=2),
group = "cyl", dot.size = 6, label = round(df2$mpg_z, 1),
font.label = list(color="white", size=9, vjust=0.5), ggtheme = theme_pubr())+
geom_line(yintercept=0, linetype=2, color="lightgray")
Cleveland点图
ggdotchart(df2, x="name", y="mpg", color = "cyl",
palette = c("#00AFBB", "#E7B800", "#FC4E07"), sorting = "descending",
rotate = TRUE, dot.size = 2, y.text.col=TRUE, ggtheme = theme_pubr())+
theme_cleveland()
当然,还有很多其他的图表功能
3. 更多
- ggscatter() 散点图
- stat_cor() 将有P值的相关系数添加到散点图中
- stat_stars()) Add Stars to a Scatter Plot
- ggscatterhist() 绘制具有边际直方图的散点图
-
ggpaired() Plot Paired Data
Make MA-plot which is a scatter plot of log2 fold changes (on the y-axis) versus the mean expression signal (on the x-axis).
MA plot充分展示了基因丰度和表达变化之间的关系。我们可以看到,越靠左下或者右上的点,就是丰度越高而且变化幅度越大的基因。当然了,MA plot就丢了FDR这类统计量。二维图嘛,死活两个参数,顶多用颜色做个假三维。
不过对于终端小白用户来说,如果在volcano plot和MA plot中发现了重叠的靶点(实际上会有不少重叠),那就愉快地拿去做实验吧。
- 基因丰度:基因组中某基因的拷贝数。
- 基因表达丰度:某基因转录的mRNA数量。可以用RT-qPCR来检测。
- 表达变化(fold change):就是倍数变化,假设A基因表达值为1,B表达值为3,那么B的表达就是A的3倍。一般我们都用count、TPM或FPKM来衡量基因表达水平,所以基因表达值肯定是非负数,那么fold change的取值就是(0, +∞).
- 差异的显著性:P-value来衡量。假设检验首先必须要有假设,我们假设A和B的表达没有差异(H0,零假设),然后基于此假设,通过t test(以RT-PCR为例)算出我们观测到的A和B出现的概率,就得到了P-value,如果P-value<0.05,那么说明小概率事件出现了,我们应该拒绝零假设,即A和B的表达不一样,即有显著差异。
基础知识,多多学习