library(ggstatsplot)
library(ggplot2)
library(dplyr)
data("diamonds")
diamonds2 <- diamonds %>%
filter(color == c('J', 'H', 'F'), clarity %in% c('SI2', 'VS1', 'IF'))#筛选出diamonds中颜色为J、H、F,清晰度为SI2、VS1、IF的数据,并保存为diamonds2。
ggbarstats(diamonds2, color, clarity, palette = 'Set2')
#以下为统计结果
Note: 95% CI for effect size estimate was computed with 100 bootstrap samples.
Note: Results from one-sample proportion tests for each level of the variable
clarity testing for equal proportions of the variable color.
# A tibble: 3 x 9
condition N F H J `Chi-squared` df `p-value` significance
<ord> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <chr>
1 SI2 (n = 1208) 45.20% 41.72% 13.08% 225. 2 0 ***
2 VS1 (n = 966) 46.38% 38.20% 15.42% 149. 2 0 ***
3 IF (n = 251) 53.39% 39.44% 7.17% 84.6 2 0 ***
-如图所示,卡方值为15.01,p = 0.005 < 检验水准0.05,可认为钻石的颜色与分类不独立,即存在关联。
-各个clarity的组内比较,不同颜色钻石的数量的差异均具有显著性(每个柱子上面为三颗星“*”,卡方值分别为225, 149, 84.6,均大于卡方分布在自由度为2,α为0.05时的值5.99,即p < 0.05, 所以都具有显著性)。
ggpiestats(diamonds2, color, clarity, palette = 'Set3')
#以下为统计结果
Note: 95% CI for effect size estimate was computed with 100 bootstrap samples.
Note: Results from one-sample proportion tests for each level of the variable
clarity testing for equal proportions of the variable color.
# A tibble: 3 x 9
condition N F H J `Chi-squared` df `p-value` significance
<ord> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <chr>
1 SI2 (n = 1208) 45.20% 41.72% 13.08% 225. 2 0 ***
2 VS1 (n = 966) 46.38% 38.20% 15.42% 149. 2 0 ***
3 IF (n = 251) 53.39% 39.44% 7.17% 84.6 2 0 ***
-此图统计结果与上面柱状图的结果一样,只是将柱状图换成饼图。
-这种些图形能够方便快速的将统计数据快速可视化,不仅能得到基本的卡方统计量,P值,还可以得到各分组内的分布状况,如颜色为J的钻石在分类为SI2的组内占比为13%,占比最大的为颜色F,占比45%。在分类VS1和IF组内,占比最大的也是颜色F,分别占比46%和53%。
grouped_ggpiestats(diamonds2[diamonds2$cut != 'Very Good',], color, clarity, grouping.var = cut, simulate.p.value = T) #diamonds2[diamonds2$cut != 'Very Good',]表示去掉数据中cut为Very Good的数据,simulate.p.value = T表示对P值进行调整,因为cut为Fair的数据内,颜色为J和H的数量为0。
#以下为统计结果
Note: 95% CI for effect size estimate was computed with 100 bootstrap samples.
Note: Results from one-sample proportion tests for each level of the variable
clarity testing for equal proportions of the variable color.
# A tibble: 3 x 9
condition N F H J `Chi-squared` df `p-value` significance
<ord> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <chr>
1 SI2 (n =~ 47.7~ 41.7~ 10.4~ 16.1 2 0 ***
2 VS1 (n =~ 42.8~ 35.7~ 21.4~ 2 2 0.368 ns
3 IF (n =~ 100.~ NA NA 6 2 0.05 ns
Note: 95% CI for effect size estimate was computed with 100 bootstrap samples.
Note: Results from one-sample proportion tests for each level of the variable
clarity testing for equal proportions of the variable color.
# A tibble: 3 x 9
condition N F H J `Chi-squared` df `p-value` significance
<ord> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <chr>
1 SI2 (n =~ 49.6~ 35.7~ 14.6~ 25.6 2 0 ***
2 VS1 (n =~ 48.1~ 31.3~ 20.4~ 9.71 2 0.008 **
3 IF (n =~ 69.2~ 15.3~ 15.3~ 7.54 2 0.023 *
Note: 95% CI for effect size estimate was computed with 100 bootstrap samples.
Note: Results from one-sample proportion tests for each level of the variable
clarity testing for equal proportions of the variable color.
# A tibble: 3 x 9
condition N F H J `Chi-squared` df `p-value` significance
<ord> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <chr>
1 SI2 (n =~ 44.5~ 42.0~ 13.3~ 71.7 2 0 ***
2 VS1 (n =~ 41.5~ 41.5~ 16.8~ 29.6 2 0 ***
3 IF (n =~ 40.0~ 48.0~ 12.0~ 5.36 2 0.069 ns
Note: 95% CI for effect size estimate was computed with 100 bootstrap samples.
Note: Results from one-sample proportion tests for each level of the variable
clarity testing for equal proportions of the variable color.
# A tibble: 3 x 9
condition N F H J `Chi-squared` df `p-value` significance
<ord> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <chr>
1 SI2 (n =~ 45.4~ 44.6~ 9.91% 84.7 2 0 ***
2 VS1 (n =~ 49.0~ 38.5~ 12.5~ 84.7 2 0 ***
3 IF (n =~ 52.5~ 42.3~ 5.08% 66.3 2 0 ***