R中优雅的对P值进行转换

在数据分析中，我们经常要对数据进行统计分析；但是返回的结果往往是一串很长的浮点数不能给人直观的感受，本节来解释如何使用lucid函数来改进数据格式使P值更加直观

安装并加载R包

package.list=c("tidyverse","lucid","broom")

for (package in package.list) {
  if (!require(package,character.only=T, quietly=T)) {
    install.packages(package)
    library(package, character.only=T)
  }
}

数据展示

Orange %>% group_by(Tree) %>% 
  do(tidy(lm(circumference ~ age, data=.))) %>% as.data.frame

可以看到返回的P值格式很不直观

   Tree        term    estimate    std.error statistic      p.value
1     3 (Intercept) 19.20353638  5.863410215  3.275148 2.207255e-02
2     3         age  0.08111158  0.005628105 14.411881 2.901046e-05
3     1 (Intercept) 24.43784664  6.543311039  3.734783 1.350409e-02
4     1         age  0.08147716  0.006280721 12.972581 4.851902e-05
5     5 (Intercept)  8.75834459  8.176436207  1.071169 3.330518e-01
6     5         age  0.11102891  0.007848307 14.146861 3.177093e-05
7     2 (Intercept) 19.96090337  9.352361105  2.134317 8.593318e-02
8     2         age  0.12506176  0.008977041 13.931291 3.425041e-05
9     4 (Intercept) 14.63762022 11.233762751  1.303002 2.493507e-01
10    4         age  0.13517222  0.010782940 12.535748 5.733090e-05

lucid转换格式

Orange %>% group_by(Tree) %>% 
  do(tidy(lm(circumference ~ age, data=.))) %>% as.data.frame %>% lucid

   Tree  term        estimate  std.error  statistic p.value    
   <ord> <chr>       <chr>     <chr>      <chr>     <chr>      
 1 3     (Intercept) "19.2   " " 5.86   " " 3.28"   "0.0221   "
 2 3     age         " 0.0811" " 0.00563" "14.4 "   "0.000029 "
 3 1     (Intercept) "24.4   " " 6.54   " " 3.73"   "0.0135   "
 4 1     age         " 0.0815" " 0.00628" "13   "   "0.0000485"
 5 5     (Intercept) " 8.76  " " 8.18   " " 1.07"   "0.333    "
 6 5     age         " 0.111 " " 0.00785" "14.1 "   "0.0000318"
 7 2     (Intercept) "20     " " 9.35   " " 2.13"   "0.0859   "
 8 2     age         " 0.125 " " 0.00898" "13.9 "   "0.0000343"
 9 4     (Intercept) "14.6   " "11.2    " " 1.3 "   "0.249    "
10 4     age         " 0.135 " " 0.0108 " "12.5 "   "0.0000573"

经过lucid函数处理后，可以看到数据符合人类的感官了，但是请注意数据格式变为了字符串类型，因此后续我们需求将其重新转换为数值型

P值转换

通过symnum函数将P值转换为*

Orange %>% group_by(Tree) %>% 
  do(tidy(lm(circumference ~ age, data=.))) %>% as.data.frame %>%
  mutate(p.value=as.numeric(p.value)) %>% 
  lucid %>%
  mutate(pvalue=as.numeric(p.value),
         p_signif=symnum(pvalue, 
                       cutpoints = c(0,0.001,0.01,0.05,1), 
                       symbols = c("***","**","*"," "))) %>% 
  select(-pvalue)

   Tree        term estimate std.error statistic   p.value   pvalue signif
1     3 (Intercept)  19.2      5.86         3.28 0.0221    2.21e-02      *
2     3         age   0.0811   0.00563     14.4  0.000029  2.90e-05    ***
3     1 (Intercept)  24.4      6.54         3.73 0.0135    1.35e-02      *
4     1         age   0.0815   0.00628     13    0.0000485 4.85e-05    ***
5     5 (Intercept)   8.76     8.18         1.07 0.333     3.33e-01       
6     5         age   0.111    0.00785     14.1  0.0000318 3.18e-05    ***
7     2 (Intercept)  20        9.35         2.13 0.0859    8.59e-02       
8     2         age   0.125    0.00898     13.9  0.0000343 3.43e-05    ***
9     4 (Intercept)  14.6     11.2          1.3  0.249     2.49e-01       
10    4         age   0.135    0.0108      12.5  0.0000573 5.73e-05    ***

自定义函数结合sapply对P值进行转换

myfun <- function(pval) {
  stars = ""
  if(pval <= 0.001)
    stars = "***"
  if(pval > 0.001 & pval <= 0.01)
    stars = "**"
  if(pval > 0.01 & pval <= 0.05)
    stars = "*"
  if(pval > 0.05 & pval <= 0.1)
    stars = ""
  stars
}

Orange %>% group_by(Tree) %>% 
  do(tidy(lm(circumference ~ age, data=.))) %>% as.data.frame %>%
  lucid %>%
  mutate(pvalue=as.numeric(p.value)) %>% 
  mutate(signif = sapply(p.value, function(x) myfun(x)))

   Tree        term estimate std.error statistic   p.value   pvalue signif
1     3 (Intercept)  19.2      5.86         3.28 0.0221    2.21e-02      *
2     3         age   0.0811   0.00563     14.4  0.000029  2.90e-05    ***
3     1 (Intercept)  24.4      6.54         3.73 0.0135    1.35e-02      *
4     1         age   0.0815   0.00628     13    0.0000485 4.85e-05    ***
5     5 (Intercept)   8.76     8.18         1.07 0.333     3.33e-01       
6     5         age   0.111    0.00785     14.1  0.0000318 3.18e-05    ***
7     2 (Intercept)  20        9.35         2.13 0.0859    8.59e-02       
8     2         age   0.125    0.00898     13.9  0.0000343 3.43e-05    ***
9     4 (Intercept)  14.6     11.2          1.3  0.249     2.49e-01       
10    4         age   0.135    0.0108      12.5  0.0000573 5.73e-05    ***

喜欢的小伙伴欢迎关注我的公众号，下回更新不迷路

R语言数据分析指南，持续分享数据可视化的经典案例及一些生信知识，希望对大家

R中优雅的对P值进行转换

R中优雅的对P值进行转换

安装并加载R包

数据展示

lucid转换格式

P值转换

相关阅读更多精彩内容

友情链接更多精彩内容