频数统计
mytable<-with(Arthritis, table(Improved)) #生成简单的统计表
mytable
Improved
None Some Marked
42 14 28
prop.table(mytable) #将频数简化为比例值
Improved
None Some Marked
0.5000000 0.1666667 0.3333333
prop.table(mytable)*100 #将频数转化为百分比
Improved
None Some Marked
50.00000 16.66667 33.33333
二维列联表
基本量使用
mytable<- xtabs(~ Treatment+Improved, data=Arthritis) #对于二维列联表,定义mytable<- table(A, B),其中,A是行变量,B是列变量;
#mytable<- xtabs(~A+B, data=mytable),其中,data是一个矩阵或者一个数据框,一般将交叉分类的变量写在公式的右侧(~的右方),
mytable
Improved
Treatment None Some Marked
Placebo 29 7 7
Treated 13 7 21
prop.table(mytable, 1) #生成比例,其中“1”表示table()语句中的第一个变量
Improved
Treatment None Some Marked
Placebo 0.6744186 0.1627907 0.1627907
Treated 0.3170732 0.1707317 0.5121951
#结果表明:接受药物治疗并且给予治疗的,有51.22%的患者是有改善的。
margin.table(mytable, 2) #生成边际频率,其中“2”表示table()语句中的第二个变量
Improved
None Some Marked
42 14 28
prop.table(mytable) #各单元格所占比例可用如下语句获取
Improved
Treatment None Some Marked
Placebo 0.34523810 0.08333333 0.08333333
Treated 0.15476190 0.08333333 0.25000000
addmargins(mytable) #为表格添加边际和
Improved
Treatment None Some Marked Sum
Placebo 29 7 7 43
Treated 13 7 21 41
Sum 42 14 28 84
addmargins(prop.table(mytable))
Improved
Treatment None Some Marked Sum
Placebo 0.34523810 0.08333333 0.08333333 0.51190476
Treated 0.15476190 0.08333333 0.25000000 0.48809524
Sum 0.50000000 0.16666667 0.33333333 1.00000000
addmargins(prop.table(mytable, 1), 2) #默认行为表中所有的变量创建边际和
Improved
Treatment None Some Marked Sum
Placebo 0.6744186 0.1627907 0.1627907 1.0000000
Treated 0.3170732 0.1707317 0.5121951 1.0000000
addmargins(prop.table(mytable, 2), 1)
Improved
Treatment None Some Marked
Placebo 0.6904762 0.5000000 0.2500000
Treated 0.3095238 0.5000000 0.7500000
Sum 1.0000000 1.0000000 1.0000000
使用CrossTable生成二维列联表
install.packages("gmodels")
library(gmodels)
CrossTable(Arthritis$Treatment, Arthritis$Improved)
从图中可以看到:每一个数据单元表格含有统计量有:数量、卡方检验值、横纵比例以及总比例等值。
多维列联表
> mytable <- xtabs(~ Treatment+Sex+Improved, data=Arthritis)
> mytable
, , Improved = None
Sex
Treatment Female Male
Placebo 19 10
Treated 6 7
, , Improved = Some
Sex
Treatment Female Male
Placebo 7 0
Treated 5 2
, , Improved = Marked
Sex
Treatment Female Male
Placebo 6 1
Treated 16 5
#这一部分主要是对相关的频数进行统计与分析
> ftable(mytable)
Improved None Some Marked
Treatment Sex
Placebo Female 19 7 6
Male 10 0 1
Treated Female 6 5 16
Male 7 2 5
#主要体现的是ftable()函数的好处,有助于直观的浏览信息
> margin.table(mytable, 1)
Treatment
Placebo Treated
43 41
#计算表中第一个变量treatment的和,比如placebo的为43,treated的为41
> margin.table(mytable, 2)
Sex
Female Male
59 25
#计算表中第二个变量性别的和,比如female的为59,male的为25
> margin.table(mytable, 3)
Improved
None Some Marked
42 14 28
#计算表中第三个变量improved的和,比如none为42,some为14,marked为28
> margin.table(mytable, c(1,3))
Improved
Treatment None Some Marked
Placebo 29 7 7
Treated 13 7 21
#计算表中第一个变量treatment和第三个标量improved交际的和,比如placebo(给药)*marked(显著治疗)的人为7个;
> ftable(prop.table(mytable,c(1,2)))
Improved None Some Marked
Treatment Sex
Placebo Female 0.59375000 0.21875000 0.18750000
Male 0.90909091 0.00000000 0.09090909
Treated Female 0.22222222 0.18518519 0.59259259
Male 0.50000000 0.14285714 0.35714286
#按照交互的类别,算出所占的比率
> ftable(addmargins(prop.table(mytable,c(1,2)),3))
Improved None Some Marked Sum
Treatment Sex
Placebo Female 0.59375000 0.21875000 0.18750000 1.00000000
Male 0.90909091 0.00000000 0.09090909 1.00000000
Treated Female 0.22222222 0.18518519 0.59259259 1.00000000
Male 0.50000000 0.14285714 0.35714286 1.00000000
#按照交互的类别,算出所占的比率,并算出边际和
> ftable(addmargins(prop.table(mytable,c(1,2)),3))*100
Improved None Some Marked Sum
Treatment Sex
Placebo Female 59.375000 21.875000 18.750000 100.000000
Male 90.909091 0.000000 9.090909 100.000000
Treated Female 22.222222 18.518519 59.259259 100.000000
Male 50.000000 14.285714 35.714286 100.000000
#按照交互的类别,算出所占的比率(百分比),并算出边际和
好了,我的小伙伴们,今天就先到这儿吧,下期见!O(∩_∩)O哈哈~