因子(factor)是一个对等长的其他向量元素进行分类(分组)的向量对象。即因子也是向量。
因子的创建及查看因子水平
factor():将向量转化成因子
levels():查看因子的水平
> gender=c(rep("female",3),rep("male",3))
> genderf=factor(gender)#将向量转换成因子
> gender
[1] "female" "female" "female" "male" "male" "male"
> genderf
[1] female female female male male male
Levels: female male
> levels(genderf)#查看因子的水平函数
[1] "female" "male"
应用:分组计算均值和标准误
tapply():Apply a function to each cell of a ragged array.
tapply(x,index,function),x为对象,如向量;index为因子,与x的长度相同;function为要应用的函数名,可以是内置的,也可以是自定义的。意为对由因子分类的向量元素分别应用function。
> gender=c(rep("female",3),rep("male",3))
> genderf=factor(gender)
> height=c(165,160,168,170,175,180)
#R内置函数计算均值
> htmean=tapply(height,genderf,mean)
> htmean
female male
164.3333 175.0000
#自定义函数计算均值
> mymean <- function(x) sum(x)/length(x)
> myhtmean=tapply(height,genderf, mymean)
> myhtmean
female male
164.3333 175.0000
#自定义函数计算标准误
> stderr <- function(x) sqrt(var(x)/length(x))#sd(x)/sqrt(length(x))
> htster=tapply(height,genderf,stderr)
> htster
female male
2.333333 2.886751