对于变量的重编码,主要有几种方法:
1、硬编码
leadership$agecat[leadership$age ==99] <- NA #指定缺失值
leadership$agecat[leadership$age > 75] <- "Elder"
leadership$agecat[leadership$age >= 55 &
leadership$age <= 75] <- "Middle Aged"
leadership$agecat[leadership$age < 55] <- "Young"
这段代码可以写的更紧凑些:
leadership$agecat[leadership$age ==99] <- NA #指定缺失值
leadership <- within(leadership,{
agecat <- NA
agecat[age > 75] <- "Elder"
agecat[age >= 55 & age <= 75] <- "Middle Aged"
agecat[age < 55] <- "Young" })
2、自带的cut函数
num<-seq(1,100)
numcut=cut(num,c(-Inf, 0, 10,20, 30, 40, 50, 60, 70, 80, 90,100, Inf))
3、car包中的recode函数;
recode(var, recodes, as.factor.result, as.numeric.result=TRUE, levels)
install.packages("car")
libarary(car)
x<-c(10:100)
recode(x,"lo:60='C';61:80='B';81-hi='A';else='NULL'") #lo、hi分别表示最小、最大值
4、doBy包中的recodevar函数