数据泛化
由于地区列存在值太多,将琶洲处理成海珠区,实现数据泛化(琶洲属于海珠)
foo <- function(x){
x <- as.character(x)
if(grepl('Pazhou',x)){x='Haizhu District'};
sub(pattern = "(District).*", replacement = "\\1", x);
}
data$neighborhood <- lapply(data$neighborhood,foo)
data$neighborhood <- as.factor(unlist(data$neighborhood))
字符串拼接
以下语句将part数据框中的列名批量增加前缀'M‘
paste函数将x与'M’拼接
foo1 <- function(x) paste('M',x,sep="")
colnames(part2) <- lapply(names(part2),foo1)
paste其它用法
paste(c("c", "d"), c("1", "2"), sep = '+')
#[1] "c+1" "d+2" 含有两字符串元素的向量
paste(c("c", "d"), c("1", "2"), sep = '+', collapse = '=')
#[1] "c+1=d+2" 一个字符串