【描述】
cut divides the range of x into intervals and codes the values in x according to which interval they fall. The leftmost interval corresponds to level one, the next leftmost to level two and so on.
cut将x的范围划分为若干个区间,并根据这些区间对x中的值进行编码。最左边的区间对应于第一级,最左边的区间对应于第二级,依此类推。
个人感觉最重要的一点在于生成划分因子
【用法】
cut(x, ...)
## Default S3 method:
cut(x, breaks, labels = NULL,
include.lowest = FALSE, right = TRUE, dig.lab = 3,
ordered_result = FALSE, ...)
【参数】
x
a numeric vector which is to be converted to a factor by cutting.
breaks
either a numeric vector of two or more unique cut points or a single number (greater than or equal to 2) giving the number of intervals into which x is to be cut.
labels
labels for the levels of the resulting category. By default, labels are constructed using "(a,b]" interval notation. If labels = FALSE, simple integer codes are returned instead of a factor.
include.lowest
logical, indicating if an ‘x[i]’ equal to the lowest (or highest, for right = FALSE) ‘breaks’ value should be included.
right
logical, indicating if the intervals should be closed on the right (and open on the left) or vice versa.
dig.lab
integer which is used when labels are not given. It determines the number of digits used in formatting the break numbers.
ordered_result
logical: should the result be an ordered factor?
...
further arguments passed to or from other methods.
【代码】
> cut(1:10, breaks = seq(0, 10, 5)) # 默认情况下左开右闭
[1] (0,5] (0,5] (0,5] (0,5] (0,5] (5,10] (5,10] (5,10] (5,10] (5,10]
Levels: (0,5] (5,10]
# break 可以为自定义的分组也可以为大于等于2的数字,如果是数字,则软件会自动均分数值间的距离,如果不想均分,可以自定义分类距离
> cut(1:10, breaks = c(0, 3, 5, 8, 10))
[1] (0,3] (0,3] (0,3] (3,5] (3,5] (5,8] (5,8] (5,8] (8,10] (8,10]
Levels: (0,3] (3,5] (5,8] (8,10]
# right 代表区间的左右端开和闭 默认为true,代表左开右闭,当设置成False的时候,为左闭右开
# label 为标签向量,代表给每段间距设置一个标签,相当于每个间距给一个名字
> cut(1:10, breaks = c(1, 3, 5, 8, 11), right = F, labels = c('A', 'B', 'C', 'D'))
[1] A A B B C C C D D D
Levels: A B C D