CART是一颗二叉树(分类或回归)
分类树的节点分裂
基于Gini指数
数据集,预测婚姻
| ID | Occupation | Marital Status |
|---|---|---|
| 1 | Student | S |
| 2 | Student | S |
| 3 | Teacher | M |
| 4 | Officer | M |
| 5 | Officer | M |
| 6 | Teacher | S |
| 7 | Student | M |
演示:

选择Gini最小的分裂
最终选择{Officer}、{Student、Teacher}的划分方法
回归树的节点分裂
基于方差
数据集,预测年龄
| ID | Occupation | Age |
|---|---|---|
| 1 | Student | 12 |
| 2 | Student | 18 |
| 3 | Teacher | 26 |
| 4 | Officer | 47 |
| 5 | Officer | 36 |
| 6 | Teacher | 29 |
| 7 | Student | 21 |
演示:

选择方差最小的分裂
最终选择{Officer}、{Student, Teacher}的划分方法
连续变量的分裂和C4.5类似
数据集,预测职业
| ID | Age | Occupation |
|---|---|---|
| 1 | 12 | Student |
| 2 | 18 | Student |
| 7 | 21 | Student |
| 3 | 26 | Teacher |
| 6 | 29 | Teacher |
| 5 | 36 | Officer |
| 4 | 47 | Officer |
演示

选择Gini最小的分裂
最终选择{<26, >=26}的划分方法