Supervised learning 有监督学习
Goal: To learn a classification model from the data that can be used to predict the classes of new cases.
A Decision Tree 决策树概念
A decision tree will include decision nodes and leaf nodes.
All current tree algorithms are all heuristic algorithms
Each path from the root to a leaf is a rule
A greedy Divide-n-conquer algorithm
Tree is constructed in a top-down recursive manner
Key: Which attribute to choose in order to branch
Objective: Reduce impurity or uncertainty in data
手动画决策树步骤公式
The Entropy Formula:
The Entropy of Attribute Ai:
The Information gained by selecting Ai to branch or to partition data:
Finally we choose the largest gain to split the the current tree
在求出拥有最大InformationGain的Attribute之后,将其作为root。 剩下的数据重复以上过程。
Quiz related:
1. The resulting decision tree will use a subset of the attributes in S
2. It's a recursive algorithm
3. It works in a depth-first fashion
4. It's complexity is nlog(n)