机器学习基石课程笔记 Week6 VC Bound

本篇中将完成上篇未完成的推论证明，并更新霍夫丁不等式以满足 $m_H(N)$ 的情形。
To prove the conjecture, we need to generalize a upper bound of all cases with break point $k$ , which is irrelevant to what the actual case is (either the perceptron hypothesis or any other ones) , we call it the bounding function, $B(N,k)$ , $m_H(N)\leq B(N,k)$ .
Firstly, we have the following equations, $B(N,k)=2^k-1, N=k$ $B(N,k)=2^N, N<k$ $B(N,1)=1$
Consider a special case $B(4,3)=2\alpha+\beta$ , $\alpha$ 的意思是在不考虑 $x_4$ 的情况下， $x_1,x_2,x_3$ 的组合出现两次的次数， $\{A,A,A,B\}和\{A,A,A,A\}$ 就是这样的一对。 $\beta$ 是只出现一次的组合次数。
Hence, $\alpha+\beta<B(3,3)$ as we are not able to shatter three points (break point is 3).
$\alpha\leq B(3,2)$ as we are not able to shatter three points(including $x_4$ this time).
Inductively, $B(N,k)\leq B(N-1,k)+B(N-1,k-1)$ . We need to prove $B(N,k)\leq \sum_{i=0}^{k-1}C_N^i$
Assume $B(N-1,k)\leq \sum_{i=0}^{k-1}C_{N-1}^i$ and $B(N-1,k-1)\leq \sum_{i=0}^{k-2}C_{N-1}^i=\sum_{i=1}^{k-1}C_{N-1}^{i-1}$
Then $\begin{align} B(N,k)&\leq \sum_{i=0}^{k-1}C_{N-1}^i+\sum_{i=1}^{k-1}C_{N-1}^{i-1}\\ &=C_{N-1}^0+\sum_{i=1}^{k-1}(C_{N-1}^i+C_{N-1}^{i-1})\\ &=C_N^0+\sum_{i=1}^{k-1}C_N^i\\ &=\sum_{i=0}^{k-1}C_N^i \end{align}$
Hence, $m_H(N)\leq B(N,k)\leq \sum_{i=0}^{k-1}\binom{N}{i}=O(N^{k-1})$ is polynomial, the only next step is to replace $M$ in Hoeffding's inequality with $m_H$ , and the new bound is called Vapnik–Chervonenkis bound, VC bound.
$P[\exists h\in H, \mid E_{in}(h)-E_{out}(h) \mid >\varepsilon]\leq 4m_H(2N)exp(-\frac{1}{8}\varepsilon^2 N)$
有关VC bound的进一步证明可见
https://mostafa-samir.github.io/ml-theory-pt2/