错题集 - ML @Cousera


Week3_2 Regularization 第 1 题

You are training a classification model with logistic regression. Which of the following statements are true? Check all that apply.

        Introducing regularization to the model always results in equal or better performance on the training set.

        Adding many new features to the model helps prevent overfitting ont the training set.

        Introducing regularization to the model always results in equal or better performance on examples not in the training set.

        Adding a new feature to the model always results in equal or better performance on the training set.

*     答案: 4

正则化方法的公式: J(θ)=\frac{1}{2m}[\sum_{i=1}^{m}(h_{θ}(x^{(i)})−y^{(i)})^2+λ\sum_{i=1}^{n}θ^{2}_{j}]

* 选项1: 将正则化方法加入模型并不是每次都能取得好的效果,如果λλ取得太大的化就会导致欠拟合. 这样不论对traing set 还是 examples都不好. 不正确 **

* 选项2: more features能够更好的fit 训练集,同时也容易导致overfit,是more likely而不是prevent. 不正确 **

* 选项3: 同1,将正则化方法加入模型并不是每次都能取得好的效果,如果λλ取得太大的化就会导致欠拟合. 这样不论对traing set 还是 examples都不好. 不正确 **

* 选项4: 新加的feature会提高train set的拟合度,而不是example拟合度. 正确 **


Which of the following statements are true? Check all that apply.

        A two layer (one input layer, one output layer; no hidden layer) neural network can represent the XOR function.

        Any logical function over binary-valued (0 or 1) inputs x_1x1​ and x_2x2​ can be (approximately) represented using some neural network.

        The activation values of the hidden units in a neural network, with the sigmoid activation function applied at every layer, are always in the range (0, 1).

        Suppose you have a multi-class classification problem with three classes, trained with a 3 layer network. Let a(3)1=(hΘ(x))1 be the activation of the first output unit, and similarly a(3)2=(hΘ(x))2 and a(3)3=(hΘ(x))3. Then for any input xx, it must be the case that a^{(3)}_1 + a^{(3)}_2 + a^{(3)}_3 = 1a1(3)​+a2(3)​+a3(3)​=1.

*答案:BC

*A选项:XOR异或,需要通过叠加产生。

        https://blog.csdn.net/oliverkingli/article/details/81131103

        OR或者AND都是线性可分,而XOR却是非线性可分的

*D选项:每一层的节点都是通过前一层计算而来,同层节点互不影响,加上softmax才会影响。 


最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
【社区内容提示】社区部分内容疑似由AI辅助生成,浏览时请结合常识与多方信息审慎甄别。
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

相关阅读更多精彩内容

友情链接更多精彩内容