ML - hw1

1. Machine Learning Problems

(a) 1. BF,2. C,3. AD,4. G,5. AE,6. A,7. BF,8. AE,9. BG

(b) False. Although a large number of data can train an excellent model working quite well in data resource, or training data, we focus more on the model performance on the test data or the model generalization ability.

  • Maximizing performance on the whole dataset may result in severe overfitting.
  • On the other hand, using all the data will consume more computation and time.

2. Bayes Decision Rule

maximum likelihood decision rule
optimal bayes decision rule

3. Gaussian Discriminant Analysis and MLE

c.

Gaussian Boundaries

4. Text Classification with Naive Bayes

a. 10 words

ooking  9453
voip    9494
computron   13613
nbsp    30033
meds    37568
pills   38176
cialis  45153
sex 56930
php 65398
viagra  75526

b. accuracy = 0.9857315598548972

c. False. When the ratio of spam and ham is 1:99, the spam filter can easily to find ham emails but may regard spam email as the ham email, too.

d.
precision = 0.9750223015165032
recall = 0.9724199288256228

e. Precision really matters. In this condition, it can find more spams.

To identify drugs and bombs at an airport, I think the recall is more important, because we must find all the bombs to make sure the safety.

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
【社区内容提示】社区部分内容疑似由AI辅助生成,浏览时请结合常识与多方信息审慎甄别。
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

相关阅读更多精彩内容

  • rljs by sennchi Timeline of History Part One The Cognitiv...
    sennchi阅读 12,179评论 0 10
  • 从今天开始,写一下我在刷 LeetCode 时的心得体会,包括自己的思路和别人的优秀思路,欢迎各种监督啊! ...
    秋名山菜车手阅读 4,461评论 0 1
  • “唔,疼。”我一醒来,就感到身上有撕心裂肺的疼痛。低头一看,浑身血迹甚是吓人,衣裙亦被刮的破洞四处。往上望...
    梨主阅读 1,702评论 0 0
  • 冒泡排序法相信很多人刚开始接触C语言的时候就已经很熟悉了,那么我们今天就来使用冒泡排序法进行简单的算法练习。 结果...
    少寨主的互联网洞察阅读 1,274评论 0 0

友情链接更多精彩内容