Chapter 2 课后习题

EXE1

(a) better - a more flexible approach will fit the data closer and with the large sample size a better fit

(b) worse - a flexible method would overfit the small number of observations

(c) better - with more degrees of freedom, a flexible model would obtain a better fit

(d) worse - flexible methods fit to the noise in the error terms and increase variance

EXE 2

a)regression.inference.

b)classification.prediction.

c)regression.prediction.

EXE 3

bias-More flexible,smaller bias.

variance-More flexible,larger variance.

training error-More flexible,smaller training error.

test error -More flexible,U-shape curve.

Bayes (irreducible) error -constant. defines the lower limit, the test error is bounded below by the irreducible error due to variance in the error (epsilon) in the output

values (0 <= value). When the training error is lower than the irreducible error,overfitting has taken place.The Bayes error rate is defined for classification problems and is determined by the ratio of data points which lie at the 'wrong' side of the decision boundary, (0 <= value < 1).

EXE 4

...

EXE 5

Flexible models will fit the data closer with smaller bias but larger variance and obtain a better fit for non-linear data.

If the number of observations is small,flexible models is easy to get overfitting(fit the noise).So they need a larger scale of data.

We prefer a more flexible model when we are interested in prediction rather than the interpretability(解释性)。

We prefer a less flexible model when we are interested in the interpretability and inference.

EXE 6

A parametric approach reduces the problem of estimating f down to one of estimating a set of parameters because it assumes a form for f.

A non-parametric approach does not assume a functional form for f and so requires a very large number of observations to accurately estimate f.

The advantages of a parametric approach to regression or classification are the simplifying of modeling f to a few parameters and not as many observations are required compared to a non-parametric approach.

The disadvantages of a parametric approach to regression or classification are a potential to inaccurately estimate f if the form of f assumed is wrong or to overfit the observations if more flexible models are used.

EXE 7

Small.When k is larger,the boundary will be more linear.

Refer to https://raw.githubusercontent.com/asadoughi/stat-learning/master/ch2/answers

©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容

  • 奇点临近 本篇文章,会让大家脑洞大开! 首先,大家还记得什么叫“奇点”么?公式y=1/x, 当x无限接近于0时,y...
    Calvin2100阅读 560评论 3 1
  • 苕之华先秦:佚名苕之华,芸其黄矣。心之忧矣,维其伤矣!苕之华,其叶青青。知我如此,不如无生!牂羊坟首,三星在罶。人...
    To者也阅读 730评论 0 1
  • 文/菲烟笑
    菲烟笑阅读 201评论 0 1
  • 看了大张伟的微博 最新的一条 发现喜欢这个人真的是很不错 他也是有对音乐的信仰啊 之前就应该坚定这种想法的 应该是...
    Cheryl_ak717阅读 206评论 0 0