【机器学习】-Week6.1 Evaluating a Hypothesis

Once we have done some trouble shooting for errors in our predictions by:

$\bullet$ Getting more training examples

$\bullet$ Trying smaller sets of features

$\bullet$ Trying additional features

$\bullet$ Trying polynomial features

$\bullet$ Increasing or decreasing λ

We can move on to evaluate our new hypothesis.

A hypothesis may have a low error for the training examples but still be inaccurate (because of overfitting). Thus, to evaluate a hypothesis, given a dataset of training examples, we can split up the data into two sets: atraining setand atest set. Typically, the training set consists of 70 % of your data and the test set is the remaining 30 %.

The new procedure using these two sets is then:

1. Learn $\theta$ and minimize $J_{train}(\theta )$ using the training set

2. Compute the test set error $J_{test}(\theta )$

The test set error

1. For linear regression:

2. For classification ~ Misclassification error (aka 0/1 misclassification error):

This gives us a binary 0 or 1 error result based on a misclassification. The average test error for the test set is:

This gives us the proportion of the test data that was misclassified.

来源：coursera 斯坦福吴恩达机器学习

【机器学习】-Week6.1 Evaluating a Hypothesis

The test set error

推荐阅读更多精彩内容