【机器学习】-Week3 1. Classification

来源:coursera 斯坦福 吴恩达 机器学习

分类

要尝试分类,一种方法是使用线性回归并将大于0.5的所有预测映射为1,将所有小于0.5的预测映射为0.但是,此方法不能很好地工作,因为分类实际上不是线性函数。

分类问题就像回归问题一样,除了我们现在想要预测的值只占用少量离散值。现在,我们将关注二进制分类问题,其中y只能接受两个值0和1.(我们在这里所说的大多数也将推广到多类情况)

例如,如果我们尝试为电子邮件构建垃圾邮件分类器,那么x(i) 可能是一封电子邮件的某些功能,如果它是一封垃圾邮件则y可能为1 ,否则为0。因此,y∈{0,1}。 0也称为负类,1表示正类,它们有时也用符号“ - ”和“+”表示。给定x(i),相应的y (i) 也被称为训练示例的标签。

Classification

To attempt classification, one method is to use linear regression and map all predictions greater than 0.5 as a 1 and all less than 0.5 as a 0. However, this method doesn't work well because classification is not actually a linear function.

The classification problem is just like the regression problem, except that the values we now want to predict take on only a small number of discrete values. For now, we will focus on the binary classification problem in which y can take on only two values, 0 and 1. (Most of what we say here will also generalize to the multiple-class case.) 

For instance, if we are trying to build a spam classifier for email, then x^{(i)}x(i) may be some features of a piece of email, and y may be 1 if it is a piece of spam mail, and 0 otherwise. Hence, y∈{0,1}. 0 is also called the negative class, and 1 the positive class, and they are sometimes also denoted by the symbols “-” and “+.” Given x^{(i)}x(i), the corresponding y^{(i)}y(i) is also called the label for the training example.


最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。
禁止转载,如需转载请通过简信或评论联系作者。

推荐阅读更多精彩内容