ML notes (03/03/20)

video1

linear regression is one kind of supervised learning
definition of supervised learning:当给定n个数据/sample/example,e.g.:
\{(x_1,y_1),(x_2,y_2),(x_3,y_3)......\}
remark:其中x_n是input space,y_n是output space or lable
The question is to find a functionf ,such that f(x)=y'.When you input x. you will get a prediction of \hat{y}\hat{y}的目的是逼近其真实值)
That means the function f is a good predictor of y for a future input x(to predict the data,instead of fitting data)

Statistical Learning Definition

hypothesis:

the product spaceZ=XxY
The training set S=\{(x_1,y_1),(x_2,y_2),(x_3,y_3)...(x_n,y_n)\} ,which is in Z,and the n samples drawn i.i.d. from \mu
There is an unknown probability distribution on the product spaceZ,written\mu(x,y)
Assuming that X is a compact domain in Euclidean space and Y a bounded subset of R.
H is the hypothesis space, a space of functionsf:X\rightarrow Y

algorithm:

A learning algorithm is a map L:Z^n\rightarrow Hthat looks at Sand selects from Ha functionf_s:X\rightarrow Ysuch thatf_s(x)\approx yin a predictive way
To measure the degree of approximation of functionf , a loss function l:YxY\rightarrow R,(一个是正确的值y,一个是预测的y,即f(x))and then,we define the expected or true error of f (在loss function 的基础上求均值)is

This is the expected loss on a new example drawn at random from $\mu $

但由于未知,以上方程解不出来,故由大数定理,根据sample的数据,造一个去逼近:
The empirical error of fs

但为了确保小等价于小,需要满足
Ls的条件

video2 linear regression:

problem settings:

Elements:

n data samples

Assumptions:


So,The empirical error of is:

Matrix Form:


Then the empirical error of can be written as matrix form:

Conclusion:

Assuming that X has full column rank, minimization of the empirical error leads to the estimator of the function f: \hat{y}=X\hat{\beta},where \hat{\beta}=(X^TX)^{-1}X^Ty

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
【社区内容提示】社区部分内容疑似由AI辅助生成,浏览时请结合常识与多方信息审慎甄别。
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

友情链接更多精彩内容