Linear Regression Learning Notes

    Support vector machine has been studied for two weeks,But i didn't study it thoroughtly,i think it so difficlut,then i studied linear regression for a period time.

Our target:

    Linear model makes a prediction by computing a dot product sum of the input feature vector,the add a bias term,this function:


linear regression

    we can make it simple like this:


linear regression

    we can use:

theta = (X.T * X)**(-1) * X.T.y

    To computing wT of f(x)

    But the noise made it impossible to recover the excat linear.so we need optimize this solver,common way is:Least Squares Method or Gradient Descent and so on...

This Least Squares Method:

    Our goal is to get the f(x) generated close to the real y,like:


our goal

    Then we need MSE(mean-square error) minimum.


our target

    so our target is to convert the minimum of the formula above,Let's take the derivative of w and b with respect to this formula,The exact process is not derived here,and then we want its closed-form solution.we can get:


closed-form of w and b

Gradient Descent

    It's easy to understand,first we specified a vector of theta,then we can make a learning-rate of eta,if the eta is too big,our curvers may not converge,but if eta to small,we need a long time to adjust theta,ok,now,we're going to iterate over the sample,the minimum value of MSE was calculated using the learning rate,this cost function is closer our target,but we can't get global minimum,because there are local minimum.we can use Batch Gradient Descent(BGD) to find global minimum.

Batch Gradient Descent

MSE(theta) =  2/m * X.T * (X*theta -y)

#m is the sample matrix's number,X is the sample matrix,y is target number

    BGD's algorithm

for:

theta = theta - learning_rate * MSE

    Use the above formula,we can get global minimum and jump out of the local minimum,but we must use long time to computing,because every time we need to compute all the instances,Stochastic Gradient Descent is a way to reduce computing time.

Stochastic Gradient Descent

    To make up for the defect of BGD running time to long,We introduce the SGD,SGD trains one instance at a time,therefore,we are not sure whether all samples will be used after the training,this learn curves dont like BGD is step by step achieve global minimum,SGD is even bouncing upward,upward,but it's always like global minimum,so we can get the optimal solution if our learning-rate remain the same,we're adjusting the learning-rate less and less,so it's going to be close to the optimal solution.

Mini-batch Gradient Descent

    If you want to get tht optimal solution,you dont want to take a long time,we can use Mini-batch Gradient Descent,it combines BGD and SGD,yeah,Each time we use a subset of instances,and learning-rate is loss and loss,We dont waste as much time as BDG and the instability of SGD.

Regularized Linear Models

    The previous model is easy to overfitting,a good wat to avoid it is regularize,like Ridge Regression and Lasso Regression.

    Ridge Regression:a regularization term equal to a*SUM(theta**2) is added to the cost function,then we can get the ridge regression cost function,the hyperparameter alpha controls how much you want to regularize the model:

J(theta) = MSE(theta) + 1/2 * alpha* SUM(theta ** 2)

    Then,this function's closed_form solution is:

theta = (X**T * X + alpha * A)**(-1) * y

#A is the n*n identity matrix

    Elastic Net is a middle ground between ridge regressoin and lasso regression.The following is Elastic Net's cost function

J(theta) = MSE(theta) + ratio*alpha|theta(i)| + (1-ratio)/2 * alpha *SUM(theta(i) ** 2)

    Ratio bigger,Lasso heavier,ratio samller,Lasso lighter,In general,Elastic Net is preferred over lasso since lasso may behave erratically when the number of freatures than the number of training instances or when several features are strongly correlated.so,if the running time to long,maybe overfitting,the time to short,maybe underfitting,so we stopping running when MSE between predict and true is minimum.This is called early stopping.

Logistic Regression

This cost function:


逻辑斯蒂回归

    This function is actually a sigmoid function,this function is very steep when (w**T*x + b) is equal to zero,to avoid unclassified situation as much as possible.emmmmm,you can use this model from sklearn....ok.

Softmax Regression

    Softmax regression (or multinomial logistic regression) is a generalization of logistic regression to the case where we want to handle multiple classes,each class has it own dedicated parameter vector,following:

Sk(X) = theta(k) ** T * X #softmax score for class k

MSE(theta) = 1/m * SUM(f(x) - y(x))*X(i) #cross entropy gradient vector for class k

    Calculate each possible category,We separate this category from other categories,it's equal to 1 if the target class for the ith instance is K,otherwise,it is equal to 0.

©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 194,242评论 5 459
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 81,769评论 2 371
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 141,484评论 0 319
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 52,133评论 1 263
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 61,007评论 4 355
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 46,080评论 1 272
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 36,496评论 3 381
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 35,190评论 0 253
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 39,464评论 1 290
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 34,549评论 2 309
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 36,330评论 1 326
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 32,205评论 3 312
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 37,567评论 3 298
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 28,889评论 0 17
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 30,160评论 1 250
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 41,475评论 2 341
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 40,650评论 2 335

推荐阅读更多精彩内容