线性回归

最小二乘法线性回归：sklearn.linear_model.LinearRegression(fit_intercept=True, normalize=False,copy_X=True, n_jobs=1)

主要参数说明：

fit_intercept：布尔型，默认为True，若参数值为True时，代表训练模型需要加一个截距项；若参数为False时，代表模型无需加截距项。

normalize：布尔型，默认为False，若fit_intercept参数设置False时，normalize参数无需设置；若normalize设置为True时，则输入的样本数据将(X-X均值)/||X||；若设置normalize=False时，在训练模型前，可以使用sklearn.preprocessing.StandardScaler进行标准化处理。

属性：

coef_：回归系数(斜率)

intercept_：截距项

主要方法：

①fit(X, y, sample_weight=None)

②predict(X)

③score(X, y, sample_weight=None)，其结果等于1-(((y_true - y_pred) **2).sum() / ((y_true - y_true.mean()) ** 2).sum())

利用sklearn自带的糖尿病数据集，建立最简单的一元回归模型

In [1]:importnumpyasnp

...:fromsklearnimportdatasets , linear_model

...:fromsklearn.metricsimportmean_squared_error , r2_score

...:fromsklearn.model_selectionimporttrain_test_split

...:#加载糖尿病数据集

...: diabetes = datasets.load_diabetes()

...: X = diabetes.data[:,np.newaxis ,2]#diabetes.data[:,2].reshape(diabetes

...: .data[:,2].size,1)

...: y = diabetes.target

...: X_train , X_test , y_train ,y_test = train_test_split(X,y,test_size=0.2

...: ,random_state=42)

...: LR = linear_model.LinearRegression()

...: LR.fit(X_train,y_train)

...: print('intercept_:%.3f'% LR.intercept_)

...: print('coef_:%.3f'% LR.coef_)

...: print('Mean squared error: %.3f'% mean_squared_error(y_test,LR.predict

...: (X_test)))##((y_test-LR.predict(X_test))**2).mean()

...: print('Variance score: %.3f'% r2_score(y_test,LR.predict(X_test)))#1-(

...: (y_test-LR.predict(X_test))**2).sum()/((y_test - y_test.mean())**2).sum

...: ()

...: print('score: %.3f'% LR.score(X_test,y_test))

...: plt.scatter(X_test , y_test ,color ='green')

...: plt.plot(X_test ,LR.predict(X_test) ,color='red',linewidth =3)

...: plt.show()

...:

intercept_:152.003

coef_:998.578

Mean squared error:4061.826

Variance score:0.233

score:0.233

效果如下：

推荐阅读更多精彩内容