在统计学中,线性回归(英语:linear regression)是利用称为线性回归方程的最小二乘函数对一个或多个自变量和因变量之间关系进行建模的一种回归分析。这种函数是一个或多个称为回归系数的模型参数的线性组合。只有一个自变量的情况称为简单回归,大于一个自变量情况的叫做多元回归(multivariable linear regression)。
使用scikit-learn实现线性回归
import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets
from playML.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
###波士顿房价数据集
boston = datasets.load_boston()
X = boston.data
y = boston.target
X = X[y < 50.0]
y = y[y < 50.0]
X_train, X_test, y_train, y_test = train_test_split(X, y, seed=666)
lin_reg = LinearRegression()
lin_reg.fit(X_train, y_train)
lin_reg.coef_
###Output:array([ -1.18919477e-01, 3.63991462e-02, -3.56494193e-02,
5.66737830e-02, -1.16195486e+01, 3.42022185e+00,
-2.31470282e-02, -1.19509560e+00, 2.59339091e-01,
-1.40112724e-02, -8.36521175e-01, 7.92283639e-03,
-3.81966137e-01])
lin_reg.score(X_test, y_test)
###回归系数Output:0.81298026026584758