机器学习是什么
.已有的数据经验
.某种模型迟到的规律
.利用此模型预测未来(是否迟到)
.机器学习界 数据为王 思想
拟人
1. 多次
2. 好准
==> 数据整理
特征工程
3. 算法
==> 调优
AI
Intel cntk
TensonFlow
Data(x,y)
TrainSet
TestSet
历史数据--算法-->模型训练--最优解-->模型---->/新数据---->预测结果
Scikit-Learn
http://scikit-learn.org/stable/index.html
numpy
Scipy
matplotlib
pantas
DataFrame
BSD license
Anaconda
Classification
Regression Algorithms:SVR,ridge regression,Lasso
Clustering Applications:Customer segmentation,Grouping expenment outcomes
Algorithms K-Means,spectral clustering,mean-shift,...
Dimensionality reduction(降维计算) Applications:Visualization, Increased efficiency
Algorithms:RCA,feature selection,non-negative matrix factorization
Model selection
Comparing,validating and choosing parameters and models
Goal: Improved accuracy via parameter tuning
Modules: grid search,cross validation,metrics.
Preprocessing
Feature extraction and nomalization(归一化)
Application:Transforming input data such as text for use with machine learning algonthms.
Modules: preprocessing,feature extraction
线性回归
线性:两个变量之间存在一次方function关系,就称它们之间存在线性关系
线性:线性lineor ,指量与量之间按比例、成直线的关系,在空间和时间上代表规则和光滑的运动