神经网络,英文neural network,跟random forest一样,是众多机器学习方法的一种。
tensorflow,是实现神经网络的其中一种框架。如官网所说:an open-sourse mechine learning framework for everyone,当然还有其他框架可以选择,比如caffe,PyTorch,keras等等。
tensorflow基础知识点:
关于tensorflow实现neural network的一些基本概念和实现,请参考系列文章。
- 深度学习模型之激活函数(Activation Function)
- 深度学习策略之损失函数(Loss Function OR Cost Function)
- 深度学习之算法(Algorithm)
- 深度学习之评估标准(F1)
- 深度学习训练之Batch
- 深度学习可视化之Tensorboard
正式开启重启版的第三部。
一、读取数据并转换格式
读取数据,并将label标签one-hot,再将Dataframe结果转成ndarray。
# 读取数据
data = pd.read_csv("haoma11yue_after_onehot_and_RobustScaler.csv", index_col=0, parse_dates=True)
print(data.shape) #(43777, 70)
# 将X和Y拆分开
from sklearn.model_selection import train_test_split
X = data.loc[:, data.columns != 'yonghuzhuangtai']
y = data.loc[:, data.columns == 'yonghuzhuangtai']
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size = 0.22, random_state = 0)
# label进行onthot
y_train_one_hot = pd.get_dummies(y_train['yonghuzhuangtai'], prefix= 'yonghuzhuangtai')
y_test_one_hot = pd.get_dummies(y_test['yonghuzhuangtai'], prefix= 'yonghuzhuangtai')
# DataFrame转ndarray
y_train_one_hot_n=y_train_one_hot.values
X_train_n =X_train.values
y_test_one_hot_n=y_test_one_hot.values
X_test_n =X_test.values
接下来就是逐步构建这个flow graph
二、为inputs占坑placeholder
placeholder只是明确了列数,行是使用None,动态的。
#-------输入输出的维度参数----------
features = X_train.shape[1] # 输入层,xx个特征
numClasses = 2 # 输出层,稀疏表示
# 指定X(输入层)Y(输出层)的大小,占坑
with tf.name_scope("inputs"):
X_input = tf.placeholder(tf.float32, shape = [None, features],name="X_input")
y_true = tf.placeholder(tf.float32, shape = [None, numClasses],name="y_true")
三、模型neural network
- 先定义神经网络中一层的函数,后续直接调用构建多层neural network。后续深入看看weight和baises的初始化值对结果的影响。
# 函数add_layer,添加一个神经网络隐藏层
# layoutname,隐藏层的名字
# inputs,隐藏层的输入,也就是前一层
# in_size,输入的纬度,也就是前一层的neurons数目
# out_size,输出的纬度,也就是该隐藏层的neurons数目
# activatuib_funaction,激活函数
def add_layer(layoutname, inputs, in_size, out_size, activatuib_funaction=None):
with tf.name_scope(layoutname):
with tf.name_scope('weights'):
Weights=tf.Variable(tf.random_normal([in_size,out_size], stddev=0.1),name='W') #stddev=0.1,无意中试过,加这个,效率速度快很多。其实涉及到W的初始化问题了。
tf.summary.histogram('Weights',Weights) #histogram_summary用于生成分布图,也可以用scalar_summary记录存数值
with tf.name_scope('biases'):
biases = tf.Variable(tf.constant(0.1, shape=[out_size]), name='b')
tf.summary.histogram('Biases',biases)
with tf.name_scope('Wx_plus_b'):
Wx_plus_b=tf.add(tf.matmul(inputs,Weights),biases)
if activatuib_funaction is None:
outputs=Wx_plus_b
else :
outputs=activatuib_funaction(Wx_plus_b)
return outputs
- 构建neural network,这里构建三个隐藏层,加上输入层和输出层,总共有5层,使用到的激活函数暂时是tf.nn.relu,后续试试tf.tanh,看看差别。
num_HiddenNeurons1 = 50 # 隐藏层第一层
num_HiddenNeurons2 = 40 # 隐藏层第二层
num_HiddenNeurons3 = 20 # 隐藏层第三层
with tf.name_scope('first_hindden_layer'):
first_hindden_layer=add_layer("first_hindden_layer",X_input,features,num_HiddenNeurons1,activatuib_funaction=tf.nn.relu)
with tf.name_scope('second_hindden_layer'):
second_hindden_layer=add_layer("second_hindden_layer",first_hindden_layer,num_HiddenNeurons1,num_HiddenNeurons2,activatuib_funaction=tf.nn.relu)
with tf.name_scope('third_hindden_layer'):
third_hindden_layer=add_layer("third_hindden_layer",second_hindden_layer,num_HiddenNeurons2,num_HiddenNeurons3,activatuib_funaction=tf.nn.relu)
with tf.name_scope('prediction'):
y_prediction =add_layer('prediction',third_hindden_layer,num_HiddenNeurons3,numClasses,activatuib_funaction=None)
# y_prediction的输出并不是概率分布,没有经过softmax
# [[ 84.97052765 47.09545517]
# [ 84.97052765 47.09545517]]
with tf.name_scope('prediction_softmax'):
y_prediction_softmax = tf.nn.softmax(y_prediction)
with tf.name_scope('Save'):
saver = tf.train.Saver(max_to_keep=4) #保留最近四次的模型
四、策略loss function
暂时使用square再mean的loss function,后续试试softmax_cross_entropy_with_logits,看看区别。
其实这里有个疑问,策略是优化的方向,mean_square,主要体现的应该是precision,但是最后评判的结果是f1,有loss function直接体现f1的吗?
with tf.name_scope("loss"):
# 结构风险=经验风险+正则化,经验风险使用交叉熵,正则化使用L2。
# 暂时不使用正则化,效果好像好点,或者说正则化还用得不好啊。
# ------------------------------------------------square--------------------------------------------------------
loss = tf.reduce_mean(tf.square(y_true - y_prediction_softmax))
五、算法
with tf.name_scope("train"):
# ----------------------------------指数衰减学习率---------------------------------------
# exponential_decay(learning_rate, global_step, decay_steps, decay_rate, staircase=False, name=None)
# decayed_learning_rate = learning_rate * decay_rate ^ (global_step / decay_steps) ,(If the argument `staircase` is `True)
Iterations = 0
learning_rate = tf.train.exponential_decay(learning_rate=0.1, global_step=Iterations, decay_steps=10000, decay_rate=0.99, staircase=True) #staircase 楼梯。
# ----------------------------------算法---------------------------------------
opt = tf.train.GradientDescentOptimizer(learning_rate = learning_rate).minimize(loss)
六、评价
主要还是那几个:f1,recall,precision,只是用tensor实现而已,最看重的还是f1。
def tf_confusion_metrics(model, actual_classes, session, feed_dict):
predictions = tf.argmax(model, 1)
actuals = tf.argmax(actual_classes, 1)
ones_like_actuals = tf.ones_like(actuals) # tf.ones_like: A `Tensor` with all elements set to 1.
zeros_like_actuals = tf.zeros_like(actuals)
ones_like_predictions = tf.ones_like(predictions)
zeros_like_predictions = tf.zeros_like(predictions)
# true positive 猜测和真实一致
tp_op = tf.reduce_sum( # tf.reduce_sum,统计1的个数
tf.cast( # tf.cast: Casts a tensor to a new type.把true变回1
tf.logical_and( # tf.logical_and: A `Tensor` of type `bool`. 把预测的true和实际的true取且操作
tf.equal(actuals, ones_like_actuals), # tf.equal:A `Tensor` of type `bool`.其实就是把1变成TRUE.
tf.equal(predictions, ones_like_predictions)
),
"float"
)
)
# true negative 猜测和真实一致
tn_op = tf.reduce_sum(
tf.cast(
tf.logical_and(
tf.equal(actuals, zeros_like_actuals),
tf.equal(predictions, zeros_like_predictions)
),
"float"
)
)
# false positive 实际是0,猜测是1
fp_op = tf.reduce_sum(
tf.cast(
tf.logical_and(
tf.equal(actuals, zeros_like_actuals),
tf.equal(predictions, ones_like_predictions)
),
"float"
)
)
# false negative 实际是1,猜测是0
fn_op = tf.reduce_sum(
tf.cast(
tf.logical_and(
tf.equal(actuals, ones_like_actuals),
tf.equal(predictions, zeros_like_predictions)
),
"float"
)
)
tp, tn, fp, fn = \
session.run(
[tp_op, tn_op, fp_op, fn_op],
feed_dict
)
with tf.name_scope("confusion_matrix"):
with tf.name_scope("precision"):
if((float(tp) + float(fp)) == 0):
precision = 0
else:
precision = float(tp)/(float(tp) + float(fp))
tf.summary.scalar("Precision",precision)
with tf.name_scope("recall"):
if((float(tp) + float(fn)) ==0):
recall = 0
else:
recall = float(tp) / (float(tp) + float(fn))
tf.summary.scalar("Recall",recall)
with tf.name_scope("f1_score"):
if((precision + recall) ==0):
f1_score = 0
else:
f1_score = (2 * (precision * recall)) / (precision + recall)
tf.summary.scalar("F1_score",f1_score)
with tf.name_scope("accuracy"):
accuracy = (float(tp) + float(tn)) / (float(tp) + float(fp) + float(fn) + float(tn))
tf.summary.scalar("Accuracy",accuracy)
print ('F1 Score = ', f1_score, ', Precision = ', precision,', Recall = ', recall, ', Accuracy = ', accuracy)
除了TensorFlow方式实现外,还可以用sklearn实现。
import sklearn as sk
import numpy as np
from sklearn.metrics import confusion_matrix
# 打印所有的scores参数,包括precision、recall、f1等等
# y_pred_score,神经网络的预测结果,经过softmax,type: <class 'numpy.ndarray'>
# y_true_onehot_score,神经网络的true值输入,是one-hot编码后的type: <class 'numpy.ndarray'>
def scores_all(y_pred_onehot_score, y_true_onehot_score):
y_pred_score = np.argmax(y_pred_onehot_score, axis = 1) # 反one-hot编码
y_true_score = np.argmax(y_true_onehot_score, axis = 1) # 反one-hot编码
# print("precision:",sk.metrics.precision_score(y_true_score,y_pred_score), \
# "recall:",sk.metrics.recall_score(y_true_score,y_pred_score), \
# "f1:",sk.metrics.f1_score(y_true_score,y_pred_score))
print("f1:",sk.metrics.f1_score(y_true_score,y_pred_score))
七、batch
batch的实现。
# --------------函数说明-----------------
# sourceData_feature :训练集的feature部分
# sourceData_label :训练集的label部分
# batch_size : 牛肉片的厚度
# num_epochs : 牛肉翻煮多少次
# shuffle : 是否打乱数据
def batch_iter(sourceData_feature,sourceData_label, batch_size, num_epochs, shuffle=True):
data_size = len(sourceData_feature)
num_batches_per_epoch = int(data_size / batch_size) # 样本数/batch块大小,多出来的“尾数”,不要了
for epoch in range(num_epochs):
# Shuffle the data at each epoch
if shuffle:
shuffle_indices = np.random.permutation(np.arange(data_size))
shuffled_data_feature = sourceData_feature[shuffle_indices]
shuffled_data_label = sourceData_label[shuffle_indices]
else:
shuffled_data_feature = sourceData_feature
shuffled_data_label = sourceData_label
for batch_num in range(num_batches_per_epoch): # batch_num取值0到num_batches_per_epoch-1
start_index = batch_num * batch_size
end_index = min((batch_num + 1) * batch_size, data_size)
yield (shuffled_data_feature[start_index:end_index] , shuffled_data_label[start_index:end_index])
八、训练train
进行迭代训练train。
batchSize = 1000 # 定义具体的牛肉厚度
epoch_count = 200 # 训练的epoch次数
Iterations = 0 # 记录迭代的次数
print("how many steps would train: ", (epoch_count * int((len(X_train_n)/batchSize))))
print('---------------------------start training------------------------------')
# sess
sess = tf.Session()
merged = tf.summary.merge_all() #Merges all summaries collected in the default graph.
# 定义训练过程中的参数(比如loss,weight,baises)保存到哪里
writer_val = tf.summary.FileWriter("logs/val", sess.graph)
# 保存训练过程中各个环节消耗的时间和内存
writer_timeandplace = tf.summary.FileWriter("logs/timeandplace", sess.graph)
sess.run(tf.global_variables_initializer())
# 迭代 必须注意batch_iter是yield→generator,所以for语句有特别
for (batchInput, batchLabels) in batch_iter(X_train_n, y_train_one_hot_n, batchSize, epoch_count, shuffle=True):
if Iterations%1000 == 0:
# --------------------------------训练并记录-----------------------------------------------
run_options = tf.RunOptions(trace_level = tf.RunOptions.FULL_TRACE) # 配置运行时需要记录的信息
run_metadata = tf.RunMetadata() # 运行时记录运行信息的proto
# train
trainingopt,trainingLoss,merged_r,y_prediction_softmax_r,y_prediction_r= \
sess.run([opt,loss,merged,y_prediction_softmax,y_prediction], \
feed_dict={X_input:batchInput, y_true:batchLabels}, options =run_options, run_metadata = run_metadata)
# print(batchInput[0:5,:])
# print(y_prediction_r[0:5,:])
# print(y_prediction_softmax_r[0:5,:])
# print(batchLabels[0:5,:])
# 记录参数
writer_val.add_summary(merged_r, Iterations)
# 将节点在运行时的信息写入日志文件
writer_timeandplace.add_run_metadata(run_metadata, 'Iterations%03d' % Iterations)
# 输出效果
print("step %d, %d people leave in this batch, loss is %g" \
%(Iterations, sum(np.argmax(batchLabels,axis = 1)) ,trainingLoss ))
print('--------------------train scores------------------')
tf_confusion_metrics(y_prediction_softmax_r, batchLabels, sess, feed_dict={X_input:batchInput, y_true:batchLabels})
# scores_all(y_prediction_softmax_r ,batchLabels)
# test set 的效果
trainingLoss, y_prediction_softmax_r = sess.run([loss,y_prediction_softmax], feed_dict = {X_input: X_test_n, y_true:y_test_one_hot_n})
print('**********************test score**********************')
tf_confusion_metrics(y_prediction_softmax_r, y_test_one_hot_n, sess, feed_dict = {X_input: X_test_n, y_true:y_test_one_hot_n})
# scores_all(y_prediction_softmax_r ,y_test_one_hot_n)
else:
# train
trainingopt, trainingLoss, merged_r = sess.run([opt,loss,merged], feed_dict={X_input: batchInput, y_true:batchLabels})
# 记录参数
writer_val.add_summary(merged_r, Iterations)
if Iterations%3000 == 0: # 每迭代三千次,save模型
saver.save(sess, 'tf_model/my_test_model',global_step=Iterations)
Iterations=Iterations+1
writer_val.close()
writer_timeandplace.close()
# sess.close()
训练结果:64000次训练后,f1大概是56.77%,倒数第二个是59.31%。
九、tensorboard看参数
loss:很快在0.0140附近震荡了。
十、优化
- 对weight和baises的初始值进行微调,效果不明显。
Weights=tf.Variable(tf.random_normal([in_size,out_size], mean =0, stddev=0.2),name='W')
biases = tf.Variable(tf.random_normal(shape=[out_size], mean =0, stddev=0.2),name='b')
- 试试改变网络结果,增加层或改激活函数
- 改loss function,最想改这个,挂钩f1,还有加入正则化
- 过采样到10:1
- 看方差,看偏差
后续,继续尝试特征优化,获取更多的特征,优化各种机器学习方法,
其实还有如何实现,呈现等
还有更重要的,回归到基础数学,统计学,线性代数,最优化等
跟算法玩个游戏,偷偷把label放到feature里面,看看算法能找到吗?
结果的二维边界怎么画