经过之前的学习能够做到识别'0-9'的手写体数字,但是为了实现更多的分类,应该怎么办呢?
首先简单介绍一下LeNet,网络结构包含2个卷积层,2个max池化层,2个全链接层和1个relu层与一个softmax层。
输入数据体的尺寸为
4个超参数:滤波器的数量
滤波器的空间尺寸
步长
零填充数量
,其中:
想实现对‘0-9’和‘a-z’分类,我看了整个网络。数据层不需要变化,我只改变了最后一个全连接层。很幸运的是成功实现了目标,但是训练完正确率只有0.6.可能是我的训练集太小,也可能是卷积层要增加特征的采集。
以下是代码,和之前差不多所以我只贴出改的地方
layer {
name: "ip2"
type: "InnerProduct"
bottom: "ip1"
top: "ip2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 36
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
# The train/test net protocol buffer definition
net: "mytest/chinese/lenet_plus_train_test.prototxt"
# test_iter specifies how many forward passes the test should carry out.
# In the case of MNIST, we have test batch size 100 and 100 test iterations,
# covering the full 370 testing images.
test_iter: 10
# Carry out testing every 500 training iterations.
test_interval: 300
# The base learning rate, momentum and the weight decay of the network.
base_lr: 0.01
momentum: 0.9
weight_decay: 0.0005
# The learning rate policy
lr_policy: "inv"
gamma: 0.0001
power: 0.75
# Display every 100 iterations
display: 20
# The maximum number of iterations
max_iter: 10000
# snapshot intermediate results
snapshot: 5000
snapshot_prefix: "mytest/chinese/lenet"
# solver mode: CPU or GPU
solver_mode: GPU
创建predict_plus.py
#coding=utf-8
#by yuzefan
import os
import caffe
import numpy as np
import cv2
import sys
from os.path import join, isdir
caffe_root='/home/ubuntu/caffe-master/'
sys.path.insert(0,caffe_root+'python')
os.chdir(caffe_root)#change current dir
DEPLOY_FILE=caffe_root+'mytest/chinese/classificat_net.prototxt'
MODEL_FILE=caffe_root+'mytest/chinese/lenet_iter_10000.caffemodel'
net=caffe.Classifier(DEPLOY_FILE,MODEL_FILE)
caffe.set_mode_gpu()
IMAGE_PATH=caffe_root+'mytest/chinese/data/train'
font=cv2.FONT_HERSHEY_SIMPLEX #normal size sans-serif font
sd=[d for d in os.listdir(IMAGE_PATH)]
sd.sort()
cv2.waitKey(1000)
print (sd,'add path done')
cv2.waitKey(2000)
class_id=001
os.chdir(IMAGE_PATH)
names=[]
with open('/home/ubuntu/caffe-master/mytest/chinese/words.txt', 'r+') as f:
for l in f.readlines():
names.append(l.split(' ')[1].strip())
for d in sd:
fs=[join(d,x) for x in os.listdir(d)]
for num in fs:
img=join(IMAGE_PATH,num)
input_image=cv2.imread(img,cv2.IMREAD_GRAYSCALE).astype(np.float32)
resized=cv2.resize(input_image,(280,280),None,0,0,cv2.INTER_AREA)
input_image = input_image[:, :, np.newaxis]
prediction = net.predict([input_image], oversample=False)
cv2.putText(resized, str(names[prediction[0].argmax()]), (0, 280), font, 2, (0,), 2)
cv2.imshow("Prediction", resized)
print 'predicted class:', names[prediction[0].argmax()]
keycode = cv2.waitKey(50) & 0xFF
if keycode == 27:
break
以前的make_list.py,训练脚本只需要更改路径。
运行效果: