目录:
1, 环境需求
2, 数据介绍
3, 模型卷积神经网络
4, 测试卷积神经网络的性能
注意:如果你只想看一遍教程其实不需要非要把这里的代码跑一遍,但如果你手头有支持CUDA的GPU,并想按照教程的步骤执行代码的话,下面提供一些指导:
1, 假设你支持CUDA的GPU已经应该安装的配套环境已经就绪,Python3.6.x,numpy,pandas,matplotlib,scikit-learn 、tensorflow-gpu1.4、keras2.x都已经安装好。
对 Facial Keypoint Detection这个竞赛来说,训练集是96*96的灰度图:(数据连接:链接:https://pan.baidu.com/s/1XBswM0Xiz9QwcdddYnB_Hw 密码:xym7)
15个特征点分别是:
左、右眼中心,2
左、右眼外侧点,2
左、右眼内侧点,2
左、右眉毛外侧点,2
左、右眉毛内侧点,2
左、右嘴角,2
上、下唇中心,2
鼻尖,1
一个有趣的意外是,整个样本集合对有的特征点来说有7000个训练样本,然而对有的点只有2000个。下面来读取数据:
import os
import numpyas np
from pandas.io.parsersimport read_csv
from sklearn.utilsimport shuffle
FTRAIN ='training.csv'
FTEST ='test.csv'
def load(test=False, cols=None):
"""Loads data from FTEST if *test* is True, otherwise from FTRAIN.
Pass a list of *cols* if you're only interested in a subset of the
target columns.
"""
fname = FTESTif testelse FTRAIN
df = read_csv(os.path.expanduser(fname))# load pandas dataframe
# The Image column has pixel values separated by space; convert
# the values to numpy arrays:
df['Image'] = df['Image'].apply(lambda im: np.fromstring(im, sep=' '))
if cols:# get a subset of columns
df = df[list(cols) + ['Image']]
print(df.count())# prints the number of values for each column
df = df.dropna()# drop all rows that have missing values in them
X = np.vstack(df['Image'].values) /255. # scale pixel values to [0, 1]
X = X.astype(np.float32)
if not test:# only FTRAIN has any target columns
y = df[df.columns[:-1]].values
y = y /96 # scale target coordinates to [-1, 1]
X, y = shuffle(X, y, random_state=42)# shuffle train data
y = y.astype(np.float32)
else:
y =None
return X, y
X, y = load()
print("X.shape == {}; X.min == {:.3f}; X.max == {:.3f}".format(
X.shape, X.min(), X.max()))
print("y.shape == {}; y.min == {:.3f}; y.max == {:.3f}".format(
y.shape, y.min(), y.max()))
读取结果如下:
left_eye_center_x 7039
left_eye_center_y 7039
right_eye_center_x 7036
right_eye_center_y 7036
left_eye_inner_corner_x 2271
left_eye_inner_corner_y 2271
left_eye_outer_corner_x 2267
left_eye_outer_corner_y 2267
right_eye_inner_corner_x 2268
right_eye_inner_corner_y 2268
right_eye_outer_corner_x 2268
right_eye_outer_corner_y 2268
left_eyebrow_inner_end_x 2270
left_eyebrow_inner_end_y 2270
left_eyebrow_outer_end_x 2225
left_eyebrow_outer_end_y 2225
right_eyebrow_inner_end_x 2270
right_eyebrow_inner_end_y 2270
right_eyebrow_outer_end_x 2236
right_eyebrow_outer_end_y 2236
nose_tip_x 7049
nose_tip_y 7049
mouth_left_corner_x 2269
mouth_left_corner_y 2269
mouth_right_corner_x 2270
mouth_right_corner_y 2270
mouth_center_top_lip_x 2275
mouth_center_top_lip_y 2275
mouth_center_bottom_lip_x 7016
mouth_center_bottom_lip_y 7016
Image 7049
dtype: int64
X.shape == (2140, 9216); X.min == 0.000; X.max == 1.000
y.shape == (2140, 30); y.min == 0.040; y.max == 0.998
Process finished with exit code 0
这个结果告诉我们,很多图的特征点是不全的,比如右唇角,只有2267个样本。我们丢掉所有特征点不到15个的图片,这一行做了这件事:
df = df.dropna() # drop all rows that have missing values in them
用剩下的2140张图片作为训练集训练我们的网络。
另外一个值得注意的地方是,我们在读取数据的函数中,把图片像素值从0~255缩放到[0,1]之间,目标值(特征点的位置)也从0~95放缩到[0,1]之间。