2-7节 k-近邻算法|手写识别系统|机器学习实战-学习笔记

文章原创,最近更新:2018-08-11

本章节的主要内容是:
重点介绍项目案例1: 手写识别系统的完整代码

1.KNN项目案例介绍:

项目案例2:

手写识别系统

项目概述:
  • 构造一个能识别数字 0 到 9 的基于 KNN 分类器的手写数字识别系统。
  • 需要识别的数字是存储在文本文件中的具有相同的色彩和大小:宽高是 32 像素 * 32 像素的黑白图像。
开发流程:
  • 收集数据:提供文本文件。
  • 准备数据:编写函数 img2vector(), 将图像格式转换为分类器使用的向量格式
  • 分析数据:在 Python 命令提示符中检查数据,确保它符合要求
  • 训练算法:此步骤不适用于 KNN
  • 测试算法:编写函数使用提供的部分数据集作为测试样本,测试样本与非测试样本的区别在于测试样本是已经完成分类的数据,如果预测分类与实际类别不同,则标记为一个错误
  • 使用算法:本例没有完成此步骤,若你感兴趣可以构建完整的应用程序,从图像中提取数字,并完成数字识别,美国的邮件分拣系统就是一个实际运行的类似系统
数据集介绍

数据来源于《机器学习实战》第二章 k邻近算法,具体如下:

  • 文件夹trainingDigits 中包含了大约 2000 个例子,每个例子内容如下图所示,每个数字大约有 200 个样本.
    手写数字数据集的例子
  • 文件夹 testDigits 中包含了大约 900 个测试数据。
  • 使用 文件trainingDigits中的数据训练分类器,使用文件 testDigits 中的数据测试分类器的效果.

trainingDigits文件夹中某个文件的内容如下所示:(备注: testDigits文件格式类似,不再展示)

00000000000001111000000000000000
00000000000011111110000000000000
00000000001111111111000000000000
00000001111111111111100000000000
00000001111111011111100000000000
00000011111110000011110000000000
00000011111110000000111000000000
00000011111110000000111100000000
00000011111110000000011100000000
00000011111110000000011100000000
00000011111100000000011110000000
00000011111100000000001110000000
00000011111100000000001110000000
00000001111110000000000111000000
00000001111110000000000111000000
00000001111110000000000111000000
00000001111110000000000111000000
00000011111110000000001111000000
00000011110110000000001111000000
00000011110000000000011110000000
00000001111000000000001111000000
00000001111000000000011111000000
00000001111000000000111110000000
00000001111000000001111100000000
00000000111000000111111000000000
00000000111100011111110000000000
00000000111111111111110000000000
00000000011111111111110000000000
00000000011111111111100000000000
00000000001111111110000000000000
00000000000111110000000000000000
00000000000011000000000000000000

trainingDigits文件夹的文件的存储方式如下所示:(备注: testDigits文件夹存储类似,不再展示)

2.手写识别系统项目

第二章是描述K近邻算法的,算法本质就是寻找距离最近的点,这个距离可以是欧式距离,也可以是其他,这本书采用的就是欧式距离了。K近邻算法主要是用来分类的,比如我新输入一个数据,要判断他属于哪个类别,用这个算法就很合适了,简单实用。

2.1准备数据:将图像转换为测试向量

首先我们创建一个名为kNN.py的文件,然后我们就创建一个函数img2vector(),输入到kNN.py这个文件.

img2vector()这个函数的主要作用是将图像数据转换为向量.

def img2vector(filename):
    """
    将图像数据转换为向量
    
    filename:图片文件,因为我们的输入数据的图片格式是32*32
    return:一维矩阵
    
    该函数将图像转换为向量:该函数创建1*1024的numpy数组,然后打开给定的文件,
    循环读出文件的前32行,并将每行的头32个字符值存储在numpy数组中,最后返回数组.
    """
    returnVect = np.zeros((1,1024))
    fr=open(filename)
    for i in range(32):
        lineStr = fr.readline()
        for j in range(32):
            returnVect[0,32*i+j] = int(lineStr[j])
    return returnVect

测试代码及其结果如下:

import kNN

testVector=kNN.img2vector("testDigits/0_13.txt")

testVector[0,0:31]
Out[8]: 
array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  1.,  1.,  1.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  0.,  0.,  0.,  0.])

testVector[0,32:63]
Out[9]: 
array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.,
        1.,  1.,  1.,  1.,  1.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  0.,  0.,  0.,  0.]

相关知识点:
知识点1:
文件内容的读取,通过文中的数据集testDigits/0_13.txt进行理解.

  • <f>.read(size=-1)
    读入全部内容,如果给出参数,读入前size长度
fr=open("testDigits/0_13.txt")
for i in range(32):
    lineStr1 = fr.read()
    print("打印第",i,"行",lineStr1)

输出结果如下:

打印第 0 行 00000000000000111100000000000000
00000000000011111110000000000000
 ----------略-------------------
00000000000011111111111000000000
00000000000000111111110000000000

打印第 1 行 
打印第 2 行 
----略-----
打印第 30 行 
打印第 31 行 
  • <f>.readline(size=-1)
    读入一行内容,如果给出参数,读入该行前size长度
fr=open("testDigits/0_13.txt")
for i in range(32):
    lineStr1 = fr.readline()
    print("打印第",i,"行",lineStr1)

输出结果为:

打印第 0 行 00000000000000111100000000000000
打印第 1 行 00000000000011111110000000000000
 -------------------略---------------------
打印第 30 行 00000000000011111111111000000000
打印第 31 行 00000000000000111111110000000000
  • <f>.readlines(hint=-1)
    读入文件所有行,以每行为元素形成列表如果给出参数,读入前hint行
fr=open("testDigits/0_13.txt")
for i in range(32):
    lineStr1 = fr.readlines()
    print("打印第",i,"行",lineStr1)

输出结果为:

打印第 0 行 ['00000000000000111100000000000000\n', '00000000000011111110000000000000\n', '00000000000111111111000000000000\n', '00000000001111111111100000000000\n', '00000000001111111111100000000000\n', '00000000111111111111110000000000\n', '00000000011111111111111000000000\n', '00000000111111110111111000000000\n', '00000001111110000000111100000000\n', '00000001111110000000011100000000\n', '00000011111110000000011100000000\n', '00000011111100000000011100000000\n', '00000011111100000000011100000000\n', '00000001111100000000001111000000\n', '00000001111000000000000111000000\n', '00000011110000000000000111000000\n', '00000011110000000000000111000000\n', '00000011110000000000000111000000\n', '00000011110000000000000111000000\n', '00000011110000000000000111000000\n', '00000000111000000000000011100000\n', '00000000111000000000000011100000\n', '00000000111100000000000111100000\n', '00000000111100000000000111100000\n', '00000000111110000000011111100000\n', '00000000011111000000111111000000\n', '00000000011111111111111110000000\n', '00000000001111111111111111000000\n', '00000000000111111111111111000000\n', '00000000000111111111111110000000\n', '00000000000011111111111000000000\n', '00000000000000111111110000000000\n']
打印第 1 行 []
打印第 2 行 []
 ----略-------
打印第 30 行 []
打印第 31 行 []

2.2测试算法:使用k近邻算法识别手写数字

创建一个函数handwritingClassTest(),输入到kNN.py这个文件.

handwritingClassTest()这个函数的主要作用是手写字体识别模块:训练集和测试集.

def handwritingClassTest():
    """
    手写字体识别模块:训练集和测试集
    """
    #训练集:每个文件中的数据进行识别
    
    # 存放训练集labels
    hwLabels = []
    # 以列表形式获取trainingDigits文件夹所有文件名称
    trainingFileList = listdir('trainingDigits')
    # trainingDigits文件夹所有文件个数
    m = len(trainingFileList)
    # 创建m行1024列,0矩阵
    trainingMat = np.zeros((m,1024))
    for i in range(m):
        # 读取 trainingFileList第i个数据文件名称
        fileNameStr = trainingFileList[i]  
        # split文件,通过识别”.“,[0]代表除去后面的,即txt
        fileStr = fileNameStr.split('.')[0] 
        # split文件,通过识别”_”,[0]除去了0_3后面的序号3,保留0
        classNumStr = int(fileStr.split('_')[0])
        # 存放通过文件名称识别出来的labels
        hwLabels.append(classNumStr)
        # 存放不同标签下的具体数据
        # 调用函数img2vector每行放一个1×1024的向量
        trainingMat[i,:]=img2vector('trainingDigits/%s'\
                                      % fileNameStr)
        
    # 测试集:每个文件中的数据进行识别,得出参考向量inX
    testFileList = listdir('testDigits')
    errorCount = 0.0
    mTest = len(testFileList)
    for i in range(mTest):
        fileNameStr = testFileList[i]
        fileStr = fileNameStr.split('.')[0]
        classNumStr = int(fileStr.split('_')[0])
        vectorUnderTest = img2vector('testDigits/%s' % fileNameStr)
        # 调用k近邻算法
        classifierResult = classify0(vectorUnderTest, trainingMat, hwLabels, 3)
        print("the classifier came back with: %d,the real answer is: %d" % (classifierResult, classNumStr))
        if(classifierResult != classNumStr):errorCount += 1.0
    print("\nthe totao number of errors is: %d" % errorCount)
    print("\nthe total error rate is: %f" % (errorCount/float(mTest)))

测试代码及其结果如下:

>import kNN
>kNN.handwritingClassTest()

the classifier came back with: 0,the real answer is: 0
the classifier came back with: 0,the real answer is: 0
-----------------------略-----------------------------
the classifier came back with: 9,the real answer is: 9
the classifier came back with: 9,the real answer is: 9

the totao number of errors is: 13

the total error rate is: 0.013742

通过以上代码可以知道:
k-近邻算法识别手写数字数据集,错误率为1.3%。

相关知识点:

知识点1:os.listdir(path)
os.listdir() 方法用于返回指定的文件夹包含的文件或文件夹的名字的列表。这个列表以字母顺序。 它不包括 '.' 和'..' 即使它在文件夹中。只支持在 Unix, Windows 下使用。

listdir()方法语法格式:os.listdir(path)

  • 参数:path -- 需要列出的目录路径
  • 返回值:返回指定路径下的文件和文件夹列表。
from os import listdir
path='trainingDigits'
dirs = listdir( path )
print(dirs)

输出结果如下:

['0_0.txt', '0_1.txt', '0_10.txt', '0_100.txt', '0_101.txt', '0_102.txt', '0_103.txt', '0_104.txt', '0_105.txt', '0_106.txt', '0_107.txt', '0_108.txt', '0_109.txt', '0_11.txt', '0_110.txt', '0_111.txt', '0_112.txt', '0_113.txt', '0_114.txt', '0_115.txt', '0_116.txt', '0_117.txt', '0_118.txt', '0_119.txt', '0_12.txt', '0_120.txt', '0_121.txt', '0_122.txt', '0_123.txt', '0_124.txt', '0_125.txt', '0_126.txt', '0_127.txt', '0_128.txt', '0_129.txt', '0_13.txt', '0_130.txt', '0_131.txt', '0_132.txt', '0_133.txt', '0_134.txt', '0_135.txt', '0_136.txt', '0_137.txt', '0_138.txt', '0_139.txt', '0_14.txt', '0_140.txt', '0_141.txt', '0_142.txt', '0_143.txt', '0_144.txt', '0_145.txt', '0_146.txt', '0_147.txt', '0_148.txt', '0_149.txt', '0_15.txt', '0_150.txt', '0_151.txt', '0_152.txt', '0_153.txt', '0_154.txt', '0_155.txt', '0_156.txt', '0_157.txt', '0_158.txt', '0_159.txt', '0_16.txt', '0_160.txt', '0_161.txt', '0_162.txt', '0_163.txt', '0_164.txt', '0_165.txt', '0_166.txt', '0_167.txt', '0_168.txt', '0_169.txt', '0_17.txt', '0_170.txt', '0_171.txt', '0_172.txt', '0_173.txt', '0_174.txt', '0_175.txt', '0_176.txt', '0_177.txt', '0_178.txt', '0_179.txt', '0_18.txt', '0_180.txt', '0_181.txt', '0_182.txt', '0_183.txt', '0_184.txt', '0_185.txt', '0_186.txt', '0_187.txt', '0_188.txt', '0_19.txt', '0_2.txt', '0_20.txt', '0_21.txt', '0_22.txt', '0_23.txt', '0_24.txt', '0_25.txt', '0_26.txt', '0_27.txt', '0_28.txt', '0_29.txt', '0_3.txt', '0_30.txt', '0_31.txt', '0_32.txt', '0_33.txt', '0_34.txt', '0_35.txt', '0_36.txt', '0_37.txt', '0_38.txt', '0_39.txt', '0_4.txt', '0_40.txt', '0_41.txt', '0_42.txt', '0_43.txt', '0_44.txt', '0_45.txt', '0_46.txt', '0_47.txt', '0_48.txt', '0_49.txt', '0_5.txt', '0_50.txt', '0_51.txt', '0_52.txt', '0_53.txt', '0_54.txt', '0_55.txt', '0_56.txt', '0_57.txt', '0_58.txt', '0_59.txt', '0_6.txt', '0_60.txt', '0_61.txt', '0_62.txt', '0_63.txt', '0_64.txt', '0_65.txt', '0_66.txt', '0_67.txt', '0_68.txt', '0_69.txt', '0_7.txt', '0_70.txt', '0_71.txt', '0_72.txt', '0_73.txt', '0_74.txt', '0_75.txt', '0_76.txt', '0_77.txt', '0_78.txt', '0_79.txt', '0_8.txt', '0_80.txt', '0_81.txt', '0_82.txt', '0_83.txt', '0_84.txt', '0_85.txt', '0_86.txt', '0_87.txt', '0_88.txt', '0_89.txt', '0_9.txt', '0_90.txt', '0_91.txt', '0_92.txt', '0_93.txt', '0_94.txt', '0_95.txt', '0_96.txt', '0_97.txt', '0_98.txt', '0_99.txt', '1_0.txt', '1_1.txt', '1_10.txt', '1_100.txt', '1_101.txt', '1_102.txt', '1_103.txt', '1_104.txt', '1_105.txt', '1_106.txt', '1_107.txt', '1_108.txt', '1_109.txt', '1_11.txt', '1_110.txt', '1_111.txt', '1_112.txt', '1_113.txt', '1_114.txt', '1_115.txt', '1_116.txt', '1_117.txt', '1_118.txt', '1_119.txt', '1_12.txt', '1_120.txt', '1_121.txt', '1_122.txt', '1_123.txt', '1_124.txt', '1_125.txt', '1_126.txt', '1_127.txt', '1_128.txt', '1_129.txt', '1_13.txt', '1_130.txt', '1_131.txt', '1_132.txt', '1_133.txt', '1_134.txt', '1_135.txt', '1_136.txt', '1_137.txt', '1_138.txt', '1_139.txt', '1_14.txt', '1_140.txt', '1_141.txt', '1_142.txt', '1_143.txt', '1_144.txt', '1_145.txt', '1_146.txt', '1_147.txt', '1_148.txt', '1_149.txt', '1_15.txt', '1_150.txt', '1_151.txt', '1_152.txt', '1_153.txt', '1_154.txt', '1_155.txt', '1_156.txt', '1_157.txt', '1_158.txt', '1_159.txt', '1_16.txt', '1_160.txt', '1_161.txt', '1_162.txt', '1_163.txt', '1_164.txt', '1_165.txt', '1_166.txt', '1_167.txt', '1_168.txt', '1_169.txt', '1_17.txt', '1_170.txt', '1_171.txt', '1_172.txt', '1_173.txt', '1_174.txt', '1_175.txt', '1_176.txt', '1_177.txt', '1_178.txt', '1_179.txt', '1_18.txt', '1_180.txt', '1_181.txt', '1_182.txt', '1_183.txt', '1_184.txt', '1_185.txt', '1_186.txt', '1_187.txt', '1_188.txt', '1_189.txt', '1_19.txt', '1_190.txt', '1_191.txt', '1_192.txt', '1_193.txt', '1_194.txt', '1_195.txt', '1_196.txt', '1_197.txt', '1_2.txt', '1_20.txt', '1_21.txt', '1_22.txt', '1_23.txt', '1_24.txt', '1_25.txt', '1_26.txt', '1_27.txt', '1_28.txt', '1_29.txt', '1_3.txt', '1_30.txt', '1_31.txt', '1_32.txt', '1_33.txt', '1_34.txt', '1_35.txt', '1_36.txt', '1_37.txt', '1_38.txt', '1_39.txt', '1_4.txt', '1_40.txt', '1_41.txt', '1_42.txt', '1_43.txt', '1_44.txt', '1_45.txt', '1_46.txt', '1_47.txt', '1_48.txt', '1_49.txt', '1_5.txt', '1_50.txt', '1_51.txt', '1_52.txt', '1_53.txt', '1_54.txt', '1_55.txt', '1_56.txt', '1_57.txt', '1_58.txt', '1_59.txt', '1_6.txt', '1_60.txt', '1_61.txt', '1_62.txt', '1_63.txt', '1_64.txt', '1_65.txt', '1_66.txt', '1_67.txt', '1_68.txt', '1_69.txt', '1_7.txt', '1_70.txt', '1_71.txt', '1_72.txt', '1_73.txt', '1_74.txt', '1_75.txt', '1_76.txt', '1_77.txt', '1_78.txt', '1_79.txt', '1_8.txt', '1_80.txt', '1_81.txt', '1_82.txt', '1_83.txt', '1_84.txt', '1_85.txt', '1_86.txt', '1_87.txt', '1_88.txt', '1_89.txt', '1_9.txt', '1_90.txt', '1_91.txt', '1_92.txt', '1_93.txt', '1_94.txt', '1_95.txt', '1_96.txt', '1_97.txt', '1_98.txt', '1_99.txt', '2_0.txt', '2_1.txt', '2_10.txt', '2_100.txt', '2_101.txt', '2_102.txt', '2_103.txt', '2_104.txt', '2_105.txt', '2_106.txt', '2_107.txt', '2_108.txt', '2_109.txt', '2_11.txt', '2_110.txt', '2_111.txt', '2_112.txt', '2_113.txt', '2_114.txt', '2_115.txt', '2_116.txt', '2_117.txt', '2_118.txt', '2_119.txt', '2_12.txt', '2_120.txt', '2_121.txt', '2_122.txt', '2_123.txt', '2_124.txt', '2_125.txt', '2_126.txt', '2_127.txt', '2_128.txt', '2_129.txt', '2_13.txt', '2_130.txt', '2_131.txt', '2_132.txt', '2_133.txt', '2_134.txt', '2_135.txt', '2_136.txt', '2_137.txt', '2_138.txt', '2_139.txt', '2_14.txt', '2_140.txt', '2_141.txt', '2_142.txt', '2_143.txt', '2_144.txt', '2_145.txt', '2_146.txt', '2_147.txt', '2_148.txt', '2_149.txt', '2_15.txt', '2_150.txt', '2_151.txt', '2_152.txt', '2_153.txt', '2_154.txt', '2_155.txt', '2_156.txt', '2_157.txt', '2_158.txt', '2_159.txt', '2_16.txt', '2_160.txt', '2_161.txt', '2_162.txt', '2_163.txt', '2_164.txt', '2_165.txt', '2_166.txt', '2_167.txt', '2_168.txt', '2_169.txt', '2_17.txt', '2_170.txt', '2_171.txt', '2_172.txt', '2_173.txt', '2_174.txt', '2_175.txt', '2_176.txt', '2_177.txt', '2_178.txt', '2_179.txt', '2_18.txt', '2_180.txt', '2_181.txt', '2_182.txt', '2_183.txt', '2_184.txt', '2_185.txt', '2_186.txt', '2_187.txt', '2_188.txt', '2_189.txt', '2_19.txt', '2_190.txt', '2_191.txt', '2_192.txt', '2_193.txt', '2_194.txt', '2_2.txt', '2_20.txt', '2_21.txt', '2_22.txt', '2_23.txt', '2_24.txt', '2_25.txt', '2_26.txt', '2_27.txt', '2_28.txt', '2_29.txt', '2_3.txt', '2_30.txt', '2_31.txt', '2_32.txt', '2_33.txt', '2_34.txt', '2_35.txt', '2_36.txt', '2_37.txt', '2_38.txt', '2_39.txt', '2_4.txt', '2_40.txt', '2_41.txt', '2_42.txt', '2_43.txt', '2_44.txt', '2_45.txt', '2_46.txt', '2_47.txt', '2_48.txt', '2_49.txt', '2_5.txt', '2_50.txt', '2_51.txt', '2_52.txt', '2_53.txt', '2_54.txt', '2_55.txt', '2_56.txt', '2_57.txt', '2_58.txt', '2_59.txt', '2_6.txt', '2_60.txt', '2_61.txt', '2_62.txt', '2_63.txt', '2_64.txt', '2_65.txt', '2_66.txt', '2_67.txt', '2_68.txt', '2_69.txt', '2_7.txt', '2_70.txt', '2_71.txt', '2_72.txt', '2_73.txt', '2_74.txt', '2_75.txt', '2_76.txt', '2_77.txt', '2_78.txt', '2_79.txt', '2_8.txt', '2_80.txt', '2_81.txt', '2_82.txt', '2_83.txt', '2_84.txt', '2_85.txt', '2_86.txt', '2_87.txt', '2_88.txt', '2_89.txt', '2_9.txt', '2_90.txt', '2_91.txt', '2_92.txt', '2_93.txt', '2_94.txt', '2_95.txt', '2_96.txt', '2_97.txt', '2_98.txt', '2_99.txt', '3_0.txt', '3_1.txt', '3_10.txt', '3_100.txt', '3_101.txt', '3_102.txt', '3_103.txt', '3_104.txt', '3_105.txt', '3_106.txt', '3_107.txt', '3_108.txt', '3_109.txt', '3_11.txt', '3_110.txt', '3_111.txt', '3_112.txt', '3_113.txt', '3_114.txt', '3_115.txt', '3_116.txt', '3_117.txt', '3_118.txt', '3_119.txt', '3_12.txt', '3_120.txt', '3_121.txt', '3_122.txt', '3_123.txt', '3_124.txt', '3_125.txt', '3_126.txt', '3_127.txt', '3_128.txt', '3_129.txt', '3_13.txt', '3_130.txt', '3_131.txt', '3_132.txt', '3_133.txt', '3_134.txt', '3_135.txt', '3_136.txt', '3_137.txt', '3_138.txt', '3_139.txt', '3_14.txt', '3_140.txt', '3_141.txt', '3_142.txt', '3_143.txt', '3_144.txt', '3_145.txt', '3_146.txt', '3_147.txt', '3_148.txt', '3_149.txt', '3_15.txt', '3_150.txt', '3_151.txt', '3_152.txt', '3_153.txt', '3_154.txt', '3_155.txt', '3_156.txt', '3_157.txt', '3_158.txt', '3_159.txt', '3_16.txt', '3_160.txt', '3_161.txt', '3_162.txt', '3_163.txt', '3_164.txt', '3_165.txt', '3_166.txt', '3_167.txt', '3_168.txt', '3_169.txt', '3_17.txt', '3_170.txt', '3_171.txt', '3_172.txt', '3_173.txt', '3_174.txt', '3_175.txt', '3_176.txt', '3_177.txt', '3_178.txt', '3_179.txt', '3_18.txt', '3_180.txt', '3_181.txt', '3_182.txt', '3_183.txt', '3_184.txt', '3_185.txt', '3_186.txt', '3_187.txt', '3_188.txt', '3_189.txt', '3_19.txt', '3_190.txt', '3_191.txt', '3_192.txt', '3_193.txt', '3_194.txt', '3_195.txt', '3_196.txt', '3_197.txt', '3_198.txt', '3_2.txt', '3_20.txt', '3_21.txt', '3_22.txt', '3_23.txt', '3_24.txt', '3_25.txt', '3_26.txt', '3_27.txt', '3_28.txt', '3_29.txt', '3_3.txt', '3_30.txt', '3_31.txt', '3_32.txt', '3_33.txt', '3_34.txt', '3_35.txt', '3_36.txt', '3_37.txt', '3_38.txt', '3_39.txt', '3_4.txt', '3_40.txt', '3_41.txt', '3_42.txt', '3_43.txt', '3_44.txt', '3_45.txt', '3_46.txt', '3_47.txt', '3_48.txt', '3_49.txt', '3_5.txt', '3_50.txt', '3_51.txt', '3_52.txt', '3_53.txt', '3_54.txt', '3_55.txt', '3_56.txt', '3_57.txt', '3_58.txt', '3_59.txt', '3_6.txt', '3_60.txt', '3_61.txt', '3_62.txt', '3_63.txt', '3_64.txt', '3_65.txt', '3_66.txt', '3_67.txt', '3_68.txt', '3_69.txt', '3_7.txt', '3_70.txt', '3_71.txt', '3_72.txt', '3_73.txt', '3_74.txt', '3_75.txt', '3_76.txt', '3_77.txt', '3_78.txt', '3_79.txt', '3_8.txt', '3_80.txt', '3_81.txt', '3_82.txt', '3_83.txt', '3_84.txt', '3_85.txt', '3_86.txt', '3_87.txt', '3_88.txt', '3_89.txt', '3_9.txt', '3_90.txt', '3_91.txt', '3_92.txt', '3_93.txt', '3_94.txt', '3_95.txt', '3_96.txt', '3_97.txt', '3_98.txt', '3_99.txt', '4_0.txt', '4_1.txt', '4_10.txt', '4_100.txt', '4_101.txt', '4_102.txt', '4_103.txt', '4_104.txt', '4_105.txt', '4_106.txt', '4_107.txt', '4_108.txt', '4_109.txt', '4_11.txt', '4_110.txt', '4_111.txt', '4_112.txt', '4_113.txt', '4_114.txt', '4_115.txt', '4_116.txt', '4_117.txt', '4_118.txt', '4_119.txt', '4_12.txt', '4_120.txt', '4_121.txt', '4_122.txt', '4_123.txt', '4_124.txt', '4_125.txt', '4_126.txt', '4_127.txt', '4_128.txt', '4_129.txt', '4_13.txt', '4_130.txt', '4_131.txt', '4_132.txt', '4_133.txt', '4_134.txt', '4_135.txt', '4_136.txt', '4_137.txt', '4_138.txt', '4_139.txt', '4_14.txt', '4_140.txt', '4_141.txt', '4_142.txt', '4_143.txt', '4_144.txt', '4_145.txt', '4_146.txt', '4_147.txt', '4_148.txt', '4_149.txt', '4_15.txt', '4_150.txt', '4_151.txt', '4_152.txt', '4_153.txt', '4_154.txt', '4_155.txt', '4_156.txt', '4_157.txt', '4_158.txt', '4_159.txt', '4_16.txt', '4_160.txt', '4_161.txt', '4_162.txt', '4_163.txt', '4_164.txt', '4_165.txt', '4_166.txt', '4_167.txt', '4_168.txt', '4_169.txt', '4_17.txt', '4_170.txt', '4_171.txt', '4_172.txt', '4_173.txt', '4_174.txt', '4_175.txt', '4_176.txt', '4_177.txt', '4_178.txt', '4_179.txt', '4_18.txt', '4_180.txt', '4_181.txt', '4_182.txt', '4_183.txt', '4_184.txt', '4_185.txt', '4_19.txt', '4_2.txt', '4_20.txt', '4_21.txt', '4_22.txt', '4_23.txt', '4_24.txt', '4_25.txt', '4_26.txt', '4_27.txt', '4_28.txt', '4_29.txt', '4_3.txt', '4_30.txt', '4_31.txt', '4_32.txt', '4_33.txt', '4_34.txt', '4_35.txt', '4_36.txt', '4_37.txt', '4_38.txt', '4_39.txt', '4_4.txt', '4_40.txt', '4_41.txt', '4_42.txt', '4_43.txt', '4_44.txt', '4_45.txt', '4_46.txt', '4_47.txt', '4_48.txt', '4_49.txt', '4_5.txt', '4_50.txt', '4_51.txt', '4_52.txt', '4_53.txt', '4_54.txt', '4_55.txt', '4_56.txt', '4_57.txt', '4_58.txt', '4_59.txt', '4_6.txt', '4_60.txt', '4_61.txt', '4_62.txt', '4_63.txt', '4_64.txt', '4_65.txt', '4_66.txt', '4_67.txt', '4_68.txt', '4_69.txt', '4_7.txt', '4_70.txt', '4_71.txt', '4_72.txt', '4_73.txt', '4_74.txt', '4_75.txt', '4_76.txt', '4_77.txt', '4_78.txt', '4_79.txt', '4_8.txt', '4_80.txt', '4_81.txt', '4_82.txt', '4_83.txt', '4_84.txt', '4_85.txt', '4_86.txt', '4_87.txt', '4_88.txt', '4_89.txt', '4_9.txt', '4_90.txt', '4_91.txt', '4_92.txt', '4_93.txt', '4_94.txt', '4_95.txt', '4_96.txt', '4_97.txt', '4_98.txt', '4_99.txt', '5_0.txt', '5_1.txt', '5_10.txt', '5_100.txt', '5_101.txt', '5_102.txt', '5_103.txt', '5_104.txt', '5_105.txt', '5_106.txt', '5_107.txt', '5_108.txt', '5_109.txt', '5_11.txt', '5_110.txt', '5_111.txt', '5_112.txt', '5_113.txt', '5_114.txt', '5_115.txt', '5_116.txt', '5_117.txt', '5_118.txt', '5_119.txt', '5_12.txt', '5_120.txt', '5_121.txt', '5_122.txt', '5_123.txt', '5_124.txt', '5_125.txt', '5_126.txt', '5_127.txt', '5_128.txt', '5_129.txt', '5_13.txt', '5_130.txt', '5_131.txt', '5_132.txt', '5_133.txt', '5_134.txt', '5_135.txt', '5_136.txt', '5_137.txt', '5_138.txt', '5_139.txt', '5_14.txt', '5_140.txt', '5_141.txt', '5_142.txt', '5_143.txt', '5_144.txt', '5_145.txt', '5_146.txt', '5_147.txt', '5_148.txt', '5_149.txt', '5_15.txt', '5_150.txt', '5_151.txt', '5_152.txt', '5_153.txt', '5_154.txt', '5_155.txt', '5_156.txt', '5_157.txt', '5_158.txt', '5_159.txt', '5_16.txt', '5_160.txt', '5_161.txt', '5_162.txt', '5_163.txt', '5_164.txt', '5_165.txt', '5_166.txt', '5_167.txt', '5_168.txt', '5_169.txt', '5_17.txt', '5_170.txt', '5_171.txt', '5_172.txt', '5_173.txt', '5_174.txt', '5_175.txt', '5_176.txt', '5_177.txt', '5_178.txt', '5_179.txt', '5_18.txt', '5_180.txt', '5_181.txt', '5_182.txt', '5_183.txt', '5_184.txt', '5_185.txt', '5_186.txt', '5_19.txt', '5_2.txt', '5_20.txt', '5_21.txt', '5_22.txt', '5_23.txt', '5_24.txt', '5_25.txt', '5_26.txt', '5_27.txt', '5_28.txt', '5_29.txt', '5_3.txt', '5_30.txt', '5_31.txt', '5_32.txt', '5_33.txt', '5_34.txt', '5_35.txt', '5_36.txt', '5_37.txt', '5_38.txt', '5_39.txt', '5_4.txt', '5_40.txt', '5_41.txt', '5_42.txt', '5_43.txt', '5_44.txt', '5_45.txt', '5_46.txt', '5_47.txt', '5_48.txt', '5_49.txt', '5_5.txt', '5_50.txt', '5_51.txt', '5_52.txt', '5_53.txt', '5_54.txt', '5_55.txt', '5_56.txt', '5_57.txt', '5_58.txt', '5_59.txt', '5_6.txt', '5_60.txt', '5_61.txt', '5_62.txt', '5_63.txt', '5_64.txt', '5_65.txt', '5_66.txt', '5_67.txt', '5_68.txt', '5_69.txt', '5_7.txt', '5_70.txt', '5_71.txt', '5_72.txt', '5_73.txt', '5_74.txt', '5_75.txt', '5_76.txt', '5_77.txt', '5_78.txt', '5_79.txt', '5_8.txt', '5_80.txt', '5_81.txt', '5_82.txt', '5_83.txt', '5_84.txt', '5_85.txt', '5_86.txt', '5_87.txt', '5_88.txt', '5_89.txt', '5_9.txt', '5_90.txt', '5_91.txt', '5_92.txt', '5_93.txt', '5_94.txt', '5_95.txt', '5_96.txt', '5_97.txt', '5_98.txt', '5_99.txt', '6_0.txt', '6_1.txt', '6_10.txt', '6_100.txt', '6_101.txt', '6_102.txt', '6_103.txt', '6_104.txt', '6_105.txt', '6_106.txt', '6_107.txt', '6_108.txt', '6_109.txt', '6_11.txt', '6_110.txt', '6_111.txt', '6_112.txt', '6_113.txt', '6_114.txt', '6_115.txt', '6_116.txt', '6_117.txt', '6_118.txt', '6_119.txt', '6_12.txt', '6_120.txt', '6_121.txt', '6_122.txt', '6_123.txt', '6_124.txt', '6_125.txt', '6_126.txt', '6_127.txt', '6_128.txt', '6_129.txt', '6_13.txt', '6_130.txt', '6_131.txt', '6_132.txt', '6_133.txt', '6_134.txt', '6_135.txt', '6_136.txt', '6_137.txt', '6_138.txt', '6_139.txt', '6_14.txt', '6_140.txt', '6_141.txt', '6_142.txt', '6_143.txt', '6_144.txt', '6_145.txt', '6_146.txt', '6_147.txt', '6_148.txt', '6_149.txt', '6_15.txt', '6_150.txt', '6_151.txt', '6_152.txt', '6_153.txt', '6_154.txt', '6_155.txt', '6_156.txt', '6_157.txt', '6_158.txt', '6_159.txt', '6_16.txt', '6_160.txt', '6_161.txt', '6_162.txt', '6_163.txt', '6_164.txt', '6_165.txt', '6_166.txt', '6_167.txt', '6_168.txt', '6_169.txt', '6_17.txt', '6_170.txt', '6_171.txt', '6_172.txt', '6_173.txt', '6_174.txt', '6_175.txt', '6_176.txt', '6_177.txt', '6_178.txt', '6_179.txt', '6_18.txt', '6_180.txt', '6_181.txt', '6_182.txt', '6_183.txt', '6_184.txt', '6_185.txt', '6_186.txt', '6_187.txt', '6_188.txt', '6_189.txt', '6_19.txt', '6_190.txt', '6_191.txt', '6_192.txt', '6_193.txt', '6_194.txt', '6_2.txt', '6_20.txt', '6_21.txt', '6_22.txt', '6_23.txt', '6_24.txt', '6_25.txt', '6_26.txt', '6_27.txt', '6_28.txt', '6_29.txt', '6_3.txt', '6_30.txt', '6_31.txt', '6_32.txt', '6_33.txt', '6_34.txt', '6_35.txt', '6_36.txt', '6_37.txt', '6_38.txt', '6_39.txt', '6_4.txt', '6_40.txt', '6_41.txt', '6_42.txt', '6_43.txt', '6_44.txt', '6_45.txt', '6_46.txt', '6_47.txt', '6_48.txt', '6_49.txt', '6_5.txt', '6_50.txt', '6_51.txt', '6_52.txt', '6_53.txt', '6_54.txt', '6_55.txt', '6_56.txt', '6_57.txt', '6_58.txt', '6_59.txt', '6_6.txt', '6_60.txt', '6_61.txt', '6_62.txt', '6_63.txt', '6_64.txt', '6_65.txt', '6_66.txt', '6_67.txt', '6_68.txt', '6_69.txt', '6_7.txt', '6_70.txt', '6_71.txt', '6_72.txt', '6_73.txt', '6_74.txt', '6_75.txt', '6_76.txt', '6_77.txt', '6_78.txt', '6_79.txt', '6_8.txt', '6_80.txt', '6_81.txt', '6_82.txt', '6_83.txt', '6_84.txt', '6_85.txt', '6_86.txt', '6_87.txt', '6_88.txt', '6_89.txt', '6_9.txt', '6_90.txt', '6_91.txt', '6_92.txt', '6_93.txt', '6_94.txt', '6_95.txt', '6_96.txt', '6_97.txt', '6_98.txt', '6_99.txt', '7_0.txt', '7_1.txt', '7_10.txt', '7_100.txt', '7_101.txt', '7_102.txt', '7_103.txt', '7_104.txt', '7_105.txt', '7_106.txt', '7_107.txt', '7_108.txt', '7_109.txt', '7_11.txt', '7_110.txt', '7_111.txt', '7_112.txt', '7_113.txt', '7_114.txt', '7_115.txt', '7_116.txt', '7_117.txt', '7_118.txt', '7_119.txt', '7_12.txt', '7_120.txt', '7_121.txt', '7_122.txt', '7_123.txt', '7_124.txt', '7_125.txt', '7_126.txt', '7_127.txt', '7_128.txt', '7_129.txt', '7_13.txt', '7_130.txt', '7_131.txt', '7_132.txt', '7_133.txt', '7_134.txt', '7_135.txt', '7_136.txt', '7_137.txt', '7_138.txt', '7_139.txt', '7_14.txt', '7_140.txt', '7_141.txt', '7_142.txt', '7_143.txt', '7_144.txt', '7_145.txt', '7_146.txt', '7_147.txt', '7_148.txt', '7_149.txt', '7_15.txt', '7_150.txt', '7_151.txt', '7_152.txt', '7_153.txt', '7_154.txt', '7_155.txt', '7_156.txt', '7_157.txt', '7_158.txt', '7_159.txt', '7_16.txt', '7_160.txt', '7_161.txt', '7_162.txt', '7_163.txt', '7_164.txt', '7_165.txt', '7_166.txt', '7_167.txt', '7_168.txt', '7_169.txt', '7_17.txt', '7_170.txt', '7_171.txt', '7_172.txt', '7_173.txt', '7_174.txt', '7_175.txt', '7_176.txt', '7_177.txt', '7_178.txt', '7_179.txt', '7_18.txt', '7_180.txt', '7_181.txt', '7_182.txt', '7_183.txt', '7_184.txt', '7_185.txt', '7_186.txt', '7_187.txt', '7_188.txt', '7_189.txt', '7_19.txt', '7_190.txt', '7_191.txt', '7_192.txt', '7_193.txt', '7_194.txt', '7_195.txt', '7_196.txt', '7_197.txt', '7_198.txt', '7_199.txt', '7_2.txt', '7_20.txt', '7_200.txt', '7_21.txt', '7_22.txt', '7_23.txt', '7_24.txt', '7_25.txt', '7_26.txt', '7_27.txt', '7_28.txt', '7_29.txt', '7_3.txt', '7_30.txt', '7_31.txt', '7_32.txt', '7_33.txt', '7_34.txt', '7_35.txt', '7_36.txt', '7_37.txt', '7_38.txt', '7_39.txt', '7_4.txt', '7_40.txt', '7_41.txt', '7_42.txt', '7_43.txt', '7_44.txt', '7_45.txt', '7_46.txt', '7_47.txt', '7_48.txt', '7_49.txt', '7_5.txt', '7_50.txt', '7_51.txt', '7_52.txt', '7_53.txt', '7_54.txt', '7_55.txt', '7_56.txt', '7_57.txt', '7_58.txt', '7_59.txt', '7_6.txt', '7_60.txt', '7_61.txt', '7_62.txt', '7_63.txt', '7_64.txt', '7_65.txt', '7_66.txt', '7_67.txt', '7_68.txt', '7_69.txt', '7_7.txt', '7_70.txt', '7_71.txt', '7_72.txt', '7_73.txt', '7_74.txt', '7_75.txt', '7_76.txt', '7_77.txt', '7_78.txt', '7_79.txt', '7_8.txt', '7_80.txt', '7_81.txt', '7_82.txt', '7_83.txt', '7_84.txt', '7_85.txt', '7_86.txt', '7_87.txt', '7_88.txt', '7_89.txt', '7_9.txt', '7_90.txt', '7_91.txt', '7_92.txt', '7_93.txt', '7_94.txt', '7_95.txt', '7_96.txt', '7_97.txt', '7_98.txt', '7_99.txt', '8_0.txt', '8_1.txt', '8_10.txt', '8_100.txt', '8_101.txt', '8_102.txt', '8_103.txt', '8_104.txt', '8_105.txt', '8_106.txt', '8_107.txt', '8_108.txt', '8_109.txt', '8_11.txt', '8_110.txt', '8_111.txt', '8_112.txt', '8_113.txt', '8_114.txt', '8_115.txt', '8_116.txt', '8_117.txt', '8_118.txt', '8_119.txt', '8_12.txt', '8_120.txt', '8_121.txt', '8_122.txt', '8_123.txt', '8_124.txt', '8_125.txt', '8_126.txt', '8_127.txt', '8_128.txt', '8_129.txt', '8_13.txt', '8_130.txt', '8_131.txt', '8_132.txt', '8_133.txt', '8_134.txt', '8_135.txt', '8_136.txt', '8_137.txt', '8_138.txt', '8_139.txt', '8_14.txt', '8_140.txt', '8_141.txt', '8_142.txt', '8_143.txt', '8_144.txt', '8_145.txt', '8_146.txt', '8_147.txt', '8_148.txt', '8_149.txt', '8_15.txt', '8_150.txt', '8_151.txt', '8_152.txt', '8_153.txt', '8_154.txt', '8_155.txt', '8_156.txt', '8_157.txt', '8_158.txt', '8_159.txt', '8_16.txt', '8_160.txt', '8_161.txt', '8_162.txt', '8_163.txt', '8_164.txt', '8_165.txt', '8_166.txt', '8_167.txt', '8_168.txt', '8_169.txt', '8_17.txt', '8_170.txt', '8_171.txt', '8_172.txt', '8_173.txt', '8_174.txt', '8_175.txt', '8_176.txt', '8_177.txt', '8_178.txt', '8_179.txt', '8_18.txt', '8_19.txt', '8_2.txt', '8_20.txt', '8_21.txt', '8_22.txt', '8_23.txt', '8_24.txt', '8_25.txt', '8_26.txt', '8_27.txt', '8_28.txt', '8_29.txt', '8_3.txt', '8_30.txt', '8_31.txt', '8_32.txt', '8_33.txt', '8_34.txt', '8_35.txt', '8_36.txt', '8_37.txt', '8_38.txt', '8_39.txt', '8_4.txt', '8_40.txt', '8_41.txt', '8_42.txt', '8_43.txt', '8_44.txt', '8_45.txt', '8_46.txt', '8_47.txt', '8_48.txt', '8_49.txt', '8_5.txt', '8_50.txt', '8_51.txt', '8_52.txt', '8_53.txt', '8_54.txt', '8_55.txt', '8_56.txt', '8_57.txt', '8_58.txt', '8_59.txt', '8_6.txt', '8_60.txt', '8_61.txt', '8_62.txt', '8_63.txt', '8_64.txt', '8_65.txt', '8_66.txt', '8_67.txt', '8_68.txt', '8_69.txt', '8_7.txt', '8_70.txt', '8_71.txt', '8_72.txt', '8_73.txt', '8_74.txt', '8_75.txt', '8_76.txt', '8_77.txt', '8_78.txt', '8_79.txt', '8_8.txt', '8_80.txt', '8_81.txt', '8_82.txt', '8_83.txt', '8_84.txt', '8_85.txt', '8_86.txt', '8_87.txt', '8_88.txt', '8_89.txt', '8_9.txt', '8_90.txt', '8_91.txt', '8_92.txt', '8_93.txt', '8_94.txt', '8_95.txt', '8_96.txt', '8_97.txt', '8_98.txt', '8_99.txt', '9_0.txt', '9_1.txt', '9_10.txt', '9_100.txt', '9_101.txt', '9_102.txt', '9_103.txt', '9_104.txt', '9_105.txt', '9_106.txt', '9_107.txt', '9_108.txt', '9_109.txt', '9_11.txt', '9_110.txt', '9_111.txt', '9_112.txt', '9_113.txt', '9_114.txt', '9_115.txt', '9_116.txt', '9_117.txt', '9_118.txt', '9_119.txt', '9_12.txt', '9_120.txt', '9_121.txt', '9_122.txt', '9_123.txt', '9_124.txt', '9_125.txt', '9_126.txt', '9_127.txt', '9_128.txt', '9_129.txt', '9_13.txt', '9_130.txt', '9_131.txt', '9_132.txt', '9_133.txt', '9_134.txt', '9_135.txt', '9_136.txt', '9_137.txt', '9_138.txt', '9_139.txt', '9_14.txt', '9_140.txt', '9_141.txt', '9_142.txt', '9_143.txt', '9_144.txt', '9_145.txt', '9_146.txt', '9_147.txt', '9_148.txt', '9_149.txt', '9_15.txt', '9_150.txt', '9_151.txt', '9_152.txt', '9_153.txt', '9_154.txt', '9_155.txt', '9_156.txt', '9_157.txt', '9_158.txt', '9_159.txt', '9_16.txt', '9_160.txt', '9_161.txt', '9_162.txt', '9_163.txt', '9_164.txt', '9_165.txt', '9_166.txt', '9_167.txt', '9_168.txt', '9_169.txt', '9_17.txt', '9_170.txt', '9_171.txt', '9_172.txt', '9_173.txt', '9_174.txt', '9_175.txt', '9_176.txt', '9_177.txt', '9_178.txt', '9_179.txt', '9_18.txt', '9_180.txt', '9_181.txt', '9_182.txt', '9_183.txt', '9_184.txt', '9_185.txt', '9_186.txt', '9_187.txt', '9_188.txt', '9_189.txt', '9_19.txt', '9_190.txt', '9_191.txt', '9_192.txt', '9_193.txt', '9_194.txt', '9_195.txt', '9_196.txt', '9_197.txt', '9_198.txt', '9_199.txt', '9_2.txt', '9_20.txt', '9_200.txt', '9_201.txt', '9_202.txt', '9_203.txt', '9_21.txt', '9_22.txt', '9_23.txt', '9_24.txt', '9_25.txt', '9_26.txt', '9_27.txt', '9_28.txt', '9_29.txt', '9_3.txt', '9_30.txt', '9_31.txt', '9_32.txt', '9_33.txt', '9_34.txt', '9_35.txt', '9_36.txt', '9_37.txt', '9_38.txt', '9_39.txt', '9_4.txt', '9_40.txt', '9_41.txt', '9_42.txt', '9_43.txt', '9_44.txt', '9_45.txt', '9_46.txt', '9_47.txt', '9_48.txt', '9_49.txt', '9_5.txt', '9_50.txt', '9_51.txt', '9_52.txt', '9_53.txt', '9_54.txt', '9_55.txt', '9_56.txt', '9_57.txt', '9_58.txt', '9_59.txt', '9_6.txt', '9_60.txt', '9_61.txt', '9_62.txt', '9_63.txt', '9_64.txt', '9_65.txt', '9_66.txt', '9_67.txt', '9_68.txt', '9_69.txt', '9_7.txt', '9_70.txt', '9_71.txt', '9_72.txt', '9_73.txt', '9_74.txt', '9_75.txt', '9_76.txt', '9_77.txt', '9_78.txt', '9_79.txt', '9_8.txt', '9_80.txt', '9_81.txt', '9_82.txt', '9_83.txt', '9_84.txt', '9_85.txt', '9_86.txt', '9_87.txt', '9_88.txt', '9_89.txt', '9_9.txt', '9_90.txt', '9_91.txt', '9_92.txt', '9_93.txt', '9_94.txt', '9_95.txt', '9_96.txt', '9_97.txt', '9_98.txt', '9_99.txt']

3.完整的代码:

import numpy as np
from os import listdir
import operator

def img2vector(filename):
    """
    将图像数据转换为向量
    
    filename:图片文件,因为我们的输入数据的图片格式是32*32
    return:一维矩阵
    
    该函数将图像转换为向量:该函数创建1*1024的numpy数组,然后打开给定的文件,
    循环读出文件的前32行,并将每行的头32个字符值存储在numpy数组中,最后返回数组.
    """
    returnVect = np.zeros((1,1024))
    fr=open(filename)
    for i in range(32):
        lineStr = fr.readline()
        for j in range(32):
            returnVect[0,32*i+j] = int(lineStr[j])
    return returnVect

def classify0(inX,dataSet,labels,k):
    """
    inX:用于分类的输入向量
    dataSet:输入的训练样本集
    lables:标签向量
    k:表示用于选择最近邻居的数目
    
    预测数据所在分类可在输入下列命令
    kNN.classify0([0,0], group, labels, 3)
    """
    # array的shape函数返回指定维度的大小,如dataset为n*m的矩阵,
    # 则dataset.shape[0]返回n,dataset.shape[1]返回m,dataset.shape返回n,m
    dataSetSize = dataSet.shape[0]
    # tile函数简单的理解,它的功能是重复某个数组。比如tile(A,n),功能是将数组A重复n次,构成一个新的数组
    # 所以此处tile(inX,(dataSetSize,1))的作用是将inX重复复制dataSetSize次,以便与训练样本集的样本个数一致
    # 减去dataSet就是求出其差值,所以diffMat为一个差值矩阵
    diffmat=np.tile(inX,(dataSetSize,1))-dataSet
    #距离度量,度量公式为欧氏距离
    sqdiffmat=diffmat**2
    # 将矩阵的每一行相加,axis用于控制是行相加还是列相加
    sqdistances=sqdiffmat.sum(axis=1)
    #开方
    distances=sqdistances**0.5
    # 根据距离排序从小到大的排序,返回对应的索引位置
    sortedDistIndicies=distances.argsort()
    # 选择距离最小的k个点
    classcount={}
   
    for i in range(k):
        # 找到该样本标签的类型
        voteIlabel=labels[sortedDistIndicies[i]]
        # 字典的get方法,list.get(k,d) 其中 get相当于一条if...else...语句,参数k在字典中,字典将返回list[k];如果参数k不在字典中则返回参数d
        classcount[voteIlabel]=classcount.get(voteIlabel,0)+1
        # 字典的 items() 方法,以列表返回可遍历的(键,值)元组数组。
        # sorted 中的第2个参数 key=operator.itemgetter(1) 这个参数的意思是先比较第几个元素
        sortedClasscount = sorted(classcount.items(),key=operator.itemgetter(1),reverse=True)
        # 返回最符合的标签
        return sortedClasscount[0][0]

def handwritingClassTest():
    """
    手写字体识别模块:训练集和测试集
    """
    #训练集:每个文件中的数据进行识别
    
    # 存放训练集labels
    hwLabels = []
    # 以列表形式获取trainingDigits文件夹所有文件名称
    trainingFileList = listdir('trainingDigits')
    # trainingDigits文件夹所有文件个数
    m = len(trainingFileList)
    # 创建m行1024列,0矩阵
    trainingMat = np.zeros((m,1024))
    for i in range(m):
        # 读取 trainingFileList第i个数据文件名称
        fileNameStr = trainingFileList[i]  
        # split文件,通过识别”.“,[0]代表除去后面的,即txt
        fileStr = fileNameStr.split('.')[0] 
        # split文件,通过识别”_”,[0]除去了0_3后面的序号3,保留0
        classNumStr = int(fileStr.split('_')[0])
        # 存放通过文件名称识别出来的labels
        hwLabels.append(classNumStr)
        # 存放不同标签下的具体数据
        # 调用函数img2vector每行放一个1×1024的向量
        trainingMat[i,:]=img2vector('trainingDigits/%s'\
                                      % fileNameStr)
        
    # 测试集:每个文件中的数据进行识别,得出参考向量inX
    testFileList = listdir('testDigits')
    errorCount = 0.0
    mTest = len(testFileList)
    for i in range(mTest):
        fileNameStr = testFileList[i]
        fileStr = fileNameStr.split('.')[0]
        classNumStr = int(fileStr.split('_')[0])
        vectorUnderTest = img2vector('testDigits/%s' % fileNameStr)
        # 调用k近邻算法
        classifierResult = classify0(vectorUnderTest, trainingMat, hwLabels, 3)
        print("the classifier came back with: %d,the real answer is: %d" % (classifierResult, classNumStr))
        if(classifierResult != classNumStr):errorCount += 1.0
    print("\nthe totao number of errors is: %d" % errorCount)
    print("\nthe total error rate is: %f" % (errorCount/float(mTest)))
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 221,548评论 6 515
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 94,497评论 3 399
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 167,990评论 0 360
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 59,618评论 1 296
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 68,618评论 6 397
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 52,246评论 1 308
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 40,819评论 3 421
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 39,725评论 0 276
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 46,268评论 1 320
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 38,356评论 3 340
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 40,488评论 1 352
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 36,181评论 5 350
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 41,862评论 3 333
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 32,331评论 0 24
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 33,445评论 1 272
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 48,897评论 3 376
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 45,500评论 2 359

推荐阅读更多精彩内容