天气晴 但是刮超大的风 小马的手还是一直都是冰凉的 看来我真的是体寒 哈哈哈
明天是第四次线上开会,希望小马汇报工作一切顺利!冲冲冲
今天总结的是使用Matlab实现KNN多分类问题
KNN算法流程描述:
1、初始化训练集和类别;
2、计算测试集样本与训练集样本的欧氏距离;
3、根据欧氏距离大小对训练集样本进行升序排序;
4、选取欧式距离最小的前K个训练样本,统计其在各类别中的频率;
5、返回频率最大的类别,即测试集样本属于该类别。
Matlab实现代码(KNN算法封装函数 ):
输入五个变量:训练数据集,训练数据集标签,测试数据集,测试数据集标签,KNN算法的K值
输出两个变量:测试数据集所属类别,算法的分类精度
function [class_test, Acc] = knn(trainData, sample_label, testData, test_labels, k)
%KNN k-Nearest Neighbors Algorithm.
%
% INPUT: trainData: training sample Data, M1-by-N matrix.
% sample_label: training sample labels, M1-by-1 row vector.
% testData: testing sample Data, M2-by-N_test matrix.
% test_lables: testing sample labels,M2-by-1 row vector.
% K: the k in k-Nearest Neighbors.
%
% OUTPUT: class_test: predicted labels, M2-by-1_test row vector.
% Acc: Classification accuracy of KNN algorithm.
[M_train, N] = size(trainData);
[M_test, N] = size(testData);
%calculate the distance between testData and trainData
Dis = zeros(M_train,1);
class_test = zeros(M_test,1);
for n = 1:M_test
for i = 1:M_train
distance1 = 0;
for j = 1:N
distance1 = (testData(n,j) - trainData(i,j)).^2 + distance1;
end
Dis(i,1) = distance1.^0.5;
end
%find the k nearest neighbor
[~, index] = sort(Dis);
for i = 1:k
temp(i) = sample_label(index(i));
end
table = tabulate(temp); %函数tabulate统计一个数组中各数字(元素)出现的频数、频率
MaxCount=max(table(:,2,:));
[row,col]=find(table==MaxCount);
MaxValue=table(row,1);
class_test(n) = MaxValue(1,1); %测试集的分类标签
end
Acc = (mean(class_test == test_labels))*100; %算法的分类精度
end
新建脚本测试KNN算法函数:(以UCI中红酒数据集为例,178个样本,13个,3种类别)
clc;
clear;
%导入数据集
load wine_SVM;
%随机划分训练集和测试集
[train, test] = crossvalind('holdOut',wine_labels);
train_wine = wine(train,:);
train_wine_labels = wine_labels(train,:);
test_wine = wine(test,:);
test_wine_labels = wine_labels(test,:);
%代入KNN算法函数
[class_test, Acc]= knn(train_wine, train_wine_labels, test_wine, test_wine_labels, 10);
算法运行结果最终返回的是测试集红酒的分类标签,存在输出变量class_test中;算法的分类精度,存在输出变量Acc中。