Classification: Classification aims to divide the items into categories. We have a binary classification and multi-class classification. We need the correct labelled training data to classify the new test samples.
Pattern Recognition: Goal of Pattern Recognition is the Classification of data or to be in more general Patterns into classes or categories. In Pattern Recognition there is no need to have a labelled training data to classify.
PR generally categorized according to the type of learning procedure used to generate the output value.
1. Supervised Learning(set of training data has to be provided which is labelled with correct output) (classification)
The model underlying categories are perfectly known in terms of probability density function(pdf) and categories label.
The model is known (e.g. suppose normal density with mean and covariance matrix), but not so some of its parameters
Not even the model is known: there is no prior parameterized knowledge about the form of the underlying probability structure and all the information for classification will come from the training samples alone
Classification Analysis: Classification pertains to the known number of groups and the objective is to assign new data points to one of these groups
2. Unsupervised Learning (training data is not labelled i.e is any training data)(Clustering)
Not even the labels of input patterns are known and our classifier needs to determine the cluster structure
Cluster Analysis: Cluster analysis can be used to partition a large set of data into groups, called clusters, so that the data points in a group are similar to each another, while those in distinct groups are not similar to those in the other groups
PR does not absolutely mean that you have to finally classify it to a certain class. Clustering is one such typical example. Consider there are 100 samples and you perform clustering on them, i.e., you just form groups of similar objects based on some similarity measure. This is a form of Pattern Recognition.
Pattern Classification: For example say a new test data is obtained and the pattern of the test data is identified with a group of certain training samples or a cluster of similar samples. Thereafter, the moment the new test sample is assigned a class label, it will be called Pattern Classification.
References:
Pattern recognition Clustering Classification
分類:分類旨在將項目分為幾類。我們有二元分類和多類分類。我們需要正確的標記訓練數據來對新測試樣本進行分類。
模式識別:模式識別的目標是數據的分類或更廣泛的模式分類或類別。在模式識別中,不需要標記的訓練數據來進行分類。
PR通常根據用於生成輸出值的學習過程的類型進行分類。
1.監督學習(必須提供一套標有正確輸出的訓練數據)(分類)
基於概率密度函數(pdf)和類別標籤,模型基礎類別是完全已知的。
該模型是已知的(例如,假設具有均值和協方差矩陣的正常密度),但不是那麼一些參數
甚至模型都不知道:沒有關於基礎概率結構形式的先驗參數化知識,所有分類信息都來自單獨的訓練樣本
分類分析:分類與已知的組數有關,目標是將新數據點分配給其中一組
2.無監督學習(訓練數據未標記,即任何訓練數據)(聚類)
甚至輸入模式的標籤都不知道,我們的分類器需要確定集群結構
聚類分析:聚類分析可用於將大量數據分組為稱為群集的組,以便組中的數據點彼此相似,而不同組中的數據點與其他組中的數據點不相似
PR並不絕對意味著你必須最終將它歸類到某個類。聚類就是一個典型的例子。考慮有100個樣本並對它們執行聚類,即,您只需根據某些相似性度量形成相似對象組。這是模式識別的一種形式。
模式分類:例如,獲得新的測試數據,並且使用一組特定訓練樣本或一組類似樣本來識別測試數據的模式。此後,在為新測試樣本分配類標籤的那一刻,它將被稱為模式分類。
模式识别主要是对已知数据样本的特征发现和提取,比如人脸识别、雷达信号识别等,强调从原始信息中提取有价值的特征,在机器学习里面,好的特征所带来的贡献有时候远远大于算法本身的贡献;
模式分类可以理解为对具有了给定特征的样本通过分类器来进行分类,典型的模式分类方法有线性分类器(感知器,Fisher判别)、非线性分类器(BP神经网络、RBF、SVM),现实场景中主要是非线性啦,还有贝叶斯判决、C4.5、随机森林等等等等。
这两者还会有个区别,目前模式识别主要是无监督学习,人为构造算法的成分比较大(比如,人脸里面,工程师会事先告诉算法某些地方的特征),而在模式分类上,机器学习可以发挥的空间就比较大,只要有了训练样本,适当降维和清洗数据,分类器是可以自动发现样本中的特征的,此所谓有监督机器学习。