Libsvm的说明——方法

Utility Functions

=================

To use utility functions, type:

>>> from svmutil import *

The above command loads:

svm_train() : train an SVM model

svm_predict() : predict testing data

svm_read_problem() : read the data from a LIBSVM-format file.

svm_load_model() : load a LIBSVM model.

svm_save_model() : save model to a file.

evaluations() : evaluate prediction results.

csr_find_scale_param() : find scaling parameter for data in csr format(查找csr格式数据的缩放参数).

csr_scale() : apply data scaling to data in csr format(对csr格式的数据应用数据缩放).

## 第一个function

- ***Function: svm_train***

There are three ways to call svm_train()

>>> model = svm_train(y, x [, 'training_options'])

>>> model = svm_train(prob [, 'training_options'])

>>> model = svm_train(prob, param)

y: a list/tuple/ndarray of l training labels (type must be int/double).

x: 1. a list/tuple of l training instances. Feature vector of each training instance is a list/tuple or dictionary.

2. an l * n numpy ndarray or scipy spmatrix (n: number of features).

training_options: a string in the same form as that for LIBSVM command mode.

prob: an svm_problem instance generated by calling

svm_problem(y, x).

For pre-computed kernel, you should use

svm_problem(y, x, isKernel=True)

param: an svm_parameter instance generated by calling

svm_parameter('training_options')

model: the returned svm_model instance. See svm.h for details of this structure. If '-v' is specified, cross validation is

conducted and the returned model is just a scalar: cross-validation accuracy for classification and mean-squared error for regression.

To train the same data many times with different parameters, the second and the third ways should be faster..

Examples:

>>> y, x = svm_read_problem('../heart_scale')

>>> prob = svm_problem(y, x)

>>> param = svm_parameter('-s 3 -c 5 -h 0')

>>> m = svm_train(y, x, '-c 5')

>>> m = svm_train(prob, '-t 2 -c 5')

>>> m = svm_train(prob, param)

>>> CV_ACC = svm_train(y, x, '-v 3')

## 第二个function

***- Function: svm_predict***

To predict testing data with a model, use

>>> p_labs, p_acc, p_vals = svm_predict(y, x, model [,'predicting_options'])

y: a list/tuple/ndarray of l true labels (type must be int/double).

It is used for calculating the accuracy. Use [] if true labels are unavailable.

x: 1. a list/tuple of l training instances. Feature vector of each training instance is a list/tuple or dictionary.

2. an l * n numpy ndarray or scipy spmatrix (n: number of features).

predicting_options: a string of predicting options in the same format as that of LIBSVM.

model: an svm_model instance.

p_labels: a list of predicted labels

p_acc: a tuple including accuracy (for classification), mean squared error, and squared correlation coefficient (for regression)（包括准确度（用于分类）、均方误差和平方相关系数（用于回归）的元组）.

p_vals: a list of decision values or probability estimates (if '-b 1' is specified). If k is the number of classes in training data, for decision values, each element includes results of predicting k(k-1)/2 binary-class SVMs. For classification, k = 1 is a special case. Decision value [+1] is returned for each testing instance, instead of an empty list.

For probabilities, each element contains k values indicating the probability that the testing instance is in each class. Note that the order of classes is the same as the 'model.label' field in the model structure.

Example:

>>> m = svm_train(y, x, '-c 5')

>>> p_labels, p_acc, p_vals = svm_predict(y, x, m)

## 第三组functions

***- Functions: svm_read_problem/svm_load_model/svm_save_model***

See the usage by examples:

>>> y, x = svm_read_problem('data.txt')

>>> m = svm_load_model('model_file')

>>> svm_save_model('model_file', m)

## 第四个functions

***- Function: evaluations***

Calculate some evaluations using the true values (ty) and the predicted values (pv):

>>> (ACC, MSE, SCC) = evaluations(ty, pv, useScipy)

ty: a list/tuple/ndarray of true values.

pv: a list/tuple/ndarray of predicted values.

useScipy: convert ty, pv to ndarray, and use scipy functions to do the evaluation

ACC: accuracy(准确度).

MSE: mean squared error(均方误差).

SCC: squared correlation coefficient(平方相关系数).

## 第五组functions

***- Function: csr_find_scale_parameter/csr_scale***

Scale data in csr format.

>>> param = csr_find_scale_param(x [, lower=l, upper=u])

>>> x = csr_scale(x, param)

x: a csr_matrix of data.

l: x scaling lower limit; default -1.（缩放下限，默认-1）

u: x scaling upper limit; default 1.（缩放上限，默认1）

The scaling process is: x * diag(coef) + ones(l, 1) * offset'

param: a dictionary of scaling parameters, where param['coef'] = coef and param['offset'] = offset.

coef: a scipy array of scaling coefficients（系数）.

offset: a scipy array of scaling offsets（偏移）.

Additional Information

======================

This interface was written by Hsiang-Fu Yu from Department of Computer Science, National Taiwan University. If you find this tool useful, please cite LIBSVM as follows Chih-Chung Chang and Chih-Jen Lin, LIBSVM : a library for support vector machines. ACM Transactions on Intelligent Systems and

Technology, 2:27:1--27:27, 2011. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm

For any question, please contact Chih-Jen Lin <cjlin@csie.ntu.edu.tw>, or check the FAQ page: [http://www.csie.ntu.edu.tw/~cjlin/libsvm/faq.html](http://www.csie.ntu.edu.tw/~cjlin/libsvm/faq.html).

Libsvm的说明——方法

推荐阅读更多精彩内容