Page: 1 of 38 Support Vector Machine 李旭斌 (LI mining Lab. 6/19/2012.

http://datamining.xmu.edu.cn Page: 1 of 38 Support Vector Machine 李旭斌 (LI Xubin) xmubingo@gmail.com @Data mining Lab. 6/19/2012

http://datamining.xmu.edu.cn Page: 2 of 38 No theory, Just use Structural risk minimization VC dimension hyperplane Maximum Margin Classifier Bla,bla …. Paper: What is a support vector machine? Kernel function Theory is so complicated …

http://datamining.xmu.edu.cn Page: 3 of 38 What can it do?  Main usage: Classification: C-SVC, nu-SVC Regression: epsilon-SVR, nu-SVR Distribution estimation: one-class SVM  Other: clustering

http://datamining.xmu.edu.cn Page: 4 of 38 But, we have many software with friendly interface.

http://datamining.xmu.edu.cn Page: 5 of 38 Who can achieve SVM?  libSVM Java, C, R, MATLAB, Python, Perl, C#...CUDA! Hadoop(Mahout)! WEKA Weka-Parallel MATLAB SVM Toolbox Spider SVM in R GPU-accelerated LIBSVM

http://datamining.xmu.edu.cn Page: 6 of 38 Examples for Machine Learning Algorithms

http://datamining.xmu.edu.cn Page: 7 of 38 Classification SVM

http://datamining.xmu.edu.cn Page: 8 of 38 Regression SVR

http://datamining.xmu.edu.cn Page: 9 of 38 Clustering K-means Shortcuts from MLDemos.MLDemos

http://datamining.xmu.edu.cn Page: 10 of 38 Let ’ s back to libSVM

http://datamining.xmu.edu.cn Page: 11 of 38 Format of input The format of training and testing data file is: : :.... Each line contains an instance and is ended by a '\n' character. For classification, is an integer indicating the class label (multi-class is supported). For regression, is the target value which can be any real number. For one-class SVM, it's not used so can be any number. The pair : gives a feature (attribute) value: is an integer starting from 1 and is a real number. Example: 1 0:1 1:4 2:6 3:1 1 0:2 1:6 2:8 3:0 0 0:3 1:1 2:0 3:1

http://datamining.xmu.edu.cn Page: 12 of 38 Parameters Usage: svm-train [options] training_set_file [model_file] options: -s svm_type : set type of SVM (default 0) 0 -- C-SVC 1 -- nu-SVC 2 -- one-class SVM 3 -- epsilon-SVR 4 -- nu-SVR -t kernel_type : set type of kernel function (default 2) 0 -- linear: u'*v 1 -- polynomial: (gamma*u'*v + coef0)^degree 2 -- radial basis function: exp(-gamma*|u-v|^2) 3 -- sigmoid: tanh(gamma*u'*v + coef0) 4 -- precomputed kernel (kernel values in training_set_file) Attention: Parameters in formula

http://datamining.xmu.edu.cn Page: 13 of 38 -d degree : set degree in kernel function (default 3) -g gamma : set gamma in kernel function (default 1/num_features) -r coef0 : set coef0 in kernel function (default 0) -c cost : set the parameter C of C-SVC, epsilon-SVR, and nu-SVR (default 1) -n nu : set the parameter nu of nu-SVC, one-class SVM, and nu- SVR (default 0.5) -p epsilon : set the epsilon in loss function of epsilon-SVR (default 0.1) -m cachesize : set cache memory size in MB (default 100) -e epsilon : set tolerance of termination criterion (default 0.001) -h shrinking : whether to use the shrinking heuristics, 0 or 1 (default 1) -b probability_estimates : whether to train a SVC or SVR model for probability estimates, 0 or 1 (default 0) -wi weight : set the parameter C of class i to weight*C, for C-SVC (default 1) -v n: n-fold cross validation mode -q : quiet mode (no outputs)

http://datamining.xmu.edu.cn Page: 14 of 38 nu-SVC & C-SVC “ Basically they are the same thing but with different parameters. The range of C is from zero to infinity but nu is always between [0,1]. A nice property of nu is that it is related to the ratio of support vectors and the ratio of the training error. ”

http://datamining.xmu.edu.cn Page: 15 of 38 one-class SVM  Fault diagnosis Train set is always made up of normal instances. Label: 1 (no -1) Test set contains unknown statuses (instances). Output Label: 1 or -1 1 : normal -1: anomalous  Anomaly detection

http://datamining.xmu.edu.cn Page: 16 of 38 epsilon-SVR & nu-SVR Paper: LIBSVM: A Library for Support Vector Machines

http://datamining.xmu.edu.cn Page: 17 of 38 Comparison epsilon nu epsilon nu

http://datamining.xmu.edu.cn Page: 18 of 38 Related experience Usage and grid search Code Analysis Chinese version of libSVM FAQ

http://datamining.xmu.edu.cn Page: 19 of 38 libSVM Guide http://www.csie.ntu.edu.tw/~cjlin/papers/guide/g uide.pdf

http://datamining.xmu.edu.cn Page: 20 of 38 train svm-train.model svm-predict test result Flowchart of Task  Train set and test set should been both scaled. Before that, do you really need to scale them? svm-scale train.scale svm-scale test.scale

http://datamining.xmu.edu.cn Page: 21 of 38 Parameters are important!  Good parameters will build a good model. How to get the ‘ good ’ parameters? Features are important! Model is also important! Stupid line

http://datamining.xmu.edu.cn Page: 22 of 38 Example Train Set C=2, g=100 Positive 83% Negative 85% C=50, g=100 Positive 86% Negative 91%  ROC ? Click here

http://datamining.xmu.edu.cn Page: 23 of 38 ROC ? Predict 10Total Real1 True Positive （ TP ） False Negative （ FN ） Actual Positive(TP+FN) 0 False Positive （ FP) True Negative(TN)Actual Negative(FP+TN) TotalPredicted Positive(TP+FP)Predicted Negative(FN+TN)TP+FP+FN+TN  We need TPR and FPR X-axis: FPR (1-specificity) Y-axis: TPR (Sensitivity)

http://datamining.xmu.edu.cn Page: 24 of 38 Parameter Selection  Grid Search  Particle Swarm Optimization  Other Algorithm  Manual try … Random? My God!! Now, our work: Type: Classification. Goal: Find best (C, G)

http://datamining.xmu.edu.cn Page: 25 of 38 Grid Search

http://datamining.xmu.edu.cn Page: 26 of 38 Parallel Grid Search SSH Command grid.py Hadoop-based: 使用 MapReduce 对 svm 模型进行训练

http://datamining.xmu.edu.cn Page: 27 of 38 Particle Swarm Optimization (PSO)  demo demo  Demo can ’ t work? Click here

http://datamining.xmu.edu.cn Page: 28 of 38 Climb mountain  Peak is destination.  Higher and slower.

http://datamining.xmu.edu.cn Page: 29 of 38 Similar Algorithms  Hill-climbing algorithm  Genetic algorithm  Ant colony optimization  Simulated annealing algorithm

http://datamining.xmu.edu.cn Page: 30 of 38 Let ’ s back to PSO. Paper: Development of Particle Swarm Optimization Algorithm

http://datamining.xmu.edu.cn Page: 31 of 38 Particle Swarm Optimization Birds hurt food C G 0 Distance (C best, G best )

http://datamining.xmu.edu.cn Page: 32 of 38 PSO and Parameter Selection  PSO Find a point (C, G) to make the distance between (C, G) and (Cbest, Gbest) shortest.  Parameter Selection Find a pair (C, G) to make the error rate lowest. Estimate function

http://datamining.xmu.edu.cn Page: 33 of 38 Position of Particle i : Speed: Particle i best: Global best: Update rule: Update position Update speed Update weight

http://datamining.xmu.edu.cn Page: 34 of 38 Max Iteration(20) threshold (0.03) Max dead-stop times(10) Stop criterion Algorithm const variables Dimension (M = 2) Number of Particles (N = 20-50) Space scope (0<X[i]<1024, 0<i<M) Max speed Speedup factor = 2 =

http://datamining.xmu.edu.cn Page: 35 of 38 Figure too small? Click here

http://datamining.xmu.edu.cn Page: 36 of 38 Example There is a problem.

http://datamining.xmu.edu.cn Page: 37 of 38 Discussion

http://datamining.xmu.edu.cn Page: 38 of 38 Thank you for your attention!

Page: 1 of 38 Support Vector Machine 李旭斌 (LI mining Lab. 6/19/2012.

Similar presentations

Presentation on theme: "Page: 1 of 38 Support Vector Machine 李旭斌 (LI mining Lab. 6/19/2012."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Page: 1 of 38 Support Vector Machine 李旭斌 (LI mining Lab. 6/19/2012.

Similar presentations

Presentation on theme: "Page: 1 of 38 Support Vector Machine 李旭斌 (LI mining Lab. 6/19/2012."— Presentation transcript:

Similar presentations

About project

Feedback