Presentation is loading. Please wait.

Presentation is loading. Please wait.

Gist 2.3 John H. Phan MIBLab Summer Workshop June 28th, 2006.

Similar presentations


Presentation on theme: "Gist 2.3 John H. Phan MIBLab Summer Workshop June 28th, 2006."— Presentation transcript:

1 Gist 2.3 John H. Phan MIBLab Summer Workshop June 28th, 2006

2 Overview Gist 2.3 Tools –Support Vector Machine (SVM) classification –Kernel Principal Component Analysis (KPCA)

3 Gist 2.3 Overview Gist is a set of command line programs written in C –Primary programs SVM and KPCA –Auxiliary programs Ranking and feature selection –Web interface for the SVM component

4 Support Vector Machines Supervised classification method Maximal margin hyperplane http://www.dtreg.com/svm.htm

5 Primary Gist Programs gist-train-svm – train support vector machine gist-classify – classify points with a trained support vector machine gist-fast-classify – linear optimized classification gist-kpca – kernel principal component analysis gist-project – project points onto KPCA components

6 Auxiliary Gist Programs gist-fselect – linear feature selection gist-matrix – basic matrix manipulations gist-score-svm – performance of gist-train-svm and gist-classify gist-rfe – recursive feature elimination gist-sigmoid – classification probabilities gist2html – convert output to HTML gist-kernel – create a square kernel matrix

7 gist-train-svm Train a support vector machine –Input file is tab delimited but transposed –Output file contains 5 columns Label, binary classification, SVM weights, predicted classification, discriminant value

8 gist-fselect – Feature Selection Fisher Criterion Score t-test Welch t-test Mann-Whitney SAM (significance analysis of microarrays) Threshold number of mis-classifications

9 gist-score-svm Compute False and true positives on training and test sets Compute area under the ROC curves for training and test sets

10 gist-rfe Recursive feature elimination – SVM –Initialize the data to contain all features –Train an SVM on the data –Rank features according to SVM weights –Eliminate lower 50% of features –Repeat until 1 feature is left

11 Gist SVM Web Interface SVM Training and Testing Normalize data by mean centering or z-score Adjust kernel settings (linear, polynomial, or radial basis) Demo (http://svm.sdsc.edu/svm-intro.html)http://svm.sdsc.edu/svm-intro.html

12 Comparison to MAGMA Normalizations –Row (gene) mean center –Row (gene) median center –Column mean center –Column median center –Row z-score –Column z-score –Quantile –Handles missing values MAGMAGist (Web) Normalizations –Column (sample) mean center –Column (sample) z-score

13 Comparison to MAGMA Classifiers –SVM –Fisher’s Discriminant –SDF Data Representation –Visualization of classifiers –Database storage MAGMAGist (Web) Classifiers –SVM Data Representation –Text files –HTML output

14 Comparison to MAGMA Ranking Methods –Resubstitution –Cross validation –Bootstrap –Bolstering MAGMAGist (Web) Ranking Methods –Fisher criterion –T-test –SAM –Mann-Whitney –Welch t-test


Download ppt "Gist 2.3 John H. Phan MIBLab Summer Workshop June 28th, 2006."

Similar presentations


Ads by Google