Presentation is loading. Please wait.

Presentation is loading. Please wait.

CLOP A MATLAB® learning object package

Similar presentations


Presentation on theme: "CLOP A MATLAB® learning object package"— Presentation transcript:

1

2 http://clopinet.com/CLOP/ support@clopinet.com CLOP A MATLAB® learning object package http://clopinet.com/CLOP/ support@clopinet.com

3 http://clopinet.com/CLOP/ support@clopinet.com What is CLOP? CLOP stands for Challenge Learning Object Package ( It was developed for use in ML challenges with hundreds of thousands of features and/or examples)

4 http://clopinet.com/CLOP/ support@clopinet.com What is CLOP? CLOP is an object-oriented Matlab package using the “Spider” interface

5 http://clopinet.com/CLOP/ support@clopinet.com DATA OBJECTS

6 http://clopinet.com/CLOP/ support@clopinet.com data(X, Y) % Load data:  X=load([data_dir 'gisette_train.data']);  Y=load([data_dir 'gisette_train.labels']); % Create a data object and examine it:  dat=data(X, Y);  b browse(dat, 2);

7 http://clopinet.com/CLOP/ support@clopinet.com ALGORITHM OBJECTS

8 http://clopinet.com/CLOP/ support@clopinet.com algo(hyperparam) % Create data objects: ttrainD=data(X,Y); ttestD=data(Xt,Yt); % Define some hyperparameters: hhyper = {'degree=3', 'shrinkage=0.1'}; % Create a kernel ridge regression model: mmodel = kridge(hyper); % Train it and test it: [[resu, Model] = train(model, trainD); ttresu = test(Model, testD); % Visualize the results:  r roc(tresu);

9 http://clopinet.com/CLOP/ support@clopinet.com COMPOUND MODELS

10 http://clopinet.com/CLOP/ support@clopinet.com Preprocessing % For example, create a smoothing kernel:  my_ker=gauss_ker({'dim1=11', 'dim2=11', 'sigma1=2', 'sigma2=2'});  s show(my_ker); % Create a preprocessing object of type convolve: y_prepro=convolve(my_ker); % Perform the preprocessing and visualize the results:  d=train(my_prepro, dat);  b browse(d, 2);

11 http://clopinet.com/CLOP/ support@clopinet.com chain({model1, model2,…}) % Combine preprocessing and kernel ridge regression: mmodel = chain({my_prepro,kridge(hyper)}); % Combine replicas of a base learner: ffor k=1:10 bbase_model{k}=chain({my_prepro, naive}); eend mmy_model=ensemble(base_model); ensemble({model1, model2,…})

12 http://clopinet.com/CLOP/ support@clopinet.com BASIC METHODS

13 http://clopinet.com/CLOP/ support@clopinet.com train(model, trainD) % After creating your complex model, just one command: train mmodel=ensemble({chain({standardize,kridge(hyper)}),chain({normalize,naive})}); [[resu, Model] = train(model, trainD); % After training your complex model, just one command: test ttresu = test(My_model, testD); % You can chain with a “cv” object to perform cross-validation: ccv_model=cv(my_model); % Just call train and test on it! test(Model, testD)

14 http://clopinet.com/CLOP/ support@clopinet.com BASIC OBJECTS

15 http://clopinet.com/CLOP/ support@clopinet.com Some CLOP objects Basic learning machines Feature selection, pre- and post- processing Compound models

16 http://clopinet.com/CLOP/ support@clopinet.com BENCHMARKS

17 http://clopinet.com/CLOP/ support@clopinet.com Best BER=6.22  0.57% - n0=20 (4%) – BER0=7.33% MADELON Best BER=6.22  0.57% - n0=20 (4%) – BER0=7.33% my_classif=svc({'coef0=1', 'degree=0', 'gamma=1', 'shrinkage=1'}); my_model=chain({probe(relief,{'p_num=2000', 'pval_max=0'}), standardize, my_classif}) Best BER=8.54  0.99% - n0=1000 (1%) – BER0=12.37% DOROTHEA Best BER=8.54  0.99% - n0=1000 (1%) – BER0=12.37% my_model=chain({TP('f_max=1000'), naive, bias}); Competitive baseline methods set new standards for the NIPS 2003 feature selection benchmark, Isabelle Guyon, Jiwen Li, Theodor Mader, Patrick A. Pletscher, Georg Schneider and Markus Uhr,Pattern Recognition Letters, Volume 28, Issue 12, 1 September 2007, Pages 1438-1444. Dataset SizeTypeFeatures Training Examples Validation Examples Test Examples Arcene 8.7 MB Dense10000100 700 Gisette 22.5 MB Dense5000600010006500 Dexter 0.9 MB Sparse integer 20000300 2000 Dorothea 4.7 MB Sparse binary 100000800350800 Madelon 2.9 MB Dense50020006001800 Class taught at ETH, Zurich, winter 2005 Task of the students: Baseline method provided, BER0 performance and n0 features. Get BER<BER0 or BER=BER0 but n<n0. Extra credit for beating best challenge entry. GISETTE DOROTHEA NEW YORK, October 2, 2001 – Instinet Group Incorporated (Nasdaq: INET), the world’s largest electronic agency securities broker, today announced tha DEXTER MADELON ARCENE Best BER=3.30  0.40% - n0=300 (1.5%) – BER0=5% DEXTER Best BER=3.30  0.40% - n0=300 (1.5%) – BER0=5% my_classif=svc({'coef0=1', 'degree=1', 'gamma=0', 'shrinkage=0.5'}); my_model=chain({s2n('f_max=300'), normalize, my_classif}) Best BER=1.26  0.14% - n0=1000 (20%) – BER0=1.80% GISETTE Best BER=1.26  0.14% - n0=1000 (20%) – BER0=1.80% my_classif=svc({'coef0=1', 'degree=3', 'gamma=0', 'shrinkage=1'}); my_model=chain({normalize, s2n('f_max=1000'), my_classif}); Best BER= 11.9  1.2 % - n0=1100 (11%) – BER0=14.7% ARCENE Best BER= 11.9  1.2 % - n0=1100 (11%) – BER0=14.7% my_svc=svc({'coef0=1', 'degree=3', 'gamma=0', 'shrinkage=0.1'}); my_model=chain({standardize, s2n('f_max=1100'), normalize, my_svc}) NIPS 2003 Feature Selection Challenge

18 http://clopinet.com/CLOP/ support@clopinet.com NIPS 2006 Model Selection Game Data set CLOP models selected ADA 2*{sns,std,norm,gentleboost(neural),bias}; 2*{std,norm,gentleboost(kridge),bias}; 1*{rf,bias} GI NA 6*{std,gs,svc(degree=1)}; 3*{std,svc(degree=2)} HI VA 3*{norm,svc(degree=1),bias} NO VA 5*{norm,gentleboost(kridge),bias} SYL VA 4*{std,norm,gentleboost(neural),bias}; 4*{std,neural}; 1*{rf,bias} First place: Juha Reunanen, cross-indexing-7 sns = shift’n’scale, std = standardize, norm = normalize (some details of hyperparameters not shown) Dat aset CLOP models selected ADA {sns, std, norm, neural(units=5), bias} GI NA {norm, svc(degree=5, shrinkage=0.01), bias} HI VA {std, norm, gentleboost(kridge), bias} N O VA {norm,gentleboost(neural), bias} SYL VA {std, norm, neural(units=1), bias} Second place: Hugo Jair Escalante Balderas, BRun2311062 sns = shift’n’scale, std = standardize, norm = normalize (some details of hyperparameters not shown) Note: entry Boosting_1_001_x900 gave better results, but was older. Subject: Re: Goalie masks Lines: 21 Tom Barrasso wore a great mask, one time, last season. It was all black, with Pgh city scenes on it. The "Golden Triangle" graced the top, along with a steel mill on one side and the Civic Arena on the other. On the back of the helmet was the old Pens' logo the current (at the time) Pens logo, and a space for the "new" logo. Lori NOVA GINA HIVA ADA SYLVA Dataset Domain Feature #Training #Validation #Test # ADAMarketing48414741541471 GINADigit recognition970315331531532 HIVADrug discovery1617384538438449 NOVAText classification16969175417517537 SYLVA Ecology 216130861309130857 Proc. IJCNN07, Orlando, FL, Aug, 2007: PSMS for Neural Networks H. Jair Escalante, Manuel Montes y G´omez, and Luis Enrique Sucar Model Selection and Assessment Using Cross-indexing, Juha Reunanen

19 http://clopinet.com/CLOP/ support@clopinet.com Credits The Challenge Learning Object Package (CLOP) is based on code to which many people have contributed: - The developers of CLOP: Isabelle Guyon and Amir Reza Saffari Azar. - The creators of The Spider: Jason Weston, André Elisseeff, Gökhan BakIr, Fabian Sinz. - The developers of the packages attached to CLOP: Olivier Chapelle, Hugo Jair Escalante Balderas (PSMS), Gavin Cawley (LSSVM), Chih-Chung Chang and Chih- JenLin Jun-Cheng (LIBSVM), Chen, Kuan-Jen Peng, Chih-Yuan Yan, Chih-Huai Cheng, and Rong-En Fan (LIBSVM Matlab interface), Junshui Ma and Yi Zhao (second LIBSVM Matlab interface), Leo Breiman and Adele Cutler (Random Forests), Ting Wang (RF Matlab interface), Ian Nabney and Christopher Bishop (NETLAB). - The contributors to other Spider functions or packages: Thorsten Joachims (SVMLight), Chih-Chung Chang and Chih-JenLin (LIBSVM), Ronan Collobert (SVM Torch II), Jez Hill, Jan Eichhorn, Rodrigo Fernandez, Holger Froehlich, Gorden Jemwa, Kiyoung Yang, Chirag Patel, Sergio Rojas. - The authors of the Weka package and the R project who made code available, which was interfaced to Matlab and made accessible to CLOP.WekaR project

20 http://clopinet.com/CLOP/ support@clopinet.com Book with CLOP and datasets Feature Extraction, Foundations and Applications, Isabelle Guyon, Steve Gunn, et al, Eds. Isabelle GuyonSteve Gunn Springer, 2006 http://clopinet.com/fextract-book/ CD including CLOP and the data of the NIPS2003 challenge Tutorial chapters Invited papers on the best results of the challenge


Download ppt "CLOP A MATLAB® learning object package"

Similar presentations


Ads by Google