Comparing the Parallel Automatic Composition of Inductive Applications with Stacking Methods Hidenao Abe & Takahira Yamaguchi Shizuoka University, JAPAN.

Slides:



Advertisements
Similar presentations
Florida International University COP 4770 Introduction of Weka.
Advertisements

Random Forest Predrag Radenković 3237/10
Ch. Eick: More on Machine Learning & Neural Networks Different Forms of Learning: –Learning agent receives feedback with respect to its actions (e.g. using.
Using Parallel Genetic Algorithm in a Predictive Job Scheduling
Combining Classification and Model Trees for Handling Ordinal Problems D. Anyfantis, M. Karagiannopoulos S. B. Kotsiantis, P. E. Pintelas Educational Software.
Weka & Rapid Miner Tutorial By Chibuike Muoh. WEKA:: Introduction A collection of open source ML algorithms – pre-processing – classifiers – clustering.
Molecular Biomedical Informatics 分子生醫資訊實驗室 Machine Learning and Bioinformatics 機器學習與生物資訊學 Machine Learning & Bioinformatics 1.
Data Mining Classification: Alternative Techniques
Mining databases with different schema: Integrating incompatible classifiers Andreas L Prodromidis Salvatore Stolfo Dept of Computer Science Columbia University.
Application of Stacked Generalization to a Protein Localization Prediction Task Melissa K. Carroll, M.S. and Sung-Hyuk Cha, Ph.D. Pace University, School.
Feature Selection for Regression Problems
Ensemble Learning: An Introduction
Department of Computer Science, University of Waikato, New Zealand Eibe Frank WEKA: A Machine Learning Toolkit The Explorer Classification and Regression.
Three kinds of learning
Evaluation of Results (classifiers, and beyond) Biplav Srivastava Sources: [Witten&Frank00] Witten, I.H. and Frank, E. Data Mining - Practical Machine.
Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 Juan J. Rodríguez and Ludmila I. Kuncheva.
PUBLIC: A Decision Tree Classifier that Integrates Building and Pruning RASTOGI, Rajeev and SHIM, Kyuseok Data Mining and Knowledge Discovery, 2000, 4.4.
An Exercise in Machine Learning
 The Weka The Weka is an well known bird of New Zealand..  W(aikato) E(nvironment) for K(nowlegde) A(nalysis)  Developed by the University of Waikato.
A Genetic Algorithms Approach to Feature Subset Selection Problem by Hasan Doğu TAŞKIRAN CS 550 – Machine Learning Workshop Department of Computer Engineering.
Issues with Data Mining
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Appendix: The WEKA Data Mining Software
Cristian Urs and Ben Riveira. Introduction The article we chose focuses on improving the performance of Genetic Algorithms by: Use of predictive models.
1 1 Slide Evaluation. 2 2 n Interactive decision tree construction Load segmentchallenge.arff; look at dataset Load segmentchallenge.arff; look at dataset.
Treatment Learning: Implementation and Application Ying Hu Electrical & Computer Engineering University of British Columbia.
Benk Erika Kelemen Zsolt
Comparison of Bayesian Neural Networks with TMVA classifiers Richa Sharma, Vipin Bhatnagar Panjab University, Chandigarh India-CMS March, 2009 Meeting,
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.
Combining multiple learners Usman Roshan. Bagging Randomly sample training data Determine classifier C i on sampled data Goto step 1 and repeat m times.
Today Ensemble Methods. Recap of the course. Classifier Fusion
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
Ensemble Methods: Bagging and Boosting
Ensembles. Ensemble Methods l Construct a set of classifiers from training data l Predict class label of previously unseen records by aggregating predictions.
Institut für Softwarewissenschaft - Universität WienP.Brezany 1 Meta-Learning in Distributed Datamining Systems Peter Brezany Institut für Softwarewissenschaft.
Stefan Mutter, Mark Hall, Eibe Frank University of Freiburg, Germany University of Waikato, New Zealand The 17th Australian Joint Conference on Artificial.
CISC Machine Learning for Solving Systems Problems Presented by: Ashwani Rao Dept of Computer & Information Sciences University of Delaware Learning.
Tony Jebara, Columbia University Advanced Machine Learning & Perception Instructor: Tony Jebara.
Chapter 9 Genetic Algorithms.  Based upon biological evolution  Generate successor hypothesis based upon repeated mutations  Acts as a randomized parallel.
MACHINE LEARNING 10 Decision Trees. Motivation  Parametric Estimation  Assume model for class probability or regression  Estimate parameters from all.
Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
An Exercise in Machine Learning
Classification Ensemble Methods 1
Finding τ → μ−μ−μ+ Decays at LHCb with Data Mining Algorithms
***Classification Model*** Hosam Al-Samarraie, PhD. CITM-USM.
… Algo 1 Algo 2 Algo 3 Algo N Meta-Learning Algo.
Learning Photographic Global Tonal Adjustment with a Database of Input / Output Image Pairs.
Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.
In part from: Yizhou Sun 2008 An Introduction to WEKA Explorer.
Data Mining CH6 Implementation: Real machine learning schemes(2) Reporter: H.C. Tsai.
Linear Models & Clustering Presented by Kwak, Nam-ju 1.
@relation age sex { female, chest_pain_type { typ_angina, asympt, non_anginal,
Combining Bagging and Random Subspaces to Create Better Ensembles
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Machine Learning: Ensemble Methods
Zaman Faisal Kyushu Institute of Technology Fukuoka, JAPAN
Trees, bagging, boosting, and stacking
Combining Base Learners
Data Mining Practical Machine Learning Tools and Techniques
Machine Learning with Weka
Chap. 7 Regularization for Deep Learning (7.8~7.12 )
Discriminative Frequent Pattern Analysis for Effective Classification
Ensemble learning.
Authors: Wai Lam and Kon Fan Low Announcer: Kyu-Baek Hwang
Somi Jacob and Christian Bach
Lecture 10 – Introduction to Weka
A task of induction to find patterns
Using Bayesian Network in the Construction of a Bi-level Multi-classifier. A Case Study Using Intensive Care Unit Patients Data B. Sierra, N. Serrano,
Advisor: Dr.vahidipour Zahra salimian Shaghayegh jalali Dec 2017
Presentation transcript:

Comparing the Parallel Automatic Composition of Inductive Applications with Stacking Methods Hidenao Abe & Takahira Yamaguchi Shizuoka University, JAPAN

H.Abe & T. YamaguchiParallel and Distributed Computing for Machine Learning Contents Introduction Constructive meta-learning based on method repositories Case study with common data sets Conclusion

H.Abe & T. YamaguchiParallel and Distributed Computing for Machine Learning Overview of meta-learning scheme 学習 Learning Algorithms Meta-Learning Executor データセット Data Set Meta Knowledge A Better result to the given data set than its done by each single learning algorithm Meta-learning algorithm

H.Abe & T. YamaguchiParallel and Distributed Computing for Machine Learning Selective meta-learning scheme Integrating base-level classifiers, which are learned with different training data sets generating by –“Bootstrap Sampling” (bagging) –weighting ill-classified instances (boosting) Integrating base-level classifiers, which are learned with different algorithms, with –simple voting (voting) –constructing meta-classifier with a meta data set and a meta-level learning algorithm (stacking, cascading) They don’t work well, when no base-level learning algorithm works well to the given data set !! -> Because they don’t de-compose base-level learning algorithms.

H.Abe & T. YamaguchiParallel and Distributed Computing for Machine Learning Contents Introduction Constructive meta-learning based on method repositories –Basic idea of our constructive meta-learning –An implementation of the constructive meta-learning called CAMLET –Parallel execution and search for CAMLET Case Study with common data sets Conclusion

H.Abe & T. YamaguchiParallel and Distributed Computing for Machine Learning De-composition& Organization Search & Composition + C4.5 VS CS AQ15 Analysis of Representative Inductive Learning Algorithms Basic Idea of our Constructive Meta-Learning

H.Abe & T. YamaguchiParallel and Distributed Computing for Machine Learning CS AQ15 Analysis of Representative Inductive Learning Algorithms Basic Idea of Constructive Meta-Learning De-composition& Organization Search & Composition +

H.Abe & T. YamaguchiParallel and Distributed Computing for Machine Learning Analysis of Representative Inductive Learning Algorithms Automatic Composition of Inductive Applications De-composition& Organization Search & Composition + Basic Idea of Constructive Meta-Learning Organizing inductive learning methods, control structures and treated objects.

H.Abe & T. YamaguchiParallel and Distributed Computing for Machine Learning Analysis of Representative Inductive Learning Algorithms Version Space AQ15 ID3 C4.5 Boosted C4.5 Bagged C4.5 Neural Network with back propagation Classifier Systems We have analyzed the following 8 learning algorithms: We have identified 22 specific inductive learning methods.

H.Abe & T. YamaguchiParallel and Distributed Computing for Machine Learning Inductive Learning Method Repository

H.Abe & T. YamaguchiParallel and Distributed Computing for Machine Learning Data Type Hierarchy Organization of input/output/reference data types for inductive learning methods for inductive learning methods classifier-set data-set If-Then rules tree (Decision Tree) network (Neural Net) training data set test data set Objects validation data set

H.Abe & T. YamaguchiParallel and Distributed Computing for Machine Learning Identifying Inductive Learning Methods and Control Structures Generating Training & Validation Data Sets Generating a classifier set Evaluating set Modifying training and validation data sets Modifying a classifier set set Evaluating classifier sets classifier sets to test data set Modifying a classifier set set

H.Abe & T. YamaguchiParallel and Distributed Computing for Machine Learning CAMLET CAMLET: a Computer Aided Machine Learning Engineering Tool Method Repository Data Type Hierarchy Control Structures Compile Go & Test Construction Instantiation Go to the goal accuracy? No Yes Data Sets, Goal Accuracy Inductive application, etc. Refinement User

H.Abe & T. YamaguchiParallel and Distributed Computing for Machine Learning CAMLET CAMLET with a parallel environment Method Repository Data Type Hierarchy Control Structures Compile Go & Test Execution Level Construction Instantiation Go to the goal accuracy? No Yes Data Sets, Goal Accuracy Inductive application, etc. Composition Level Refinement Specifications Data Sets Results User

H.Abe & T. YamaguchiParallel and Distributed Computing for Machine Learning Parallel Executions of Inductive Applications Constructing specs Sending specs Initializing R: receiving,Ex: executing, S: sending Composition Level is necessary one process Execution Level is necessary one or more process elements RRR R Waiting Executing Ex Receiving a result Receiving Ex S Refining the spec Sending refined one Refining & Sending Ex R

H.Abe & T. YamaguchiParallel and Distributed Computing for Machine Learning Refinement of Inductive Applications with GA t Generation Selection Parents Crossover and Mutation Executed Specs & Their Results X Generation execute & add 1.Transform executed specifications to chromosomes 2.Select parents with “Tournament Method” 3.Crossover the parents and Mutate one of their children 4.Execute children’s specification, transforming chromosomes to specs * If CAMLET can execute more than two I.A at the same time, some slow I.A will be added to t+2 or later Generation. t+1 Generation Children

H.Abe & T. YamaguchiParallel and Distributed Computing for Machine Learning CAMLET CAMLET on a parallel GA refinement Method Repository Data Type Hierarchy Control Structures Compile Go & Test Execution Level Construction Instantiation Go to the goal accuracy? No Yes Data Sets, Goal Accuracy Inductive application, etc. Composition Level Refinement Specifications Data Sets Results User

H.Abe & T. YamaguchiParallel and Distributed Computing for Machine Learning Contents Introduction Constructive meta-learning based on method repositories Case Study with common data sets –Accuracy comparison with common data sets –Evaluation of parallel efficiencies in CAMLET Conclusion

H.Abe & T. YamaguchiParallel and Distributed Computing for Machine Learning Set up of accuracy comparison with common data sets We have used two kinds of common data sets –10 Statlog data sets –32 UCI data sets (distributed with WEKA) We have input goal accuracies to each data set as the criterion. CAMLET has output just one specification of the best inductive application to each data set and its performance. –CAMLET has searched about 6,000 inductive applications for the best one, executing up to one hundred inductive applications. 10 “Execution Level” PEs have been used to each data set. Stacking methods implemented in WEKA data mining suites

H.Abe & T. YamaguchiParallel and Distributed Computing for Machine Learning Statlog common data sets To generate 10-fold cross validation data sets’ pairs, We have used SplitDatasetFilter implemented in WEKA.

H.Abe & T. YamaguchiParallel and Distributed Computing for Machine Learning Setting up of Stacking methods to Statlog common data sets J4.8 (with pruning, C=0.25) IBk (k=5) Bagged J4.8 (Sampling rate=55%, 5 Iterations) Bagged J4.8 (Sampling rate=55%, 10 Iterations) Boosted J4.8 (5 Iterations)Boosted J4.8 (10 Iterations) Part (with pruning, C=0.25) Naïve Bayes J4.8 (with pruning, C=0.25) Naïve Bayes Classification with Linear Regression (with all of features) Meta-level Learning Algorithms Base-level Learning Algorithms

H.Abe & T. YamaguchiParallel and Distributed Computing for Machine Learning Evaluation on accuracies to Statlog common data sets Inductive applications composed by CAMLET have shown us as good performance as that of stacking with Classification via Linear Regression.

H.Abe & T. YamaguchiParallel and Distributed Computing for Machine Learning Setting up of Stacking methods to UCI common data sets J4.8 (with pruning, C=0.25) IBk (k=5) Bagged J4.8 (Sampling rate=55%, 10 Iterations) Boosted J4.8 (10 Iterations) Part (with pruning, C=0.25) Naïve Bayes J4.8 (with pruning, C=0.25) Naïve Bayes Linear Regression (with all of features) Meta-level Learning Algorithms Base-level Learning Algorithms

H.Abe & T. YamaguchiParallel and Distributed Computing for Machine Learning Evaluation on accuracies to UCI common data sets AlgorithmJ4.8NBLRMax. of 3CAMLET Average

H.Abe & T. YamaguchiParallel and Distributed Computing for Machine Learning Analysis of parallel efficiency of CAMLET We have analyzed parallel efficiencies of executions of the case study with “Statlog common data sets”. –We have used 10 processing elements to execute each inductive applications. Parallel Efficiency is calculated as the following formula: 1<=x<100, n:#fold, e:Execution Time of Each PE

H.Abe & T. YamaguchiParallel and Distributed Computing for Machine Learning Parallel Efficiency of the case study with StatLog data sets Parallel Efficiencies of 10-fold CV data sets Parallel Efficiencies of training-test data sets

H.Abe & T. YamaguchiParallel and Distributed Computing for Machine Learning Why the parallel efficiency of “satimage” data set was so low? Because: - CAMLET could not find satisfactory inductive application to this data set within 100 executions of inductive applications. - In the end of search, inductive applications with large computational cost badly affected to parallel efficiency.

H.Abe & T. YamaguchiParallel and Distributed Computing for Machine Learning Contents Introduction Constructive meta-learning based on method repositories Case Study with common data sets Conclusion

H.Abe & T. YamaguchiParallel and Distributed Computing for Machine Learning Conclusion CAMLET has been implemented as a tool for “Constructive Meta-Learning” scheme based on method repositories. CAMLET shows us a significant performance as a meta-learning scheme. –We are extending the method repository to construct data mining application of whole data mining process. The parallel efficiency of CAMLET has been achieved greater than 5.93 times. –It should be improved with methods to equalize each execution time such as load balancing or/and three-layered architecture. with more efficient search method to find satisfactory inductive application faster.

H.Abe & T. YamaguchiParallel and Distributed Computing for Machine Learning Thank you!

H.Abe & T. YamaguchiParallel and Distributed Computing for Machine Learning Stacking: training and prediction Training phasePrediction phase Training Set (Int.) Test Set (Int.) Training SetTest Set Meta-level Training Set Meta-level classifier Meta-level learning ベースレベル 分類器 ベースレベル 分類器 Base-level classifiers Learning of Base-level classifiers Transformation Repeating with CV Meta-level Test Set Transformation Prediction Results of Preduction ベースレベル 分類器 ベースレベル 分類器 Base-level classifiers Meta-level classifiers Learning of Base-level classifiers

H.Abe & T. YamaguchiParallel and Distributed Computing for Machine Learning How to transforme base-level attributes to meta-level attributes Classifier by Algorithm 1 Learning with Algorithm 1 1.Getting prediction probability to each class value with the classifier. 2. Adding base-level class value of each base-level instance A algorithm C class values #Meta-level Att. = AC

H.Abe & T. YamaguchiParallel and Distributed Computing for Machine Learning Meta-level classifiers (Ex.) The prediction probability of”0” With classifier by algorithm 1 The prediction probability of”1” With classifier by algorithm 2 The prediction probability of”1” With classifier by algorithm 1 The prediction probability of”0” With classifier by algorithm 3 Prediction “1” .... ≦ 0.1 > 0.1 ≦ 0.95 > 0.95 > 0.9 ≦ 0.9 Prediction “0” Prediction “1”

H.Abe & T. YamaguchiParallel and Distributed Computing for Machine Learning Accuracies of 8 base-level learning algorithms to Statlog data sets Stacking Base-level algorithms Accuracy (%) CAMLET Base-level algorithms Accuracy (%)

H.Abe & T. YamaguchiParallel and Distributed Computing for Machine Learning C4.5 Decision Tree without pruning (reproduction of CAMLET) Generating training and validation data sets with a void validation set Generating a classifier set (decision tree) with entropy + information ratio Evaluating a classifier set with set evaluation Evaluating classifier sets to test data set with single classifier set evaluation 1. Select just one control structure 2. Fill each method with specific methods 3. Instantiate this spec. 4. Compile the spec. 5. Execute its executable code 6. Refine the spec. (if needed)