Using Bayesian Network in the Construction of a Bi-level Multi-classifier. A Case Study Using Intensive Care Unit Patients Data B. Sierra, N. Serrano,

Slides:



Advertisements
Similar presentations
DECISION TREES. Decision trees  One possible representation for hypotheses.
Advertisements

Random Forest Predrag Radenković 3237/10
1 Some Comments on Sebastiani et al Nature Genetics 37(4)2005.
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.
RIPPER Fast Effective Rule Induction
Data Mining Classification: Alternative Techniques
Shared Ensemble Learning using Multi-trees 전자전기컴퓨터공학과 G 김영제 Database Lab.
Classification Techniques: Decision Tree Learning
Decision Tree Learning 主講人:虞台文 大同大學資工所 智慧型多媒體研究室.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Sparse vs. Ensemble Approaches to Supervised Learning
Ensemble Learning: An Introduction
Classification Continued
Three kinds of learning
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Classification.
(C) 2001 SNU CSE Biointelligence Lab Incremental Classification Using Tree- Based Sampling for Large Data H. Yoon, K. Alsabti, and S. Ranka Instance Selection.
CS Bayesian Learning1 Bayesian Learning. CS Bayesian Learning2 States, causes, hypotheses. Observations, effect, data. We need to reconcile.
Machine Learning CS 165B Spring 2012
Comparing the Parallel Automatic Composition of Inductive Applications with Stacking Methods Hidenao Abe & Takahira Yamaguchi Shizuoka University, JAPAN.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Monday, April 3, 2000 DingBing.
Mohammad Ali Keyvanrad
Midterm Review Rao Vemuri 16 Oct Posing a Machine Learning Problem Experience Table – Each row is an instance – Each column is an attribute/feature.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.
Basic Data Mining Technique
Analysing Microarray Data Using Bayesian Network Learning Name: Phirun Son Supervisor: Dr. Lin Liu.
1 Knowledge Discovery Transparencies prepared by Ho Tu Bao [JAIST] ITCS 6162.
Categorical data. Decision Tree Classification Which feature to split on? Try to classify as many as possible with each split (This is a good split)
Aprendizagem Computacional Gladys Castillo, UA Bayesian Networks Classifiers Gladys Castillo University of Aveiro.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
CLASSIFICATION: Ensemble Methods
MACHINE LEARNING 10 Decision Trees. Motivation  Parametric Estimation  Assume model for class probability or regression  Estimate parameters from all.
Slides for “Data Mining” by I. H. Witten and E. Frank.
Decision Tree Learning Presented by Ping Zhang Nov. 26th, 2007.
Hybrid Intelligent Systems for Network Security Lane Thames Georgia Institute of Technology Savannah, GA
Outline Decision tree representation ID3 learning algorithm Entropy, Information gain Issues in decision tree learning 2.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Overfitting, Bias/Variance tradeoff. 2 Content of the presentation Bias and variance definitions Parameters that influence bias and variance Bias and.
Data Mining Practical Machine Learning Tools and Techniques Chapter 6.5: Instance-based Learning Rodney Nielsen Many / most of these slides were adapted.
Network Management Lecture 13. MACHINE LEARNING TECHNIQUES 2 Dr. Atiq Ahmed Université de Balouchistan.
INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.
CSE573 Autumn /11/98 Machine Learning Administrative –Finish this topic –The rest of the time is yours –Final exam Tuesday, Mar. 17, 2:30-4:20.
Ensemble Classifiers.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Machine Learning: Ensemble Methods
Boosted Augmented Naive Bayes. Efficient discriminative learning of
An Artificial Intelligence Approach to Precision Oncology
C4.5 - pruning decision trees
Rule Induction for Classification Using
Prepared by: Mahmoud Rafeek Al-Farra
Medical Diagnosis via Genetic Programming
Data Mining Lecture 11.
Model Averaging with Discrete Bayesian Network Classifiers
A task of induction to find patterns
Introduction to Data Mining, 2nd Edition by
Introduction to Data Mining, 2nd Edition by
Machine Learning Ensemble Learning: Voting, Boosting(Adaboost)
CSCI N317 Computation for Scientific Applications Unit Weka
المشرف د.يــــاســـــــــر فـــــــؤاد By: ahmed badrealldeen
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
Pegna, J.M., Lozano, J.A., and Larragnaga, P.
INTRODUCTION TO Machine Learning
S. Kikuchi, D. Tominaga, M. Arita, K. Takahashi, and M. Tomita
INTRODUCTION TO Machine Learning 2nd Edition
Avoid Overfitting in Classification
Sofia Pediaditaki and Mahesh Marina University of Edinburgh
Evolutionary Ensembles with Negative Correlation Learning
A task of induction to find patterns
A task of induction to find patterns
Presentation transcript:

Using Bayesian Network in the Construction of a Bi-level Multi-classifier. A Case Study Using Intensive Care Unit Patients Data B. Sierra, N. Serrano, P. Larranaga, et al. Artificial Intelligence in Medicine, vol. 22, no. 3, pp. 233-248, June 2001 Cho, Dong-Yeon

(C) 2001 SNU CSE Biointelligence Lab Introduction Combining the predictions of a set of classifier More accurate than any of the component classifier How to create and combine an ensemble of classifier A new multi-classifier construction methodology based on the stacked generalization paradigm A number of classifier layers Upper layer classifiers receive the class predicted by its immediately previous layer as input. Bayesian network structure (C) 2001 SNU CSE Biointelligence Lab

Multi-classifier Schemata Multi-classifier Structure Stacked generalization Each layer of classifiers is used to combine the predictions of the classifiers of its preceding layer. A single classifier at the top-most level outputs the ultimate prediction. Two-level system Bayesian network makes a consensus vote system over the predictions of the level-0 single classifiers. It can identify the possible conditional independencies and dependencies existing between the results obtained by level-0 classifiers. (C) 2001 SNU CSE Biointelligence Lab

(C) 2001 SNU CSE Biointelligence Lab

(C) 2001 SNU CSE Biointelligence Lab Multi-classifier Construction Leaving One Out sequence n-1 training example Testing the learned model with the jth case Obtained results are used as training set in the Bayesian network construction (C) 2001 SNU CSE Biointelligence Lab

Layer-0 Composite Classifiers Decision Trees Avoid overfitting Prepruning: Weighting the discriminant capability of the attribute selected, and thus discarding a possible successive splitting of the dataset Postpruning: After allowing a huge expansion of the tree, then removing branches and leaves ID3: only prepruning C4.5: both pruning techniques (C) 2001 SNU CSE Biointelligence Lab

(C) 2001 SNU CSE Biointelligence Lab Instance-based Learning k-nearest neighbor (k-NN) algorithm – similarity function IB4 keeps a classification performance record for each saved instance and removes some of the saved instances that believed to be noisy instance using a significance test. IB: the weight of each attribute reflects the attribute’s relative importance for classification The attribute weights are increased for attributes with similar values for correct classifications or for attributes with different values for incorrect classifications, and they are decreased otherwise. (C) 2001 SNU CSE Biointelligence Lab

(C) 2001 SNU CSE Biointelligence Lab Rule Induction cn2 rule induction program It has been designed with the aim of inducing short, simple, comprehensible rules in domains where problems of poor description language and/or noise may be present. The rules are searched in a general-to-specific way. Strict match oneR It is a very simple rule inductor that searches and only applies the best rule in the datafile. Ripper It is a fast rule inductor (C) 2001 SNU CSE Biointelligence Lab

(C) 2001 SNU CSE Biointelligence Lab Naive Bayes (NB) Classifiers Assumption of independence between the occurrence of features values NB classifier NBTree classifier It builds a decision tree applying the Naive Bayes classifier at the leaves of the tree. (C) 2001 SNU CSE Biointelligence Lab

Layer-1 Classifier: Bayesian Network Bayesian Networks BNs are directed acyclic graphs (DAGs) Concept of conditional independence among variables It constitutes an efficient device to perform probabilistic inference. The problem of building such a network remains. (C) 2001 SNU CSE Biointelligence Lab

(C) 2001 SNU CSE Biointelligence Lab Bayesian Networks as Classifiers Naive Bayes approach This assummes independence among all the predictor variables given the class. The Bayesian network structure is fixed, having all predictor variables as sons of the variable to be predicted. (C) 2001 SNU CSE Biointelligence Lab

(C) 2001 SNU CSE Biointelligence Lab Markov Blanquet (MB) approach In a BN any variable is influenced only by its Markov Blanquet, that is, its parent variables, its children variables and the parent variables of its children variables. The search in the set of structures that are MB of the variable to be classified. (C) 2001 SNU CSE Biointelligence Lab

(C) 2001 SNU CSE Biointelligence Lab Genetic algorithm begin AGA Make initial population at random WHILE NOT stop DO BEGIN Select parents from the population Produce children from the selected parents Mutate the individuals Extend the population by adding the children to it Reduce the extended population END Output the best individual found end AGA (C) 2001 SNU CSE Biointelligence Lab

(C) 2001 SNU CSE Biointelligence Lab Notation and representation Assuming an ordering between the nodes Without assuming an ordering between the nodes (C) 2001 SNU CSE Biointelligence Lab

(C) 2001 SNU CSE Biointelligence Lab Obtained model Automatically inducing Bayesian networks with a Markov Blanquet structure with respect to the class variable based on GAs Each individual in GA will be a BN structure, and all the predictor variables form the MB of the variable to be classified. (C) 2001 SNU CSE Biointelligence Lab

(C) 2001 SNU CSE Biointelligence Lab Experimental Results Datafile 1210 ICU patients Survival: 996 cases, 82.31% Not survival: 214 cases, 17.69% 10-fold cross-validation ICU datafile variables (C) 2001 SNU CSE Biointelligence Lab

(C) 2001 SNU CSE Biointelligence Lab Results Standard medical methods ML standard approaches and multi-classifier (C) 2001 SNU CSE Biointelligence Lab

Conclusion and Further Work A new multi-classifier construction method Outperforming existing standard machine learning methods by combining them for predicting the survival of patients at ICU As further work, this method will be applied taking the specificity and sensitivity of the data, and the method will be applied to bigger databases. (C) 2001 SNU CSE Biointelligence Lab