Institut für Softwarewissenschaft - Universität WienP.Brezany 1 Meta-Learning in Distributed Datamining Systems Peter Brezany Institut für Softwarewissenschaft.

Slides:



Advertisements
Similar presentations
Designing Services for Grid-based Knowledge Discovery A. Congiusta, A. Pugliese, Domenico Talia, P. Trunfio DEIS University of Calabria ITALY
Advertisements

Discrimination amongst k populations. We want to determine if an observation vector comes from one of the k populations For this purpose we need to partition.
Classification Classification Examples
Ensemble Learning Reading: R. Schapire, A brief introduction to boosting.
Combining Classification and Model Trees for Handling Ordinal Problems D. Anyfantis, M. Karagiannopoulos S. B. Kotsiantis, P. E. Pintelas Educational Software.
Decision Tree Approach in Data Mining
Data Mining Classification: Alternative Techniques
Data Mining Classification: Alternative Techniques
1 Machine Learning: Lecture 10 Unsupervised Learning (Based on Chapter 9 of Nilsson, N., Introduction to Machine Learning, 1996)
Mining databases with different schema: Integrating incompatible classifiers Andreas L Prodromidis Salvatore Stolfo Dept of Computer Science Columbia University.
PRESENTATION ON “ Processing Of Satellite Image Using Dip ” by B a n d a s r e e n i v a s Assistant Professor Department of Electronics & Communication.
Data Mining Sangeeta Devadiga CS 157B, Spring 2007.
Decision Tree Algorithm
2D1431 Machine Learning Boosting.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 8 The Enhanced Entity- Relationship (EER) Model.
Institut für Scientific Computing – Universität WienP.Brezany Fragmentation Univ.-Prof. Dr. Peter Brezany Institut für Scientific Computing Universität.
Ensemble Learning: An Introduction
Institut für Scientific Computing – Universität WienP.Brezany Parallel Databases Univ.-Prof. Dr. Peter Brezany Institut für Scientific Computing Universität.
Data-intensive Computing Algorithms: Classification Ref: Algorithms for the Intelligent Web 6/26/20151.
Presented by Zeehasham Rasheed
Institut für Softwarewissenschaft - Universität WienP.Brezany 1 Toward Knowledge Discovery in Databases Attached to Grids Peter Brezany Institute for Software.
Report on Multi-agent Data Fusion System: Design and implementation issues 1 By Ganesh Godavari.
(C) 2001 SNU CSE Biointelligence Lab Incremental Classification Using Tree- Based Sampling for Large Data H. Yoon, K. Alsabti, and S. Ranka Instance Selection.
Machine Learning: Ensemble Methods
Data Mining: A Closer Look
Chapter 5 Data mining : A Closer Look.
For Better Accuracy Eick: Ensemble Learning
3 ème Journée Doctorale G&E, Bordeaux, Mars 2015 Wei FENG Geo-Resources and Environment Lab, Bordeaux INP (Bordeaux Institute of Technology), France Supervisor:
Overview of Distributed Data Mining Xiaoling Wang March 11, 2003.
Methods in Medical Image Analysis Statistics of Pattern Recognition: Classification and Clustering Some content provided by Milos Hauskrecht, University.
Issues with Data Mining
Comparing the Parallel Automatic Composition of Inductive Applications with Stacking Methods Hidenao Abe & Takahira Yamaguchi Shizuoka University, JAPAN.
Overview: Humans are unique creatures. Everything we do is slightly different from everyone else. Even though many times these differences are so minute.
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
Bayesian Networks. Male brain wiring Female brain wiring.
1 Logistic Regression Adapted from: Tom Mitchell’s Machine Learning Book Evan Wei Xiang and Qiang Yang.
Chapter 9 Neural Network.
Machine Learning CSE 681 CH2 - Supervised Learning.
1 Relational Databases and SQL. Learning Objectives Understand techniques to model complex accounting phenomena in an E-R diagram Develop E-R diagrams.
Prediction of Molecular Bioactivity for Drug Design Experiences from the KDD Cup 2001 competition Sunita Sarawagi, IITB
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
Ensembles. Ensemble Methods l Construct a set of classifiers from training data l Predict class label of previously unseen records by aggregating predictions.
Classification Techniques: Bayesian Classification
Stefan Mutter, Mark Hall, Eibe Frank University of Freiburg, Germany University of Waikato, New Zealand The 17th Australian Joint Conference on Artificial.
Iterative similarity based adaptation technique for Cross Domain text classification Under: Prof. Amitabha Mukherjee By: Narendra Roy Roll no: Group:
Classification Ensemble Methods 1
1 January 24, 2016Data Mining: Concepts and Techniques 1 Data Mining: Concepts and Techniques — Chapter 7 — Classification Ensemble Learning.
Classification and Prediction: Ensemble Methods Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
CS 8751 ML & KDDComputational Learning Theory1 Notions of interest: efficiency, accuracy, complexity Probably, Approximately Correct (PAC) Learning Agnostic.
… Algo 1 Algo 2 Algo 3 Algo N Meta-Learning Algo.
Eick: kNN kNN: A Non-parametric Classification and Prediction Technique Goals of this set of transparencies: 1.Introduce kNN---a popular non-parameric.
Machine Learning Chapter 7. Computational Learning Theory Tom M. Mitchell.
BAYESIAN LEARNING. 2 Bayesian Classifiers Bayesian classifiers are statistical classifiers, and are based on Bayes theorem They can calculate the probability.
Pattern Recognition. What is Pattern Recognition? Pattern recognition is a sub-topic of machine learning. PR is the science that concerns the description.
1 By: Ashmi Banerjee (125186) Suman Datta ( ) CSE- 3rd year.
DATA MINING TECHNIQUES (DECISION TREES ) Presented by: Shweta Ghate MIT College OF Engineering.
1 Machine Learning: Ensemble Methods. 2 Learning Ensembles Learn multiple alternative definitions of a concept using different training data or different.
Network Management Lecture 13. MACHINE LEARNING TECHNIQUES 2 Dr. Atiq Ahmed Université de Balouchistan.
Ensemble Classifiers.
Machine Learning: Ensemble Methods
Perceptrons Lirong Xia.
SEEM5770/ECLT5840 Course Review
Cost-Sensitive Learning
The Nature of Probability and Statistics
Data Mining Practical Machine Learning Tools and Techniques
A Unifying View on Instance Selection
Introduction to Data Mining, 2nd Edition
iSRD Spam Review Detection with Imbalanced Data Distributions
Using Bayesian Network in the Construction of a Bi-level Multi-classifier. A Case Study Using Intensive Care Unit Patients Data B. Sierra, N. Serrano,
Perceptrons Lirong Xia.
Presentation transcript:

Institut für Softwarewissenschaft - Universität WienP.Brezany 1 Meta-Learning in Distributed Datamining Systems Peter Brezany Institut für Softwarewissenschaft Universität Wien Tel. : 01/ Sprechstunde: Dienstag

Institut für Softwarewissenschaft - Universität WienP.Brezany 2 Introduction Meta-learning (learning from learned knowledge) – a technique dealing with the problem of computing a „global“ classifier from large and inherently distributed databases. A number of independent classifiers – „base classifiers“ -are computed in parallel. The base classifiers are then collected and combined to a „meta-classifier“ by another learning process. The predictive accuracy of base classifiers is improved. Assuming that a system consists of several databases interconnected through an intranet or internet, the goal is to provide the means for each data site to utilize its own local data and, at the same time, benefit from the data that is available at other data sites without transfering or directly accessing that data. The above concept can be materialized by learning agents that can execute at remote data sites and generate classifier agents that can subsequently be transfered among the sites.

Institut für Softwarewissenschaft - Universität WienP.Brezany 3 Meta-Learning Scenario Learning Algorithm Training Data Training Data Learning Algorithm Classifier Validation Data Predictions Meta-Level Training Data Meta-Learning Algorithm Final Classifier

Institut für Softwarewissenschaft - Universität WienP.Brezany 4 Meta-Learning Scenario (2) 1.The classifiers (base classifiers) are trained from the initial (base-level) training sets. 2.Predictions are generatedby the learned classifiers on a separate validation set. 3.A meta-level training set is composed from the validation set and the predictions generated by the classifiers on the validation set. 4.The final classifier (meta-classifier) is trained from the meta-level training set.

Institut für Softwarewissenschaft - Universität WienP.Brezany 5 Strategies for Combining Multiple Predictions from Base Classifiers 1.Voting - Each classifier gets 1 vote; the majority wins. 2.Arbitration – A prediction of an „objective“ judge (a classifier) is selected if the participating classifiers cannot reach a consensus decision. 3.Combining – the use of knowledge about how classifiers behave with respect to each other. 1.A combiner is a program generated by a learning algorithm that is trained on the predictions produced by a set of base classifiers on raw data (hierarchical structure is possible).

Institut für Softwarewissenschaft - Universität WienP.Brezany 6 Example: A Combiner with 2 Classifiers Classifier 1 Classifier 2 Combiner Instance Prediction 1 Prediction 2

Institut für Softwarewissenschaft - Universität WienP.Brezany 7 Notation x – an instance (sample) whose classification we seek C 1 (x), C 2 (x),..., C k (x) – predicted classifications of x from the k base classifiers, C i, i = 1, 2,..., k class(x) – correct classification of x attrvec(x) – attribute vector of x E – validation set of examples; x  E T – set of „meta-level training examples“

Institut für Softwarewissenschaft - Universität WienP.Brezany 8 Combiner Strategy Combiner = meta-classifier (generated by meta-learner) Composition rule – determines the content of training examples for the meta-learner; it varies in different schemes. 2 schemes for the composition rule, according to the strategy used for computation of T. –Class-combiner. The meta-level training instances consist of the correct classification and the predictions; i.e., T = {class(x), C 1 (x), C 2 (x),..., C k (x) | x  E} –Class-attribute-combiner. T = {class(x), C 1 (x), C 2 (x),..., C k (x), attrvec(x) | x  E} These composition rules are also used in a similar manner during classification after a combiner has been computed. Given an instance whose classification is sought, we first compute the classifications predicted by each of the base classifiers. The composition rule is then applied to generate a single meta-level test instance, which is then classified by the combiner to produce the final predicted class of the original test datum.

Institut für Softwarewissenschaft - Universität WienP.Brezany 9 Combiner Strategy (2) Validation Set

Institut für Softwarewissenschaft - Universität WienP.Brezany 10 A Real Medical Application