Cost-Sensitive Learning

Slides:



Advertisements
Similar presentations
Is Random Model Better? -On its accuracy and efficiency-
Advertisements

Data Mining and Machine Learning
Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Part I Introduction to Data Mining by Tan,
Data Mining Classification: Alternative Techniques
Ensemble Methods An ensemble method constructs a set of base classifiers from the training data Ensemble or Classifier Combination Predict class label.
Data Mining Classification: Alternative Techniques
Multiple Criteria for Evaluating Land Cover Classification Algorithms Summary of a paper by R.S. DeFries and Jonathan Cheung-Wai Chan April, 2000 Remote.
Ensemble Learning what is an ensemble? why use an ensemble?
Ensemble Learning: An Introduction
Lazy Learning k-Nearest Neighbour Motivation: availability of large amounts of processing power improves our ability to tune k-NN classifiers.
Three kinds of learning
© Prentice Hall1 DATA MINING Introductory and Advanced Topics Part II Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist.
Chapter 5 Data mining : A Closer Look.
CSCI 347 / CS 4206: Data Mining Module 06: Evaluation Topic 07: Cost-Sensitive Measures.
Ensembles of Classifiers Evgueni Smirnov
Issues with Data Mining
Classification and Prediction (cont.) Pertemuan 10 Matakuliah: M0614 / Data Mining & OLAP Tahun : Feb
LOGO Ensemble Learning Lecturer: Dr. Bo Yuan
Special topics on text mining [ Part I: text classification ] Hugo Jair Escalante, Aurelio Lopez, Manuel Montes and Luis Villaseñor.
BOOSTING David Kauchak CS451 – Fall Admin Final project.
1 Ensembles An ensemble is a set of classifiers whose combined results give the final decision. test feature vector classifier 1classifier 2classifier.
Today Ensemble Methods. Recap of the course. Classifier Fusion
Ensembles. Ensemble Methods l Construct a set of classifiers from training data l Predict class label of previously unseen records by aggregating predictions.
Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.
CLASSIFICATION: Ensemble Methods
Classification Techniques: Bayesian Classification
ISQS 6347, Data & Text Mining1 Ensemble Methods. ISQS 6347, Data & Text Mining 2 Ensemble Methods Construct a set of classifiers from the training data.
Ensemble Methods.  “No free lunch theorem” Wolpert and Macready 1995.
1 CHUKWUEMEKA DURUAMAKU.  Machine learning, a branch of artificial intelligence, concerns the construction and study of systems that can learn from data.
Class Imbalance in Text Classification
Data Mining and Decision Support
1 January 24, 2016Data Mining: Concepts and Techniques 1 Data Mining: Concepts and Techniques — Chapter 7 — Classification Ensemble Learning.
Classification and Prediction: Ensemble Methods Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Ensemble Methods Construct a set of classifiers from the training data Predict class label of previously unseen records by aggregating predictions made.
Competition II: Springleaf Sha Li (Team leader) Xiaoyan Chong, Minglu Ma, Yue Wang CAMCOS Fall 2015 San Jose State University.
Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Ensemble Classifiers.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Machine Learning: Ensemble Methods
Data Mining Practical Machine Learning Tools and Techniques
Can-CSC-GBE: Developing Cost-sensitive Classifier with Gentleboost Ensemble for breast cancer classification using protein amino acids and imbalanced data.
Trees, bagging, boosting, and stacking
COMP61011 : Machine Learning Ensemble Models
Dipartimento di Ingegneria «Enzo Ferrari»,
Ensemble Learning Introduction to Machine Learning and Data Mining, Carla Brodley.
Data Mining Lecture 11.
A task of induction to find patterns
Asymmetric Gradient Boosting with Application to Spam Filtering
A “Holy Grail” of Machine Learing
Data Mining Classification: Alternative Techniques
Classification Techniques: Bayesian Classification
Data Mining Practical Machine Learning Tools and Techniques
Cost-Sensitive Learning
A Unifying View on Instance Selection
Introduction to Data Mining, 2nd Edition
Machine Learning Ensemble Learning: Voting, Boosting(Adaboost)
Classification of class-imbalanced data
iSRD Spam Review Detection with Imbalanced Data Distributions
Ensembles.
CSCI N317 Computation for Scientific Applications Unit Weka
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
Ensemble learning.
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
Model Combination.
A task of induction to find patterns
Ensembles An ensemble is a set of classifiers whose combined results give the final decision. test feature vector classifier 1 classifier 2 classifier.
Roc curves By Vittoria Cozza, matr
A task of induction to find patterns
A task of induction to find patterns
Presentation transcript:

Cost-Sensitive Learning Prepared with Lei Tang 9/18/2018 CSE 572: Data Mining by H. Liu

Cost-Sensitive Learning Motivation Data with different misclassification costs. Objective: Minimize the total misclassification cost. Application Medical diagnosis Fraud Detection Spam filtering Intrusion detection …… 9/18/2018 CSE 572: Data Mining by H. Liu

CSE 572: Data Mining by H. Liu Toy Example A Decision Tree(T1) based on Accuracy: Body Heat Tumor Ill abnormal yes no normal Tumor yes no Ill Not Ill Two prediction errors 9/18/2018 CSE 572: Data Mining by H. Liu

CSE 572: Data Mining by H. Liu Toy Example(cont’d) Misclassification costs for false positive and false negative are 1 and 100 respectively. Then misclassification cost for T1: 1 *1+100*1=101 T2 is another tree with higher error-rate, but lower misclassification cost. Errors : 3 Cost: 1*3=3 Tumor yes no ill Heat RDBMS - relational database management systems RDBMS offer simple operators for the deduction of information, such as join abnormal normal ill Not ill T2 based on cost 9/18/2018 CSE 572: Data Mining by H. Liu

CSE 572: Data Mining by H. Liu Cost matrix Cost matrix of the toy example: (Similar to confusion matrix) Quiz: Dose the absolute value matter? How to get cost matrix? --User defined or --based on the class distribution Actual Pred ill Not ill Ill 100 1 Induction is different from deduction and DBMS does not not support induction; The result of induction is higher-level information or knowledge: general statements about data There are many approaches. Refer to the lecture notes for CS3244 available at the Co-Op. We focus on three approaches here, other examples: Other approaches Instance-based learning other neural networks Concept learning (Version space, Focus, Aq11, …) Genetic algorithms Reinforcement learning 9/18/2018 CSE 572: Data Mining by H. Liu

CSE 572: Data Mining by H. Liu Different approaches There exist many techniques Stratification(sampling based on cost) Algorithm specific methods Build or prune decision tree based on cost cost-sensitive boosting, AdaCost … Meta-cost 9/18/2018 CSE 572: Data Mining by H. Liu

CSE 572: Data Mining by H. Liu Meta-cost Sampling multiple times to build different models For each example, calculate the probability of prediction for each class Re-label the training data based on the probability and cost matrix Build a normal error-based classifier Issues How to build such a tree from the data? What are the criteria for performance measurement correctness conciseness What are the key components? test stopping criterion 9/18/2018 CSE 572: Data Mining by H. Liu

CSE 572: Data Mining by H. Liu Further issues Multiple classes? Individual example cost? Different types of cost? Test cost? (Tasks for the group on cost-sensitive learning!) Issues How to build such a tree from the data? What are the criteria for performance measurement correctness conciseness What are the key components? test stopping criterion 9/18/2018 CSE 572: Data Mining by H. Liu

CSE 572: Data Mining by H. Liu Ensemble Learning Prepared with Surendra 9/18/2018 CSE 572: Data Mining by H. Liu

CSE 572: Data Mining by H. Liu Types of Ensemble Homogeneous Ensembles Uses the same learning algorithm e.g.. Bagging, Boosting Heterogeneous Ensembles Uses different learning algorithms e.g.., combination of Decision Tree, Nearest Neighbour, K-Star, etc. 9/18/2018 CSE 572: Data Mining by H. Liu

Phases of Building an Ensemble Model Generation Generate diverse set of classifiers Resampling, using different learning algorithm, various other strategies Model Combination Decide upon a strategy of combining the predictions of the classifiers making the ensemble 9/18/2018 CSE 572: Data Mining by H. Liu

Meta–Classification Framework Classifiers at two levels Base level or low level classifiers generated during model generation phase. Meta level classifier created during model combination phase. 9/18/2018 CSE 572: Data Mining by H. Liu

Categorization of Model Combination Strategies Voting As the name implies do some sort of voting of base classifiers Stacking Find a pattern between the predictions of base classifiers and the actual class label Grading Grade the base-classifiers and decide the subset which should be used 9/18/2018 CSE 572: Data Mining by H. Liu

CSE 572: Data Mining by H. Liu Voting Techniques Majority Voting Sum the prediction probabilities of different classes given by the base classifier and predict in favor of the majority class Weighted Voting Assign weights to classifiers and do a weighted sum of prediction probabilities Weight calculated using the error rate Threshold Voting Use majority voting or weighted voting only when the error rate is above a certain threshold 9/18/2018 CSE 572: Data Mining by H. Liu

CSE 572: Data Mining by H. Liu Stacking Techniques Stacking Use complete class distribution from each classifier Build a stacking classifier for each class 9/18/2018 CSE 572: Data Mining by H. Liu

CSE 572: Data Mining by H. Liu StackingC Use class distribution only for the concerned class 9/18/2018 CSE 572: Data Mining by H. Liu

CSE 572: Data Mining by H. Liu Grading Techniques Grading/Referee Method For each base classifier there is a grader classifier which determines whether the base classifier will be correct or not for the given test instance. G1 C1 G2 C2 G3 C3 9/18/2018 CSE 572: Data Mining by H. Liu