Evaluating Hypotheses Reading: Coursepack: Learning From Examples, Section 4 (pp. 16-21)

Slides:



Advertisements
Similar presentations
CHAPTER 2: Supervised Learning. Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 Learning a Class from Examples.
Advertisements

Evaluating Classifiers
Learning Algorithm Evaluation
Lecture Notes for Chapter 4 (2) Introduction to Data Mining
What is Statistical Modeling
The loss function, the normal equation,
Evaluation.
Text Categorization Hongning Wang Today’s lecture Bayes decision theory Supervised text categorization – General steps for text categorization.
Assessing and Comparing Classification Algorithms Introduction Resampling and Cross Validation Measuring Error Interval Estimation and Hypothesis Testing.
Model Evaluation Metrics for Performance Evaluation
Credibility: Evaluating what’s been learned. Evaluation: the key to success How predictive is the model we learned? Error on the training data is not.
© sebastian thrun, CMU, The KDD Lab Intro: Outcome Analysis Sebastian Thrun Carnegie Mellon University
Evaluation.
1 Classification with Decision Trees I Instructor: Qiang Yang Hong Kong University of Science and Technology Thanks: Eibe Frank and Jiawei.
Evaluation and Credibility How much should we believe in what was learned?
Experimental Evaluation
Evaluation and Credibility
Part I: Classification and Bayesian Learning
On Comparing Classifiers: Pitfalls to Avoid and Recommended Approach Published by Steven L. Salzberg Presented by Prakash Tilwani MACS 598 April 25 th.
Classification and Prediction: Basic Concepts Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
1 Machine Learning: Lecture 5 Experimental Evaluation of Learning Algorithms (Based on Chapter 5 of Mitchell T.., Machine Learning, 1997)
EVALUATION David Kauchak CS 451 – Fall Admin Assignment 3 - change constructor to take zero parameters - instead, in the train method, call getFeatureIndices()
CSCI 347 / CS 4206: Data Mining Module 06: Evaluation Topic 01: Training, Testing, and Tuning Datasets.
Evaluating Classifiers
CLassification TESTING Testing classifier accuracy
Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 5 of Data Mining by I. H. Witten, E. Frank and M. A. Hall 報告人:黃子齊
Classification II. 2 Numeric Attributes Numeric attributes can take many values –Creating branches for each value is not ideal The value range is usually.
1 Evaluating Model Performance Lantz Ch 10 Wk 5, Part 2 Right – Graphing is often used to evaluate results from different variations of an algorithm. Depending.
1 Machine Learning: Experimental Evaluation. 2 Motivation Evaluating the performance of learning systems is important because: –Learning systems are usually.
ASSESSING LEARNING ALGORITHMS Yılmaz KILIÇASLAN. Assessing the performance of the learning algorithm A learning algorithm is good if it produces hypotheses.
Evaluating What’s Been Learned. Cross-Validation Foundation is a simple idea – “ holdout ” – holds out a certain amount for testing and uses rest for.
Experimental Evaluation of Learning Algorithms Part 1.
Categorical data. Decision Tree Classification Which feature to split on? Try to classify as many as possible with each split (This is a good split)
 2003, G.Tecuci, Learning Agents Laboratory 1 Learning Agents Laboratory Computer Science Department George Mason University Prof. Gheorghe Tecuci 5.
Machine learning system design Prioritizing what to work on
CLASSIFICATION: Ensemble Methods
ASSESSING LEARNING ALGORITHMS Yılmaz KILIÇASLAN. Assessing the performance of the learning algorithm A learning algorithm is good if it produces hypotheses.
Model Evaluation l Metrics for Performance Evaluation –How to evaluate the performance of a model? l Methods for Performance Evaluation –How to obtain.
ASSESSING LEARNING ALGORITHMS Yılmaz KILIÇASLAN. Assessing the performance of the learning algorithm A learning algorithm is good if it produces hypotheses.
Concept learning, Regression Adapted from slides from Alpaydin’s book and slides by Professor Doina Precup, Mcgill University.
Practical Issues of Classification Underfitting and Overfitting –Training errors –Generalization (test) errors Missing Values Costs of Classification.
Evaluating Predictive Models Niels Peek Department of Medical Informatics Academic Medical Center University of Amsterdam.
1 CSI5388 Current Approaches to Evaluation (Based on Chapter 5 of Mitchell T.., Machine Learning, 1997)
Chapter5: Evaluating Hypothesis. 개요 개요 Evaluating the accuracy of hypotheses is fundamental to ML. - to decide whether to use this hypothesis - integral.
WEKA Machine Learning Toolbox. You can install Weka on your computer from
Weka Just do it Free and Open Source ML Suite Ian Witten & Eibe Frank University of Waikato New Zealand.
Machine Learning Tutorial-2. Recall, Precision, F-measure, Accuracy Ch. 5.
Classification Evaluation. Estimating Future Accuracy Given available data, how can we reliably predict accuracy on future, unseen data? Three basic approaches.
Bias and Variance of the Estimator PRML 3.2 Ethem Chp. 4.
Quiz 1 review. Evaluating Classifiers Reading: T. Fawcett paper, link on class website, Sections 1-4 Optional reading: Davis and Goadrich paper, link.
Validation methods.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Genetic Algorithms (in 1 Slide) l GA: based on an analogy to biological evolution l Each.
Chapter 5: Credibility. Introduction Performance on the training set is not a good indicator of performance on an independent set. We need to predict.
Evaluating Classifiers Reading: T. Fawcett, An introduction to ROC analysis, Sections 1-4, 7 (linked from class website)An introduction to ROC analysis.
Chapter 5: Credibility. Introduction Performance on the training set is not a good indicator of performance on an independent set. We need to predict.
Evaluating Classifiers. Reading for this topic: T. Fawcett, An introduction to ROC analysis, Sections 1-4, 7 (linked from class website)
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 5: Credibility: Evaluating What’s Been Learned.
Machine Learning Reading: Chapter Classification Learning Input: a set of attributes and values Output: discrete valued function Learning a continuous.
Data Science Credibility: Evaluating What’s Been Learned
Machine Learning – Classification David Fenyő
Evaluating Classifiers
Introduction to Data Mining, 2nd Edition by
Learning Algorithm Evaluation
Model Evaluation and Selection
Evaluating Models Part 1
The loss function, the normal equation,
Mathematical Foundations of BME Reza Shadmehr
Evaluating Hypothesis
Machine Learning: Methodology Chapter
Introduction to Machine learning
Presentation transcript:

Evaluating Hypotheses Reading: Coursepack: Learning From Examples, Section 4 (pp )

Evaluating Hypotheses What we want: hypothesis that best predicts unseen data Assumption: Data is “iid” (independently and identically distributed)

Accuracy and Error Accuracy = fraction of correct classifications on unseen data (test set) Error rate = 1 − Accuracy

How to use available data to best measure accuracy?

Split data into training and test sets.

How to use available data to best measure accuracy? Split data into training and test sets. But how to split?

How to use available data to best measure accuracy? Split data into training and test sets. But how to split? Too little training data: Too little test data:

How to use available data to best measure accuracy? Split data into training and test sets. But how to split? Too little training data: Don’t get optimal classifier Too little test data: Measured accuracy is not correct

One solution: “k-fold cross validation” Each example is used both as a training instance and as a test instance. Split data into k disjoint parts: S 1, S 2,..., S k. For i = 1 to k Select S i to be the test set. Train on the remaining data, test on S i, to obtain accuracy A i. Report as the final accuracy.

Avoid “peeking” at test data when training Example from readings: Split data into training and test sets. Train model with one learning parameter (e.g., “gain” vs “gain ratio”) Test on test set. Repeat with other learning parameter. Test on test set. Return accuracy of model with best performance. What’s wrong with this procedure?

Avoid “peeking” at test data when training Example from readings: Split data into training and test sets. Train model with one learning parameter (e.g., “gain” vs “gain ratio”) Test on test set. Repeat with other learning parameter. Test on test set. Return accuracy of model with best performance. Problem: You used the test set to select the best model – but is part of the learning process! Risk of overfitting to a particular test set. Need to evaluate final learned model on previously unseen data.

Can also solve this problem by using k-fold cross-validation to select model parameters, and then evaluate the resulting model on unseen test data that has been set aside previous to training.

“Confusion matrix” for a given class c Predicted (or “classified”) True False Actual(in class c)(not in class c) True (in class c) TruePositiveFalseNegative False (not in class c)FalsePositiveTrueNegative Evaluating classification algorithms

“Confusion matrix” for a given class c Predicted (or “classified”) True False Actual(in class c)(not in class c) True (in class c) TruePositiveFalseNegative False (not in class c)FalsePositiveTrueNegative Evaluating classification algorithms Type 2 error Type 1 error

Precision: Fraction of true positives out of all predicted positives: Recall: Fraction of true positives out of all actual positives:

Error vs. Loss Loss functions

Regularization