Presentation is loading. Please wait.

Presentation is loading. Please wait.

Assessing and Comparing Classification Algorithms Introduction Resampling and Cross Validation Measuring Error Interval Estimation and Hypothesis Testing.

Similar presentations


Presentation on theme: "Assessing and Comparing Classification Algorithms Introduction Resampling and Cross Validation Measuring Error Interval Estimation and Hypothesis Testing."— Presentation transcript:

1 Assessing and Comparing Classification Algorithms Introduction Resampling and Cross Validation Measuring Error Interval Estimation and Hypothesis Testing Assessing and Comparing Performance

2 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 Introduction Questions:  Assessment of the expected error of a learning algorithm: Is the error rate of 1-NN less than 2%?  Comparing the expected errors of two algorithms: Is k-NN more accurate than MLP ? Training/validation/test sets Resampling methods: K-fold cross-validation

3 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 3 Algorithm Preference Criteria (Application-dependent):  Misclassification error, or risk (loss functions)  Training time/space complexity  Testing time/space complexity  Interpretability  Easy programmability Cost-sensitive learning

4 Assessing and Comparing Classification Algorithms Introduction Resampling and Cross Validation Measuring Error Interval Estimation and Hypothesis Testing Assessing and Comparing Performance

5 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 5 Resampling and K-Fold Cross-Validation The need for multiple training/validation sets {X i,V i } i : Training/validation sets of fold i K-fold cross-validation: Divide X into k, X i,i=1,...,K T i share K-2 parts

6 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 6 5×2 Cross-Validation 5 times 2 fold cross-validation (Dietterich, 1998)

7 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 7 Bootstrapping Draw instances from a dataset with replacement Prob that we do not pick an instance after N draws that is, only 36.8% is new!

8 Assessing and Comparing Classification Algorithms Introduction Resampling and Cross Validation Measuring Error Interval Estimation and Hypothesis Testing Assessing and Comparing Performance

9 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 9 Measuring Error Error rate = # of errors / # of instances = (FN+FP) / N Recall = # of found positives / # of positives = TP / (TP+FN) = sensitivity = hit rate Precision = # of found positives / # of found = TP / (TP+FP) Specificity = TN / (TN+FP) False alarm rate = FP / (FP+TN) = 1 - Specificity

10 Methods for Performance Evaluation How to obtain a reliable estimate of performance? Performance of a model may depend on other factors besides the learning algorithm:  Class distribution  Cost of misclassification  Size of training and test sets

11 Learning Curve l Learning curve shows how accuracy changes with varying sample size l Requires a sampling schedule for creating learning curve: l Arithmetic sampling (Langley, et al) l Geometric sampling (Provost et al) Effect of small sample size: - Bias in the estimate - Variance of estimate

12 ROC (Receiver Operating Characteristic) Developed in 1950s for signal detection theory to analyze noisy signals  Characterize the trade-off between positive hits and false alarms ROC curve plots TP (on the y-axis) against FP (on the x-axis) Performance of each classifier represented as a point on the ROC curve  changing the threshold of algorithm, sample distribution or cost matrix changes the location of the point http://en.wikipedia.org/wiki/Receiver_operating_characteristic http://www.childrensmercy.org/stats/ask/roc.asp

13 ROC Curve At threshold t: TP=0.5, FN=0.5, FP=0.12, FN=0.88 - 1-dimensional data set containing 2 classes (positive and negative) - any points located at x > t is classified as positive

14 ROC Curve (TP,FP): (0,0): declare everything to be negative class (1,1): declare everything to be positive class (1,0): ideal Diagonal line:  Random guessing  Below diagonal line: prediction is opposite of the true class

15 Using ROC for Model Comparison l No model consistently outperform the other l M 1 is better for small FPR l M 2 is better for large FPR l Area Under the ROC curve l Ideal:  Area = 1 l Random guess:  Area = 0.5

16 How to Construct an ROC curve InstanceP(+|A)True Class 10.95+ 20.93+ 30.87- 40.85- 5 - 6 + 70.76- 80.53+ 90.43- 100.25+ Use classifier that produces posterior probability for each test instance P(+|A) Sort the instances according to P(+|A) in decreasing order Apply threshold at each unique value of P(+|A) Count the number of TP, FP, TN, FN at each threshold TP rate, TPR = TP/(TP+FN) FP rate, FPR = FP/(FP + TN)

17 How to construct an ROC curve Threshold >= ROC Curve: + + - + - - - + - + +  Reverse of above order

18 Assessing and Comparing Classification Algorithms Introduction Resampling and Cross Validation Measuring Error Interval Estimation and Hypothesis Testing Assessing and Comparing Performance

19 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 19 Interval Estimation X = { x t } t where x t ~ N ( μ, σ 2 ) m ~ N ( μ, σ 2 /N) 100(1- α) percent confidence interval

20 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 20 When σ 2 is not known:

21 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 21 Hypothesis Testing Reject a null hypothesis if not supported by the sample with enough confidence X = { x t } t where x t ~ N ( μ, σ 2 ) H 0 : μ = μ 0 vs. H 1 : μ ≠ μ 0 Accept H 0 with level of significance α if μ 0 is in the 100(1- α ) confidence interval Two-sided test

22 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 22 One-sided test: H 0 : μ ≤ μ 0 vs. H 1 : μ > μ 0 Accept if Variance unknown: Use t, instead of z Accept H 0 : μ = μ 0 if

23 Assessing and Comparing Classification Algorithms Introduction Resampling and Cross Validation Measuring Error Interval Estimation and Hypothesis Testing Assessing and Comparing Performance

24 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 24 Assessing Error: H 0 : p ≤ p 0 vs. H 1 : p > p 0 Single training/validation set: Binomial Test If error prob is p 0, prob that there are e errors or less in N validation trials is 1- α Accept if this prob is less than 1- α N=100, e=20

25 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 25 Normal Approximation to the Binomial Number of errors X is approx N with mean Np 0 and var Np 0 (1-p 0 ) Accept if this prob for X = e is less than z 1- α 1- α

26 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 26 Paired t Test Multiple training/validation sets x t i = 1 if instance t misclassified on fold i Error rate of fold i: With m and s 2 average and var of p i we accept p 0 or less error if is less than t α,K-1

27 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 27 K-Fold CV Paired t Test Use K-fold cv to get K training/validation folds p i 1, p i 2 : Errors of classifiers 1 and 2 on fold i p i = p i 1 – p i 2 : Paired difference on fold i The null hypothesis is whether p i has mean 0


Download ppt "Assessing and Comparing Classification Algorithms Introduction Resampling and Cross Validation Measuring Error Interval Estimation and Hypothesis Testing."

Similar presentations


Ads by Google