1 1 Slide Evaluation. 2 2 n Interactive decision tree construction Load segmentchallenge.arff; look at dataset Load segmentchallenge.arff; look at dataset.

Slides:



Advertisements
Similar presentations
COMP3740 CR32: Knowledge Management and Adaptive Systems
Advertisements

On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach Author: Steven L. Salzberg Presented by: Zheng Liu.
Florida International University COP 4770 Introduction of Weka.
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.
Learning Algorithm Evaluation
Indian Statistical Institute Kolkata
Evaluation (practice). 2 Predicting performance  Assume the estimated error rate is 25%. How close is this to the true error rate?  Depends on the amount.
Evaluation.
WEKA - Experimenter (sumber: WEKA Explorer user Guide for Version 3-5-5)
Sparse vs. Ensemble Approaches to Supervised Learning
Credibility: Evaluating what’s been learned. Evaluation: the key to success How predictive is the model we learned? Error on the training data is not.
CS 8751 ML & KDDEvaluating Hypotheses1 Sample error, true error Confidence intervals for observed hypothesis error Estimators Binomial distribution, Normal.
Evaluation.
Three kinds of learning
Evaluation of MineSet 3.0 By Rajesh Rathinasabapathi S Peer Mohamed Raja Guided By Dr. Li Yang.
1 Homework  What’s important (i.e., this will be used in determining your grade): Finding features that make a difference You should expect to do some.
Experimental Evaluation
Evaluation of Results (classifiers, and beyond) Biplav Srivastava Sources: [Witten&Frank00] Witten, I.H. and Frank, E. Data Mining - Practical Machine.
General Mining Issues a.j.m.m. (ton) weijters Overfitting Noise and Overfitting Quality of mined models (some figures are based on the ML-introduction.
On Comparing Classifiers: Pitfalls to Avoid and Recommended Approach Published by Steven L. Salzberg Presented by Prakash Tilwani MACS 598 April 25 th.
M. Sulaiman Khan Dept. of Computer Science University of Liverpool 2009 COMP527: Data Mining Classification: Evaluation February 23,
EVALUATION David Kauchak CS 451 – Fall Admin Assignment 3 - change constructor to take zero parameters - instead, in the train method, call getFeatureIndices()
1 © Goharian & Grossman 2003 Introduction to Data Mining (CS 422) Fall 2010.
Machine Learning CS 165B Spring 2012
Data Mining – Algorithms: OneR Chapter 4, Section 4.1.
An Exercise in Machine Learning
Comparing the Parallel Automatic Composition of Inductive Applications with Stacking Methods Hidenao Abe & Takahira Yamaguchi Shizuoka University, JAPAN.
CLassification TESTING Testing classifier accuracy
Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 5 of Data Mining by I. H. Witten, E. Frank and M. A. Hall 報告人:黃子齊
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Topics: Statistics & Experimental Design The Human Visual System Color Science Light Sources: Radiometry/Photometry Geometric Optics Tone-transfer Function.
WEKA - Explorer (sumber: WEKA Explorer user Guide for Version 3-5-5)
“PREDICTIVE MODELING” CoSBBI, July Jennifer Hu.
Evaluating What’s Been Learned. Cross-Validation Foundation is a simple idea – “ holdout ” – holds out a certain amount for testing and uses rest for.
Categorical data. Decision Tree Classification Which feature to split on? Try to classify as many as possible with each split (This is a good split)
Hands-on predictive models and machine learning for software Foutse Khomh, Queen’s University Segla Kpodjedo, École Polytechnique de Montreal PASED - Canadian.
Today Ensemble Methods. Recap of the course. Classifier Fusion
1 1 Slide Using Weka. 2 2 Slide Data Mining Using Weka n What’s Data Mining? We are overwhelmed with data We are overwhelmed with data Data mining is.
An Investigation of Commercial Data Mining Presented by Emily Davis Supervisor: John Ebden.
Slides for “Data Mining” by I. H. Witten and E. Frank.
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 5: Credibility: Evaluating What’s Been Learned.
Classification (slides adapted from Rob Schapire) Eran Segal Weizmann Institute.
WEKA Machine Learning Toolbox. You can install Weka on your computer from
Machine Learning Tutorial-2. Recall, Precision, F-measure, Accuracy Ch. 5.
CSCI 347, Data Mining Evaluation: Cross Validation, Holdout, Leave-One-Out Cross Validation and Bootstrapping, Sections 5.3 & 5.4, pages
An Exercise in Machine Learning
***Classification Model*** Hosam Al-Samarraie, PhD. CITM-USM.
Validation methods.
WEKA's Knowledge Flow Interface Data Mining Knowledge Discovery in Databases ELIE TCHEIMEGNI Department of Computer Science Bowie State University, MD.
Machine Learning in Practice Lecture 2 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Machine Learning in Practice Lecture 9 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Evaluating Classifiers. Reading for this topic: T. Fawcett, An introduction to ROC analysis, Sections 1-4, 7 (linked from class website)
In part from: Yizhou Sun 2008 An Introduction to WEKA Explorer.
Machine Learning in Practice Lecture 9 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
1 Machine Learning Lecture 8: Ensemble Methods Moshe Koppel Slides adapted from Raymond J. Mooney and others.
Data Science Credibility: Evaluating What’s Been Learned
Evaluating Classifiers
Machine Learning Techniques for Data Mining
Weka Package Weka package is open source data mining software written in Java. Weka can be applied to your dataset from the GUI, the command line or called.
Tutorial for LightSIDE
CSCI N317 Computation for Scientific Applications Unit Weka
CS4705 – Natural Language Processing Thursday, September 28
Classification Breakdown
Evaluating Classifiers
Assignment 1: Classification by K Nearest Neighbors (KNN) technique
CS639: Data Management for Data Science
Regression Methods.
Neural Networks Weka Lab
Data Mining CSCI 307, Spring 2019 Lecture 8
Data Mining CSCI 307, Spring 2019 Lecture 9
Presentation transcript:

1 1 Slide Evaluation

2 2 n Interactive decision tree construction Load segmentchallenge.arff; look at dataset Load segmentchallenge.arff; look at dataset Select UserClassifier (tree classifier) Select UserClassifier (tree classifier) Use the test set segmenttest.arff Use the test set segmenttest.arff Examine data visualizer and tree visualizer Examine data visualizer and tree visualizer Plot regioncentroidrow vs intensitymean Plot regioncentroidrow vs intensitymean Rectangle, Polygon and Polyline selection tools … several selections … Rectangle, Polygon and Polyline selection tools … several selections … Right click in Tree visualizer and Accept the tree Right click in Tree visualizer and Accept the tree n Over to you: how well can you do? Be a classifier!

3 3 Slide n Build a tree: what strategy did you use? n Given enough time, you could produce a “perfect” tree for the dataset but would it perform well on the test test? but would it perform well on the test test? Be a classifier!

4 4 Slide TestTestdatadataTestTestdatadata Training data ML algorithm Classifier Deploy! Evaluation results Training and Testing

5 5 Slide TestTestdatadataTestTestdatadata Training data ML algorithm Classifier Deploy! Evaluation results sets produced by Basic assumption: training and test independent sampling from an infinite population Training and Testing

6 6 Slide n Use J48 to analyze the segment dataset Open file segment ‐ challenge.arff Open file segment ‐ challenge.arff Choose J48 decision tree learner (trees>J48) Choose J48 decision tree learner (trees>J48) Supplied test set segment ‐ test.arff Supplied test set segment ‐ test.arff Run it: 96% accuracy Run it: 96% accuracy Evaluate on training set: 99% accuracy Evaluate on training set: 99% accuracy Evaluate on percentage split: 95% accuracy Evaluate on percentage split: 95% accuracy Do it again: get exactly the same result! Do it again: get exactly the same result! Training and Testing

7 7 Slide n Basic assumption: training and test sets sampled independently from an infinite population training and test sets sampled independently from an infinite population n Just one dataset? — hold some out for testing n Expect slight variation in results… but Weka produces same results each time… Why? E.g. J48 on segment ‐ challenge dataset E.g. J48 on segment ‐ challenge dataset Training and Testing

8 8 Slide n Evaluate J48 on segment ‐ challenge With segment ‐ challenge and J48 (trees>J48) With segment ‐ challenge and J48 (trees>J48) Set percentage split to 90% Set percentage split to 90% Run it: 96.7% accuracy Run it: 96.7% accuracy [More options] Repeat [More options] Repeat with a different i th seed with a different i th seed Use 2, 3, 4, 5, 6, 7, 8, 9, 10Use 2, 3, 4, 5, 6, 7, 8, 9, 10 Repeated Training and Testing

9 9 Slide  x x x x i i Sample mean x =x =x =x = n nn(xi – x )2(xi – x )2nn(xi – x )2(xi – x )2 Variance  2 2 2 2 = n – 1 Standard deviation  x = 0.949,  = Repeated Training and Testing n Evaluate J48 on segment ‐ challenge

10 Slide n Basic assumption: training and test sets sampled independently from an infinite population training and test sets sampled independently from an infinite population n Expect slight variation in results … get it by setting the random ‐ number seed n Can calculate mean and standard deviation experimentally Repeated Training and Testing

11 Slide n Use diabetes dataset and default holdout n Open file diabetes.arff n Test option: Percentage split n Try these classifiers: trees > J4876% trees > J4876% bayes > NaiveBayes 77% bayes > NaiveBayes 77% lazy > IBk73% lazy > IBk73% rules > PART74% rules > PART74% n 768 instances (500 negative, 268 positive) n Always guess “negative”: 500/768=65% rules > ZeroR: most likely class! rules > ZeroR: most likely class! Baseline Accuracy

12 Slide n Sometimes baseline is best! Open supermarket.arff and blindly apply Open supermarket.arff and blindly apply rules > ZeroR64%rules > ZeroR64% trees > J4863%trees > J4863% bayes > NaiveBayes 63%bayes > NaiveBayes 63% lazy > IBk38%lazy > IBk38% rules > PART63%rules > PART63% Attributes are not informative Attributes are not informative Caution: Don’t just apply Weka to a dataset: you need to understand what’s going on Caution: Don’t just apply Weka to a dataset: you need to understand what’s going on Baseline Accuracy

13 Slide n Consider whether differences are significant n Always try a simple baseline, e.g. rules > ZeroR n Caution: Don’t just apply Weka to a dataset: you need to understand what’s going on Baseline Accuracy

14 Slide n Can we improve upon repeated holdout (i.e. reduce variance)? n Cross ‐ validation n Stratified cross ‐ validation Cross-Validation

15 Slide  Repeated holdout holdout10% for testing, repeat 10times (repeat 10 times) Cross-Validation

16 Slide 10‐fold cross‐validation  Divide dataset into 10 parts Hold out each part in turn Average the results (folds) Each data point used once for testing, 9 times for training Stratified cross‐validation  Ensure that each fold has the right proportion of each class value Cross-Validation

17 Slide  Cross‐validation better than repeated holdout Stratified is even better Practical rule of thumb: Lots of data? – use percentage split Else stratified 10‐fold cross‐validation  Cross-Validation

18 Slide Is cross‐validation really better than repeated holdout?  Diabetes dataset Baseline accuracy (rules > ZeroR): trees > J48 10‐fold cross‐validation 65.1% 73.8% … with different random number seed Cross-Validation Results

19 Slide holdout(10%) cross‐validation (10‐fold) xixixixi Sample mean x =x =x =x = n nn(xi – x )2(xi – x )2nn(xi – x )2(xi – x )2 Variance  2 2 2 2 = n–1  Standard deviation x = 74.5 x = 74.8  = = = =  = Cross-Validation Results

20 Slide  Why 10‐fold? E.g. 20‐fold: 75.1%  Cross‐validation really is better than repeated holdout It reduces the variance of the estimate Cross-Validation Results

21 Slide Evaluation Methods Exercises

22 Slide Plan n To evaluate the performance of machine learning algorithms classifying Tic-Tac-Toe games.

23 Slide Classification on Tic-Tac-Toe n Download Tic-Tac-Toe dataset tic-tac-toe.zip from Course Page. n Work as a team to evaluate the performance of machine learning algorithms classifying Tic-Tac-Toe games.

24 Slide Evaluation Methods n Using Training Set (use 100% of instances to train/learn and use 100% of instances to test performance) n 10-fold Cross-Validation n Split 70% (use 70% of instances to train/learn and use the rest of 30% of instances to test performance)

25 Slide Classifiers Being Used n Decision Tree Tree → J48 Tree → J48 n Neural Network Functions → MultilayerPerceptron (trainingtime=50) Functions → MultilayerPerceptron (trainingtime=50) n Bayes Network Bayes → NaiveBayes Bayes → NaiveBayes n Nearest Neighbor Lazy → IBk (k=3) Lazy → IBk (k=3)

26 Slide Using Weka n Extract Tic-Tac-Toe.zip to the Weka folder n Load Weka program n Open the Tic-Tac-Toe.arff n Choose Explorer

27 Slide Using Weka (cont.) n Click Classify tab n Choose J48 Classifier below trees n Set the Test options to Use training set n Enable Output predictions in More options n Click Start to run

28 Slide Using Weka (cont.) n Accuracy rate

29 Slide Reporting n Download Tic-tac-toe-report.docx n Complete the table evaluating the performance of different learning methods in Q1. n Find the best performer in Q2, Q3, and Q4.