Machine Learning in Practice Lecture 3 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.

Slides:



Advertisements
Similar presentations
Florida International University COP 4770 Introduction of Weka.
Advertisements

My name is Dustin Boswell and I will be presenting: Ensemble Methods in Machine Learning by Thomas G. Dietterich Oregon State University, Corvallis, Oregon.
Machine Learning in Practice Lecture 7 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Indian Statistical Institute Kolkata
Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 3 of Data Mining by I. H. Witten, E. Frank and M. A. Hall.
What is Statistical Modeling
Induction of Decision Trees
1 Homework  What’s important (i.e., this will be used in determining your grade): Finding features that make a difference You should expect to do some.
5/30/2006EE 148, Spring Visual Categorization with Bags of Keypoints Gabriella Csurka Christopher R. Dance Lixin Fan Jutta Willamowski Cedric Bray.
Data Mining: A Closer Look Chapter Data Mining Strategies (p35) Moh!
ML ALGORITHMS. Algorithm Types Classification (supervised) Given -> A set of classified examples “instances” Produce -> A way of classifying new examples.
Rotation Forest: A New Classifier Ensemble Method 交通大學 電子所 蕭晴駿 Juan J. Rodríguez and Ludmila I. Kuncheva.
February 15, 2006 Geog 458: Map Sources and Errors
CS Bayesian Learning1 Bayesian Learning. CS Bayesian Learning2 States, causes, hypotheses. Observations, effect, data. We need to reconcile.
Classification and Prediction: Basic Concepts Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Chapter 5 Data mining : A Closer Look.
M. Sulaiman Khan Dept. of Computer Science University of Liverpool 2009 COMP527: Data Mining Classification: Evaluation February 23,
Data Mining – Algorithms: OneR Chapter 4, Section 4.1.
An Exercise in Machine Learning
Predicting Income from Census Data using Multiple Classifiers Presented By: Arghya Kusum Das Arnab Ganguly Manohar Karki Saikat Basu Subhajit Sidhanta.
Data Mining Joyeeta Dutta-Moscato July 10, Wherever we have large amounts of data, we have the need for building systems capable of learning information.
Short Introduction to Machine Learning Instructor: Rada Mihalcea.
INTRODUCTION TO MACHINE LEARNING. $1,000,000 Machine Learning  Learn models from data  Three main types of learning :  Supervised learning  Unsupervised.
Inductive learning Simplest form: learn a function from examples
WEKA - Explorer (sumber: WEKA Explorer user Guide for Version 3-5-5)
Fundamentals of machine learning 1 Types of machine learning In-sample and out-of-sample errors Version space VC dimension.
1 1 Slide Evaluation. 2 2 n Interactive decision tree construction Load segmentchallenge.arff; look at dataset Load segmentchallenge.arff; look at dataset.
Machine Learning CSE 681 CH2 - Supervised Learning.
Data Mining: Classification & Predication Hosam Al-Samarraie, PhD. Centre for Instructional Technology & Multimedia Universiti Sains Malaysia.
Learning from Observations Chapter 18 Through
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
Today Ensemble Methods. Recap of the course. Classifier Fusion
WEKA Machine Learning Toolbox. You can install Weka on your computer from
Weka Just do it Free and Open Source ML Suite Ian Witten & Eibe Frank University of Waikato New Zealand.
M Machine Learning F# and Accord.net.
Machine Learning in Practice Lecture 19 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
An Exercise in Machine Learning
Rotem Golan Department of Computer Science Ben-Gurion University of the Negev, Israel.
Data Mining and Decision Support
***Classification Model*** Hosam Al-Samarraie, PhD. CITM-USM.
Machine Learning in Practice Lecture 24 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Machine Learning in Practice Lecture 10 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
MACHINE LEARNING 3. Supervised Learning. Learning a Class from Examples Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Machine Learning in Practice Lecture 6 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Machine Learning in Practice Lecture 2 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Intro. ANN & Fuzzy Systems Lecture 16. Classification (II): Practical Considerations.
Machine Learning in Practice Lecture 21 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Machine Learning in Practice Lecture 9 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Machine Learning in Practice Lecture 8 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
In part from: Yizhou Sun 2008 An Introduction to WEKA Explorer.
Machine Learning in Practice Lecture 9 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Machine Learning in Practice Lecture 4 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Machine Learning Week 1.
Weka Package Weka package is open source data mining software written in Java. Weka can be applied to your dataset from the GUI, the command line or called.
Weka Free and Open Source ML Suite Ian Witten & Eibe Frank
Machine Learning in Practice Lecture 11
Machine Learning in Practice Lecture 26
CSCI N317 Computation for Scientific Applications Unit Weka
CS4705 – Natural Language Processing Thursday, September 28
Machine Learning in Practice Lecture 23
Machine Learning in Practice Lecture 22
Machine Learning in Practice Lecture 7
Machine Learning in Practice Lecture 17
Machine Learning in Practice Lecture 6
Machine Learning in Practice Lecture 19
Machine Learning in Practice Lecture 27
Assignment 8 : logistic regression
Lecture 16. Classification (II): Practical Considerations
Data Mining CSCI 307, Spring 2019 Lecture 6
Data Mining CSCI 307, Spring 2019 Lecture 8
Presentation transcript:

Machine Learning in Practice Lecture 3 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute

Plan for Today Announcements  Assignment 2  Quiz 1 Weka helpful hints Topic of the day: Input and Output More on cross-validation ARFF format

Weka Helpful Hints

Increase Heap Size

Weka Helpful Hint: Documentation!! Click on More button!

Output Predictions Option

Important note: Because of the way Weka randomizes the data for cross-validation, the only circumstance under which you can match the instance numbers to positions in your data is if you have separate train and test sets so the order will be preserved!

View Classifier Errors

Input and Output

Representations Concept: the rule you want to learn Instance: one data point from your training or testing data (row in table) Attribute: one of the features that an instance is composed of (column in table)

Numeric versus Nominal Attributes What kind of reasoning does your representation enable? Numeric attributes allow instances to be ordered Numeric attributes allow you to measure distance between instances Sometimes numeric attributes make too fine grained of a distinction

Numeric versus Nominal Attributes Numeric attributes can be discretized into nominal values  Then you lose ordering and distance  Another option is applying a function that maps a range of values into a single numeric attribute Nominal attributes can be mapped into numbers  i.e., decide that blue=1 and green=2  But are inferences made based on this valid?

Numeric versus Nominal Attributes Numeric attributes can be discretized into nominal values  Then you lose ordering and distance  Another option is applying a function that maps a range of values into a single numeric attribute Nominal attributes can be mapped into numbers  i.e., decide that blue=1 and green=2  But are inferences made based on this valid?

Example! Problem: Learn a rule that predicts how much time a person spends doing math problems each day Attributes: You know gender, age, socio- economic status of parents, chosen field if any How would you represent age, and why? What would you expect the target rule to look like?

Styles of Learning Classification – learn rules from labeled instances that allow you to assign new instances to a class Association – look for relationships between features, not just rules that predict a class from an instance (more general) Clustering – look for instances that are similar (involves comparisons of multiple features) Numeric Prediction (regression models)

Food Web

Food Web What else would be affected if wheat were to disappear?

Food Web How would you represent this data?

Food Web What would the learned rule look like?

Food Web What would the learned rule look like?

Food Web

Food Web What if you wanted a more general rule: i.e., Affects(Entity1, Entity2)

Food Web What if you wanted a more general rule: i.e., Affects(Entity1, Entity2)

Food Web What if you wanted a more general rule: i.e., Affects(Entity1, Entity2) rows altogether! Now let’s look at the learned rule….

Food Web What if you wanted a more general rule: i.e., Affects(Entity1, Entity2) rows altogether! Now let’s look at the learned rule….

Food Web What if you wanted a more general rule: i.e., Affects(Entity1, Entity2) rows altogether! Now let’s look at the learned rule…. Does it have to be this complicated?

Food Web What would your representation for Affects(Entity1, Entity2) look like?

Food Web What would your representation for Affects(Entity1, Entity2) look like?

Food Web What would your representation for Affects(Entity1, Entity2) look like?

More on Cross- Validation

Cross Validation Exercise What is the same? What is different? What surprises you?

Compare Folds with Tree Trained on Whole Set

Train Versus Test Performance on Training Data Performance on Testing Data

Which Model Do You Think Will Perform Best on Test Set?

Fold 1

Fold 2

Fold 3

Fold 4

Fold 5

Total Performance What do you notice?

Total Performance Average Kappa =.5

Starting to think about Error Analyses Step 1: Look at the confusion matrix Where are most of the errors occurring? What are possible explanations for systematic errors you see?  Are the instances in the confusable classes too similar to each other? If so, how can we distinguish them?  Are we paying attention to the wrong features?  Are we missing features that would allow us to see commonalities within classes that we are missing?

What went wrong on Fold 3?

Training Set PerformanceTesting Set Performance Hypotheses?

What went wrong on Fold 3? Training Set PerformanceTesting Set Performance Hypotheses?

What’s the difference?

Hypothesis: Problem with first cut

Some Examples

What do you conclude?

Problem with Fold 3 was probably just a sampling fluke. Distribution of classes different between train and test.