Today’s Topics Chapter 2 in One Slide Chapter 18: Machine Learning (ML) Creating an ML Dataset –“Fixed-length feature vectors” –Relational/graph-based.

Slides:



Advertisements
Similar presentations
Week 2 Vocabulary Review Quiz Friday, September 3 rd.
Advertisements

CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.
Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
PERCEPTRON LEARNING David Kauchak CS 451 – Fall 2013.
Indian Statistical Institute Kolkata
Statistical Classification Rong Jin. Classification Problems X Input Y Output ? Given input X={x 1, x 2, …, x m } Predict the class label y  Y Y = {-1,1},
Autocorrelation and Linkage Cause Bias in Evaluation of Relational Learners David Jensen and Jennifer Neville.
Prof. Ramin Zabih (CS) Prof. Ashish Raj (Radiology) CS5540: Computational Techniques for Analyzing Clinical Data.
Learning from Observations Copyright, 1996 © Dale Carnegie & Associates, Inc. Chapter 18 Fall 2005.
C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Lecture 9 Clustering Algorithms Bioinformatics Data Analysis and Tools.
Classification Dr Eamonn Keogh Computer Science & Engineering Department University of California - Riverside Riverside,CA Who.
Experimental Evaluation
CSCI 5582 Fall 2006 CSCI 5582 Artificial Intelligence Lecture 22 Jim Martin.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
Introduction to Machine Learning Approach Lecture 5.
EVALUATION David Kauchak CS 451 – Fall Admin Assignment 3 - change constructor to take zero parameters - instead, in the train method, call getFeatureIndices()
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
CS 376b Introduction to Computer Vision 04 / 29 / 2008 Instructor: Michael Eckmann.
INTRODUCTION TO MACHINE LEARNING. $1,000,000 Machine Learning  Learn models from data  Three main types of learning :  Supervised learning  Unsupervised.
CS 8751 ML & KDDSupport Vector Machines1 Support Vector Machines (SVMs) Learning mechanism based on linear programming Chooses a separating plane based.
ADVANCED CLASSIFICATION TECHNIQUES David Kauchak CS 159 – Fall 2014.
Data Analysis 1 Mark Stamp. Topics  Experimental design o Training set, test set, n-fold cross validation, thresholding, imbalance, etc.  Accuracy o.
Lecture note for Stat 231: Pattern Recognition and Machine Learning 4. Maximum Likelihood Prof. A.L. Yuille Stat 231. Fall 2004.
AI Week 14 Machine Learning: Introduction to Data Mining Lee McCluskey, room 3/10
Today’s Topics HW0 due 11:55pm tonight and no later than next Tuesday HW1 out on class home page; discussion page in MoodleHW1discussion page Please do.
Today’s Topics FREE Code that will Write Your PhD Thesis, a Best-Selling Novel, or Your Next Methods for Intelligently/Efficiently Searching a Space.
Today’s Topics Dealing with Noise Overfitting (the key issue in all of ML) A ‘Greedy’ Algorithm for Pruning D-Trees Generating IF-THEN Rules from D-Trees.
CS Fall 2015 (© Jude Shavlik), Lecture 7, Week 3
1 Learning Chapter 18 and Parts of Chapter 20 AI systems are complex and may have many parameters. It is impractical and often impossible to encode all.
1 CS 391L: Machine Learning: Experimental Evaluation Raymond J. Mooney University of Texas at Austin.
CS774. Markov Random Field : Theory and Application Lecture 19 Kyomin Jung KAIST Nov
Today’s Topics Read –For exam: Chapter 13 of textbook –Not on exam: Sections & Genetic Algorithms (GAs) –Mutation –Crossover –Fitness-proportional.
Today Ensemble Methods. Recap of the course. Classifier Fusion
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
1 Data Mining: Data Lecture Notes for Chapter 2. 2 What is Data? l Collection of data objects and their attributes l An attribute is a property or characteristic.
Today’s Topics Read Chapter 3 & Section 4.1 (Skim Section 3.6 and rest of Chapter 4), Sections 5.1, 5.2, 5.3, 5,7, 5.8, & 5.9 (skim rest of Chapter 5)
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
Probability Course web page: vision.cis.udel.edu/cv March 19, 2003  Lecture 15.
Today’s Topics Learning Decision Trees (Chapter 18) –We’ll use d-trees to introduce/motivate many general issues in ML (eg, overfitting reduction) “Forests”
Today’s Topics HW1 Due 11:55pm Today (no later than next Tuesday) HW2 Out, Due in Two Weeks Next Week We’ll Discuss the Make-Up Midterm Be Sure to Check.
Today’s Topics Playing Deterministic (no Dice, etc) Games –Mini-max –  -  pruning –ML and games? 1997: Computer Chess Player (IBM’s Deep Blue) Beat Human.
Review of fundamental 1 Data mining in 1D: curve fitting by LLS Approximation-generalization tradeoff First homework assignment.
Today’s Topics Graded HW1 in Moodle (Testbeds used for grading are linked to class home page) HW2 due (but can still use 5 late days) at 11:55pm tonight.
Today’s Topics Read: Chapters 7, 8, and 9 on Logical Representation and Reasoning HW3 due at 11:55pm THURS (ditto for your Nannon Tourney Entry) Recipe.
Objectives: Terminology Components The Design Cycle Resources: DHS Slides – Chapter 1 Glossary Java Applet URL:.../publications/courses/ece_8443/lectures/current/lecture_02.ppt.../publications/courses/ece_8443/lectures/current/lecture_02.ppt.
Machine Learning ICS 178 Instructor: Max Welling Supervised Learning.
Machine Learning Chapter 18, 21 Some material adopted from notes by Chuck Dyer.
Virtual Examples for Text Classification with Support Vector Machines Manabu Sassano Proceedings of the 2003 Conference on Emprical Methods in Natural.
Machine Learning Lecture 1: Intro + Decision Trees Moshe Koppel Slides adapted from Tom Mitchell and from Dan Roth.
Computational Intelligence: Methods and Applications Lecture 26 Density estimation, Expectation Maximization. Włodzisław Duch Dept. of Informatics, UMK.
Machine Learning in Practice Lecture 9 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Unsupervised Learning Part 2. Topics How to determine the K in K-means? Hierarchical clustering Soft clustering with Gaussian mixture models Expectation-Maximization.
Who am I? Work in Probabilistic Machine Learning Like to teach 
CS Fall 2016 (Shavlik©), Lecture 5
Information Organization: Overview
CS Fall 2016 (© Jude Shavlik), Lecture 4
cs540 - Fall 2015 (Shavlik©), Lecture 25, Week 14
Introduction to Data Science Lecture 7 Machine Learning Overview
cs638/838 - Spring 2017 (Shavlik©), Week 7
CS Fall 2016 (© Jude Shavlik), Lecture 6, Week 4
Machine Learning Today: Reading: Maria Florina Balcan
cs540 - Fall 2016 (Shavlik©), Lecture 20, Week 11
CS Fall 2016 (Shavlik©), Lecture 2
Machine Learning in Practice Lecture 27
Multivariate Methods Berlin Chen
Machine learning overview
Multivariate Methods Berlin Chen, 2005 References:
Information Organization: Overview
Lecture 16. Classification (II): Practical Considerations
Presentation transcript:

Today’s Topics Chapter 2 in One Slide Chapter 18: Machine Learning (ML) Creating an ML Dataset –“Fixed-length feature vectors” –Relational/graph-based examples HW0 (due in one week) Getting ‘Labeled’ Training Examples Train/Tune/Test Sets N-fold Cross Validation 9/8/15CS Fall 2015 (Shavlik©), Lecture 2, Week 11

The Big AI Picture – Chapter 2 9/8/15CS Fall 2015 (Shavlik©), Lecture 2, Week 12 Environment AI “Agent” 1: Sense 5: Learn 3: Act 4: Get Feedback 2: Reason The study of ‘agents’ that exist in an environment and perceive, act, and learn

What Do You Think Machine Learning Means? Given: Do: 9/8/15 Throughout the semester, think of what is missing in current ML, compared to human learning 3CS Fall 2015 (Shavlik©), Lecture 2, Week 1

What is Learning? “Learning denotes changes in the system that … enable the system to do the same task … more effectively the next time.” - Herbert Simon “Learning is making useful changes in our minds.” - Marvin Minsky 9/8/154CS Fall 2015 (Shavlik©), Lecture 2, Week 1 But remember, cheese and wine get better over time but don’t learn!

9/8/15 Supervised Machine Learning: Task Overview Concepts/ Classes/ Decisions Concepts/ Classes/ Decisions Feature “Design” (usually done by humans) Classifier Construction (done by learning algorithm) Real World Feature Space 5CS Fall 2015 (Shavlik©), Lecture 2, Week 1

9/8/15 Standard Approach for Constructing an ML Dataset for a Task Step 1: Choose a feature space We will use fixed-length feature vectors Choose N features Each feature has V i possible values Each example is represented by a vector of N feature values (is a point in the feature space) eg color weight shape Step 2: Collect examples (“I/O” pairs) Defines a space color shape weight 6CS Fall 2015 (Shavlik©), Lecture 2, Week 1

Another View of Std ML Datasets - a Single Table (2D array) 9/8/15CS Fall 2015 (Shavlik©), Lecture 2, Week 17 Feature 1 Feature 2... Feature N Output Category Example 10.0smallred true Example 29.3mediumredfalse Example 38.2smallbluefalse... Example M5.7mediumgreentrue

9/8/15 Standard Feature Types for representing training examples – a source of “domain knowledge” Nominal (including Boolean) –No ordering among possible values eg, color  {red, blue, green} (vs. color = 1000 Hertz) Linear (or Ordered) –Possible values of the feature are totally ordered eg, size  {small, medium, large} ← discrete weight  [0…500] ← continuous Hierarchical (not commonly used) –Possible values are partially ordered in an ISA hierarchy eg, shape  closed polygoncontinuous trianglesquarecircleellipse Keep your eye out for places where domain knowledge is (or should be) used in ML 8CS Fall 2015 (Shavlik©), Lecture 2, Week 1

9/8/15 A Richer Testbed: The Internet Movie Database (IMDB) IMDB richly represents data note each movie is potentially represented by a graph of a different size by a graph of a different size 9CS Fall 2015 (Shavlik©), Lecture 2, Week 1 Figure from David Jensen of UMass

Learning with Data in Multiple Tables (Relational ML) – not covered in cs540 Previous Mammograms Previous Blood Tests Prev. Rx Key challenge different amount of data for each patient Patients 10

HWOHWO – Reading in an Dataset Due in one week (most HWs will have two weeks between when assigned and when due) The Thoracic Surgery Dataset (original version)The Thoracic Surgery Dataset original version 9/8/15CS Fall 2015 (Shavlik©), Lecture 2, Week 111

Getting Labeled Examples The ‘Achilles Heel’ of ML Often ‘experts’ label –eg ‘books I like’ or ‘patients that should get drug X’ ‘Time will tell’ concepts –wait a month and see if medical treatment worked or stock appreciated over a year Use of Amazon Mechanical Turk –‘the crowd’ Need representative examples, especially good ‘negative’ (counter) examples 9/8/15CS Fall 2015 (Shavlik©), Lecture 2, Week 112

If it is Free, You are the Product 9/8/15CS Fall 2015 (Shavlik©), Lecture 2, Week 113 Google is using authentication (as a human) as a way to get labeled data for their ML algorithms!

9/8/15 IID and Other Assumptions We are assuming examples are IID: independently identically distributed We are ignoring temporal dependencies (covered in time-series learning) We assume the ML algo has no say in which examples it gets (covered in active learning) Data arrives in any order 14CS Fall 2015 (Shavlik©), Lecture 2, Week 1

9/8/15CS Fall 2015 (Shavlik©), Lecture 2, Week 1 Train/Tune/Test Sets: A Pictorial Overview generate solutions select best ML Algo training examples train’ set tune set testing examples classifier expected accuracy on future examples collection of classified examples (here each column is an example) 15

9/8/15 N -fold Cross Validation Can be used to 1) estimate future accuracy (via test sets) 2) choose parameter settings (via tuning sets) Method 1) Randomly permute examples 2) Divide into N bins 3) Train on N - 1 bins, measure accuracy on bin ‘left out’ 4) Compute average accuracy on held-out sets Examples Fold 1 Fold 2Fold 3Fold 4 Fold 5 CS Fall 2015 (Shavlik©), Lecture 2, Week 116

Dealing with Data that Comes from Larger Objects Assume examples are sentences contained in books Or web pages from computer science depts Or short DNA sequences from genes (Usually) need to cross validate on the LARGER objects Eg, first partition books into N folds, then collect sentences from a fold’s books 9/8/15CS Fall 2015 (Shavlik©), Lecture 2, Week 117 Sentences in Books Fold1Fold2