Instance Based Learning Bob Durrant School of Computer Science University of Birmingham (Slides: Dr Ata Kabán) 1.

Slides:



Advertisements
Similar presentations
1 Classification using instance-based learning. 3 March, 2000Advanced Knowledge Management2 Introduction (lazy vs. eager learning) Notion of similarity.
Advertisements

Machine Learning Instance Based Learning & Case Based Reasoning Exercise Solutions.
Data Mining Classification: Alternative Techniques
K-means method for Signal Compression: Vector Quantization
1 CS 391L: Machine Learning: Instance Based Learning Raymond J. Mooney University of Texas at Austin.
Instance Based Learning
1 Machine Learning: Lecture 7 Instance-Based Learning (IBL) (Based on Chapter 8 of Mitchell T.., Machine Learning, 1997)
Lazy vs. Eager Learning Lazy vs. eager learning
Kansas State University Department of Computing and Information Sciences Laboratory for Knowledge Discovery in Databases (KDD) KDD Group Research Seminar.
1er. Escuela Red ProTIC - Tandil, de Abril, Instance-Based Learning 4.1 Introduction Instance-Based Learning: Local approximation to the.
Classification and Decision Boundaries
Navneet Goyal. Instance Based Learning  Rote Classifier  K- nearest neighbors (K-NN)  Case Based Resoning (CBR)
Instance Based Learning
Università di Milano-Bicocca Laurea Magistrale in Informatica Corso di APPRENDIMENTO E APPROSSIMAZIONE Lezione 8 - Instance based learning Prof. Giancarlo.
K nearest neighbor and Rocchio algorithm
Instance based learning K-Nearest Neighbor Locally weighted regression Radial basis functions.
Learning from Observations Chapter 18 Section 1 – 4.
Carla P. Gomes CS4700 CS 4700: Foundations of Artificial Intelligence Carla P. Gomes Module: Nearest Neighbor Models (Reading: Chapter.
Instance Based Learning
Instance-Based Learning
Lazy Learning k-Nearest Neighbour Motivation: availability of large amounts of processing power improves our ability to tune k-NN classifiers.
Data Mining Classification: Alternative Techniques
These slides are based on Tom Mitchell’s book “Machine Learning” Lazy learning vs. eager learning Processing is delayed until a new instance must be classified.
CES 514 – Data Mining Lec 9 April 14 Mid-term k nearest neighbor.
Aprendizagem baseada em instâncias (K vizinhos mais próximos)
KNN, LVQ, SOM. Instance Based Learning K-Nearest Neighbor Algorithm (LVQ) Learning Vector Quantization (SOM) Self Organizing Maps.
INSTANCE-BASE LEARNING
CS Instance Based Learning1 Instance Based Learning.
K Nearest Neighborhood (KNNs)
1 Data Mining Lecture 5: KNN and Bayes Classifiers.
K Nearest Neighbors Saed Sayad 1www.ismartsoft.com.
Chapter 8 The k-Means Algorithm and Genetic Algorithm.
 2003, G.Tecuci, Learning Agents Laboratory 1 Learning Agents Laboratory Computer Science Department George Mason University Prof. Gheorghe Tecuci 9 Instance-Based.
1 Instance Based Learning Ata Kaban The University of Birmingham.
Knowledge Learning by Using Case Based Reasoning (CBR)
Instance Based Learning
CpSc 881: Machine Learning Instance Based Learning.
CpSc 810: Machine Learning Instance Based Learning.
COMP 2208 Dr. Long Tran-Thanh University of Southampton K-Nearest Neighbour.
Outline K-Nearest Neighbor algorithm Fuzzy Set theory Classifier Accuracy Measures.
Lazy Learners K-Nearest Neighbor algorithm Fuzzy Set theory Classifier Accuracy Measures.
Kansas State University Department of Computing and Information Sciences CIS 798: Intelligent Systems and Machine Learning Tuesday, November 23, 1999.
KNN Classifier.  Handed an instance you wish to classify  Look around the nearby region to see what other classes are around  Whichever is most common—make.
Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Thursday, 05 April 2007 William.
Meta-learning for Algorithm Recommendation Meta-learning for Algorithm Recommendation Background on Local Learning Background on Algorithm Assessment Algorithm.
CS Machine Learning Instance Based Learning (Adapted from various sources)
K-Nearest Neighbor Learning.
1 Learning Bias & Clustering Louis Oliphant CS based on slides by Burr H. Settles.
Eick: kNN kNN: A Non-parametric Classification and Prediction Technique Goals of this set of transparencies: 1.Introduce kNN---a popular non-parameric.
Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Intelligent Systems Wednesday, November 15, 2000 Cecil.
Instance-Based Learning Evgueni Smirnov. Overview Instance-Based Learning Comparison of Eager and Instance-Based Learning Instance Distances for Instance-Based.
CS 8751 ML & KDDInstance Based Learning1 k-Nearest Neighbor Locally weighted regression Radial basis functions Case-based reasoning Lazy and eager learning.
1 Instance Based Learning Soongsil University Intelligent Systems Lab.
1 Instance Based Learning Soongsil University Intelligent Systems Lab.
Classification Nearest Neighbor
Data Mining: Concepts and Techniques (3rd ed
K Nearest Neighbors and Instance-based methods
K-Nearest Neighbours and Instance based learning
Instance Based Learning (Adapted from various sources)
K Nearest Neighbor Classification
Classification Nearest Neighbor
Nearest-Neighbor Classifiers
یادگیری بر پایه نمونه Instance Based Learning Instructor : Saeed Shiry
Instance Based Learning
COSC 4335: Other Classification Techniques
Chap 8. Instance Based Learning
Machine Learning: UNIT-4 CHAPTER-1
Data Mining Classification: Alternative Techniques
Nearest Neighbor Classifiers
CSE4334/5334 Data Mining Lecture 7: Classification (4)
Presentation transcript:

Instance Based Learning Bob Durrant School of Computer Science University of Birmingham (Slides: Dr Ata Kabán) 1

Outline Today we learn: K-Nearest Neighbours Case-based reasoning Lazy and eager learning 2

Instance-based learning One way of solving tasks of approximating discrete or real valued target functions Have training examples: (x n, f(x n )), n=1..N. Key idea: – just store the training examples – when a test example is given then find the closest matches 3

“Nearest Neighbours” 1-Nearest neighbour: – given a query instance x q – locate the nearest training example x n – then f(x q ):= f(x n ) K-Nearest neighbour: – given a query instance x q – locate the k nearest training examples – if discrete values target function then take vote among its k nearest neighbours – if real valued target function then take the mean of the f values of the k nearest neighbours: 4

The distance between examples We need a measure of distance in order to know who are the neighbours Assume that we have T attributes for the learning problem. Then one example point x has elements x t  R, t=1,…T. The distance between two points x i and x j is usually defined to be the Euclidean distance: 5

Voronoi Diagram 6

Characteristics of Instance-Based Learning An instance-based learner is a so-called lazy learner which does all the work when the test example is presented. This is as opposed to eager learners, which build a parameterised compact model of the target. It produces local approximation to the target function (different with each test instance) 7

When to consider Nearest Neighbour algorithms? Instances map to points in R n Not more then say 20 attributes per instance Lots of training data Advantages: – Training is very fast – Can learn complex target functions – Don’t lose information Disadvantages: – ? (will see them shortly…) 8

9 two one four three five six seven Eight ?

Training data 10 Test instance

Keep data in normalised form 11 One way to normalise the data a r (x) to a’ r (x) is:

Normalised training data 12 Test instance

Distances of test instance from training data 13 Classification 1-NNYes 3-NNYes 5-NNNo 7-NNNo

What if the target function is real valued? The k-nearest neighbour algorithm would just calculate the mean of the k nearest neighbours 14

Variant of kNN: Distance-Weighted kNN We might want to weight nearer neighbors more heavily: Then it makes sense to use all training examples instead of just k (Stepard’s method) 15

Difficulties with k-nearest neighbour algorithms Have to calculate the distance of the test case from all training cases There may be irrelevant attributes amongst the attributes – curse of dimensionality 16

Case-based reasoning (CBR) CBR is an advanced instance based learning applied to more complex instance objects Objects may include complex structural descriptions of cases & adaptation rules 17

Case-based Reasoning (CBR) CBR cannot use Euclidean distance measures Must define distance measures for those complex objects instead (e.g. semantic nets) CBR tries to model human problem-solving – uses past experience (cases) to solve new problems – retains solutions to new problems CBR is an ongoing area of machine learning research with many applications 18

Applications of CBR Design – landscape, building, mechanical, conceptual design of aircraft sub-systems Planning – repair schedules Diagnosis – medical Adversarial reasoning – legal 19

CBR process 20 New Case matching Matched Cases Retrieve Adapt? No Yes Closest Case Suggest solution Retain Learn Revise Reuse Case Base Knowledge and Adaptation rules

CBR example: Property pricing 21 Test instance

How rules are generated There is no unique way of doing it. Here is one possibility: Examine cases and look for ones that are almost identical – case 1 and case 2 R1: If recep-rooms changes from 2 to 1 then reduce price by £5,000 – case 3 and case 4 R2: If Type changes from semi to terraced then reduce price by £7,000 22

Matching Comparing test instance – matches(5,1) = 3 – matches(5,2) = 3 – matches(5,3) = 2 – matches(5,4) = 1 23 n Estimate price of case 5 is £25,000

Adapting Reverse rule 2 – if type changes from terraced to semi then increase price by £7,000 Apply reversed rule 2 – new estimate of price of property 5 is £32,000 24

Learning So far we have a new case and an estimated price – nothing is added yet to the case base If later we find house sold for £35,000 then the case would be added – could add a new rule if location changes from 8 to 7 increase price by £3,000 25

Problems with CBR How should cases be represented? How should cases be indexed for fast retrieval? How can good adaptation heuristics be developed? When should old cases be removed? 26

Advantages A local approximation is found for each test case Knowledge is in a form understandable to human beings Fast to train 27

Summary K-Nearest Neighbours Case-based reasoning Lazy and eager learning 28

Lazy and Eager Learning Lazy: wait for query before generalizing – k-Nearest Neighbour, Case based reasoning Eager: generalize before seeing query – Radial Basis Function Networks, ID3, … Does it matter? – Eager learner must create global approximation – Lazy learner can create many local approximations 29