Instance based learning K-Nearest Neighbor Locally weighted regression Radial basis functions.

Slides:



Advertisements
Similar presentations
1 Classification using instance-based learning. 3 March, 2000Advanced Knowledge Management2 Introduction (lazy vs. eager learning) Notion of similarity.
Advertisements

DECISION TREES. Decision trees  One possible representation for hypotheses.
K-means method for Signal Compression: Vector Quantization
1 CS 391L: Machine Learning: Instance Based Learning Raymond J. Mooney University of Texas at Austin.
Instance Based Learning IB1 and IBK Find in text Early approach.
Preventing Overfitting Problem: We don’t want to these algorithms to fit to ``noise’’ The generated tree may overfit the training data –Too many branches,
Instance Based Learning
1 Machine Learning: Lecture 7 Instance-Based Learning (IBL) (Based on Chapter 8 of Mitchell T.., Machine Learning, 1997)
Lazy vs. Eager Learning Lazy vs. eager learning
Statistical Classification Rong Jin. Classification Problems X Input Y Output ? Given input X={x 1, x 2, …, x m } Predict the class label y  Y Y = {-1,1},
1er. Escuela Red ProTIC - Tandil, de Abril, Instance-Based Learning 4.1 Introduction Instance-Based Learning: Local approximation to the.
Classification and Decision Boundaries
Machine Learning Bioinformatics Data Analysis and Tools
Instance Based Learning
Università di Milano-Bicocca Laurea Magistrale in Informatica Corso di APPRENDIMENTO E APPROSSIMAZIONE Lezione 8 - Instance based learning Prof. Giancarlo.
K nearest neighbor and Rocchio algorithm
MACHINE LEARNING 9. Nonparametric Methods. Introduction Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 
Instance Based Learning
1 Classification: Definition Given a collection of records (training set ) Each record contains a set of attributes, one of the attributes is the class.
Instance-Based Learning
Machine Learning Lecture 3
These slides are based on Tom Mitchell’s book “Machine Learning” Lazy learning vs. eager learning Processing is delayed until a new instance must be classified.
1 Nearest Neighbor Learning Greg Grudic (Notes borrowed from Thomas G. Dietterich and Tom Mitchell) Intro AI.
Instance Based Learning IB1 and IBK Small section in chapter 20.
Aprendizagem baseada em instâncias (K vizinhos mais próximos)
KNN, LVQ, SOM. Instance Based Learning K-Nearest Neighbor Algorithm (LVQ) Learning Vector Quantization (SOM) Self Organizing Maps.
Instance Based Learning Bob Durrant School of Computer Science University of Birmingham (Slides: Dr Ata Kabán) 1.
INSTANCE-BASE LEARNING
Memory-Based Learning Instance-Based Learning K-Nearest Neighbor.
CS Instance Based Learning1 Instance Based Learning.
Nearest Neighbor Learning
11/12/2012ISC471 / HCI571 Isabelle Bichindaritz 1 Prediction.
Overview of Supervised Learning Overview of Supervised Learning2 Outline Linear Regression and Nearest Neighbors method Statistical Decision.
1 Instance Based Learning Ata Kaban The University of Birmingham.
ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –(Finish) Nearest Neighbour Readings: Barber 14 (kNN)
Instance Based Learning
CpSc 881: Machine Learning Instance Based Learning.
CpSc 810: Machine Learning Instance Based Learning.
Chapter1: Introduction Chapter2: Overview of Supervised Learning
Outline K-Nearest Neighbor algorithm Fuzzy Set theory Classifier Accuracy Measures.
Lazy Learners K-Nearest Neighbor algorithm Fuzzy Set theory Classifier Accuracy Measures.
Kansas State University Department of Computing and Information Sciences CIS 798: Intelligent Systems and Machine Learning Tuesday, November 23, 1999.
KNN Classifier.  Handed an instance you wish to classify  Look around the nearby region to see what other classes are around  Whichever is most common—make.
Machine Learning ICS 178 Instructor: Max Welling Supervised Learning.
Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Thursday, 05 April 2007 William.
Meta-learning for Algorithm Recommendation Meta-learning for Algorithm Recommendation Background on Local Learning Background on Algorithm Assessment Algorithm.
CS Machine Learning Instance Based Learning (Adapted from various sources)
K-Nearest Neighbor Learning.
基于实例的学习 上海交通大学 摘自 T. Mitchell Machine learning Instance based learning.
1 Learning Bias & Clustering Louis Oliphant CS based on slides by Burr H. Settles.
Eick: kNN kNN: A Non-parametric Classification and Prediction Technique Goals of this set of transparencies: 1.Introduce kNN---a popular non-parameric.
Debrup Chakraborty Non Parametric Methods Pattern Recognition and Machine Learning.
Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Intelligent Systems Wednesday, November 15, 2000 Cecil.
Instance-Based Learning Evgueni Smirnov. Overview Instance-Based Learning Comparison of Eager and Instance-Based Learning Instance Distances for Instance-Based.
CS 8751 ML & KDDInstance Based Learning1 k-Nearest Neighbor Locally weighted regression Radial basis functions Case-based reasoning Lazy and eager learning.
1 Instance Based Learning Soongsil University Intelligent Systems Lab.
1 Instance Based Learning Soongsil University Intelligent Systems Lab.
Instance Based Learning
Ch8: Nonparametric Methods
Instance Based Learning (Adapted from various sources)
K Nearest Neighbor Classification
Classification Nearest Neighbor
Nearest-Neighbor Classifiers
یادگیری بر پایه نمونه Instance Based Learning Instructor : Saeed Shiry
Instance Based Learning
COSC 4335: Other Classification Techniques
Chap 8. Instance Based Learning
Machine Learning: UNIT-4 CHAPTER-1
Nearest Neighbor Classifiers
Memory-Based Learning Instance-Based Learning K-Nearest Neighbor
Presentation transcript:

Instance based learning K-Nearest Neighbor Locally weighted regression Radial basis functions

When to Consider Nearest Neighbors Instances map to points in R N Less than 20 attributes per instance Lots of training data Advantages: Training is very fast Learn complex target functions Do not loose information Disadvantages: Slow at query time Easily fooled by irrelevant attributes

Instance Based Learning Key idea: just store all training examples Nearest neighbor: Given query instance x q, first locate nearest training example x n, then estimate f(x q )=f(x n ) K-nearest neighbor: Given x q, take vote among its k nearest neighbors (if discrete-valued target function) Take mean of f values of k nearest neighbors (if real- valued) f(x q )=  i=1 k f(x i )/k

Voronoi Diagram query point q f nearest neighbor q i

3-Nearest Neighbors query point q f 3 nearest neighbors 2x,1o

7-Nearest Neighbors query point q f 7 nearest neighbors 3x,4o

Nearest Neighbor (continuous) 1-nearest neighbor

Nearest Neighbor (continuous) 3-nearest neighbor

Nearest Neighbor (continuous) 5-nearest neighbor

Locally Weighted Regression Regression means approximating a real-valued target function Residual is the error in approximating the target function Kernel function is the function of distance that is used to determine the weight of each training example. In other words, the kernel function is the function K such that w i =K(d(x i,x q ))

Distance Weighted k-NN Give more weight to neighbors closer to the query point f ^ (x q ) =  i=1 k w i f(x i ) /  i=1 k w i where w i =K(d(x q,x i )) and d(x q,x i ) is the distance between x q and x i Instead of only k-nearest neighbors use all training examples (Shepard’s method)

Distance Weighted Average Weighting the data: f ^ (x q ) =  i f(x i ) K(d(x i,xq))/  i K(d(x i,x q )) Relevance of a data point (x i,f(x i )) is measured by calculating the distance d(x i,x q ) between the query x q and the input vector x i Weighting the error criterion: E(x q ) =  i (f ^ (x q )-f(x i )) 2 K(d(x i,xq)) the best estimate f ^ (x q ) will minimize the cost E(q), therefore  E(q)/  f ^ (x q )=0

Kernel Functions

Distance Weighted NN K(d(x q,x i )) = 1/ d(x q,x i ) 2

Distance Weighted NN K(d(x q,x i )) = 1/(d 0 +d(x q,x i )) 2

Distance Weighted NN K(d(x q,x i )) = exp(-(d(x q,x i )/  0 ) 2 )

Curse of Dimensionality Imagine instances described by 20 attributes but only 3 are relevant to target function Curse of dimensionality: nearest neighbor is easily misled when instance space is high-dimensional One approach: Stretch j-th axis by weight z j, where z 1,…,z n chosen to minimize prediction error Use cross-validation to automatically choose weights z 1,…,z n Note setting z j to zero eliminates this dimension alltogether (feature subset selection)