Presentation is loading. Please wait.

Presentation is loading. Please wait.

Instance based learning K-Nearest Neighbor Locally weighted regression Radial basis functions.

Similar presentations


Presentation on theme: "Instance based learning K-Nearest Neighbor Locally weighted regression Radial basis functions."— Presentation transcript:

1 Instance based learning K-Nearest Neighbor Locally weighted regression Radial basis functions

2 When to Consider Nearest Neighbors Instances map to points in R N Less than 20 attributes per instance Lots of training data Advantages: Training is very fast Learn complex target functions Do not loose information Disadvantages: Slow at query time Easily fooled by irrelevant attributes

3 Instance Based Learning Key idea: just store all training examples Nearest neighbor: Given query instance x q, first locate nearest training example x n, then estimate f(x q )=f(x n ) K-nearest neighbor: Given x q, take vote among its k nearest neighbors (if discrete-valued target function) Take mean of f values of k nearest neighbors (if real- valued) f(x q )=  i=1 k f(x i )/k

4 Voronoi Diagram query point q f nearest neighbor q i

5 3-Nearest Neighbors query point q f 3 nearest neighbors 2x,1o

6 7-Nearest Neighbors query point q f 7 nearest neighbors 3x,4o

7 Nearest Neighbor (continuous) 1-nearest neighbor

8 Nearest Neighbor (continuous) 3-nearest neighbor

9 Nearest Neighbor (continuous) 5-nearest neighbor

10 Locally Weighted Regression Regression means approximating a real-valued target function Residual is the error in approximating the target function Kernel function is the function of distance that is used to determine the weight of each training example. In other words, the kernel function is the function K such that w i =K(d(x i,x q ))

11 Distance Weighted k-NN Give more weight to neighbors closer to the query point f ^ (x q ) =  i=1 k w i f(x i ) /  i=1 k w i where w i =K(d(x q,x i )) and d(x q,x i ) is the distance between x q and x i Instead of only k-nearest neighbors use all training examples (Shepard’s method)

12 Distance Weighted Average Weighting the data: f ^ (x q ) =  i f(x i ) K(d(x i,xq))/  i K(d(x i,x q )) Relevance of a data point (x i,f(x i )) is measured by calculating the distance d(x i,x q ) between the query x q and the input vector x i Weighting the error criterion: E(x q ) =  i (f ^ (x q )-f(x i )) 2 K(d(x i,xq)) the best estimate f ^ (x q ) will minimize the cost E(q), therefore  E(q)/  f ^ (x q )=0

13 Kernel Functions

14 Distance Weighted NN K(d(x q,x i )) = 1/ d(x q,x i ) 2

15 Distance Weighted NN K(d(x q,x i )) = 1/(d 0 +d(x q,x i )) 2

16 Distance Weighted NN K(d(x q,x i )) = exp(-(d(x q,x i )/  0 ) 2 )

17 Curse of Dimensionality Imagine instances described by 20 attributes but only 3 are relevant to target function Curse of dimensionality: nearest neighbor is easily misled when instance space is high-dimensional One approach: Stretch j-th axis by weight z j, where z 1,…,z n chosen to minimize prediction error Use cross-validation to automatically choose weights z 1,…,z n Note setting z j to zero eliminates this dimension alltogether (feature subset selection)


Download ppt "Instance based learning K-Nearest Neighbor Locally weighted regression Radial basis functions."

Similar presentations


Ads by Google