Download presentation

Presentation is loading. Please wait.

Published byShanon Stephens Modified about 1 year ago

1
1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006 4. Instance-Based Learning 4.1 Introduction Instance-Based Learning: Local approximation to the target function that applies in the neighborhood of the query instance –Cost of classifying new instances can be high: Nearly all computations take place at classification time –Examples: k-Nearest Neighbors –Radial Basis Functions: Bridge between instance-based learning and artificial neural networks

2
1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006 4. Instance-Based Learning K-Nearest Neighbors

3
1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006 4. Instance-Based Learning Most plausible hypothesis

4
1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006 4. Instance-Based Learning Now? Or maybe…

5
1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006 4. Instance-Based Learning

6
1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006 4. Instance-Based Learning

7
1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006 4. Instance-Based Learning Is the simplest hypothesis always the best one?

8
1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006 4. Instance-Based Learning 4.2 k-Nearest Neighbor Learning Instance x = [ a 1 (x), a 2 (x),..., a n (x) ] n d(x i,x j ) = [ (x i -x j ).(x i -x j ) ] ½ = Euclidean Distance –Discrete-Valued Target Functions ƒ : n V = {v 1, v 2,.., v s )

9
1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006 4. Instance-Based Learning Prediction for a new query x: (k nearest neighbors of x) ƒ(x) = argmax v V i=1,k [v,ƒ(x i )] [v,ƒ(x i )] = 1 if v =ƒ(x i ), [v,ƒ(x i )] = 0 otherwise –Continuous-Valued Target Functions ƒ(x) = (1/k) i=1,k ƒ(x i )

10
1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006 4. Instance-Based Learning

11
1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006 4. Instance-Based Learning Distance-Weighted k-NN ƒ(x) = argmax v V i=1,k w i [v,ƒ(x i )] ƒ(x) = i=1,k w i ƒ(x i ) / k i=1,k w i w i = [d(x i,x)] -2 Weights more heavily closest neighbors

12
1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006 4. Instance-Based Learning Remarks for k-NN –Robust to noise –Quite effective for large training sets –Inductive bias: The classification of an instance will be most similar to the classification of instances that are nearby in Euclidean distance –Especially sensitive to the curse of dimensionality –Elimination of irrelevant attributes by suitably chosen the metric: d(x i,x j ) = [ (x i -x j ).G.(x i -x j ) ] ½

13
1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006 4. Instance-Based Learning 4.3 Locally Weighted Regression Builds an explicit approximation to ƒ(x) over a local region surrounding x (usually a linear or quadratic fit to training examples nearest to x) Locally Weighted Linear Regression: ƒ L (x) = w 0 + w 1 x 1 +...+ w n x n E(x) = i=1,k [ƒ L (x i )-ƒ(x i )] 2 (x i nn of x)

14
1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006 4. Instance-Based Learning Generalization: ƒ L (x) = w 0 + w 1 x 1 +...+ w n x n E(x) = i=1,N K[ d(x i,x)] [ƒ L (x i )-ƒ(x i )] 2 K[ d(x i,x)] = kernel function Other possibility: ƒ Q (x) = quadratic function of x j

15
1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006 4. Instance-Based Learning 4.4 Radial Basis Functions Approach closely related to distance-weighted regression and artificial neural network learning ƒ RBF (x) = w 0 + µ=1,k w µ K[ d(x µ,x)] K[ d(x µ,x)] = exp[-d 2 (x µ,x)/ 2 2 µ ] = Gaussian kernel function

16
1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006 4. Instance-Based Learning

17
1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006 4. Instance-Based Learning Training RBF Networks 1 st Stage: Determination of k (=number of basis functions) x µ and µ (kernel parameters) Expectation-Maximization (EM) algorithm 2 nd Stage: Determination of weights w µ Linear Problem

18
1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006 4. Instance-Based Learning 4.6 Remarks on Lazy and Eager Learning Lazy Learning: stores data and postpones decisions until a new query is presented Eager Learning: generalizes beyond the training data before a new query is presented Lazy methods may consider the query instance x when deciding how to generalize beyond the training data D (local approximation) Eager methods cannot (they have already chosen their global approximation to the target function)

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google