Presentation is loading. Please wait.

Presentation is loading. Please wait.

Instance Based Learning. Nearest Neighbor Remember all your data When someone asks a question –Find the nearest old data point –Return the answer associated.

Similar presentations


Presentation on theme: "Instance Based Learning. Nearest Neighbor Remember all your data When someone asks a question –Find the nearest old data point –Return the answer associated."— Presentation transcript:

1 Instance Based Learning

2 Nearest Neighbor Remember all your data When someone asks a question –Find the nearest old data point –Return the answer associated with it In order to say what point is nearest, we have to define what we mean by "near". Typically, we use Euclidean distance between two points. Nominal attributes: distance is set to 1 if values are different, 0 if they are equal

3 Predicting Bankruptcy

4 Now, let's say we have a new person with R equal to 0.3 and L equal to 2. What y value should we predict? And so our answer would be "no".

5 Scaling The naïve Euclidean distance isn't always appropriate. Consider the case where we have two features describing a car. –f 1 = weight in pounds –f 2 = number of cylinders. Any effect of f 2 will be completely lost because of the relative scales. So, rescale the inputs to put all of the features on about equal footing:

6 Time and Space Learning is fast –We just have to remember the training data. Space is n. What takes longer is answering a query. If we do it naively, we have to, for each point in our training set (and there are n of them) compute the distance to the query point (which takes about m computations, since there are m features to compare). So, overall, this takes about m * n time.

7 Noise Someone with an apparently healthy financial record goes bankrupt.

8 Remedy: K-Nearest Neighbors k-nearest neighbor algorithm: –Just like the old algorithm, except that when we get a query, we'll search for the k closest points to the query points. Output what the majority says. –In this case, we've chosen k to be 3. –The three closest points consist of two "no"s and a "yes", so our answer would be "no". Find the optimal k using cross- validation

9 Other Variants IB2: save memory, speed up classification –Work incrementally –Only incorporate misclassified instances –Problem: noisy data gets incorporated IB3: deal with noise –Discard instances that don’t perform well –Keep a record of the number of correct and incorrect classification decisions that each exemplar makes. –Two predetermined thresholds are set on success ratio. If the performance of exemplar falls below the low threshold it is deleted. If the performance exceeds the upper threshold it is used for prediction.

10 Instance-based learning: IB2 IB2: save memory, speed up classification –Work incrementally –Only incorporate misclassified instances –Problem: noisy data gets incorporated Data: “Who buys gold jewelry” (25,60,no) (45,60,no) (50,75,no) (50,100,no) (50,120,no) (70,110,yes) (85,140,yes) (30,260,yes) (25,400,yes) (45,350,yes) (50,275,yes) (60,260,yes)

11 Instance-based learning: IB2 Data: –(25,60,no) –(85,140,yes) –(45,60,no) –(30,260,yes) –(50,75,no) –(50,120,no) –(70,110,yes) –(25,400,yes) –(50,100,no) –(45,350,yes) –(50,275,yes) –(60,260,yes) This is the final answer. I.e. we memorize only these 5 points. However, let’s compute gradually the classifier.

12 Instance-based learning: IB2 Data: –(25,60,no)

13 Instance-based learning: IB2 Data: –(25,60,no) –(85,140,yes) Since so far the model has only the first instance memorized, this second instance gets wrongly classified. So, we memorize it as well.

14 Instance-based learning: IB2 Data: –(25,60,no) –(85,140,yes) –(45,60,no) So far the model has the two first instances memorized. The third instance gets properly classified, since it happens to be closer with the first. So, we don’t memorize it.

15 Instance-based learning: IB2 Data: –(25,60,no) –(85,140,yes) –(45,60,no) –(30,260,yes) So far the model has the two first instances memorized. The fourth instance gets properly classified, since it happens to be closer with the second. So, we don’t memorize it.

16 Instance-based learning: IB2 Data: –(25,60,no) –(85,140,yes) –(45,60,no) –(30,260,yes) –(50,75,no) So far the model has the two first instances memorized. The fifth instance gets properly classified, since it happens to be closer with the first. So, we don’t memorize it.

17 Instance-based learning: IB2 Data: –(25,60,no) –(85,140,yes) –(45,60,no) –(30,260,yes) –(50,75,no) –(50,120,no) So far the model has the two first instances memorized. The sixth instance gets wrongly classified, since it happens to be closer with the second. So, we memorize it.

18 Instance-based learning: IB2 Continuing in a similar way, we finally get, the figure in the right. –The colored points are the one that get memorized. This is the final answer. I.e. we memorize only these 5 points.

19 Instance-based learning: IB3 IB3: deal with noise –Discard instances that don’t perform well –Keep a record of the number of correct and incorrect classification decisions that each exemplar makes. –Two predetermined thresholds are set on success ratio. –An instance is used for training: If the number of incorrect classifications is  the first threshold, and If the number of correct classifications  the second threshold.

20 Instance-based learning: IB3 Suppose the lower threshold is 0, and upper threshold is 1. Shuffle the data first –(25,60,no) –(85,140,yes) –(45,60,no) –(30,260,yes) –(50,75,no) –(50,120,no) –(70,110,yes) –(25,400,yes) –(50,100,no) –(45,350,yes) –(50,275,yes) –(60,260,yes)

21 Instance-based learning: IB3 Suppose the lower threshold is 0, and upper threshold is 1. Shuffle the data first –(25,60,no) [1,1] –(85,140,yes) [1,1] –(45,60,no) [0,1] –(30,260,yes) [0,2] –(50,75,no) [0,1] –(50,120,no) [0,1] –(70,110,yes) [0,0] –(25,400,yes) [0,1] –(50,100,no) [0,0] –(45,350,yes) [0,0] –(50,275,yes) [0,1] –(60,260,yes) [0,0]

22 Instance-based learning: IB3 The points that will be used in classification are: –(45,60,no) [0,1] –(30,260,yes) [0,2] –(50,75,no) [0,1] –(50,120,no) [0,1] –(25,400,yes) [0,1] –(50,275,yes) [0,1]

23 Rectangular generalizations When a new exemplar is classified correctly, it is generalized by simply merging it with the nearest exemplar. The nearest exemplar may be either a single instance or a hyper- rectangle.

24 Rectangular generalizations Data: –(25,60,no) –(85,140,yes) –(45,60,no) –(30,260,yes) –(50,75,no) –(50,120,no) –(70,110,yes) –(25,400,yes) –(50,100,no) –(45,350,yes) –(50,275,yes) –(60,260,yes)

25 Classification Class 1 Class 2 Separation line If the new instance lies within a rectangle then output the rectangle class If the new instance lies in the overlap of several rectangles, then output the class of the rectangle whose center is the closest to the new data instance. If the new instance lies outside any of the rectangles, output the class of the rectangle, which is the closest to the data instance. The distance of a point from a rectangle is: 1.If an instance lies within rectangle, d=0 2.If outside, d = distance from the closest rectangle part, i.e. distance from some point in the rectangle boundary.


Download ppt "Instance Based Learning. Nearest Neighbor Remember all your data When someone asks a question –Find the nearest old data point –Return the answer associated."

Similar presentations


Ads by Google