Presentation is loading. Please wait.

Presentation is loading. Please wait.

These slides are based on Tom Mitchell’s book “Machine Learning” Lazy learning vs. eager learning Processing is delayed until a new instance must be classified.

Similar presentations


Presentation on theme: "These slides are based on Tom Mitchell’s book “Machine Learning” Lazy learning vs. eager learning Processing is delayed until a new instance must be classified."— Presentation transcript:

1 These slides are based on Tom Mitchell’s book “Machine Learning” Lazy learning vs. eager learning Processing is delayed until a new instance must be classified Pros: –Classification hypothesis is developed locally for each instance to be classified Cons: –Running time (no model is built, so each classification actually builds a local model from scratch)

2 These slides are based on Tom Mitchell’s book “Machine Learning” K-Nearest Neighbors Classification of new instances is based on classifications of (one or more) known instances nearest to them –K=1  1-NN (using a single nearest neighbor) –Frequently, K > 1 Assumption: all instances correspond to points in the n-dimensional space R n –Dimensions = features (aka attributes)

3 These slides are based on Tom Mitchell’s book “Machine Learning” Metrics Nearest neighbors are identified using a metric defined for this high-dimensional space Let x be an arbitrary instance with feature vector Euclidean metric is frequently used for real-valued features:

4 These slides are based on Tom Mitchell’s book “Machine Learning” Pseudo-code for KNN Training algorithm –For each training example, add the example to the list Training Classification algorithm (R n  V) –Let V ={v 1, …, v l } be a set of classes –Given a query instance x q to be classified Let X={x 1, …, x k } denote the k instances from Training that are nearest to x q Return v i such that |vote i | is largest

5 These slides are based on Tom Mitchell’s book “Machine Learning” Distance-weighted KNN Weighting contribution of each of the k neighbors according to their distance to the query point x q –Give greater weight to closer neighbors –Return v i such that |w i | is largest

6 These slides are based on Tom Mitchell’s book “Machine Learning” Distance-weighted KNN (cont’d) If x q exactly matches one of the training instances x i, and d(x q, x i )=0, then we simply take class(x i ) to be the classification of x q

7 These slides are based on Tom Mitchell’s book “Machine Learning” Remarks on KNN Highly effective learning algorithm The distance between instances is calculated based on all features –If some features are irrelevant, or redundant, or noisy, then KNN suffers from the curse of dimensionality –In such a case, feature selection must be performed prior to invoking KNN

8 These slides are based on Tom Mitchell’s book “Machine Learning” Home assignment #4: Feature selection Compare the following algorithms 1.ID3 – regular ID3 with internal feature selection 2.KNN.all – KNN that uses all the features available 3.KNN.FS – KNN with a priori feature selection (IG) Two datasets: –Spam email –Handwritten digits You don’t have to understand the physical meaning of all the coefficients involved !

9 These slides are based on Tom Mitchell’s book “Machine Learning” Cross-validation Averaging the accuracy of a learning algorithm over a number of experiments N-fold cross-validation: –Partition the available data D into N disjoint subsets T 1, …, T N of equal size (|D| / N) –For n from 1 to N do Training = D \ T i, Testing = T i Induce a classifier using Training, test it on Testing, and measure the accuracy A i –Return (∑ A i ) / N (cross-validated accuracy)


Download ppt "These slides are based on Tom Mitchell’s book “Machine Learning” Lazy learning vs. eager learning Processing is delayed until a new instance must be classified."

Similar presentations


Ads by Google