Nearest Neighbor Classifiers other names: –instance-based learning –case-based learning (CBL) –non-parametric learning –model-free learning.

Slides:

Advertisements

Similar presentations

1 Classification using instance-based learning. 3 March, 2000Advanced Knowledge Management2 Introduction (lazy vs. eager learning) Notion of similarity.

Advertisements

Notes Sample vs distribution “m” vs “µ” and “s” vs “σ” Bias/Variance Bias: Measures how much the learnt model is wrong disregarding noise Variance: Measures.

Data Mining Feature Selection. Data reduction: Obtain a reduced representation of the data set that is much smaller in volume but yet produces the same.

Data Mining Classification: Alternative Techniques

1 CS 391L: Machine Learning: Instance Based Learning Raymond J. Mooney University of Texas at Austin.

Instance Based Learning

1 Machine Learning: Lecture 7 Instance-Based Learning (IBL) (Based on Chapter 8 of Mitchell T.., Machine Learning, 1997)

Lazy vs. Eager Learning Lazy vs. eager learning

Addressing the Medical Image Annotation Task using visual words representation Uri Avni, Tel Aviv University, Israel Hayit GreenspanTel Aviv University,

1er. Escuela Red ProTIC - Tandil, de Abril, Instance-Based Learning 4.1 Introduction Instance-Based Learning: Local approximation to the.

Navneet Goyal. Instance Based Learning  Rote Classifier  K- nearest neighbors (K-NN)  Case Based Resoning (CBR)

K nearest neighbor and Rocchio algorithm

MACHINE LEARNING 9. Nonparametric Methods. Introduction Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 

CS 590M Fall 2001: Security Issues in Data Mining Lecture 3: Classification.

Carla P. Gomes CS4700 CS 4700: Foundations of Artificial Intelligence Carla P. Gomes Module: Nearest Neighbor Models (Reading: Chapter.

Instance Based Learning

Flexible Metric NN Classification based on Friedman (1995) David Madigan.

Instance Based Learning. Nearest Neighbor Remember all your data When someone asks a question –Find the nearest old data point –Return the answer associated.

Comparison of Instance-Based Techniques for Learning to Predict Changes in Stock Prices iCML Conference December 10, 2003 Presented by: David LeRoux.

1 Nearest Neighbor Learning Greg Grudic (Notes borrowed from Thomas G. Dietterich and Tom Mitchell) Intro AI.

CES 514 – Data Mining Lec 9 April 14 Mid-term k nearest neighbor.

Aprendizagem baseada em instâncias (K vizinhos mais próximos)

KNN, LVQ, SOM. Instance Based Learning K-Nearest Neighbor Algorithm (LVQ) Learning Vector Quantization (SOM) Self Organizing Maps.

INSTANCE-BASE LEARNING

Memory-Based Learning Instance-Based Learning K-Nearest Neighbor.

CS Instance Based Learning1 Instance Based Learning.

Module 04: Algorithms Topic 07: Instance-Based Learning

Methods in Medical Image Analysis Statistics of Pattern Recognition: Classification and Clustering Some content provided by Milos Hauskrecht, University.

This week: overview on pattern recognition (related to machine learning)

Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.

K Nearest Neighborhood (KNNs)

DATA MINING LECTURE 10 Classification k-nearest neighbor classifier Naïve Bayes Logistic Regression Support Vector Machines.

COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.

1 Data Mining Lecture 5: KNN and Bayes Classifiers.

11/12/2012ISC471 / HCI571 Isabelle Bichindaritz 1 Prediction.

Ensemble Methods: Bagging and Boosting

Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.7: Instance-Based Learning Rodney Nielsen.

Principal Component Analysis Machine Learning. Last Time Expectation Maximization in Graphical Models – Baum Welch.

CpSc 881: Machine Learning Instance Based Learning.

CpSc 810: Machine Learning Instance Based Learning.

Chapter1: Introduction Chapter2: Overview of Supervised Learning

KNN & Naïve Bayes Hongning Wang Today’s lecture Instance-based classifiers – k nearest neighbors – Non-parametric learning algorithm Model-based.

Nearest Neighbor Classifier 1.K-NN Classifier 2.Multi-Class Classification.

Chapter 13 (Prototype Methods and Nearest-Neighbors )

Fast Query-Optimized Kernel Machine Classification Via Incremental Approximate Nearest Support Vectors by Dennis DeCoste and Dominic Mazzoni International.

Information Retrieval and Organisation Chapter 14 Vector Space Classification Dell Zhang Birkbeck, University of London.

Machine Learning ICS 178 Instructor: Max Welling Supervised Learning.

DATA MINING LECTURE 10b Classification k-nearest neighbor classifier

CS Machine Learning Instance Based Learning (Adapted from various sources)

Eick: kNN kNN: A Non-parametric Classification and Prediction Technique Goals of this set of transparencies: 1.Introduce kNN---a popular non-parameric.

Instance-Based Learning Evgueni Smirnov. Overview Instance-Based Learning Comparison of Eager and Instance-Based Learning Instance Distances for Instance-Based.

Cover and Hart (1967) in the limit, as training set size n→ , the error of 1- NN is bounded above by twice the error of the Bayes Optimal Classifier:

CS 8751 ML & KDDInstance Based Learning1 k-Nearest Neighbor Locally weighted regression Radial basis functions Case-based reasoning Lazy and eager learning.

KNN & Naïve Bayes Hongning Wang

Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.

Machine Learning Supervised Learning Classification and Regression K-Nearest Neighbor Classification Fisher’s Criteria & Linear Discriminant Analysis Perceptron:

Data Science Algorithms: The Basic Methods

Instance Based Learning

Information Retrieval

Classification Nearest Neighbor

Chapter 7 – K-Nearest-Neighbor

Instance Based Learning (Adapted from various sources)

K Nearest Neighbor Classification

Classification Nearest Neighbor

Junheng, Shengming, Yunsheng 11/2/2018

Nearest Neighbor Classifiers

CSE4334/5334 Data Mining Lecture 7: Classification (4)

FEATURE WEIGHTING THROUGH A GENERALIZED LEAST SQUARES ESTIMATOR

Memory-Based Learning Instance-Based Learning K-Nearest Neighbor

Presentation transcript:

Nearest Neighbor Classifiers other names: –instance-based learning –case-based learning (CBL) –non-parametric learning –model-free learning

1-NN save all training data to classify a test example, –compute distance to each training example –Euclidean distance metric –report same class of nearest training example –for binary attributes, use Hamming distance –for nominal attributes, use equality (0 if equal, else 1) or VDM (Value-Difference Metric; Stanfill and Waltz, 1986) – difference of conditional probabilities squared, summed over classes Result: often surprisingly good accuracy, comparable with decision trees & neural nets

k-NN –sensitivity to noise –take majority over k closest neighbors –optimizing k: use validation set distance-weighting –can use all training examples

strengths of k-NN –simple, accurate –Theorem: In the limit (large N), the error of 1-NN is at most twice the error of the Bayes-optimal classifier (Cover & Hart, 1967) weaknesses of k-NN –memory needed to store examples –classification speed (indexing can help) –no comprehensibility –(noise, curse of dimensionality, lack of adequate training examples) basis for generalization –bias: similarity bias

NTGrowth (Aha and Kibler) during training, save only those examples on which mistakes are made –also throw out examples that appear noisy reduces memory requirements, increases accuracy

Scaling of attributes –for fairness, don’t want large values to dominate –pre-whiten data: for continuous values, replace with z-scores, z=(x-  binary and nominal attributes are already on scale of 0-1

Feature Weighting weighted Euclidean dist. metric –want to weight features by “relevance” –conditional probability –negEntropy –chi-squared Mahalanobis metric –inverse of covariance matrix, d xy =(x-y) T  -1 (x-y) –capture skewing of data distribution –con: class-independent

Feature Selection curse of dimensionality – many attributes often leads to lower accuracy PCA – principle component analysis –based on manipulation of covariance matrix –choose new orthogonal dimensions based on linear combinations of original attributes, chosen in order of most variance explained filter methods: try to estimate relevance –negEntropy, RELIEF: hits vs. misses of neighbors wrapper methods (use accuracy on training data to pick best features) –SFS: stepwise-forward selection –SBE: stepwise-backward elimination –DIET: try optimizing weights of one feature at a time by searching a grid