Non Parametric Classifiers Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall 2004-2005.

Slides:



Advertisements
Similar presentations
Principles of Density Estimation
Advertisements

Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Non-Metric Methods: Decision Trees Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall
INTRODUCTION TO Machine Learning 3rd Edition
Lecture 3 Nonparametric density estimation and classification
Coin Counter Andres Uribe. what Find out the amount of money in a coin picture.
CSC2515 Fall 2008 Introduction to Machine Learning Lecture 10a Kernel density estimators and nearest neighbors All lecture slides will be available as.ppt,.ps,
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Assuming normally distributed data! Naïve Bayes Classifier.
MACHINE LEARNING 9. Nonparametric Methods. Introduction Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 
Prénom Nom Document Analysis: Parameter Estimation for Pattern Recognition Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
PatReco: Estimation/Training Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall
CSE803 Fall Pattern Recognition Concepts Chapter 4: Shapiro and Stockman How should objects be represented? Algorithms for recognition/matching.
Principle of Locality for Statistical Shape Analysis Paul Yushkevich.
Prénom Nom Document Analysis: Non Parametric Methods for Pattern Recognition Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Unsupervised Training and Clustering Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall
PatReco: Bayesian Networks Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall
Discriminant Functions Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall
PatReco: Detection Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall
PatReco: Bayes Classifier and Discriminant Functions Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
PatReco: Discriminant Functions for Gaussians Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall
Bayesian Classification with a brief introduction to pattern recognition Modified from slides by Michael L. Raymer, Ph.D.
Stockman CSE803 Fall Pattern Recognition Concepts Chapter 4: Shapiro and Stockman How should objects be represented? Algorithms for recognition/matching.
Pattern Recognition Applications Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall
CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.
Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.
Methods in Medical Image Analysis Statistics of Pattern Recognition: Classification and Clustering Some content provided by Milos Hauskrecht, University.
Music retrieval Conventional music retrieval systems Exact queries: ”Give me all songs from J.Lo’s latest album” What about ”Give me the music that I like”?
1 Pattern Recognition Concepts How should objects be represented? Algorithms for recognition/matching * nearest neighbors * decision tree * decision functions.
Nearest Neighbor (NN) Rule & k-Nearest Neighbor (k-NN) Rule Non-parametric : Can be used with arbitrary distributions, No need to assume that the form.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
Computational Intelligence: Methods and Applications Lecture 12 Bayesian decisions: foundation of learning Włodzisław Duch Dept. of Informatics, UMK Google:
EECS 274 Computer Vision Segmentation by Clustering II.
Pattern Recognition April 19, 2007 Suggested Reading: Horn Chapter 14.
Image Modeling & Segmentation Aly Farag and Asem Ali Lecture #2.
Chapter 4: Pattern Recognition. Classification is a process that assigns a label to an object according to some representation of the object’s properties.
Perceptual and Sensory Augmented Computing Machine Learning WS 13/14 Machine Learning – Lecture 3 Probability Density Estimation II Bastian.
METU Informatics Institute Min720 Pattern Classification with Bio-Medical Applications Part 6: Nearest and k-nearest Neighbor Classification.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Machine Learning ICS 178 Instructor: Max Welling Supervised Learning.
Lecture 5: Statistical Methods for Classification CAP 5415: Computer Vision Fall 2006.
Eick: kNN kNN: A Non-parametric Classification and Prediction Technique Goals of this set of transparencies: 1.Introduce kNN---a popular non-parameric.
Instance-Based Learning Evgueni Smirnov. Overview Instance-Based Learning Comparison of Eager and Instance-Based Learning Instance Distances for Instance-Based.
Data Mining CH6 Implementation: Real machine learning schemes(2) Reporter: H.C. Tsai.
PatReco: Introduction Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall
Lecture 7 Spring 2010 Dr. Jianjun Hu CSCE883 Machine Learning.
PatReco: Model and Feature Selection Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall
1 C.A.L. Bailer-Jones. Machine Learning. Data exploration and dimensionality reduction Machine learning, pattern recognition and statistical data modelling.
Nonparametric Density Estimation – k-nearest neighbor (kNN) 02/20/17
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Ch8: Nonparametric Methods
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Overview of Supervised Learning
Instance Based Learning (Adapted from various sources)
Outline Parameter estimation – continued Non-parametric methods.
Boosting Nearest-Neighbor Classifier for Character Recognition
K Nearest Neighbor Classification
Nearest-Neighbor Classifiers
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Classification and Prediction
CS5112: Algorithms and Data Structures for Applications
CSC2515 Fall 2007 Introduction to Machine Learning Lecture 7b: Kernel density estimators and nearest neighbors All lecture slides will be available as.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Nonparametric density estimation and classification
Mathematical Foundations of BME
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
ECE – Pattern Recognition Lecture 10 – Nonparametric Density Estimation – k-nearest-neighbor (kNN) Hairong Qi, Gonzalez Family Professor Electrical.
Presentation transcript:

Non Parametric Classifiers Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall

Histograms-Parzen Windows  Main idea: Instead of selecting a parametric distribution (e.g., Gaussian) to describe the properties of the features of a class, compute directly the empirical distribution  class feature histogram

Feature Histogram Example X # of samples in each bin  Normalize histogram curve to get feature PDF

Parzen Windows: Issues  When compared to parametric methods empirical distributions are: Better because no specific form of the PDF is assumed Worse because over-fitting can easily occur (too small histogram bin)  Parzen proposed rules for adapting bin size based on number of samples in each bin to avoid over- fitting

Nearest Neighbor Rule  Main idea (1-NNR): No explicit model (i.e., no training) For each test sample x the “nearest” sample x’ in the training set is found, i.e., argmin x’ d(x, x’) and x is classified to the class where x’ belongs

Generalizations k-NNR: Instead of finding the nearest neighbors we find k nearest neighbors from the training set; the sample x is classified to the class where most of the k neighbors belong k-l-NNR: Like k-NNR but at least l of the k nearest neighbor must belong to the same class for a classification decision to be taken (else no decision)

Example Training set D 1 = {0,-1,-2} and D 2 = {1,1,1} NNR decision boundary 3-NNR decision boundary 3-3-NNR no decision region

Computational Efficiency  To speed up NNR classification the training set size can be reduced using the condensing algorithm: The training set is classified using NNR rule misclassified samples are added to the new (condensed) training set one by one until all training samples are correctly classified

Conclusions  Non parametric classification algorithms are easy to implement are computationally efficient (in training) don’t make any assumptions are prone to over-fitting are hard to adapt (no detailed model)