CSC2515 Fall 2008 Introduction to Machine Learning Lecture 10a Kernel density estimators and nearest neighbors All lecture slides will be available as.ppt,.ps,

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

Principles of Density Estimation
Nonparametric Methods: Nearest Neighbors
CSC321: Introduction to Neural Networks and Machine Learning Lecture 24: Non-linear Support Vector Machines Geoffrey Hinton.
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) ETHEM ALPAYDIN © The MIT Press, 2010
Curse of Dimensionality Prof. Navneet Goyal Dept. Of Computer Science & Information Systems BITS - Pilani.
K-NEAREST NEIGHBORS AND DECISION TREE Nonparametric Supervised Learning.
INTRODUCTION TO Machine Learning 3rd Edition
Lecture 3 Nonparametric density estimation and classification
What is Statistical Modeling
Regression. So far, we've been looking at classification problems, in which the y values are either 0 or 1. Now we'll briefly consider the case where.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Kernel methods - overview
MACHINE LEARNING 9. Nonparametric Methods. Introduction Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 
Lecture Notes for CMPUT 466/551 Nilanjan Ray
Non Parametric Classifiers Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
INSTANCE-BASE LEARNING
1 An Introduction to Nonparametric Regression Ning Li March 15 th, 2004 Biostatistics 277.
Bayesian Classification with a brief introduction to pattern recognition Modified from slides by Michael L. Raymer, Ph.D.
Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 16 Nov, 3, 2011 Slide credit: C. Conati, S.
CSE 185 Introduction to Computer Vision Pattern Recognition 2.
Perceptual and Sensory Augmented Computing Machine Learning Summer’09 Machine Learning – Lecture 2 Probability Density Estimation Bastian Leibe.
CSC2515 Fall 2008 Introduction to Machine Learning Lecture 11a Boosting and Naïve Bayes All lecture slides will be available as.ppt,.ps, &.htm at
Overview of Supervised Learning Overview of Supervised Learning2 Outline Linear Regression and Nearest Neighbors method Statistical Decision.
Image Modeling & Segmentation Aly Farag and Asem Ali Lecture #2.
Perceptual and Sensory Augmented Computing Machine Learning WS 13/14 Machine Learning – Lecture 3 Probability Density Estimation II Bastian.
1 2 nd Pre-Lab Quiz 3 rd Pre-Lab Quiz 4 th Pre-Lab Quiz.
Jakob Verbeek December 11, 2009
CS 1699: Intro to Computer Vision Support Vector Machines Prof. Adriana Kovashka University of Pittsburgh October 29, 2015.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Chapter1: Introduction Chapter2: Overview of Supervised Learning
Nonparametric Density Estimation Riu Baring CIS 8526 Machine Learning Temple University Fall 2007 Christopher M. Bishop, Pattern Recognition and Machine.
Chapter 13 (Prototype Methods and Nearest-Neighbors )
Machine Learning 5. Parametric Methods.
Machine Learning ICS 178 Instructor: Max Welling Supervised Learning.
CSC321: Introduction to Neural Networks and Machine Learning Lecture 15: Mixtures of Experts Geoffrey Hinton.
Lecture 8: Measurement Errors 1. Objectives List some sources of measurement errors. Classify measurement errors into systematic and random errors. Study.
Professor William H. Press, Department of Computer Science, the University of Texas at Austin1 Opinionated in Statistics by Bill Press Lessons #46 Interpolation.
CHAPTER 8: Nonparametric Methods Alpaydin transparencies significantly modified, extended and changed by Ch. Eick Last updated: March 4, 2011.
Lecture 7 Spring 2010 Dr. Jianjun Hu CSCE883 Machine Learning.
CS Statistical Machine learning Lecture 7 Yuan (Alan) Qi Purdue CS Sept Acknowledgement: Sargur Srihari’s slides.
Non-separable SVM's, and non-linear classification using kernels Jakob Verbeek December 16, 2011 Course website:
1 C.A.L. Bailer-Jones. Machine Learning. Data exploration and dimensionality reduction Machine learning, pattern recognition and statistical data modelling.
Machine Learning for Computer Security
Oliver Schulte Machine Learning 726
Nonparametric Density Estimation – k-nearest neighbor (kNN) 02/20/17
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
INTRODUCTION TO Machine Learning 3rd Edition
Dept. Computer Science & Engineering, Shanghai Jiao Tong University
Ch8: Nonparametric Methods
Machine learning, pattern recognition and statistical data modelling
Reading: Pedro Domingos: A Few Useful Things to Know about Machine Learning source: /cacm12.pdf reading.
Overview of Supervised Learning
Statistical Learning Dong Liu Dept. EEIS, USTC.
An Introduction to Supervised Learning
Mathematical Foundations of BME Reza Shadmehr
Nonparametric methods Parzen window and nearest neighbor
CS5112: Algorithms and Data Structures for Applications
CS4670: Intro to Computer Vision
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
CSC2515 Fall 2007 Introduction to Machine Learning Lecture 7b: Kernel density estimators and nearest neighbors All lecture slides will be available as.
Nonparametric density estimation and classification
Hairong Qi, Gonzalez Family Professor
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
ECE – Pattern Recognition Lecture 10 – Nonparametric Density Estimation – k-nearest-neighbor (kNN) Hairong Qi, Gonzalez Family Professor Electrical.
The “Margaret Thatcher Illusion”, by Peter Thompson
Presentation transcript:

CSC2515 Fall 2008 Introduction to Machine Learning Lecture 10a Kernel density estimators and nearest neighbors All lecture slides will be available as.ppt,.ps, &.htm at Many of the figures are provided by Chris Bishop from his textbook: ”Pattern Recognition and Machine Learning”

Histograms as density models For low dimensional data we can use a histogram as a density model. –How wide should the bins be? (width=regulariser) –Do we want the same bin-width everywhere? –Do we believe the density is zero for empty bins? is too narrow is too wide green curve is true density

Some good and bad properties of histograms as density estimators There is no need to fit a model to the data. –We just compute some very simple statistics (the number of datapoints in each bin) and store them. The number of bins is exponential in the dimensionality of the dataspace. So high-dimensional data is tricky: –We must either use big bins or get lots of zero counts (or adapt the local bin-width to the density) The density has silly discontinuities at the bin boundaries. –We must be able to do better by some kind of smoothing.

Local density estimators Estimate the density in a small region to be Problem 1: Variance in estimate if K is small. Problem 2: Unmodelled variation across the region if V is big compared with the smoothness of the true density points in region total points volume of region

Kernel density estimators Use regions centered on the datapoints –Allow the regions to overlap. –Let each individual region contribute a total density of 1/N –Use regions with soft edges to avoid discontinuities (e.g. isotropic Gaussians)

The density modeled by a kernel density estimator is too narrow is too wide

Nearest neighbor methods for density estimation Vary the size of a hyper-sphere around each test point so that exactly K training datapoints fall inside the hyper- sphere. –Does this give a fair estimate of the density? Nearest neighbors is usually used for classification or regression: –For regression, average the predictions of the K nearest neighbors. –For classification, pick the class with the most votes. How should we break ties? points in region total points volume of region

Nearest neighbor methods for classification and regression Nearest neighbors is usually used for classification or regression: For regression, average the predictions of the K nearest neighbors. –How should we pick K? For classification, pick the class with the most votes. How should we break ties? Let the k’th nearest neighbor contribute a count that falls off with k. For example,

The decision boundary implemented by 3NN The boundary is always the perpendicular bisector of the line between two points (Vornoi tesselation)

Regions defined by using various numbers of neighbors