 # MACHINE LEARNING 9. Nonparametric Methods. Introduction Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 

## Presentation on theme: "MACHINE LEARNING 9. Nonparametric Methods. Introduction Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 "— Presentation transcript:

MACHINE LEARNING 9. Nonparametric Methods

Introduction Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Estimating distribution for  Classification  Regression  Clustering  Parametric: Assume model, find optimum parameters from data  ML,MAP, Least Squares  Semi-parametric: Assume distribution mixture. Use EM/Clustering to find parameters

Non-parametric estimation Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 3  Can’t assume a model for distribution densities  Might be a very complicated model with large number of parameters  Assuming wrong model leads to large error  Nonparametric estimation principle:  “Similar inputs have similar outputs”  Find similar data instances in the training data and interpolate/average their outputs

Nonparametric estimation Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 4  Parametric estimation: all data instances affect the final global estimate  Global methods  Non-parametric estimation  No single global model  Local models are created as needed  Affected only by near-by instances  Also called: instance-based or memory-based methods

Memory-based method Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 5  Lazy method  Store training data of size N  O(N) memory  O(N) search to for similar data  Eager method: parametric methods  d parameters, d<N  O(d) memory and processing

Local methods and “curse of dimensionality” Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 6  500 of 2D points in unit square gives pretty much good picture of the density  Single 1000 dimension vector doesn’t have enough information about the joint distribution of 1000 random variables  Need much more samples in higher dimensional space

Density Estimator Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 7  Given the training set X ={x t } t drawn iid from p(x)  Commutative distribution  For density estimation select length h

Histogram Estimator Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 8  Divide input space into equal size bins and origin x 0

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 9

Naïve Estimator Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 10

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 11

Naïve Estimator Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 12  Each training sample has symmetric region of influence of size h  Contribute 1 to x falling into region of influence  This region of influence “hard”, not continuous  Soft influence  Contribute as function of distance  Training samples that are close to input contribute more

Kernel Estimator Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 13  Kernel function, e.g., Gaussian kernel:  Kernel estimator (Parzen windows)

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 14

K-Nearest Neighbors Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 15  Histogram/Kernel methods select uniform bin size  Actually we want bin size to be small if there are lot of samples in the neighborhood  Arrange distances from training samples

k-Nearest Neighbor Estimator Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 16  Instead of fixing bin width h and counting the number of instances, fix the instances (neighbors) k and check bin width d k (x), distance to kth closest instance to x

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 17

Multivariate Data Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 18  Kernel density estimator Multivariate Gaussian kernel spheric ellipsoid

Nonparametric Classification Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 19  Training data “vote” for input label  Closer points get more influence  Kernel  Weight votes according to distance  K-NN  Weight k closest points  1-NN  Find closest point  Assign label of this point to input

Nonparametric Classification Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 20  Estimate p(x| C i ) and use Bayes’ rule  Kernel estimator  k-NN estimator

Condensed Nearest Neighbor Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 21  Time/space complexity of k-NN is O (N)  Find a subset Z of X that is small and is accurate in classifying X  Use 1-NN

Condensed Nearest Neighbor Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 22  Incremental algorithm: Add instance if needed

Nonparametric Regression Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 23  Aka smoothing models  Take several closest points and weight/average their output  Parametric model: find polynomial coefficients and evaluate input on fitted function  Take average of output in the same bin: Regressogram

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 24

Kernel Smoother Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 25

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 26

Running Line Smoother Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 27  Fit line locally  Can take into accounts distances (Kernel)

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 28

How to Choose k or h? Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 29  When k or h is small, single instances matter; bias is small, variance is large (undersmoothing): High complexity  As k or h increases, we average over more instances and variance decreases but bias increases (oversmoothing): Low complexity  Cross-validation is used to finetune k or h.

Classifications Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 30

Download ppt "MACHINE LEARNING 9. Nonparametric Methods. Introduction Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 "

Similar presentations