Download presentation

Presentation is loading. Please wait.

Published byKaleb Snipes Modified over 2 years ago

1
Curse of Dimensionality Prof. Navneet Goyal Dept. Of Computer Science & Information Systems BITS - Pilani

2
Curse of Dimensionality!! Poses serious challenges ! Important factor influencing the design on pattern recognition techniques Mixture of oil, water & gas (homogeneous, annular & laminar) Each data point is a point in a 12-dimensional space. 100 points along only two dimensions, x6 & x7 x – predict its class? Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

3
Curse of Dimensionality!! Unlikely that it belongs to the blue class! Surrounded by lot of red points Also, many green points nearby Intuition: identity of the x should be determined strongly by nearby points and less strongly by more distant points How can we turn this intuition into a learning algorithm? Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

4
Curse of Dimensionality!! Make grid lines! Use majority voting Problems?? Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

5
Curse of Dimensionality No. of cells grow exponentially with D Need exponentially large no. of training data points Not a good approach for more than a few dimensions! Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

6
Curse of Dimensionality Solutions?? – Dimensionality Reductions – Develop Algorithms that are not affected by Curse of Dimensionality

7
Curse of Dimensionality Problems: – running time – over-fitting – number of samples required Reference: CS434a/541a: Pattern Recognition - Prof. Olga Veksler

8
Running Time Complexity (running time) increases with dimension d! A lot of methods have at least O(nd 2 ) complexity (n=no. of samples) – For eg.: estimation of covariance matrix With large d, O(nd 2 ) complexity may be too costly Reference: CS434a/541a: Pattern Recognition - Prof. Olga Veksler

9
Number of Samples Suppose we want to use the nearest neighbor approach with k = 1 (1NN) Suppose we start with only one feature This feature is not discriminative, i.e. it does not separate the classes well Use 2 features 1NN method needs a lot of samples, i.e. Samples have to be dense To maintain the same density as in 1D (9 samples per unit length), how many samples do we need? Reference: CS434a/541a: Pattern Recognition - Prof. Olga Veksler

10
Number of Samples We need 9 2 samples to maintain the same density as in 1D Reference: CS434a/541a: Pattern Recognition - Prof. Olga Veksler

11
Number of Samples When we go from 1 feature to 2, no one gives us more samples, we still have 9 This is way too sparse for 1NN to work well Reference: CS434a/541a: Pattern Recognition - Prof. Olga Veksler

12
Number of Samples Things go from bad to worse if we decide to use 3 features If 9 was dense enough in 1D, in 3D we need 9 3 =729 samples! Reference: CS434a/541a: Pattern Recognition - Prof. Olga Veksler

13
Number of Samples In general, if n samples is dense enough in 1D Then in d dimensions we need n d samples! n d grows really really fast as a function of d Common pitfall: If we can’t solve a problem with a few features, adding more features seems like a good idea However the number of samples usually stays the same The method with more features is likely to perform worse instead of expected better Reference: CS434a/541a: Pattern Recognition - Prof. Olga Veksler

14
Number of Samples For a fixed number of samples, as we add features, the graph of classification error: Thus for each fixed sample size n, there is the optimal number of features to use Reference: CS434a/541a: Pattern Recognition - Prof. Olga Veksler

Similar presentations

OK

Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.

Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on waves tides and ocean currents activity Ppt on formal education articles Ppt on summary writing graphic organizer Download ppt on live line maintenance usa Download ppt on oxidation and reduction potential Download ppt on indus valley civilization geography Ppt on real estate business Ppt on product specification document Ppt on transportation in india Pdf to ppt online free converter