Presentation is loading. Please wait.

Presentation is loading. Please wait.

No Free Lunch (NFL) Theorem Many slides are based on a presentation of Y.C. Ho Presentation by Kristian Nolde.

Similar presentations


Presentation on theme: "No Free Lunch (NFL) Theorem Many slides are based on a presentation of Y.C. Ho Presentation by Kristian Nolde."— Presentation transcript:

1 No Free Lunch (NFL) Theorem Many slides are based on a presentation of Y.C. Ho Presentation by Kristian Nolde

2 – 2/29 25. August 2004 General notes Goal: Give an intuitive feeling for the NFL Present some mathemtical background To keep in mind NFL is an impossibility theorem, such as –Gödel‘s proof in mathematics (roughly: some facts cannot be proved or disaproved in any mathematical system) –Arrow‘s theorem in economics (in principle, perfect democracy is not realizable) Thus, practicle use is limited ?!?

3 – 3/29 25. August 2004 The No Free Lunch Theorem Without specific structural assumptions, no optimization scheme can perform better than blind search on the average But blind search is very inefficient! Prob (at least one out of N samples is in the top-n for search space of size |  |) ~ nN/|  | ex. Prob=0.0001 for |  |=10 9, n=1000, N=1000

4 – 4/29 25. August 2004 Assume a finite World Finite # of input symbols (x’s) and finite # of output symbols (y’s) => finite # of possible mappings from input to output (f’s)

5 – 5/29 25. August 2004 The Fundamental Matrix F x1x1 x2x2 x |X| f1f1 f2f2 f |F| 0 0 1 0 0 0 FACT: equal number of 0’s and 1’s in each row! 1 1 01 0 11 1 1 1 In each row, each value of Y appear |Y| |X|-1 times! Averaged over all f, the value is independent of x!

6 – 6/29 25. August 2004 Compare Algorithms Think of two algorithms: a 1 and a 2 e.g. a 1 always selects from x 1 to x.5|X| a 2 always selects from x.5|X| to x |X| For specific f: a 1 or a 2 may be bettter. However, if f is not known average performance of both is equal: where d is a sample and d y is the cot value associated with d.

7 – 7/29 25. August 2004 Comparing Algorithms Continued Case 1: Algorithms can be more specific, e.g. assume a certain realization f k, a 1 Case 2: Or, they can be more general, assume more uniform distribution of possible f, a 2. Then performance of a 1 will be excellent for f k but catastrophic for all other cases (great performance, no robustness) Contrary, a 2 performs mediocre for all cases, but doesn‘t fail (poor performance, high robustness) Common Sense says: Robustness * Efficiency = Constant or Generality * Depth = Constant

8 – 8/29 25. August 2004 Implication 1 Let x be the optimization variable, f the performance function, and y the performance, i.e., y=f(x) then averaged over all possible optimization problems, the result is choice independent if you don’t know the structure of f (which column you are dealing with), blind choice is as good as any!

9 – 9/29 25. August 2004 Implications 2 Let X be the strategy (control law, decision rule) space = |decisions| |information|, f the performance function, and y the performance, i.e., y=f(x) Same conclusion for stochastic optimal control, adaptive control, decision theory, game theory, learning control, etc. A “good”algorithm must be qualified!

10 – 10/29 25. August 2004 Implications 2 Let X be the space of all possible representation (as in genetic algorithms), or space of all possible algorithms to apply to a class of problems Without understanding of the problem, blind choice is as good as any. “understanding” means you know which column of the F matrix you are dealing with

11 – 11/29 25. August 2004 Implications 3 Even if you know which columns or group of columns you are dealing with => you can specialize the choice of rows You must accept that you will suffer LOSSES should other choices of column occur due to uncertainties or disturbances

12 – 12/29 25. August 2004 The Fundamental Matrix F x1x1 x2x2 x |X| f1f1 f2f2 f |F| 0 0 1 0 0 0 1 1 01 0 11 1 1 1 Assume a distribution of the columns, then pick a row that results in minimal expected losses or maximal performance. This is stochastic optimization

13 – 13/29 25. August 2004 Implications 5 Worse, if you should estimate the probabilities incorrectly, then your stochastically optimized solution may suffer catastrophic bad outcomes more frequent then you like. Reason: you have already used up more of the good outcomes in your “optimal” choice. What are left are bad ones that are not suppose to occur! (HOT Design & power law -Doyle)

14 – 14/29 25. August 2004 Implications 6 Generality for generality sake is not very fruitful Working on a specific problem can be rewarding Because: –the insight can be generalized –the problem is practically important –the 80-20 effect


Download ppt "No Free Lunch (NFL) Theorem Many slides are based on a presentation of Y.C. Ho Presentation by Kristian Nolde."

Similar presentations


Ads by Google