Presentation is loading. Please wait.

Presentation is loading. Please wait.

Stochastic Optimization and Simulated Annealing Psychology 85-419/719 January 25, 2001.

Similar presentations


Presentation on theme: "Stochastic Optimization and Simulated Annealing Psychology 85-419/719 January 25, 2001."— Presentation transcript:

1 Stochastic Optimization and Simulated Annealing Psychology 85-419/719 January 25, 2001

2 In Previous Lecture... Discussed constraint satisfaction networks, having: –Units, weights, and a “goodness” function Updating states involves computing input from other units –Guaranteed to locally increase goodness –Not guaranteed to globally increase goodness

3 The General Problem: Local Optima Goodness Activation State Local Optima True Optima

4 How To Solve the Problem of Local Optima? Exhaustive search? –Nah. Takes too long. n units have 2 to the nth power possible states (if binary) Random re-starts? –Seems wasteful. How about something that generally goes in the right direction, with some randomness?

5 Sometimes It Isn’t Best To Always Go Straight Towards The Goal Rubik’s Cube: Undo some moves in order to make progress Baseball: sacrifice fly Navigation: move away from goal, to get around obstacles

6 Randomness Can Help Us Escape Bad Solutions Activation State

7 So, How Random Do We Want to Be? We can take a cue from physical systems In metallurgy, metals can reach a very strong (stable) state by: –Melting it; scrambles molecular structure –Gradually cooling it –Resulting molecular structure very stable New terminology: reduce energy (which is kind of like the negative of goodness)

8 Simulated Annealing Odds that a unit is on is a function of: The input to the unit, net The temperature, T

9 Picking it Apart... As net increases, probability that output is 1 increases –e is raised to the negative of net/T; so as net gets big, e to the negative of net/T goes to zero. So probability goes to 1/1=1.

10 The Temperature Term When T is big, the exponent for e goes to zero. e (or anything) to the zero power is 1 So, odds output is 1 goes to 1/(1+1)=0.5

11 The Temperature Term (2) When T gets small, exponent gets big. Effect of net becomes amplified.

12 Different Temperatures... Net Input Probability Output is 1 High Temp Med Temp Low Temp 0 1.5

13 Ok, So At What Rate Do We Reduce Temperature? In general, must decrease it very slowly to guarantee convergence to global optimum 050100 T In practice, we can get away with a more aggressive annealing schedule..

14 Putting it Together... We can represent facts, etc. as units Knowledge about these facts encoded as weights Network processing fills in gaps, makes inferences, forms interpretations Stable Attractors form; the weights and input sculpt these attractors. Stability (and goodness) enhanced with randomness in updating process.

15 Stable Attractors Can Be Thought Of As Memories How many stable patterns can be remembered by a network with N units? There are 2 to the N possible patterns… … but only about 0.15*N will be stable To remember 100 things, need 100/0.15=666 units! (then again, the brain has about 10 to the 12th power neurons…)

16 Human Performance, When Damaged (some examples) Category coordinate errors –Naming a CAT as a DOG Superordinate errors –Naming a CAT as an ANIMAL Visual errors (deep dyslexics) –Naming SYMPATHY as SYMPHONY –or, naming SYMPATHY as ORCHESTRA

17 The Attractors We’ve Talked About Can Be Useful In Understanding This CAT COT “CAT” CAT COT “CAT” Normal Performance A Visual Error (see Plaut Hinton, Shallice)

18 Properties of Human Memory Details tend to go first, more general things next. Not all-or-nothing forgetting. Things tend to be forgotten, based on –Salience –Recency –Complexity –Age of acquisition?

19 Do These Networks Have These Properties? Sort of. Graceful degradation. Features vanish as a function of strength of input to them. Complexity: more complex / arbitrary patterns can be more difficult to retain Salience, recency, age of acquisition? –Depends on learning rule. Stay tuned

20 Next Time: Psychological Implications: The IAC Model of Word Perception Optional reading: McClelland and Rumelhart ‘81 (handout) Rest of this class: Lab session. Help installing software, help with homework.


Download ppt "Stochastic Optimization and Simulated Annealing Psychology 85-419/719 January 25, 2001."

Similar presentations


Ads by Google