B. Stochastic Neural Networks

Name: B. Stochastic Neural Networks
Uploaded: 2017-09-30T09:12:10+00:00
Duration: PTM18S27
Channel: Merryl Kelly
Description: B. Stochastic Neural Networks

B. Stochastic Neural Networks
Part 3B: Stochastic Neural Networks 4/25/2017 B. Stochastic Neural Networks (in particular, the stochastic Hopfield network) 4/25/2017

Trapping in Local Minimum
Part 3B: Stochastic Neural Networks 4/25/2017 Trapping in Local Minimum 4/25/2017

Escape from Local Minimum
Part 3B: Stochastic Neural Networks 4/25/2017 Escape from Local Minimum 4/25/2017

Part 3B: Stochastic Neural Networks
4/25/2017 Motivation Idea: with low probability, go against the local field move up the energy surface make the “wrong” microdecision Potential value for optimization: escape from local optima Potential value for associative memory: escape from spurious states because they have higher energy than imprinted states 4/25/2017

4/25/2017 The Stochastic Neuron h s(h) Q is unit step function (Flake, p ) 4/25/2017

Properties of Logistic Sigmoid
Part 3B: Stochastic Neural Networks 4/25/2017 Properties of Logistic Sigmoid As h  +, s(h)  1 As h  –, s(h)  0 s(0) = 1/2 4/25/2017

Logistic Sigmoid With Varying T
Part 3B: Stochastic Neural Networks 4/25/2017 Logistic Sigmoid With Varying T b in 0, 1, …, 20 T in 0.05, …, 0.1, infinity T varying from 0.05 to  (1/T =  = 0, 1, 2, …, 20) 4/25/2017

4/25/2017 Logistic Sigmoid T = 0.5 Slope at origin = 1 / 2T 4/25/2017

4/25/2017 Logistic Sigmoid T = 0.01 Note that this is virtually a threshold 4/25/2017

4/25/2017 Logistic Sigmoid T = 0.1 Slightly softened threshold 4/25/2017

4/25/2017 Logistic Sigmoid T = 1 4/25/2017

4/25/2017 Logistic Sigmoid T = 10 Nearly random state assignment, with slight bias determined by local field 4/25/2017

4/25/2017 Logistic Sigmoid T = 100 Virtually independent of local field 4/25/2017

4/25/2017 Pseudo-Temperature Temperature = measure of thermal energy (heat) Thermal energy = vibrational energy of molecules A source of random motion Pseudo-temperature = a measure of nondirected (random) change Logistic sigmoid gives same equilibrium probabilities as Boltzmann-Gibbs distribution Other rules (such as Metropolis algorithm) also give B-G distr. (HKP App.) At high temp., B. distr has uniform pref for all states, regardless of energy; at low temp, only the minimum energy states have a nonzero prob (Haykin 316) 4/25/2017

Transition Probability
Part 3B: Stochastic Neural Networks 4/25/2017 Transition Probability Interpret the final formula 4/25/2017

4/25/2017 Stability Are stochastic Hopfield nets stable? Thermal noise prevents absolute stability But with symmetric weights: 4/25/2017

Does “Thermal Noise” Improve Memory Performance?
Part 3B: Stochastic Neural Networks 4/25/2017 Does “Thermal Noise” Improve Memory Performance? Experiments by Bar-Yam (pp ): n = 100 p = 8 Random initial state To allow convergence, after 20 cycles set T = 0 How often does it converge to an imprinted pattern? 4/25/2017

Probability of Random State Converging on Imprinted State (n=100, p=8)
Part 3B: Stochastic Neural Networks 4/25/2017 Probability of Random State Converging on Imprinted State (n=100, p=8) Since will never reach stable state when T > 0, evolved for 20 steps at T = 1/b and then 5 at T = 0. (Bar-Yam ; fig ) T = 1 / b 4/25/2017 (fig. from Bar-Yam)

Probability of Random State Converging on Imprinted State (n=100, p=8)
Part 3B: Stochastic Neural Networks 4/25/2017 Probability of Random State Converging on Imprinted State (n=100, p=8) (Bar-Yam ; fig ) 4/25/2017 (fig. from Bar-Yam)

Analysis of Stochastic Hopfield Network
Part 3B: Stochastic Neural Networks 4/25/2017 Analysis of Stochastic Hopfield Network Complete analysis by Daniel J. Amit & colleagues in mid-80s See D. J. Amit, Modeling Brain Function: The World of Attractor Neural Networks, Cambridge Univ. Press, 1989. The analysis is beyond the scope of this course 4/25/2017

4/25/2017 Phase Diagram (D) all states melt (C) spin-glass states desired retrieval states (with possibility of some wrong bits) + spin-glass states retrieval states are global minima still have retrieval states, but SG states are lower in energy (more stable) net has many stable states, but they are all SG states SG states melt; only solution of mean-field equations is <Si> = 0 Outside of AB, it’s not useful as a memory change in properties are discontinuous (first-order phase xition) m (ratio of avg value of state to an imprinted patn) drops from 0.97 to 0 except on T axis (at a = 0, T = 1), where it’s continuous (second-order) The mixture states exist in a region shaped like AB, but smaller in area because they are higher energy The most stable mixtures (3-mixtures) extend to T = 0.46, a = 0.03 Click on legend to go to conceptual diagrams Position mouse over Go arrow to skip to next slide (HKP 39-41; fig. from Haykin fig. 8.14) (A) imprinted = minima (B) imprinted, but s.g. = min. 4/25/2017 (fig. from Domany & al. 1991)

Conceptual Diagrams of Energy Landscape
Part 3B: Stochastic Neural Networks 4/25/2017 Conceptual Diagrams of Energy Landscape Position mouse over return arrow to go back to previous (HKP fig. 2.18) (fig. from Hertz & al. Intr. Theory Neur. Comp.) 4/25/2017

4/25/2017 Phase Diagram Detail Below the TR curve: replica symmetry breakdown within the retrieval states, a fraction of neurons freeze in a spin-glass fashion Has little practical effect (but illustrates intricacies of behavior) (Haykin 313; fig. based on Haykin fig. 8.14) 4/25/2017 (fig. from Domany & al. 1991)

4/25/2017 Simulated Annealing (Kirkpatrick, Gelatt & Vecchi, 1983) Science 220: 4/25/2017

4/25/2017 Dilemma In the early stages of search, we want a high temperature, so that we will explore the space and find the basins of the global minimum In the later stages we want a low temperature, so that we will relax into the global minimum and not wander away from it Solution: decrease the temperature gradually during search [Could do Bar-Yam Question results next (BY )] 4/25/2017

Quenching vs. Annealing
Part 3B: Stochastic Neural Networks 4/25/2017 Quenching vs. Annealing Quenching: rapid cooling of a hot material may result in defects & brittleness local order but global disorder locally low-energy, globally frustrated Annealing: slow cooling (or alternate heating & cooling) reaches equilibrium at each temperature allows global order to emerge achieves global low-energy state Examples: metal or glass, heated to melting point 4/25/2017

4/25/2017 Multiple Domains global incoherence local coherence Arrows might represent crystallization axes The lowest energy is most stable, and occurs when axes are aligned Red lines indicate mismatched/weak binding, and therefore likely defects 4/25/2017

Moving Domain Boundaries
Part 3B: Stochastic Neural Networks 4/25/2017 Moving Domain Boundaries Note that the elevated temperature melts the domain boundaries and allows them to move 4/25/2017

Effect of Moderate Temperature
Part 3B: Stochastic Neural Networks 4/25/2017 Effect of Moderate Temperature 4/25/2017 (fig. from Anderson Intr. Neur. Comp.)

Effect of High Temperature
Part 3B: Stochastic Neural Networks 4/25/2017 Effect of High Temperature 4/25/2017 (fig. from Anderson Intr. Neur. Comp.)

Effect of Low Temperature
Part 3B: Stochastic Neural Networks 4/25/2017 Effect of Low Temperature 4/25/2017 (fig. from Anderson Intr. Neur. Comp.)

4/25/2017 Annealing Schedule Controlled decrease of temperature Should be sufficiently slow to allow equilibrium to be reached at each temperature With sufficiently slow annealing, the global minimum will be found with probability 1 Design of schedules is a topic of research Asymptotic cvg proved by Gemen & Geman (1984) They give sufficient bounds on slowness [Tk  T0 / log(1 + k)], but it’s too slow to be practical (Haykin 317) 4/25/2017

Typical Practical Annealing Schedule
Part 3B: Stochastic Neural Networks 4/25/2017 Typical Practical Annealing Schedule Initial temperature T0 sufficiently high so all transitions allowed Exponential cooling: Tk+1 = aTk typical 0.8 < a < 0.99 at least 10 accepted transitions at each temp. Final temperature: three successive temperatures without required number of accepted transitions Accepted transitions (this is the Metropolis algorithm approach, which Kirkpatrick & al., used): consider a random transition DE  0  accept transition DE > 0  accept with prob exp(-DE / T) (Haykin 318) 4/25/2017

4/25/2017 Summary Non-directed change (random motion) permits escape from local optima and spurious states Pseudo-temperature can be controlled to adjust relative degree of exploration and exploitation 4/25/2017

Hopfield Network for Task Assignment Problem
Six tasks to be done (I, II, …, VI) Six agents to do tasks (A, B, …, F) They can do tasks at various rates A (10, 5, 4, 6, 5, 1) B (6, 4, 9, 7, 3, 2) etc What is the optimal assignment of tasks to agents? 4/25/2017

NetLogo Implementation of Task Assignment Problem
Run TaskAssignment.nlogo 4/25/2017

Additional Bibliography
Part 3B: Stochastic Neural Networks 4/25/2017 Additional Bibliography Anderson, J.A. An Introduction to Neural Networks, MIT, 1995. Arbib, M. (ed.) Handbook of Brain Theory & Neural Networks, MIT, 1995. Hertz, J., Krogh, A., & Palmer, R. G. Introduction to the Theory of Neural Computation, Addison-Wesley, 1991. Part IV 4/25/2017

B. Stochastic Neural Networks

Similar presentations

Presentation on theme: "B. Stochastic Neural Networks"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

B. Stochastic Neural Networks

Similar presentations

Presentation on theme: "B. Stochastic Neural Networks"— Presentation transcript:

Similar presentations

About project

Feedback