Presentation is loading. Please wait.

Presentation is loading. Please wait.

Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia.

Similar presentations


Presentation on theme: "Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia."— Presentation transcript:

1 Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia Tech)

2 independent sets spanning trees matchings perfect matchings k-colorings Counting

3 (approx) counting  sampling Valleau,Card’72 (physical chemistry), Babai’79 (for matchings and colorings), Jerrum,Valiant,V.Vazirani’86, random variables: X 1 X 2... X t E[X 1 X 2... X t ] = O(1) V[X i ] E[X i ] 2 the X i are easy to estimate = “WANTED” the outcome of the JVV reduction: such that 1) 2) squared coefficient of variation (SCV)

4 E[X 1 X 2... X t ] = O(1) V[X i ] E[X i ] 2 the X i are easy to estimate = “WANTED” 1) 2) O(t 2 /  2 ) samples (O(t/  2 ) from each X i ) give 1  estimator of “WANTED” with prob  3/4 Theorem (Dyer-Frieze’91) (approx) counting  sampling

5 JVV for independent sets P( ) 1 # independent sets = GOAL: given a graph G, estimate the number of independent sets of G

6 JVV for independent sets P( ) P( ) = ? ? ? ? ? P( ) ? X1X1 X2X2 X3X3 X4X4 X i  [0,1] and E[X i ]  ½  = O(1) V[X i ] E[X i ] 2 P(A  B)=P(A)P(B|A)

7 SAMPLER ORACLE graph G random independent set of G JVV: If we have a sampler oracle: then FPRAS using O(n 2 ) samples.

8 SAMPLER ORACLE graph G random independent set of G JVV: If we have a sampler oracle: then FPRAS using O(n 2 ) samples. SAMPLER ORACLE  graph G set from gas-model Gibbs at  ŠVV: If we have a sampler oracle: then FPRAS using O * (n) samples.

9 O * ( |V| ) samples suffice for counting Application – independent sets Cost per sample (Vigoda’01,Dyer-Greenhill’01) time = O * ( |V| ) for graphs of degree  4. Total running time: O * ( |V| 2 ).

10 Other applications matchings O * (n 2 m) (using Jerrum, Sinclair’89) spin systems: Ising model O * (n 2 ) for  <  C (using Marinelli, Olivieri’95) k-colorings O * (n 2 ) for k>2  (using Jerrum’95) total running time

11 easy = hot hard = cold

12 1 2 4 Hamiltonian 0

13 H :  {0,...,n} Big set =  Goal: estimate |H -1 (0)| |H -1 (0)| = E[X 1 ]... E[X t ]

14 Distributions between hot and cold   (x)  exp(-H(x)  )  = inverse temperature  = 0  hot  uniform on   =   cold  uniform on H -1 (0) (Gibbs distributions)

15   (x)  Normalizing factor = partition function exp(-H(x)  ) Z(  )=  exp(-H(x)  ) x  Z(  ) Distributions between hot and cold   (x)  exp(-H(x)  )

16 Partition function Z(  )=  exp(-H(x)  ) x  have: Z(0) = |  | want: Z(  ) = |H -1 (0)|

17   (x)  exp(-H(x)  ) Z(  ) Assumption: we have a sampler oracle for   SAMPLER ORACLE graph G  subset of V from  

18   (x)  exp(-H(x)  ) Z(  ) Assumption: we have a sampler oracle for   W   

19   (x)  exp(-H(x)  ) Z(  ) Assumption: we have a sampler oracle for   W    X = exp(H(W)(  -  ))

20   (x)  exp(-H(x)  ) Z(  ) Assumption: we have a sampler oracle for   W    X = exp(H(W)(  -  )) E[X] =    (s) X(s) s  = Z(  ) Z(  ) can obtain the following ratio:

21 Partition function Z(  ) =  exp(-H(x)  ) x  Our goal restated Goal: estimate Z(  )=|H -1 (0)| Z(  ) = Z(  1 ) Z(  2 ) Z(  t ) Z(  0 ) Z(  1 ) Z(  t-1 ) Z(0)  0 = 0 <  1 <  2 <... <  t = ...

22 Our goal restated Z(  ) = Z(  1 ) Z(  2 ) Z(  t ) Z(  0 ) Z(  1 ) Z(  t-1 ) Z(0)... How to choose the cooling schedule? Cooling schedule: E[X i ] = Z(  i ) Z(  i-1 ) V[X i ] E[X i ] 2  O(1) minimize length, while satisfying  0 = 0 <  1 <  2 <... <  t = 

23 Parameters: A and n Z(  ) = A H:  {0,...,n} Z(  ) =  exp(-H(x)  ) x  Z(  ) = a k e -  k  k=0 n a k = |H -1 (k)|

24 Parameters Z(  ) = A H:  {0,...,n} independent sets matchings perfect matchings k-colorings 2V2V V! kVkV A E V V E n  V!

25 Previous cooling schedules Z(  ) = A H:  {0,...,n}   + 1/n    (1 + 1/ln A) ln A   “Safe steps” O( n ln A) Cooling schedules of length O( (ln n) (ln A) )  0 = 0 <  1 <  2 <... <  t =  (Bezáková,Štefankovi č, Vigoda,V.Vazirani’06) (Bezáková,Štefankovi č, Vigoda,V.Vazirani’06)

26 No better fixed schedule possible Z(  ) = A H:  {0,...,n} Z a (  ) = (1 + a e ) A 1+a -  n A schedule that works for all (with a  [0,A-1]) has LENGTH   ( (ln n)(ln A) )

27 Parameters Z(  ) = A H:  {0,...,n} Our main result: non-adaptive schedules of length  * ( ln A ) Previously: can get adaptive schedule of length O * ( (ln A) 1/2 )

28 Existential part for every partition function there exists a cooling schedule of length O * ((ln A) 1/2 ) Lemma: can get adaptive schedule of length O * ( (ln A) 1/2 ) there exists

29 W    X = exp(H(W)(  -  )) E[X 2 ] E[X] 2 Z(2  -  ) Z(  ) Z(  ) 2 =  C E[X] Z(  ) Z(  ) = Express SCV using partition function (going from  to  )

30 f(  )=ln Z(  ) Proof: E[X 2 ] E[X] 2 Z(2  -  ) Z(  ) Z(  ) 2 =  C  C’=(ln C)/2  2-2-

31 f(  )=ln Z(  ) f is decreasing f is convex f’(0)  –n f(0)  ln A Proof: either f or f’ changes a lot Let K:=  f  (ln |f’|)  1 K 1

32 f:[a,b]  R, convex, decreasing can be “approximated” using f’(a) f’(b) (f(a)-f(b)) segments

33 Proof:  2-2- Technicality: getting to 2  - 

34 Proof:  2-2- ii  i+1 Technicality: getting to 2  - 

35 Proof:  2-2- ii  i+1  i+2 Technicality: getting to 2  - 

36 Proof:  2-2- ii  i+1  i+2 Technicality: getting to 2  -   i+3 ln ln A extra steps

37 Existential  Algorithmic can get adaptive schedule of length O * ( (ln A) 1/2 ) there exists can get adaptive schedule of length O * ( (ln A) 1/2 )

38 Algorithmic construction   (x)  exp(-H(x)  ) Z(  ) using a sampler oracle for   we can construct a cooling schedule of length  38 (ln A) 1/2 (ln ln A)(ln n) Our main result: Total number of oracle calls  10 7 (ln A) (ln ln A+ln n) 7 ln (1/  )

39 current inverse temperature  ideally move to  such that E[X] = Z(  ) Z(  ) E[X 2 ] E[X] 2 B2B2 B 1  Algorithmic construction

40 current inverse temperature  ideally move to  such that E[X] = Z(  ) Z(  ) E[X 2 ] E[X] 2 B2B2 B 1  Algorithmic construction X is “easy to estimate”

41 current inverse temperature  ideally move to  such that E[X] = Z(  ) Z(  ) E[X 2 ] E[X] 2 B2B2 B 1  Algorithmic construction we make progress (assuming B 1  1)

42 current inverse temperature  ideally move to  such that E[X] = Z(  ) Z(  ) E[X 2 ] E[X] 2 B2B2 B 1  Algorithmic construction need to construct a “feeler” for this

43 Algorithmic construction current inverse temperature  ideally move to  such that E[X] = Z(  ) Z(  ) E[X 2 ] E[X] 2 B2B2 B 1  need to construct a “feeler” for this = Z(  ) Z(  ) Z(2  ) Z(  )

44 Algorithmic construction current inverse temperature  ideally move to  such that E[X] = Z(  ) Z(  ) E[X 2 ] E[X] 2 B2B2 B 1  need to construct a “feeler” for this = Z(  ) Z(  ) Z(2  ) Z(  ) bad “feeler”

45 Rough estimator for Z(  ) Z(  ) Z(  ) = a k e -  k  k=0 n For W    we have P(H(W)=k) = a k e -  k Z(  )

46 Rough estimator for Z(  ) = a k e -  k  k=0n For W    we have P(H(W)=k) = a k e -  k Z(  ) For U    we have P(H(U)=k) = a k e -  k Z(  ) If H(X)=k likely at both ,   rough estimator Z(  ) Z(  )

47 Rough estimator for For W    we have P(H(W)=k) = a k e -  k Z(  ) For U    we have P(H(U)=k) = a k e -  k Z(  ) P(H(U)=k) P(H(W)=k) e k(  -  ) = Z(  ) Z(  ) Z(  ) Z(  )

48 Rough estimator for Z(  ) = a k e -  k  k=0 n For W    we have P(H(W)  [c,d]) = a k e -  k Z(  )  k=c d Z(  ) Z(  )

49 P(H(U)  [c,d]) P(H(W)  [c,d]) ee e c(  -  )  e 1  If |  -  |  |d-c|  1 then Rough estimator for We also need P(H(U)  [c,d]) P(H(W)  [c,d]) to be large. Z(  ) Z(  ) Z(  ) Z(  ) Z(  ) Z(  )

50 Split {0,1,...,n} into h  4(ln n) ln A intervals [0],[1],[2],...,[c,c(1+1/ ln A)],... for any inverse temperature  there exists a interval with P(H(W)  I)  1/8h We say that I is HEAVY for 

51 Algorithm find an interval I which is heavy for the current inverse temperature  see how far I is heavy (until some  * ) use the interval I for the feeler repeat Z(  ) Z(  ) Z(2  ) Z(  ) either * make progress, or * eliminate the interval I

52 Algorithm find an interval I which is heavy for the current inverse temperature  see how far I is heavy (until some  * ) use the interval I for the feeler repeat Z(  ) Z(  ) Z(2  ) Z(  ) either * make progress, or * eliminate the interval I * or make a “long move”

53 if we have sampler oracles for   then we can get adaptive schedule of length t=O * ( (ln A) 1/2 ) independent sets O * (n 2 ) (using Vigoda’01, Dyer-Greenhill’01) matchings O * (n 2 m) (using Jerrum, Sinclair’89) spin systems: Ising model O * (n 2 ) for  <  C (using Marinelli, Olivieri’95) k-colorings O * (n 2 ) for k>2  (using Jerrum’95)

54


Download ppt "Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia."

Similar presentations


Ads by Google