Presentation is loading. Please wait.

Presentation is loading. Please wait.

Path Coupling And Approximate Counting

Similar presentations


Presentation on theme: "Path Coupling And Approximate Counting"— Presentation transcript:

1 Path Coupling And Approximate Counting
Faster random generation of linear extensions, R. Bubley, M. Dyer Mathematical foundations of the Markov chain Monte Carlo method, M Jerrum Counting Linear Extensions of a Partial Order, S. Harris Presented By Tomer Lauterman

2 Review – transportation metric
We have a state space Ω and metric 𝜌 between states, 𝜌 𝑋,𝑌 ≥1 The transportation metric (or Wasserstein metric) is defined on distributions over Ω: 𝜌 𝐾 𝜇,𝜈 = inf 𝔼 𝜌 𝑋,𝑌 : 𝑋,𝑌 𝑖𝑠 𝑎 𝑐𝑜𝑢𝑝𝑙𝑖𝑛𝑔 𝑜𝑓𝜇,𝜈 𝜇,𝜈 𝑇𝑉 ≤ 𝜌 𝐾 𝜇,𝜈 There exist an optimal coupling ( 𝑋 ∗ , 𝑌 ∗ ) of 𝜇,𝜈 such that: 𝜌 𝐾 𝜇,𝜈 =𝔼 𝜌 𝑋 ∗ , 𝑌 ∗

3 Review – path coupling Define 𝐺(Ω, 𝐸 0 ), ℓ(𝑒), 𝜌 𝑥,𝑦 ≔ shortest path on 𝐺 Assume for each edge 𝒙,𝒚 ∈ 𝑬 𝟎 we have a coupling ( 𝑋 1 , 𝑌 1 ) of distributions 𝑃 𝑥,⋅ , 𝑃(𝑦,⋅) such that: 𝔼 𝑥,𝑦 𝜌 𝑋 1 , 𝑌 1 ≤ 𝑒 −𝛼 ⋅ℓ 𝑥,𝑦 for some 𝛼>0 Then for any two distributions 𝜇,𝜈 on Ω: 𝜌 𝐾 𝜇𝑃,𝜈𝑃 ≤ 𝑒 −𝛼 ⋅ 𝜌 𝐾 (𝜇,𝜈) 𝑃 𝑡 𝑥,⋅ −𝜋 𝑇𝑉 ≤ 𝜌 𝐾 𝑃 𝑡 𝑥,⋅ ,𝜋 ≤ 𝑒 −𝛼𝑡 𝑑𝑖𝑎𝑚(Ω)

4 Approximate Counting Let Ω be a very large (but finite) set of combinatorial structure (s.t. possible configuration of physical system or feasible solution to combinatorial problem) Estimate |Ω|, the cardinality of Ω The estimation should be polynomial in both instance size n and the inverse error tolerated, 𝜖 −1 (fpras).

5 Goals Study another use case for path coupling
Learn FPRAS technic for approximate counting using Markov Chain Monte Carlo See how all the course subjects so far are playing together Create markov chain with target probability distribution (s.t Glauber Chain) Use the property of “rapid mixing” (Path Coupling) for algorithm design

6 Path Coupling Markov Chain Of Linear Extensions Faster random generation of linear extensions, R. Bubley, M. Dyer ? ? ? ? ?

7 Linear Extensions - Definitions
Partially ordered set (P , ≺) – a set P with an irreflexive, transitive relation 𝑎≺𝑏 , 𝑐≺𝑑 Total order (P , <) – relation such that for all 𝑎,𝑏∈𝑃 either a < b or b < a 𝑎<𝑏<𝑐<𝑑 Linear extension of partial order (P , ≺) is a total order (P , <) that preserve (P , ≺). 𝑎<𝑏 , 𝑐<𝑑 ⇒{𝑎𝑏𝑐𝑑,𝑐𝑑𝑎𝑏, 𝑎𝑐𝑏𝑑,..} a b c d

8 Partial Order - Definitions
Full Order a f a b b c g c d d e e f g

9 Notations 𝑌=𝜎 𝑖,𝑗 𝑋 𝑋=( 𝑎 1 𝑎 2 ∙∙∙ 𝑎 𝑖−1 𝑎 𝑖 ∙∙∙ a j a j+1 ∙∙∙ a n )
Transposition Operator 𝜎 𝑖,𝑗 𝑌=𝜎 𝑖,𝑗 𝑋 𝑋=( 𝑎 1 𝑎 2 ∙∙∙ 𝑎 𝑖−1 𝑎 𝑖 ∙∙∙ a j a j+1 ∙∙∙ a n ) 𝑌=( 𝑎 1 𝑎 2 ∙∙∙ 𝑎 𝑖−1 𝑎 𝑗 ∙∙∙ a i a j+1 ∙∙∙ a n ) 𝑓 - concave probability distribution on {1,2,…n-1} Ω - all possible linear extensions of (P , ≺)

10 The Markov Chain Given (𝑃,≺), we want to sample linear extension uniformly at random We will design a markov chain s.t: Each state is a linear extension The stationary distribution is uniform The chain is rapidly mixing cabd 1 6 abcd acbd 1 6 5 6 1 6 acdb 1 2 𝑎≺𝑏 , 𝑐≺𝑑

11 The Markov Chain The chain is aperiodic The chain is irreducible
The chain is symmetric ⇒ stationary state 𝜋 is the uniform distribution 𝑋= 𝑋 1 𝑋 2 … 𝑋 𝑖−1 𝑋 𝑖 … 𝑋 𝑛 𝑌= 𝑋 1 𝑋 2 … 𝑋 𝑖−1 𝑋 𝑗 … 𝑋 𝑛 𝑌=𝜎 𝑖,𝑗 𝑋 ⟺ X=𝜎 𝑖,𝑗 𝑌 cabd 5 6 1 6 1 6 abcd acbd 1 6 1 2 acdb 𝑎≺𝑏 , 𝑐≺𝑑

12 The Markov Chain Denote by 𝑋 𝑡 , 𝑋 𝑡 ∈Ω a state in the Markov Chain
Choose 𝑝∈ 1,2,…,𝑛−1 uniformly at random With Probability 1 2 : 𝑋 𝑡+1 = 𝑋 𝑡 (“lazy walk”) With Probability 1 2 : If 𝜎 𝑝,𝑝+1 𝑋 𝑡 ∉Ω then 𝑋 𝑡+1 = 𝑋 𝑡 otherwise 𝑋 𝑡+1 =𝜎 𝑝,𝑝+1 𝑋 𝑡 according to 𝑓

13 The Markov Chain 𝑎≺𝑏 , 𝑐≺𝑑 cabd 1 2 ∙𝑓(1) abcd acbd 1 2 ∙𝑓(2)
1 2 ∙𝑓(3) acdb 1 2 1 2 ∙ 1+𝑓 1 +𝑓(3) 𝑎≺𝑏 , 𝑐≺𝑑

14 Define 𝐺 Ω, 𝐸 0 ,𝜌(X,Y) Transposition Distance
𝑋,𝑌 ∈ 𝐸 0 if there exist i,j s.t 𝑌=𝜎 𝑖,𝑗 𝑋 ℓ 𝑋,𝑌 =𝑗−𝑖 𝜌 𝑋,𝑌 = shortest path from X to Y 𝑎≺𝑏 , 𝑎≺𝑐 1 abcd acbd dabc dacb 1 1 2 2 1 1 1 adcb abdc acdb adbc 2 1 1 1

15 The Coupling Assume 𝑌=𝜎 𝑖,𝑗 𝑋
Choose 𝑝∈ 1,2,…,𝑛−1 according to 𝑓, and 𝑐 𝑥 ∈ 0,1 (“coin”) If 𝑗−𝑖=1 and 𝑝=𝑖 then 𝑐 𝑦 =1− 𝑐 𝑥 otherwise 𝑐 𝑦 = 𝑐 𝑥 if 𝑐 𝑥 =0 or 𝜎 𝑝,𝑝+1 𝑋∉Ω then 𝑋′=𝑋 otherwise 𝑋′=𝜎 𝑝,𝑝+1 𝑋 (similar for 𝑌)

16 Bounding Mixing Time - overview
𝑷 𝒕 𝒙,⋅ −𝝅 𝑻𝑽 ≤ 𝒆 −𝜶𝒕 𝒅𝒊𝒂𝒎(𝜴) We need to calculate 2 parameters 𝛼 := 𝜌 𝑘 converges rate = expected 𝜌 -distance “shrink” ratio in one step. 𝑑𝑖𝑎𝑚(Ω) := maximum 𝜌-distance in 𝐺(Ω, 𝐸 0 )

17 Calculate The Shrink Ratio
𝐸[𝜌 𝑋,𝑌 ] 𝑋= 𝑥 1 𝑥 2 … 𝑥 𝑖−1 𝑥 𝑖 … 𝑥 𝑗 𝑥 𝑗+1 … 𝑥 𝑛 𝑌= 𝑥 1 𝑥 2 … 𝑥 𝑖−1 𝑥 𝑗 … 𝑥 𝑖 𝑥 𝑗+1 … 𝑥 𝑛 (a) (b) (c) 𝑝∉{𝑖−1,𝑖,𝑗−1,𝑗} 𝑝∈{𝑖−1,𝑗} 𝑝∈{𝑖,𝑗−1} 𝑖<𝑗−1 𝑖=𝑗−1

18 Calculate The Shrink Ratio
Case (a) 𝑝∉{𝑖−1,𝑖,𝑗−1,𝑗} The states change in the same manner 𝜌 𝑋 ′ , 𝑌 ′ =ℓ 𝑋,𝑌 =𝑗−𝑖 𝑋= 𝑥 1 𝑥 2 … 𝑥 𝑖−1 𝑥 𝑖 … 𝑥 𝑗−1 𝑥 𝑗 𝑥 𝑗+1 … 𝑥 𝑛 𝑌= 𝑥 1 𝑥 2 … 𝑥 𝑖−1 𝑥 𝑗 … 𝑥 𝑗−1 𝑥 𝑖 𝑥 𝑗+1 … 𝑥 𝑛

19 Calculate The Shrink Ratio
Case (b) 𝑝∈{𝑖−1,𝑗} 𝑋= 𝑥 1 𝑥 2 … 𝑥 𝑖−1 𝒙 𝒊 … 𝑥 𝑗−1 𝒙 𝒋 𝑥 𝑗+1 … 𝑥 𝑛 𝑌= 𝑥 1 𝑥 2 … 𝑥 𝑖−1 𝒙 𝒋 … 𝑥 𝑗−1 𝒙 𝒊 𝑥 𝑗+1 … 𝑥 𝑛 𝝈 𝒑,𝒑+𝟏 𝑿, 𝝈 𝒑,𝒑+𝟏 𝒀∉𝛀 ⇒𝝆 𝑿 ′ , 𝒀 ′ =ℓ(𝑿,𝒀) 𝝈 𝒑,𝒑+𝟏 𝑿, 𝝈 𝒑,𝒑+𝟏 𝒀∈𝛀 ⇒𝝆 𝑿 ′ , 𝒀 ′ =ℓ 𝑿,𝒀 +𝟏 𝑋′= 𝑥 1 𝑥 2 … 𝒙 𝒊 𝑥 𝑖−1 … 𝑥 𝑗−1 𝒙 𝒋 𝑥 𝑗+1 … 𝑥 𝑛 𝑌′= 𝑥 1 𝑥 2 … 𝒙 𝒋 𝑥 𝑖−1 … 𝑥 𝑗−1 𝒙 𝒊 𝑥 𝑗+1 … 𝑥 𝑛 Only one of 𝝈 𝒑,𝒑+𝟏 𝑿, 𝝈 𝒑,𝒑+𝟏 𝒀∈𝛀 ⇒𝝆 𝑿 ′ , 𝒀 ′ =ℓ 𝑿,𝒀 +𝟏 𝑋′= 𝑥 1 𝑥 2 … 𝑥 𝑖 𝑥 𝑖−1 … 𝑥 𝑗−1 𝑥 𝑗 𝑥 𝑗+1 … 𝑥 𝑛 𝑌′= 𝑥 1 𝑥 2 … 𝑥 𝑖−1 𝑥 𝑗 … 𝑥 𝑗−1 𝑥 𝑖 𝑥 𝑗+1 … 𝑥 𝑛 (𝑌′=𝜎 𝑖,𝑗 𝜎 𝑖−1,𝑖 𝑋′)

20 Calculate The Shrink Ratio
Case (c) 𝑝∈{𝑖,𝑗−1} 𝑥 𝑖 , 𝑥 𝑗 are incomparable to 𝑥 𝑖+1 … 𝑥 𝑗−1 in (𝑃,≺) Case 𝑖≠𝑗−1 𝑋= 𝑥 1 𝑥 2 … 𝑥 𝑖−1 𝑥 𝑖 … 𝑥 𝑗−1 𝑥 𝑗 𝑥 𝑗+1 … 𝑥 𝑛 𝑌= 𝑥 1 𝑥 2 … 𝑥 𝑖−1 𝑥 𝑗 … 𝑥 𝑗−1 𝑥 𝑖 𝑥 𝑗+1 … 𝑥 𝑛 Case 𝑖=𝑗−1 𝑋= 𝑥 1 𝑥 2 … 𝑥 𝑖−1 𝑥 𝑖 𝑥 𝑗 𝑥 𝑗+1 … 𝑥 𝑛 𝑌= 𝑥 1 𝑥 2 … 𝑥 𝑖−1 𝑥 𝑗 𝑥 𝑖 𝑥 𝑗+1 … 𝑥 𝑛 ⇒𝝆 𝑿 ′ , 𝒀 ′ =ℓ 𝑿,𝒀 −𝟏 Remember the coupling! 𝒄 𝒚 = 𝟏−𝒄 𝒙

21 Calculate The Shrink Ratio
𝐸[𝜌 𝑋′,𝑌′ ] 𝑋= 𝑥 1 𝑥 2 … 𝑥 𝑖−1 𝑥 𝑖 … 𝑥 𝑗 𝑥 𝑗+1 … 𝑥 𝑛 𝑌= 𝑥 1 𝑥 2 … 𝑥 𝑖−1 𝑥 𝑗 … 𝑥 𝑖 𝑥 𝑗+1 … 𝑥 𝑛 (a) (b) (c) 𝑝∉{𝑖−1,𝑖,𝑗−1,𝑗} 𝑝∈{𝑖−1,𝑗} 𝑝∈{𝑖,𝑗−1} 𝑖<𝑗−1 𝑖=𝑗−1 𝐸 𝜌 𝑋′,𝑌′ = ℓ 𝑋,𝑌 𝐸 𝜌 𝑋′,𝑌′ ≤ ℓ 𝑋,𝑌 + 1 2 𝐸 𝜌 𝑋′,𝑌′ = ℓ 𝑋,𝑌 − 1 2 𝐸 𝜌 𝑋′,𝑌′ = ℓ 𝑋,𝑌 −1 𝐸 𝜌 𝑋,𝑌 −ℓ 𝑋,𝑌 ≤ 1 2 𝑓 𝑖−1 −𝑓 𝑖 −𝑓 𝑗−1 +𝑓 𝑗

22 Calculate The Shrink Ratio
𝐸 𝜌 𝑋,𝑌 −ℓ 𝑋,𝑌 ≤ 1 2 𝑓 𝑖−1 −𝑓 𝑖 −𝑓 𝑗−1 +𝑓 𝑗 ≤0 We will use 𝑓 𝑖 = 𝑖 𝑛−𝑖 𝐾 , 𝐾= 1 6 𝑛 3 −𝑛 <− 𝒇 𝒊−𝟏 +𝒇 𝒋 𝑓(𝑗)) 𝑓(𝑗−1) 𝑓(𝑖) 𝑓(𝑖−1) 𝑓 𝑗 −𝑎 𝑓 𝑖−1 +𝑎

23 Calculate The Shrink Ratio
𝑓 𝑖 = 𝑖 𝑛−𝑖 𝐾 , 𝐾= 𝑛 3 −𝑛 𝑓 𝑖 −𝑓 𝑖−1 = 𝑖 𝑛−𝑖 𝐾 − (𝑖−1) 𝑛−𝑖+1 𝐾 = 𝑛+1−2𝑖 𝐾 𝐸 𝜌 𝑋′,𝑌′ ≤ ℓ 𝑋,𝑌 𝑓 𝑖−1 −𝑓 𝑖 −𝑓 𝑗−1 +𝑓 𝑗 = ℓ 𝑋,𝑌 𝑛+1−2𝑗 𝐾 − 𝑛+1−2𝑖 𝐾 = ℓ 𝑋,𝑌 − 𝑗−𝑖 𝐾 = 1− 1 𝐾 ℓ 𝑋,𝑌 𝐸 𝜌 𝑋′,𝑌′ ≤ 1− 1 𝐾 ℓ 𝑋,𝑌 ≤ 𝑒 − 1 𝐾 ℓ 𝑋,𝑌 𝜶= 𝟏 𝑲

24 Calculate Diameter Definition: Suppose X,Y are total orders then
Spearman’s footrule is defined to be, 𝛿 𝑠 𝑋,𝑌 = 1 2 𝑖=1 𝑛 𝑋 𝑖 −𝑌(𝑖) Lemma 1: The transposition distance equals to 𝛿 𝑠 Lemma 2: max 𝑋,𝑌 𝛿 𝑠 (𝑋,𝑌) ≤ 𝑛 2 4 Corollary: 𝑑𝑖𝑎𝑚 Ω = max 𝑋,𝑌 𝜌(𝑋,𝑌) = max 𝑋,𝑌 𝛿 𝑠 (𝑋,𝑌) ≤ 𝑛 2 4

25 Bounding The Mixing Time
𝑑 𝑡 = 𝑃 𝑡 𝑥,⋅ −𝜋 𝑇𝑉 ≤ 𝑒 − 1 𝐾 𝑡 𝑑𝑖𝑎𝑚(Ω) 𝑡 𝑚𝑖𝑥 𝜖 =𝑎𝑟𝑔𝑚𝑖 𝑛 𝑡 𝑑 𝑡 ≤𝜖 ⇒ 𝑡 𝑚𝑖𝑥 𝜖 ≤ 𝐾𝑙𝑛 𝑑𝑖𝑎𝑚(Ω) 𝜖 −1 Assign the values 𝐾= 𝑛 3 −𝑛 𝑑𝑖𝑎𝑚(Ω) ≤ 𝑛 2 4 𝑡 𝑚𝑖𝑥 𝜖 ≤ 𝑛 3 −𝑛 𝑙𝑛 𝑛 𝜖 −1

26 Approximate Counting Mathematical foundations of the Markov chain Monte Carlo method, M Jerrum (Q-coloring) Counting Linear Extensions of a Partial Order, S. Harris

27 FPRAS fully polynomial randomized approximation scheme:
approximate value of the function n → |Ωn| having a polynomial run-time in both the instance size n and the inverse error tolerated, 𝜖 −1 . P{(1 − ε) |Ω|≤ W ≤ (1 + ε) |Ω| } ≥ 1 − η

28 Two approaches for counting
Direct Monte-Carlo “dart throwing” Bound S by simper group S’ s.t |S’| can be estimated simply Markov Chain Monte-Carlo Create sequence of sub problems of |S|: |S0|,..,|Sn|=|S| Estimate the ratios 𝑆 𝑘 𝑆 𝑘+1 using Markov Chain sampling 𝑆 = 𝑘=1 𝑛−1 𝑆 𝑘 𝑆 𝑘 ×| 𝑆 0 | |S| |Sn-1| |S0|

29 𝑡 𝑚𝑖𝑥 (𝜀) ≤ 𝑞 − ∆ 𝑞 − 2∆ 𝑛 ( log 𝑛 + log 𝜀 −1 )
Q-Coloring Problem proper q-colorings of a graph G = (V, E) are elements of 𝑥 ∈ Ω = {1, 2, . . ., q} V such that 𝑥 𝑣 ≠𝑥 𝑤 for {𝑣, 𝑤} ∈ 𝐸 Theorem 14.8. Consider the Glauber dynamics chain for random proper q-colorings of a graph with n vertices and maximum degree ∆. If q > 2∆, then the mixing time satisfies 𝑡 𝑚𝑖𝑥 (𝜀) ≤ 𝑞 − ∆ 𝑞 − 2∆ 𝑛 ( log 𝑛 + log 𝜀 −1 )

30 Q-Coloring Problem 𝑒 1 𝑒 1 𝑒 2 𝑒 2 𝑒 3 𝑒 3 𝑒 4 𝑒 4

31 Q-C Approximate Counting
Theorem Let Ω be the set of all proper q-colorings of the graph G of n vertices and maximal degree ∆. Fix q > 2∆, and set 𝑐 𝑞, ∆ = 1 − ∆ 𝑞 − ∆ Given graph with m edges and ε, there is a random variable W which can be simulated using no more than 𝑛 log 𝑛 +𝑛 log 6𝑚 𝜀 𝑐 𝑞, ∆ 𝑚 𝜀 2 ∙𝑚 uniform random variables and which satisfies 𝑃{ 1 −𝜖 Ω ≤ 𝑊 ≤ 1 +𝜖 Ω } ≥ 3 4

32 Theorem 14.12 – Proof(1/8) Definition:
𝐺 0 (𝑉,∅) - be a graph with the vertex of G with no edges. 𝐺 𝑘 𝑉, 𝐸 𝑘 - 𝐸 𝑘 = 𝐸 𝑘−1 ∪{ 𝑒 𝑘 | 𝑒 𝑘 ∈𝐸/ 𝐸 𝑘−1 } Ω 𝑘 - the number of proper coloring of 𝐺 𝑘 Ω 0 = 𝑞 𝑛 𝐺= 𝐺 𝑚 Ω = Ω 0 × |Ω 1 | | Ω 0 | ×…× |Ω 𝑚 | | Ω 𝑚−1 | 𝑒 1 𝑒 2 𝑒 3 𝑒 4 Ω 0 Ω 1 Ω m−1 Ω

33 Theorem 14.12 – Proof(2/8) Algorithm For k=1…m
Sample 𝑎 𝑚 random q-coloring from Ω 𝑘−1 using running of t(m, ε) steps of “Glauber” chain Count 𝑧 𝑘 - haw many samples belongs to Ω 𝑘 define 𝑤 𝑘 = 𝑧 𝑘 𝑎 𝑚 to be the estimate of 𝑝 𝑘 = |Ω 𝑘 | | Ω 𝑘−1 | 𝑊= Ω 0 ∙ 𝑘=1 𝑚 𝑤 𝑘 is an estimator of Ω 𝑚 𝜎 𝑏𝑖𝑎𝑠 𝑝 𝑘 𝑤 𝑘

34 Theorem – Proof(2/8) Lemma 1: for each k, ≤ |Ω 𝑘 | | Ω 𝑘−1 | ≤1. Suppose that the graphs 𝐺 𝑘 and 𝐺 𝑘−1 differ in the edge {𝑢,𝑣}. Any coloring in Ω 𝑘−1 / Ω 𝑘 assigns the same color to u and v We can map each coloring to coloring of 𝐺 𝑘 by recoloring vertex 𝑢 with one of at least 𝑞 −Δ colors. Ω 𝑘−1 − Ω 𝑘 ≤ Ω 𝑘 ⇒ 1 2 ≤ |Ω 𝑘 | | Ω 𝑘−1 | ≤1

35 Theorem – Proof(3/8) 𝑮 𝟑 𝑮 𝟒 𝑢 𝑣 𝑢 𝑣 𝑢 𝑣 𝑢 𝑣

36 Theorem – Proof(4/8) Reminder: theorem 14.8 Consider Glauber dynamics chain for random proper q-colorings of a graph with n vertices and maximum degree ∆. If q > 2∆, then the mixing time satisfies 𝑡 𝑚𝑖𝑥 𝜖 ≤ n(log 𝑛 − log 𝜖 ) 𝑐 𝑞,Δ Let 𝑡 𝑚, 𝜀 := 𝑡 𝑚𝑖𝑥 𝜖 6𝑚 then 𝑃 𝑡 𝑚,𝜀 𝑥0, · − 𝜋 𝑘 𝑇𝑉 < 𝜀 6𝑚

37 Theorem – Proof(5/8) Let 𝑍 𝑘,𝑖 be the indicator that the i-th sample is an element of Ω 𝑘+1 𝐸 𝑍 𝑘,𝑖 − |Ω 𝑘+1 | | Ω 𝑘 | = 𝑃 𝑡 𝑚,𝜀 𝑥 𝑘,𝑖 , Ω 𝑘+1 − 𝜋 𝑘 Ω 𝑘+1 ≤ 𝑃 𝑡 𝑚,𝜀 𝑥 𝑘,𝑖 , ∙ − 𝜋 𝑘+1 𝑇𝑉 ≤ 𝜀 6𝑚 𝐸 𝑍 𝑘,𝑖 − |Ω 𝑘+1 | | Ω 𝑘 | ≤ 𝜖 6𝑚 ≤ 𝜖 3𝑚 ∙ |Ω 𝑘+1 | | Ω 𝑘 | |Ω 𝑘+1 | | Ω 𝑘 | ≥ 1 2 µ − ν 𝑇𝑉 = max 𝐴⊂Ω |µ(𝐴) − 𝜈(𝐴)|

38 Theorem 14.12 – Proof(6/8) Let 𝑊 𝑘 := 1 𝑎 𝑚 𝑖=1 𝑎 𝑚 𝑍 𝑘,𝑖
𝐸[𝑊 𝑘 ]≥ 1− 𝜖 3𝑚 ∙ |𝛺 𝑘+1 | | 𝛺 𝑘 | ≥ 1 3 𝑉𝑎𝑟 𝑍 𝑘,𝑖 𝐸 2 𝑍 𝑘,𝑖 = 𝐸 −1 𝑍 𝑘,𝑖 −1≤2 ≥ 1 2 𝑬[𝑾 𝒌 ]− |𝜴 𝒌+𝟏 | | 𝜴 𝒌 | = 𝐸[𝑊 𝑘 ]− 𝜋 𝑘 Ω 𝑘+1 ≤ 𝜀 3𝑚 ∙ |𝛺 𝑘+1 | | 𝛺 𝑘 | 𝑽𝒂𝒓 𝑾 𝒌 𝑬 𝟐 𝑾 𝒌 ≤ 2 𝑎 𝑚

39 Hope you are not approximate counting sheep!

40 Theorem 14.12 – Proof(7/8) Let 𝑊 = 𝑖=1 𝑚 𝑊 𝑖
𝐸 𝑊 = 𝑖=1 𝑚 𝐸𝑊 𝑖 ≤ 1+ 𝜖 3𝑚 𝑚 q −n Ω ≤ 𝑒 𝜖/3 q −n Ω ≤(1+ 𝜖 2 ) q −n Ω 𝑉𝑎𝑟 𝑊 𝐸 2 𝑊 = 𝐸 𝑊 2 𝐸 2 𝑊 −1= 𝑖=1 𝑚 𝐸 𝑊 𝑖 2 𝐸 2 𝑊 𝑖 −1= 𝑖=1 𝑚 1+ 𝑉𝑎𝑟 𝑊 𝑖 2 𝐸 2 𝑊 𝑖 −1 ≤ 𝑖=1 𝑚 𝑎 𝑚 −1≤ 𝑒 2𝑚 𝑎 𝑚 −1 Take 𝑎 𝑚 = 74𝜖 −2 𝑚 𝑉𝑎𝑟 𝑊 𝐸 2 𝑊 ≤ 𝜖 2 36

41 Theorem 14.12 – Proof(8/8) By Chebyshev’s inequality:
𝐸 𝑞 𝑛 𝑊 ≤(1+ 𝜖 2 ) Ω 𝑉𝑎𝑟 𝑊 𝐸 2 𝑊 ≤ 𝜖 2 36 Chebyshev’s inequality: Pr 𝑋−𝜇 ≥𝑘𝜎 ≤1/ 𝑘 2

42 Counting Linear extensions
Define sequence of sub problems Let 𝑃=𝑃 0 be our partial order 𝑃 1 , 𝑃 2 ,… 𝑃 𝑘 a set of partial orders s.t 𝑃 𝑗+1 = 𝑃 𝑗 ∪ 𝑎 𝑗 < 𝑏 𝑗 , 𝑎 𝑗 < 𝑏 𝑗 ∉ 𝑃 𝑗 𝑃 𝑘 is full order |Ω P |= 𝑖=0 𝑘−1 |Ω 𝑃 𝑖 | |Ω 𝑃 𝑖+1 | Ω 𝑃 Ω P 1 Ω P k

43 Counting Linear extensions
f a a b b c g b c c < g f b < c c f g d d g d e e e

44 Counting Linear extensions
For i = 0,1,…,k Find incomparable elements ( 𝑎 𝑖 , 𝑏 𝑖 ) by sorting algorithm Create N samples from Ω 𝑃 𝑖 by running t(n, ε) steps of and count in how many samples 𝑎 𝑖 < 𝑏 𝑖 If 𝑁 𝑎 𝑖 < 𝑏 𝑖 ≥𝑁/2 𝑍 𝑖 =𝑁 𝑎 𝑖 < 𝑏 𝑖 𝑃 𝑖+1 = 𝑃 𝑖 ∪ 𝑎 𝑖 < 𝑏 𝑖 Otherwise 𝑍 𝑖 =𝑁−𝑁 𝑎 𝑖 < 𝑏 𝑖 𝑃 𝑖+1 = 𝑃 𝑖 ∪ 𝑏 𝑖 < 𝑎 𝑖 Define 𝑤 𝑖 = 𝑍 𝑖 𝑁 as the estimator of |Ω 𝑃 𝑖+1 | |Ω 𝑃 𝑖 | W= i=0 k−1 𝑤 𝑖 is an estimator of Ω −1 P

45 Proof Outline 𝑘≤2𝑛𝑙𝑜𝑔𝑛 (maximum number of comparisons in sorting algorithm) Find incomparable couple: 𝑂(𝑛𝑙𝑜𝑔𝑛)

46 Proof Outline Estimate the bias in each 𝑤 𝑘 :
𝐸[𝑊 𝑘 ]− |Ω 𝑃 𝑘+1 | |Ω 𝑃 𝑘 | ≤𝑑(𝑡 𝑛,𝜖 ) we now can’t bound |Ω 𝑃 𝑘+1 | |Ω 𝑃 𝑘 | below exactly, only with probability – makes the the calculation more difficult

47 Questions?


Download ppt "Path Coupling And Approximate Counting"

Similar presentations


Ads by Google