Path Coupling And Approximate Counting

Slides:

Advertisements

Similar presentations

Slow and Fast Mixing of Tempering and Swapping for the Potts Model Nayantara Bhatnagar, UC Berkeley Dana Randall, Georgia Tech.

Advertisements

2 4 Theorem:Proof: What shall we do for an undirected graph?

Approximation Algorithms Chapter 14: Rounding Applied to Set Cover.

Approximation, Chance and Networks Lecture Notes BISS 2005, Bertinoro March Alessandro Panconesi University La Sapienza of Rome.

Gibbs sampler - simple properties It’s not hard to show that this MC chain is aperiodic. Often is reversible distribution. If in addition the chain is.

Approximation Algorithms Chapter 28: Counting Problems 2003/06/17.

1 The Monte Carlo method. 2 (0,0) (1,1) (-1,-1) (-1,1) (1,-1) 1 Z= 1 If  X 2 +Y 2  1 0 o/w (X,Y) is a point chosen uniformly at random in a 2  2 square.

Randomized Algorithms Kyomin Jung KAIST Applied Algorithm Lab Jan 12, WSAC

Markov Chains 1.

1 By Gil Kalai Institute of Mathematics and Center for Rationality, Hebrew University, Jerusalem, Israel presented by: Yair Cymbalista.

11 - Markov Chains Jim Vallandingham.

Random Walks Ben Hescott CS591a1 November 18, 2002.

CS774. Markov Random Field : Theory and Application Lecture 16 Kyomin Jung KAIST Nov

Complexity 16-1 Complexity Andrei Bulatov Non-Approximability.

Complexity 15-1 Complexity Andrei Bulatov Hierarchy Theorem.

Seminar on Random Walks on Graphs 2009/2010 Ilan Ben Bassat Omri Weinstein Mixing Time – General Chains.

Approximation Algorithms

1 Mazes In The Theory of Computer Science Dana Moshkovitz.

Computational statistics 2009 Random walk. Computational statistics 2009 Random walk with absorbing barrier.

Computability and Complexity 24-1 Computability and Complexity Andrei Bulatov Approximation.

Preference Analysis Joachim Giesen and Eva Schuberth May 24, 2006.

Complexity 1 Mazes And Random Walks. Complexity 2 Can You Solve This Maze?

1 On the Computation of the Permanent Dana Moshkovitz.

Sampling and Approximate Counting for Weighted Matchings Roy Cagan.

Problems, cont. 3. where k=0?. When are there stationary distributions? Theorem: An irreducible chain has a stationary distribution  iff the states are.

Mixing Times of Markov Chains for Self-Organizing Lists and Biased Permutations Prateek Bhakta, Sarah Miracle, Dana Randall and Amanda Streib.

Mixing Times of Self-Organizing Lists and Biased Permutations Sarah Miracle Georgia Institute of Technology.

Randomized Algorithms Morteza ZadiMoghaddam Amin Sayedi.

Random Walks and Markov Chains Nimantha Thushan Baranasuriya Girisha Durrel De Silva Rahul Singhal Karthik Yadati Ziling Zhou.

On comparison of different approaches to the stability radius calculation Olga Karelkina Department of Mathematics University of Turku MCDM 2011.

Edge-disjoint induced subgraphs with given minimum degree Raphael Yuster 2012.

Algorithms to Approximately Count and Sample Conforming Colorings of Graphs Sarah Miracle and Dana Randall Georgia Institute of Technology (B,B)(B,B) (R,B)(R,B)

Expanders via Random Spanning Trees R 許榮財 R 黃佳婷 R 黃怡嘉.

Markov Chains and Random Walks. Def: A stochastic process X={X(t),t ∈ T} is a collection of random variables. If T is a countable set, say T={0,1,2, …

Approximate Inference: Decomposition Methods with Applications to Computer Vision Kyomin Jung ( KAIST ) Joint work with Pushmeet Kohli (Microsoft Research)

Approximation Algorithms Department of Mathematics and Computer Science Drexel University.

Problem Statement How do we represent relationship between two related elements ?

Seminar on random walks on graphs Lecture No. 2 Mille Gandelsman,

The Markov Chain Monte Carlo Method Isabelle Stanton May 8, 2008 Theory Lunch.

Date: 2005/4/25 Advisor: Sy-Yen Kuo Speaker: Szu-Chi Wang.

The Poincaré Constant of a Random Walk in High- Dimensional Convex Bodies Ivona Bezáková Thesis Advisor: Prof. Eric Vigoda.

Approximation Algorithms based on linear programming.

Theory of Computational Complexity Probability and Computing Lee Minseon Iwama and Ito lab M1 1.

Theory of Computational Complexity Probability and Computing Ryosuke Sasanuma Iwama and Ito lab M1.

Theory of Computational Complexity M1 Takao Inoshita Iwama & Ito Lab Graduate School of Informatics, Kyoto University.

Randomized Algorithms Hung Dang, Zheyuan Gao, Irvan Jahja, Loi Luu, Divya Sivasankaran.

Theory of Computational Complexity Probability and Computing Chapter Hikaru Inada Iwama and Ito lab M1.

Markov Chains and Random Walks

Stochastic Streams: Sample Complexity vs. Space Complexity

Markov Chains and Mixing Times

Random Testing: Theoretical Results and Practical Implications IEEE TRANSACTIONS ON SOFTWARE ENGINEERING 2012 Andrea Arcuri, Member, IEEE, Muhammad.

Sequential Algorithms for Generating Random Graphs

An introduction to Approximation Algorithms Presented By Iman Sadeghi

Markov Chains Mixing Times Lecture 5

R. Srikant University of Illinois at Urbana-Champaign

Lecture 18: Uniformity Testing Monotonicity Testing

Chapter 5. Optimal Matchings

Computability and Complexity

Enumerating Distances Using Spanners of Bounded Degree

Haim Kaplan and Uri Zwick

Bin Fu Department of Computer Science

Randomized Algorithms CS648

Randomized Algorithms Markov Chains and Random Walks

Markov Chain Monte Carlo: Metropolis and Glauber Chains

Improved Bounds for Sampling Colorings

Seminar on Markov Chains and Mixing Times Elad Katz

Problem Solving 4.

Compact routing schemes with improved stretch

Presentation transcript:

Path Coupling And Approximate Counting Faster random generation of linear extensions, R. Bubley, M. Dyer Mathematical foundations of the Markov chain Monte Carlo method, M Jerrum Counting Linear Extensions of a Partial Order, S. Harris Presented By Tomer Lauterman

Review – transportation metric We have a state space Ω and metric 𝜌 between states, 𝜌 𝑋,𝑌 ≥1 The transportation metric (or Wasserstein metric) is defined on distributions over Ω: 𝜌 𝐾 𝜇,𝜈 = inf 𝔼 𝜌 𝑋,𝑌 : 𝑋,𝑌 𝑖𝑠 𝑎 𝑐𝑜𝑢𝑝𝑙𝑖𝑛𝑔 𝑜𝑓𝜇,𝜈 𝜇,𝜈 𝑇𝑉 ≤ 𝜌 𝐾 𝜇,𝜈 There exist an optimal coupling ( 𝑋 ∗ , 𝑌 ∗ ) of 𝜇,𝜈 such that: 𝜌 𝐾 𝜇,𝜈 =𝔼 𝜌 𝑋 ∗ , 𝑌 ∗

Review – path coupling Define 𝐺(Ω, 𝐸 0 ), ℓ(𝑒), 𝜌 𝑥,𝑦 ≔ shortest path on 𝐺 Assume for each edge 𝒙,𝒚 ∈ 𝑬 𝟎 we have a coupling ( 𝑋 1 , 𝑌 1 ) of distributions 𝑃 𝑥,⋅ , 𝑃(𝑦,⋅) such that: 𝔼 𝑥,𝑦 𝜌 𝑋 1 , 𝑌 1 ≤ 𝑒 −𝛼 ⋅ℓ 𝑥,𝑦 for some 𝛼>0 Then for any two distributions 𝜇,𝜈 on Ω: 𝜌 𝐾 𝜇𝑃,𝜈𝑃 ≤ 𝑒 −𝛼 ⋅ 𝜌 𝐾 (𝜇,𝜈) 𝑃 𝑡 𝑥,⋅ −𝜋 𝑇𝑉 ≤ 𝜌 𝐾 𝑃 𝑡 𝑥,⋅ ,𝜋 ≤ 𝑒 −𝛼𝑡 𝑑𝑖𝑎𝑚(Ω)

Approximate Counting Let Ω be a very large (but finite) set of combinatorial structure (s.t. possible configuration of physical system or feasible solution to combinatorial problem) Estimate |Ω|, the cardinality of Ω The estimation should be polynomial in both instance size n and the inverse error tolerated, 𝜖 −1 (fpras).

Goals Study another use case for path coupling Learn FPRAS technic for approximate counting using Markov Chain Monte Carlo See how all the course subjects so far are playing together Create markov chain with target probability distribution (s.t Glauber Chain) Use the property of “rapid mixing” (Path Coupling) for algorithm design

Path Coupling Markov Chain Of Linear Extensions Faster random generation of linear extensions, R. Bubley, M. Dyer ? ? ? ? ?

Linear Extensions - Definitions Partially ordered set (P , ≺) – a set P with an irreflexive, transitive relation 𝑎≺𝑏 , 𝑐≺𝑑 Total order (P , <) – relation such that for all 𝑎,𝑏∈𝑃 either a < b or b < a 𝑎<𝑏<𝑐<𝑑 Linear extension of partial order (P , ≺) is a total order (P , <) that preserve (P , ≺). 𝑎<𝑏 , 𝑐<𝑑 ⇒{𝑎𝑏𝑐𝑑,𝑐𝑑𝑎𝑏, 𝑎𝑐𝑏𝑑,..} a b c d

Partial Order - Definitions Full Order a f a b b c g c d d e e f g

Notations 𝑌=𝜎 𝑖,𝑗 𝑋 𝑋=( 𝑎 1 𝑎 2 ∙∙∙ 𝑎 𝑖−1 𝑎 𝑖 ∙∙∙ a j a j+1 ∙∙∙ a n ) Transposition Operator 𝜎 𝑖,𝑗 𝑌=𝜎 𝑖,𝑗 𝑋 𝑋=( 𝑎 1 𝑎 2 ∙∙∙ 𝑎 𝑖−1 𝑎 𝑖 ∙∙∙ a j a j+1 ∙∙∙ a n ) 𝑌=( 𝑎 1 𝑎 2 ∙∙∙ 𝑎 𝑖−1 𝑎 𝑗 ∙∙∙ a i a j+1 ∙∙∙ a n ) 𝑓 - concave probability distribution on {1,2,…n-1} Ω - all possible linear extensions of (P , ≺)

The Markov Chain Given (𝑃,≺), we want to sample linear extension uniformly at random We will design a markov chain s.t: Each state is a linear extension The stationary distribution is uniform The chain is rapidly mixing cabd 1 6 abcd acbd 1 6 5 6 1 6 acdb 1 2 𝑎≺𝑏 , 𝑐≺𝑑

The Markov Chain The chain is aperiodic The chain is irreducible The chain is symmetric ⇒ stationary state 𝜋 is the uniform distribution 𝑋= 𝑋 1 𝑋 2 … 𝑋 𝑖−1 𝑋 𝑖 … 𝑋 𝑛 𝑌= 𝑋 1 𝑋 2 … 𝑋 𝑖−1 𝑋 𝑗 … 𝑋 𝑛 𝑌=𝜎 𝑖,𝑗 𝑋 ⟺ X=𝜎 𝑖,𝑗 𝑌 cabd 5 6 1 6 1 6 abcd acbd 1 6 1 2 acdb 𝑎≺𝑏 , 𝑐≺𝑑

The Markov Chain Denote by 𝑋 𝑡 , 𝑋 𝑡 ∈Ω a state in the Markov Chain Choose 𝑝∈ 1,2,…,𝑛−1 uniformly at random With Probability 1 2 : 𝑋 𝑡+1 = 𝑋 𝑡 (“lazy walk”) With Probability 1 2 : If 𝜎 𝑝,𝑝+1 𝑋 𝑡 ∉Ω then 𝑋 𝑡+1 = 𝑋 𝑡 otherwise 𝑋 𝑡+1 =𝜎 𝑝,𝑝+1 𝑋 𝑡 according to 𝑓

The Markov Chain 𝑎≺𝑏 , 𝑐≺𝑑 cabd 1 2 ∙𝑓(1) abcd acbd 1 2 ∙𝑓(2) 1 2 ∙𝑓(3) acdb 1 2 1 2 ∙ 1+𝑓 1 +𝑓(3) 𝑎≺𝑏 , 𝑐≺𝑑

Define 𝐺 Ω, 𝐸 0 ,𝜌(X,Y) Transposition Distance 𝑋,𝑌 ∈ 𝐸 0 if there exist i,j s.t 𝑌=𝜎 𝑖,𝑗 𝑋 ℓ 𝑋,𝑌 =𝑗−𝑖 𝜌 𝑋,𝑌 = shortest path from X to Y 𝑎≺𝑏 , 𝑎≺𝑐 1 abcd acbd dabc dacb 1 1 2 2 1 1 1 adcb abdc acdb adbc 2 1 1 1

The Coupling Assume 𝑌=𝜎 𝑖,𝑗 𝑋 Choose 𝑝∈ 1,2,…,𝑛−1 according to 𝑓, and 𝑐 𝑥 ∈ 0,1 (“coin”) If 𝑗−𝑖=1 and 𝑝=𝑖 then 𝑐 𝑦 =1− 𝑐 𝑥 otherwise 𝑐 𝑦 = 𝑐 𝑥 if 𝑐 𝑥 =0 or 𝜎 𝑝,𝑝+1 𝑋∉Ω then 𝑋′=𝑋 otherwise 𝑋′=𝜎 𝑝,𝑝+1 𝑋 (similar for 𝑌)

Bounding Mixing Time - overview 𝑷 𝒕 𝒙,⋅ −𝝅 𝑻𝑽 ≤ 𝒆 −𝜶𝒕 𝒅𝒊𝒂𝒎(𝜴) We need to calculate 2 parameters 𝛼 := 𝜌 𝑘 converges rate = expected 𝜌 -distance “shrink” ratio in one step. 𝑑𝑖𝑎𝑚(Ω) := maximum 𝜌-distance in 𝐺(Ω, 𝐸 0 )

Calculate The Shrink Ratio 𝐸[𝜌 𝑋,𝑌 ] 𝑋= 𝑥 1 𝑥 2 … 𝑥 𝑖−1 𝑥 𝑖 … 𝑥 𝑗 𝑥 𝑗+1 … 𝑥 𝑛 𝑌= 𝑥 1 𝑥 2 … 𝑥 𝑖−1 𝑥 𝑗 … 𝑥 𝑖 𝑥 𝑗+1 … 𝑥 𝑛 (a) (b) (c) 𝑝∉{𝑖−1,𝑖,𝑗−1,𝑗} 𝑝∈{𝑖−1,𝑗} 𝑝∈{𝑖,𝑗−1} 𝑖<𝑗−1 𝑖=𝑗−1

Calculate The Shrink Ratio Case (a) 𝑝∉{𝑖−1,𝑖,𝑗−1,𝑗} The states change in the same manner 𝜌 𝑋 ′ , 𝑌 ′ =ℓ 𝑋,𝑌 =𝑗−𝑖 𝑋= 𝑥 1 𝑥 2 … 𝑥 𝑖−1 𝑥 𝑖 … 𝑥 𝑗−1 𝑥 𝑗 𝑥 𝑗+1 … 𝑥 𝑛 𝑌= 𝑥 1 𝑥 2 … 𝑥 𝑖−1 𝑥 𝑗 … 𝑥 𝑗−1 𝑥 𝑖 𝑥 𝑗+1 … 𝑥 𝑛

Calculate The Shrink Ratio Case (b) 𝑝∈{𝑖−1,𝑗} 𝑋= 𝑥 1 𝑥 2 … 𝑥 𝑖−1 𝒙 𝒊 … 𝑥 𝑗−1 𝒙 𝒋 𝑥 𝑗+1 … 𝑥 𝑛 𝑌= 𝑥 1 𝑥 2 … 𝑥 𝑖−1 𝒙 𝒋 … 𝑥 𝑗−1 𝒙 𝒊 𝑥 𝑗+1 … 𝑥 𝑛 𝝈 𝒑,𝒑+𝟏 𝑿, 𝝈 𝒑,𝒑+𝟏 𝒀∉𝛀 ⇒𝝆 𝑿 ′ , 𝒀 ′ =ℓ(𝑿,𝒀) 𝝈 𝒑,𝒑+𝟏 𝑿, 𝝈 𝒑,𝒑+𝟏 𝒀∈𝛀 ⇒𝝆 𝑿 ′ , 𝒀 ′ =ℓ 𝑿,𝒀 +𝟏 𝑋′= 𝑥 1 𝑥 2 … 𝒙 𝒊 𝑥 𝑖−1 … 𝑥 𝑗−1 𝒙 𝒋 𝑥 𝑗+1 … 𝑥 𝑛 𝑌′= 𝑥 1 𝑥 2 … 𝒙 𝒋 𝑥 𝑖−1 … 𝑥 𝑗−1 𝒙 𝒊 𝑥 𝑗+1 … 𝑥 𝑛 Only one of 𝝈 𝒑,𝒑+𝟏 𝑿, 𝝈 𝒑,𝒑+𝟏 𝒀∈𝛀 ⇒𝝆 𝑿 ′ , 𝒀 ′ =ℓ 𝑿,𝒀 +𝟏 𝑋′= 𝑥 1 𝑥 2 … 𝑥 𝑖 𝑥 𝑖−1 … 𝑥 𝑗−1 𝑥 𝑗 𝑥 𝑗+1 … 𝑥 𝑛 𝑌′= 𝑥 1 𝑥 2 … 𝑥 𝑖−1 𝑥 𝑗 … 𝑥 𝑗−1 𝑥 𝑖 𝑥 𝑗+1 … 𝑥 𝑛 (𝑌′=𝜎 𝑖,𝑗 𝜎 𝑖−1,𝑖 𝑋′)

Calculate The Shrink Ratio Case (c) 𝑝∈{𝑖,𝑗−1} 𝑥 𝑖 , 𝑥 𝑗 are incomparable to 𝑥 𝑖+1 … 𝑥 𝑗−1 in (𝑃,≺) Case 𝑖≠𝑗−1 𝑋= 𝑥 1 𝑥 2 … 𝑥 𝑖−1 𝑥 𝑖 … 𝑥 𝑗−1 𝑥 𝑗 𝑥 𝑗+1 … 𝑥 𝑛 𝑌= 𝑥 1 𝑥 2 … 𝑥 𝑖−1 𝑥 𝑗 … 𝑥 𝑗−1 𝑥 𝑖 𝑥 𝑗+1 … 𝑥 𝑛 Case 𝑖=𝑗−1 𝑋= 𝑥 1 𝑥 2 … 𝑥 𝑖−1 𝑥 𝑖 𝑥 𝑗 𝑥 𝑗+1 … 𝑥 𝑛 𝑌= 𝑥 1 𝑥 2 … 𝑥 𝑖−1 𝑥 𝑗 𝑥 𝑖 𝑥 𝑗+1 … 𝑥 𝑛 ⇒𝝆 𝑿 ′ , 𝒀 ′ =ℓ 𝑿,𝒀 −𝟏 Remember the coupling! 𝒄 𝒚 = 𝟏−𝒄 𝒙

Calculate The Shrink Ratio 𝐸[𝜌 𝑋′,𝑌′ ] 𝑋= 𝑥 1 𝑥 2 … 𝑥 𝑖−1 𝑥 𝑖 … 𝑥 𝑗 𝑥 𝑗+1 … 𝑥 𝑛 𝑌= 𝑥 1 𝑥 2 … 𝑥 𝑖−1 𝑥 𝑗 … 𝑥 𝑖 𝑥 𝑗+1 … 𝑥 𝑛 (a) (b) (c) 𝑝∉{𝑖−1,𝑖,𝑗−1,𝑗} 𝑝∈{𝑖−1,𝑗} 𝑝∈{𝑖,𝑗−1} 𝑖<𝑗−1 𝑖=𝑗−1 𝐸 𝜌 𝑋′,𝑌′ = ℓ 𝑋,𝑌 𝐸 𝜌 𝑋′,𝑌′ ≤ ℓ 𝑋,𝑌 + 1 2 𝐸 𝜌 𝑋′,𝑌′ = ℓ 𝑋,𝑌 − 1 2 𝐸 𝜌 𝑋′,𝑌′ = ℓ 𝑋,𝑌 −1 𝐸 𝜌 𝑋,𝑌 −ℓ 𝑋,𝑌 ≤ 1 2 𝑓 𝑖−1 −𝑓 𝑖 −𝑓 𝑗−1 +𝑓 𝑗

Calculate The Shrink Ratio 𝐸 𝜌 𝑋,𝑌 −ℓ 𝑋,𝑌 ≤ 1 2 𝑓 𝑖−1 −𝑓 𝑖 −𝑓 𝑗−1 +𝑓 𝑗 ≤0 We will use 𝑓 𝑖 = 𝑖 𝑛−𝑖 𝐾 , 𝐾= 1 6 𝑛 3 −𝑛 <− 𝒇 𝒊−𝟏 +𝒇 𝒋 𝑓(𝑗)) 𝑓(𝑗−1) 𝑓(𝑖) 𝑓(𝑖−1) 𝑓 𝑗 −𝑎 𝑓 𝑖−1 +𝑎

Calculate The Shrink Ratio 𝑓 𝑖 = 𝑖 𝑛−𝑖 𝐾 , 𝐾= 1 6 𝑛 3 −𝑛 𝑓 𝑖 −𝑓 𝑖−1 = 𝑖 𝑛−𝑖 𝐾 − (𝑖−1) 𝑛−𝑖+1 𝐾 = 𝑛+1−2𝑖 𝐾 𝐸 𝜌 𝑋′,𝑌′ ≤ ℓ 𝑋,𝑌 + 1 2 𝑓 𝑖−1 −𝑓 𝑖 −𝑓 𝑗−1 +𝑓 𝑗 = ℓ 𝑋,𝑌 + 1 2 𝑛+1−2𝑗 𝐾 − 𝑛+1−2𝑖 𝐾 = ℓ 𝑋,𝑌 − 𝑗−𝑖 𝐾 = 1− 1 𝐾 ℓ 𝑋,𝑌 𝐸 𝜌 𝑋′,𝑌′ ≤ 1− 1 𝐾 ℓ 𝑋,𝑌 ≤ 𝑒 − 1 𝐾 ℓ 𝑋,𝑌 𝜶= 𝟏 𝑲

Calculate Diameter Definition: Suppose X,Y are total orders then Spearman’s footrule is defined to be, 𝛿 𝑠 𝑋,𝑌 = 1 2 𝑖=1 𝑛 𝑋 𝑖 −𝑌(𝑖) Lemma 1: The transposition distance equals to 𝛿 𝑠 Lemma 2: max 𝑋,𝑌 𝛿 𝑠 (𝑋,𝑌) ≤ 𝑛 2 4 Corollary: 𝑑𝑖𝑎𝑚 Ω = max 𝑋,𝑌 𝜌(𝑋,𝑌) = max 𝑋,𝑌 𝛿 𝑠 (𝑋,𝑌) ≤ 𝑛 2 4

Bounding The Mixing Time 𝑑 𝑡 = 𝑃 𝑡 𝑥,⋅ −𝜋 𝑇𝑉 ≤ 𝑒 − 1 𝐾 𝑡 𝑑𝑖𝑎𝑚(Ω) 𝑡 𝑚𝑖𝑥 𝜖 =𝑎𝑟𝑔𝑚𝑖 𝑛 𝑡 𝑑 𝑡 ≤𝜖 ⇒ 𝑡 𝑚𝑖𝑥 𝜖 ≤ 𝐾𝑙𝑛 𝑑𝑖𝑎𝑚(Ω) 𝜖 −1 Assign the values 𝐾= 1 6 𝑛 3 −𝑛 𝑑𝑖𝑎𝑚(Ω) ≤ 𝑛 2 4 𝑡 𝑚𝑖𝑥 𝜖 ≤ 1 6 𝑛 3 −𝑛 𝑙𝑛 𝑛 2 4 𝜖 −1

Approximate Counting Mathematical foundations of the Markov chain Monte Carlo method, M Jerrum (Q-coloring) Counting Linear Extensions of a Partial Order, S. Harris

FPRAS fully polynomial randomized approximation scheme: approximate value of the function n → |Ωn| having a polynomial run-time in both the instance size n and the inverse error tolerated, 𝜖 −1 . P{(1 − ε) |Ω|≤ W ≤ (1 + ε) |Ω| } ≥ 1 − η

Two approaches for counting Direct Monte-Carlo “dart throwing” Bound S by simper group S’ s.t |S’| can be estimated simply Markov Chain Monte-Carlo Create sequence of sub problems of |S|: |S0|,..,|Sn|=|S| Estimate the ratios 𝑆 𝑘 𝑆 𝑘+1 using Markov Chain sampling 𝑆 = 𝑘=1 𝑛−1 𝑆 𝑘+1 𝑆 𝑘 ×| 𝑆 0 | |S| |Sn-1| |S0|

𝑡 𝑚𝑖𝑥 (𝜀) ≤ 𝑞 − ∆ 𝑞 − 2∆ 𝑛 ( log 𝑛 + log 𝜀 −1 ) Q-Coloring Problem proper q-colorings of a graph G = (V, E) are elements of 𝑥 ∈ Ω = {1, 2, . . ., q} V such that 𝑥 𝑣 ≠𝑥 𝑤 for {𝑣, 𝑤} ∈ 𝐸 Theorem 14.8. Consider the Glauber dynamics chain for random proper q-colorings of a graph with n vertices and maximum degree ∆. If q > 2∆, then the mixing time satisfies 𝑡 𝑚𝑖𝑥 (𝜀) ≤ 𝑞 − ∆ 𝑞 − 2∆ 𝑛 ( log 𝑛 + log 𝜀 −1 )

Q-Coloring Problem 𝑒 1 𝑒 1 𝑒 2 𝑒 2 𝑒 3 𝑒 3 𝑒 4 𝑒 4

Q-C Approximate Counting Theorem 14.12. Let Ω be the set of all proper q-colorings of the graph G of n vertices and maximal degree ∆. Fix q > 2∆, and set 𝑐 𝑞, ∆ = 1 − ∆ 𝑞 − ∆ Given graph with m edges and ε, there is a random variable W which can be simulated using no more than 𝑛 log 𝑛 +𝑛 log 6𝑚 𝜀 𝑐 𝑞, ∆ 74𝑚 𝜀 2 ∙𝑚 uniform random variables and which satisfies 𝑃{ 1 −𝜖 Ω ≤ 𝑊 ≤ 1 +𝜖 Ω } ≥ 3 4

Theorem 14.12 – Proof(1/8) Definition: 𝐺 0 (𝑉,∅) - be a graph with the vertex of G with no edges. 𝐺 𝑘 𝑉, 𝐸 𝑘 - 𝐸 𝑘 = 𝐸 𝑘−1 ∪{ 𝑒 𝑘 | 𝑒 𝑘 ∈𝐸/ 𝐸 𝑘−1 } Ω 𝑘 - the number of proper coloring of 𝐺 𝑘 Ω 0 = 𝑞 𝑛 𝐺= 𝐺 𝑚 Ω = Ω 0 × |Ω 1 | | Ω 0 | ×…× |Ω 𝑚 | | Ω 𝑚−1 | 𝑒 1 𝑒 2 𝑒 3 𝑒 4 Ω 0 Ω 1 Ω m−1 Ω

Theorem 14.12 – Proof(2/8) Algorithm For k=1…m Sample 𝑎 𝑚 random q-coloring from Ω 𝑘−1 using running of t(m, ε) steps of “Glauber” chain Count 𝑧 𝑘 - haw many samples belongs to Ω 𝑘 define 𝑤 𝑘 = 𝑧 𝑘 𝑎 𝑚 to be the estimate of 𝑝 𝑘 = |Ω 𝑘 | | Ω 𝑘−1 | 𝑊= Ω 0 ∙ 𝑘=1 𝑚 𝑤 𝑘 is an estimator of Ω 𝑚 𝜎 𝑏𝑖𝑎𝑠 𝑝 𝑘 𝑤 𝑘

Theorem 14.12 – Proof(2/8) Lemma 1: for each k, 1 2 ≤ |Ω 𝑘 | | Ω 𝑘−1 | ≤1. Suppose that the graphs 𝐺 𝑘 and 𝐺 𝑘−1 differ in the edge {𝑢,𝑣}. Any coloring in Ω 𝑘−1 / Ω 𝑘 assigns the same color to u and v We can map each coloring to coloring of 𝐺 𝑘 by recoloring vertex 𝑢 with one of at least 𝑞 −Δ colors. Ω 𝑘−1 − Ω 𝑘 ≤ Ω 𝑘 ⇒ 1 2 ≤ |Ω 𝑘 | | Ω 𝑘−1 | ≤1

Theorem 14.12 – Proof(3/8) 𝑮 𝟑 𝑮 𝟒 𝑢 𝑣 𝑢 𝑣 𝑢 𝑣 𝑢 𝑣

Theorem 14.12 – Proof(4/8) Reminder: theorem 14.8 Consider Glauber dynamics chain for random proper q-colorings of a graph with n vertices and maximum degree ∆. If q > 2∆, then the mixing time satisfies 𝑡 𝑚𝑖𝑥 𝜖 ≤ n(log 𝑛 − log 𝜖 ) 𝑐 𝑞,Δ Let 𝑡 𝑚, 𝜀 := 𝑡 𝑚𝑖𝑥 𝜖 6𝑚 then 𝑃 𝑡 𝑚,𝜀 𝑥0, · − 𝜋 𝑘 𝑇𝑉 < 𝜀 6𝑚

Theorem 14.12 – Proof(5/8) Let 𝑍 𝑘,𝑖 be the indicator that the i-th sample is an element of Ω 𝑘+1 𝐸 𝑍 𝑘,𝑖 − |Ω 𝑘+1 | | Ω 𝑘 | = 𝑃 𝑡 𝑚,𝜀 𝑥 𝑘,𝑖 , Ω 𝑘+1 − 𝜋 𝑘 Ω 𝑘+1 ≤ 𝑃 𝑡 𝑚,𝜀 𝑥 𝑘,𝑖 , ∙ − 𝜋 𝑘+1 𝑇𝑉 ≤ 𝜀 6𝑚 𝐸 𝑍 𝑘,𝑖 − |Ω 𝑘+1 | | Ω 𝑘 | ≤ 𝜖 6𝑚 ≤ 𝜖 3𝑚 ∙ |Ω 𝑘+1 | | Ω 𝑘 | |Ω 𝑘+1 | | Ω 𝑘 | ≥ 1 2 µ − ν 𝑇𝑉 = max 𝐴⊂Ω |µ(𝐴) − 𝜈(𝐴)|

Theorem 14.12 – Proof(6/8) Let 𝑊 𝑘 := 1 𝑎 𝑚 𝑖=1 𝑎 𝑚 𝑍 𝑘,𝑖 𝐸[𝑊 𝑘 ]≥ 1− 𝜖 3𝑚 ∙ |𝛺 𝑘+1 | | 𝛺 𝑘 | ≥ 1 3 𝑉𝑎𝑟 𝑍 𝑘,𝑖 𝐸 2 𝑍 𝑘,𝑖 = 𝐸 −1 𝑍 𝑘,𝑖 −1≤2 ≥ 1 2 𝑬[𝑾 𝒌 ]− |𝜴 𝒌+𝟏 | | 𝜴 𝒌 | = 𝐸[𝑊 𝑘 ]− 𝜋 𝑘 Ω 𝑘+1 ≤ 𝜀 3𝑚 ∙ |𝛺 𝑘+1 | | 𝛺 𝑘 | 𝑽𝒂𝒓 𝑾 𝒌 𝑬 𝟐 𝑾 𝒌 ≤ 2 𝑎 𝑚

Hope you are not approximate counting sheep!

Theorem 14.12 – Proof(7/8) Let 𝑊 = 𝑖=1 𝑚 𝑊 𝑖 𝐸 𝑊 = 𝑖=1 𝑚 𝐸𝑊 𝑖 ≤ 1+ 𝜖 3𝑚 𝑚 q −n Ω ≤ 𝑒 𝜖/3 q −n Ω ≤(1+ 𝜖 2 ) q −n Ω 𝑉𝑎𝑟 𝑊 𝐸 2 𝑊 = 𝐸 𝑊 2 𝐸 2 𝑊 −1= 𝑖=1 𝑚 𝐸 𝑊 𝑖 2 𝐸 2 𝑊 𝑖 −1= 𝑖=1 𝑚 1+ 𝑉𝑎𝑟 𝑊 𝑖 2 𝐸 2 𝑊 𝑖 −1 ≤ 𝑖=1 𝑚 1+ 2 𝑎 𝑚 −1≤ 𝑒 2𝑚 𝑎 𝑚 −1 Take 𝑎 𝑚 = 74𝜖 −2 𝑚 𝑉𝑎𝑟 𝑊 𝐸 2 𝑊 ≤ 𝜖 2 36

Theorem 14.12 – Proof(8/8) By Chebyshev’s inequality: 𝐸 𝑞 𝑛 𝑊 ≤(1+ 𝜖 2 ) Ω 𝑉𝑎𝑟 𝑊 𝐸 2 𝑊 ≤ 𝜖 2 36 Chebyshev’s inequality: Pr 𝑋−𝜇 ≥𝑘𝜎 ≤1/ 𝑘 2

Counting Linear extensions Define sequence of sub problems Let 𝑃=𝑃 0 be our partial order 𝑃 1 , 𝑃 2 ,… 𝑃 𝑘 a set of partial orders s.t 𝑃 𝑗+1 = 𝑃 𝑗 ∪ 𝑎 𝑗 < 𝑏 𝑗 , 𝑎 𝑗 < 𝑏 𝑗 ∉ 𝑃 𝑗 𝑃 𝑘 is full order |Ω P |= 𝑖=0 𝑘−1 |Ω 𝑃 𝑖 | |Ω 𝑃 𝑖+1 | Ω 𝑃 Ω P 1 Ω P k

Counting Linear extensions f a a b b c g b c c < g f b < c c f g d d g d e e e

Counting Linear extensions For i = 0,1,…,k Find incomparable elements ( 𝑎 𝑖 , 𝑏 𝑖 ) by sorting algorithm Create N samples from Ω 𝑃 𝑖 by running t(n, ε) steps of and count in how many samples 𝑎 𝑖 < 𝑏 𝑖 If 𝑁 𝑎 𝑖 < 𝑏 𝑖 ≥𝑁/2 𝑍 𝑖 =𝑁 𝑎 𝑖 < 𝑏 𝑖 𝑃 𝑖+1 = 𝑃 𝑖 ∪ 𝑎 𝑖 < 𝑏 𝑖 Otherwise 𝑍 𝑖 =𝑁−𝑁 𝑎 𝑖 < 𝑏 𝑖 𝑃 𝑖+1 = 𝑃 𝑖 ∪ 𝑏 𝑖 < 𝑎 𝑖 Define 𝑤 𝑖 = 𝑍 𝑖 𝑁 as the estimator of |Ω 𝑃 𝑖+1 | |Ω 𝑃 𝑖 | W= i=0 k−1 𝑤 𝑖 is an estimator of Ω −1 P

Proof Outline 𝑘≤2𝑛𝑙𝑜𝑔𝑛 (maximum number of comparisons in sorting algorithm) Find incomparable couple: 𝑂(𝑛𝑙𝑜𝑔𝑛)

Proof Outline Estimate the bias in each 𝑤 𝑘 : 𝐸[𝑊 𝑘 ]− |Ω 𝑃 𝑘+1 | |Ω 𝑃 𝑘 | ≤𝑑(𝑡 𝑛,𝜖 ) we now can’t bound |Ω 𝑃 𝑘+1 | |Ω 𝑃 𝑘 | below exactly, only with probability – makes the the calculation more difficult

Questions?