# Model Counting of Query Expressions: Limitations of Propositional Methods Paul Beame 1 Jerry Li 2 Sudeepa Roy 1 Dan Suciu 1 1 University of Washington.

## Presentation on theme: "Model Counting of Query Expressions: Limitations of Propositional Methods Paul Beame 1 Jerry Li 2 Sudeepa Roy 1 Dan Suciu 1 1 University of Washington."— Presentation transcript:

Model Counting of Query Expressions: Limitations of Propositional Methods Paul Beame 1 Jerry Li 2 Sudeepa Roy 1 Dan Suciu 1 1 University of Washington 2 MIT 11

Probabilistic Databases AsthmaPatient Ann Bob Friend AnnJoe AnnTom BobTom Smoker Joe Tom Boolean query Q:  x  y AsthmaPatient(x)  Friend (x, y)  Smoker(y) Tuples are probabilistic (and independent) ▫ “Ann” is present with probability 0.3 Lineage F Q,D = (x 1  y 1  z 1 )  (x 1  y 2  z 2 )  (x 2  y 3  z 2 ) ▫ Q is true on D  F Q,D is true What is the probability that Q is true on D? Two main evaluation techniques: lifted vs. grounded inference x1x1 x2x2 z1z1 z2z2 y1y1 y2y2 y3y3 0.3 0.1 0.5 1.0 0.9 0.5 0.7 Pr(x 1 ) = 0.3 2

Lifted Inference Q:  x  y AsthmaPatient(x)  Friend (x, y)  Smoker(y) Work with explicit query structure, i.e. the first order logic Dichotomy Theorem [Dalvi, Suciu 12] For any UCQ, evaluating it is either ▫ #P-hard ▫ Polynomial time computable using lifted inference ▫ and there is a simple condition to tell which case holds 3

Grounded Inference F Q,D = (x 1  y 1  z 1 )  (x 1  y 2  z 2 )  (x 2  y 3  z 2 ) Work with the boolean formula Folklore sentiment: Lifted inference is strictly stronger than grounded inference We give the first clear proof of this 4

Outline Background: Model Counting, DPLL algorithms ▫ Extensions (Caching & Component Analysis) ▫ Knowledge Compilation (FBDDs & Decision-DNNF) Our Contributions ▫ Statement of separation ▫ Sketch of FBDD lower bound Conclusions 5

Model Counting Probability Computation Problem: Given F, and independent Pr(x), Pr(y), Pr(z), …, compute Pr(F) Model Counting Problem: Given a Boolean formula F, compute #F = #Models (satisfying assignments) of F e.g. F = (x  y)  (x  u  w)  (  x   u  w  z) #Assignments on x, y, u, z, w which make F = true 6

CDP Relsat Cachet SharpSAT c2d Dsharp … 7 Known Model Counting Algorithms Search-based/DPLL-based (explore the assignment-space and count the satisfying ones) Knowledge Compilation-based (compile F into a “computation-friendly” form) [Survey by Gomes et. al. ’09] Both techniques explicitly or implicitly use DPLL-based algorithms produce FBDD or Decision-DNNF compiled forms (output or trace) [Huang-Darwiche’05, ’07] Both techniques explicitly or implicitly use DPLL-based algorithms produce FBDD or Decision-DNNF compiled forms (output or trace) [Huang-Darwiche’05, ’07] [Birnbaum et. al.’99] [Bayardo Jr. et. al. ’97, ’00] [Sang et. al. ’05] [Thurley ’06] [Darwiche ’04] [Muise et. al. ’12]

DPLL Algorithms Davis, Putnam, Logemann, Loveland [Davis et. al. ’60, ’62] 8 x z 0 y 1 u 0 1 1 0 w 1 0 0 1 10 u 1 1 1 0 w 1 0 0 1 10 1 010 0 1 11 F: (x  y)  (x  u  w)  (  x  u  w  z) uwzuwz uwuw w uwuw ½ ¾ ¾ y(uw)y(uw) 3/83/8 7/87/8 5/85/8 w ½ Assume uniform distribution for simplicity // basic DPLL: Function Pr(F): if F = false then return 0 if F = true then return 1 select a variable x, return ½ Pr(F X=0 ) + ½ Pr(F X=1 ) // basic DPLL: Function Pr(F): if F = false then return 0 if F = true then return 1 select a variable x, return ½ Pr(F X=0 ) + ½ Pr(F X=1 )

DPLL Algorithms 9 x z 0 y 1 u 0 1 1 0 w 1 0 0 1 10 u 1 1 1 0 w 1 0 0 1 10 1 010 0 1 11 F: (x  y)  (x  u  w)  (  x  u  w  z) uwzuwz uwuw w uwuw ½ ¾ ¾ y(uw)y(uw) 3/83/8 7/87/8 5/85/8 w ½ The trace is a Decision-Tree for F The trace is a Decision-Tree for F

Extensions to DPLL Caching Subformulas Component Analysis Conflict Directed Clause Learning ▫ Affects the efficiency of the algorithm, but not the final “form” of the trace 10

Extensions to DPLL: Caching 11 // basic DPLL: Function Pr(F): if F = false then return 0 if F = true then return 1 select a variable x, return ½ Pr(F X=0 ) + ½ Pr(F X=1 ) // basic DPLL: Function Pr(F): if F = false then return 0 if F = true then return 1 select a variable x, return ½ Pr(F X=0 ) + ½ Pr(F X=1 ) x z 0 y 1 u 0 1 1 0 w 1 0 0 1 1 0 u 1 1 1 0 w 1 0 0 1 10 F: (x  y)  (x  u  w)  (  x  u  w  z) uwzuwz uwuw w uwuw y(uw)y(uw) w // DPLL with caching: Cache F and Pr(F); look it up before computing // DPLL with caching: Cache F and Pr(F); look it up before computing

Caching & FBDDs 12 x z 0 y 1 0 1 0 u 1 1 1 0 w 1 0 0 1 10 F: (x  y)  (x  u  w)  (  x  u  w  z) uwzuwz uwuw w y(uw)y(uw) The trace is a decision-DAG for F FBDD (Free Binary Decision Diagram) or ROBP (Read Once Branching Program) Every variable is tested at most once on any path The trace is a decision-DAG for F FBDD (Free Binary Decision Diagram) or ROBP (Read Once Branching Program) Every variable is tested at most once on any path

Extensions to DPLL: Component Analysis 13 x z 0 y 1 0 1 0 u 1 1 1 0 w 1 0 0 1 10 F: (x  y)  (x  u  w)  (  x  u  w  z) uwzuwz uwuw w y  (  u  w) // basic DPLL: Function Pr(F): if F = false then return 0 if F = true then return 1 select a variable x, return ½ Pr(F X=0 ) + ½ Pr(F X=1 ) // basic DPLL: Function Pr(F): if F = false then return 0 if F = true then return 1 select a variable x, return ½ Pr(F X=0 ) + ½ Pr(F X=1 ) // DPLL with component analysis (and caching): if F = G  H where G and H have disjoint sets of variables Pr(F) = Pr(G) × Pr(H) // DPLL with component analysis (and caching): if F = G  H where G and H have disjoint sets of variables Pr(F) = Pr(G) × Pr(H)

Components & Decision-DNNF 14  x z 1 u 1 1 1 0 w 1 0 0 1 10 uwzuwz w y  (  u  w) 0 y 1 0 F: (x  y)  (x  u  w)  (  x  u  w  z) The trace is a Decision-DNNF [Huang-Darwiche ’05, ’07] FBDD + “Decomposable” AND-nodes (Two sub-DAGs do not share variables) The trace is a Decision-DNNF [Huang-Darwiche ’05, ’07] FBDD + “Decomposable” AND-nodes (Two sub-DAGs do not share variables) y 0 1 AND Node uwuw

How much power does component analysis add? 15 Theorem [BLRS]: decision-DNNF for F of size N  FBDD for F of size N log N + 1 [UAI ’13] Conversion works even when we allow negation and arbitrary decomposable binary gates. [ICDT ’14] Corollary: Exponential lower bound for FBDD(F)  exponential lower bound for decision-DNNF(F) Theorem [BLRS]: decision-DNNF for F of size N  FBDD for F of size N log N + 1 [UAI ’13] Conversion works even when we allow negation and arbitrary decomposable binary gates. [ICDT ’14] Corollary: Exponential lower bound for FBDD(F)  exponential lower bound for decision-DNNF(F)

Implications for Lower Bounds? All real world exact model counters compile into FBDDs or decision-DNNFs By conversion, an exponential size lower bound for FBDDs implies an exponential lower bound for decision-DNNFs Thus suffices to consider FBDDs 16

Outline Background: Model Counting, DPLL algorithms ▫ Extensions (Caching & Component Analysis) ▫ Knowledge Compilation (FBDDs & Decision-DNNF) Our Contributions ▫ Statement of separation ▫ Sketch of FBDD lower bound Conclusions 17

An important class of queries H 1 =R(x)S(x,y)  S(x,y)T(y) H k =R(x)S 1 (x,y) ...  S i (x,y)S i+1 (x,y) ...  S k (x,y)T(y) ▫ [Dalvi, Suciu 12]: H k is #P-hard to evaluate ▫ Known to “capture” hardness for probabilistic DB queries ▫ But, some functions of the h ki are poly-time computable using lifted inference, e.g. (h 30  h 32 ) (h 30  h 33 ) (h 31  h 33 ) 18 h k0 h ki h kk

New Lower Bounds Theorem: For all k, FBDD(H k ) = 2  ( ), which implies Decision-DNNF( H k ) = 2  (√n) Theorem: Any Boolean function f of h k0,...,h kk that depends on all of them requires FBDD(f) = 2  () which implies Decision-DNNF(f) = 2  (√n) Corollary: Grounded inference requires 2  (√) time even on probabilistic DB instances with poly(n) time algorithms using lifted inference. Implies separation between grounded and lifted inference 19

Proof for H 1 H 1 = R(x)S(x,y)  S(x,y)T(y) Over the complete database of size n, H 1 = ∨ n R(i) S(ij)  ∨ n S(ij) T(j) Q: why is H 1 hard for FBDDs? 20

Matrix view 21 T(1)T(2)T(3)T(4)T(5) R(1)S(1,1)S(1,2)S(1,3)S(1,4)S(1,5) R(2)S(2,1)S(2,2)S(2,3)S(2,4)S(2,5) R(3)S(3,1)S(3,2)S(3,3)S(3,4)S(3,5) R(4)S(4,1)S(4,2)S(4,3)S(4,4)S(4,5) R(5)S(5,1)S(5,2)S(5,3)S(5,4)S(5,5) H 1 = ∨ n R(i) S(ij)  ∨ n S(ij) T(j)

Matrix view 22 T(1)T(2)T(3)T(4)T(5) R(1)S(1,1)S(1,2)S(1,3)S(1,4)S(1,5) R(2)S(2,1)S(2,2)S(2,3)S(2,4 ) S(2,5) R(3)S(3,1)S(3,2)S(3,3)S(3,4)S(3,5) R(4)S(4,1)S(4,2 ) S(4,3)S(4,4 ) S(4,5) R(5)S(5,1)S(5,2)S(5,3)S(5,4)S(5,5) H 1 = ∨ n R(i) S(ij)  ∨ n S(ij) T(j) R(1)

Matrix view 23 T(1)T(2)T(3)T(4)T(5) R(1)S(1,1)S(1,2)S(1,3)S(1,4)S(1,5) R(2)S(2,1)S(2,2)S(2,3)S(2,4 ) S(2,5) R(3)S(3,1)S(3,2)S(3,3)S(3,4)S(3,5) R(4)S(4,1)S(4,2 ) S(4,3)S(4,4 ) S(4,5) R(5)S(5,1)S(5,2)S(5,3)S(5,4)S(5,5) H 1 = ∨ n R(i) S(ij)  ∨ n S(ij) T(j) R(1) 01

Matrix view 24 T(1)T(2)T(3)T(4)T(5) R(1)S(1,1)S(1,2)S(1,3)S(1,4)S(1,5) R(2)S(2,1)S(2,2)S(2,3)S(2,4 ) S(2,5) R(3)S(3,1)S(3,2)S(3,3)S(3,4)S(3,5) R(4)S(4,1)S(4,2 ) S(4,3)S(4,4 ) S(4,5) R(5)S(5,1)S(5,2)S(5,3)S(5,4)S(5,5) H 1 = ∨ n R(i) S(ij)  ∨ n S(ij) T(j) R(1) 01

Matrix view 25 T(1)T(2)T(3)T(4)T(5) R(1)S(1,1)S(1,2)S(1,3)S(1,4)S(1,5) R(2)S(2,1)S(2,2)S(2,3)S(2,4 ) S(2,5) R(3)S(3,1)S(3,2)S(3,3)S(3,4)S(3,5) R(4)S(4,1)S(4,2 ) S(4,3)S(4,4 ) S(4,5) R(5)S(5,1)S(5,2)S(5,3)S(5,4)S(5,5) H 1 = ∨ n R(i) S(ij)  ∨ n S(ij) T(j) Can’t Cache! R(1) 0 1 S(1,1) S(1,5) … 0

A “unit rule” for FBDDs Variable x in a formula Φ is a unit if Φ = x v G A FBDD follows the unit rule if each node tests a unit variable whenever possible Can we assume that FBDDs follow the unit rule? 26 G x 1 10 Unit Node

A “unit rule” for FBDDs Lemma: Given an FBDD for a monotone DNF formula Φ of size N, there exists an FBDD for Φ that follows the unit rule of size at most |var (Φ)| N. Proof: Alter FBDD to test units whenever possible, then restore read-once property 27

Bound for H 1 Idea: specify a set of “admissible” partial paths A so that: 1.None of them cache 2.Each takes n – 1 degrees of freedom to specify Given this set A: ▫ Each partial path in A must end at a unique node (they don’t cache) ▫ There are 2 -1 such paths (n – 1 degrees of freedom)  Implies A has at least 2 -1 nodes 28

Admissible Paths Let A be the set of partial paths P which 1.Don’t end at a leaf node 2.Touch n – 1 rows and/or columns, but not more 3.Never set R(i) = S(ij) = T(j) = 0, for any i, j 29

Bound for H 1 Proposition: If P, Q are paths in A which end at the same node v, then they test the same set of R and T variables, and assign them the same value. Proof: Suppose P sets R(i), Q does not The subformula at v cannot contain any term R(i)S(ij)  Q sets every S(ij) = 0 or every T(j) = 1 (unit rule)  #Col(Q) = n, contradiction 30

Paths don’t cache Intuition: given that R(i), T(j) are set, S(ij) is determined Since two paths that end at the same node v set the same R, T variables, they set the same S variables 31 R(i) S(ij) T(j) 010 100 001 101

n – 1 degrees of freedom 32 Let P be an admissible path w.l.o.g. |Row(P)| = n - 1 At each node where we first visit a row, could’ve chosen either edge and still been admissible!  n – 1 degrees of freedom  2 -1 distinct admissible paths  FBDD(H 1 ) = 2  () R(2) S(1, 4) S(5, 4)

Proof for H k Same basic structure. We only need to change definition of admissible path. Let A be the set of partial paths P which 1.Don’t end at a leaf node 2.Touch n – 1 rows and/or columns, but not more 3.Always set i,j consistent with the following table: 33 R(i) S 1 (ij) S 2 (ij) S 3 (ij) T(j) 01010 10101 00101 10101

New Lower Bounds Theorem: For all k, FBDD(H k ) = 2  ( ), which implies Decision-DNNF( H k ) = 2  (√n) Theorem: Any Boolean function f of h k0,...,h kk that depends on all of them requires FBDD(f) = 2  () which implies Decision-DNNF(f) = 2  (√n) Corollary: Grounded inference requires 2  (√) time even on probabilistic DB instances with poly(n) time algorithms using lifted inference. Implies separation between grounded and lifted inference 34

Boolean combinations of h k0,...,h kk f is a Boolean function that depends on all its inputs Ψ = f(h k0, h k1,…, h kk ) We give a reduction from any FBDD for Ψ into an FBDD for H k Intuitively: to compute Ψ using an FBDD, you must compute the h k0, h k1, …, h kk, so that FBDD can also compute H k 35

Summary FBDDs and decision-DNNFs bound the power of known model counting algorithms Exponential lower bounds on FBDDs & decision-DNNFs Which implies a separation between lifted and grounded inference 36

Open Problems A polynomial conversion of decision-DNNFs to FBDDs? General Dichotomy theorem for grounded inference? Approximate model counting? 37

Thank You Questions? 38

Download ppt "Model Counting of Query Expressions: Limitations of Propositional Methods Paul Beame 1 Jerry Li 2 Sudeepa Roy 1 Dan Suciu 1 1 University of Washington."

Similar presentations