Presentation is loading. Please wait.

Presentation is loading. Please wait.

Model Counting of Query Expressions: Limitations of Propositional Methods Paul Beame 1 Jerry Li 2 Sudeepa Roy 1 Dan Suciu 1 1 University of Washington.

Similar presentations


Presentation on theme: "Model Counting of Query Expressions: Limitations of Propositional Methods Paul Beame 1 Jerry Li 2 Sudeepa Roy 1 Dan Suciu 1 1 University of Washington."— Presentation transcript:

1 Model Counting of Query Expressions: Limitations of Propositional Methods Paul Beame 1 Jerry Li 2 Sudeepa Roy 1 Dan Suciu 1 1 University of Washington 2 MIT 11

2 Probabilistic Databases AsthmaPatient Ann Bob Friend AnnJoe AnnTom BobTom Smoker Joe Tom Boolean query Q:  x  y AsthmaPatient(x)  Friend (x, y)  Smoker(y) Tuples are probabilistic (and independent) ▫ “Ann” is present with probability 0.3 Boolean formula F Q,D = (x 1  y 1  z 1 )  (x 1  y 2  z 2 )  (x 2  y 3  z 2 ) ▫ Q is true on D  F Q,D is true What is the probability that Q is true on D? Two main evaluation techniques: lifted vs. grounded inference x1x1 x2x2 z1z1 z2z2 y1y1 y2y2 y3y3 0.3 0.1 0.5 1.0 0.9 0.5 0.7 Pr(x 1 ) = 0.3 2

3 Lifted Inference Q:  x  y AsthmaPatient(x)  Friend (x, y)  Smoker(y) Dichotomy Theorem [Dalvi, Suciu 12] For any union of conjunctive queries (UCQ), evaluating it is either ▫ #P-hard ▫ Polynomial time computable using lifted inference ▫ and there is a simple condition to tell which case holds 3

4 Grounded Inference F Q,D = (x 1  y 1  z 1 )  (x 1  y 2  z 2 )  (x 2  y 3  z 2 ) Equivalent to problem of model counting, counting the number of satisfying assignments of F Q,D Folklore sentiment: Lifted inference is strictly stronger than grounded inference Our examples give a first clear proof of this 4

5 Outline DPLL algorithms ▫ Extensions (Caching & Component Analysis) ▫ Knowledge Compilation (FBDDs & Decision-DNNF) Our Contributions ▫ DLDD to FBDD conversion ▫ FBDD lower bounds Sketch of FBDD lower bound Conclusions 5

6 Model Counting Probability Computation Problem: Given F, and independent Pr(x), Pr(y), Pr(z), …, compute Pr(F) Model Counting Problem: Given a Boolean formula F, compute #F = #Models (satisfying assignments) of F e.g. F = (x  y)  (x  u  w)  (  x   u  w  z) #Assignments on x, y, u, z, w which make F = true 6

7 CDP Relsat Cachet SharpSAT c2d Dsharp … 7 Exact Model Counters Search-based/DPLL-based (explore the assignment-space and count the satisfying ones) Knowledge Compilation-based (compile F into a “computation-friendly” form) [Survey by Gomes et. al. ’09] Both techniques explicitly or implicitly use DPLL-based algorithms produce FBDD or Decision-DNNF compiled forms (output or trace) [Huang-Darwiche’05, ’07] Both techniques explicitly or implicitly use DPLL-based algorithms produce FBDD or Decision-DNNF compiled forms (output or trace) [Huang-Darwiche’05, ’07] [Birnbaum et. al.’99] [Bayardo Jr. et. al. ’97, ’00] [Sang et. al. ’05] [Thurley ’06] [Darwiche ’04] [Muise et. al. ’12]

8 DPLL Algorithms Davis, Putnam, Logemann, Loveland [Davis et. al. ’60, ’62] 8 x z 0 y 1 u 0 1 1 0 w 1 0 0 1 10 u 1 1 1 0 w 1 0 0 1 10 1 010 0 1 11 F: (x  y)  (x  u  w)  (  x  u  w  z) uwzuwz uwuw w uwuw ½ ¾ ¾ y(uw)y(uw) 3/83/8 7/87/8 5/85/8 w ½ Assume uniform distribution for simplicity // basic DPLL: Function Pr(F): if F = false then return 0 if F = true then return 1 select a variable x, return ½ Pr(F X=0 ) + ½ Pr(F X=1 ) // basic DPLL: Function Pr(F): if F = false then return 0 if F = true then return 1 select a variable x, return ½ Pr(F X=0 ) + ½ Pr(F X=1 )

9 DPLL Algorithms 9 x z 0 y 1 u 0 1 1 0 w 1 0 0 1 10 u 1 1 1 0 w 1 0 0 1 10 1 010 0 1 11 F: (x  y)  (x  u  w)  (  x  u  w  z) uwzuwz uwuw w uwuw ½ ¾ ¾ y(uw)y(uw) 3/83/8 7/87/8 5/85/8 w ½ The trace is a Decision-Tree for F The trace is a Decision-Tree for F

10 Extensions to DPLL Caching Subformulas Component Analysis Conflict Directed Clause Learning ▫ Affects the efficiency of the algorithm, but not the final “form” of the trace 10 Traces of DPLL + caching + (clause learning)  FBDD DPLL + caching + component + (clause learning)  Decision-DNNF Traces of DPLL + caching + (clause learning)  FBDD DPLL + caching + component + (clause learning)  Decision-DNNF

11 Caching 11 // basic DPLL: Function Pr(F): if F = false then return 0 if F = true then return 1 select a variable x, return ½ Pr(F X=0 ) + ½ Pr(F X=1 ) // basic DPLL: Function Pr(F): if F = false then return 0 if F = true then return 1 select a variable x, return ½ Pr(F X=0 ) + ½ Pr(F X=1 ) x z 0 y 1 u 0 1 1 0 w 1 0 0 1 1 0 u 1 1 1 0 w 1 0 0 1 10 F: (x  y)  (x  u  w)  (  x  u  w  z) uwzuwz uwuw w uwuw y(uw)y(uw) w // DPLL with caching: Cache F and Pr(F); look it up before computing // DPLL with caching: Cache F and Pr(F); look it up before computing

12 Caching & FBDDs 12 x z 0 y 1 0 1 0 u 1 1 1 0 w 1 0 0 1 10 F: (x  y)  (x  u  w)  (  x  u  w  z) uwzuwz uwuw w y(uw)y(uw) The trace is a decision-DAG for F FBDD (Free Binary Decision Diagram) or ROBP (Read Once Branching Program) Every variable is tested at most once on any path All internal nodes are decision-nodes The trace is a decision-DAG for F FBDD (Free Binary Decision Diagram) or ROBP (Read Once Branching Program) Every variable is tested at most once on any path All internal nodes are decision-nodes

13 Component Analysis 13 x z 0 y 1 0 1 0 u 1 1 1 0 w 1 0 0 1 10 F: (x  y)  (x  u  w)  (  x  u  w  z) uwzuwz uwuw w y  (  u  w) // basic DPLL: Function Pr(F): if F = false then return 0 if F = true then return 1 select a variable x, return ½ Pr(F X=0 ) + ½ Pr(F X=1 ) // basic DPLL: Function Pr(F): if F = false then return 0 if F = true then return 1 select a variable x, return ½ Pr(F X=0 ) + ½ Pr(F X=1 ) // DPLL with component analysis (and caching): if F = G  H where G and H have disjoint sets of variables Pr(F) = Pr(G) × Pr(H) // DPLL with component analysis (and caching): if F = G  H where G and H have disjoint sets of variables Pr(F) = Pr(G) × Pr(H)

14 Components & Decision-DNNF 14  x z 1 u 1 1 1 0 w 1 0 0 1 10 uwzuwz w y  (  u  w) 0 y 1 0 F: (x  y)  (x  u  w)  (  x  u  w  z) The trace is a Decision-DNNF [Huang-Darwiche ’05, ’07] FBDD + “Decomposable” AND-nodes (Two sub-DAGs do not share variables) The trace is a Decision-DNNF [Huang-Darwiche ’05, ’07] FBDD + “Decomposable” AND-nodes (Two sub-DAGs do not share variables) y 0 1 AND Node uwuw

15 Decomposable Logic Decision Diagrams (DLDDs) Generalization of Decision-DNNFs: ▫ not just decomposable AND nodes ▫ Also NOT nodes, decomposable binary OR, XOR, etc  sub-DAGs for each node are labelled by disjoint sets of variables 15

16 Outline DPLL algorithms ▫ Extensions (Caching & Component Analysis) ▫ Knowledge Compilation (FBDDs & Decision-DNNF) Our Contributions ▫ DLDD to FBDD conversion ▫ FBDD lower bounds Sketch of FBDD lower bound Conclusions 16

17 How much power does component analysis add? 17 Theorem [UAI 2013]: decision-DNNF for F of size N  FBDD for F of size N log N + 1 If F is a k-DNF or k-CNF, then FBDD is of size N k Conversion algorithm runs in linear time in the size of its output Theorem [ICDT 2014]: Conversion works even for DLDDs Theorem [UAI 2013]: decision-DNNF for F of size N  FBDD for F of size N log N + 1 If F is a k-DNF or k-CNF, then FBDD is of size N k Conversion algorithm runs in linear time in the size of its output Theorem [ICDT 2014]: Conversion works even for DLDDs

18 An important class of queries H 1 (x,y)=R(x)S(x,y)  S(x,y)T(y) H k (x,y)=R(x)S 1 (x,y) ...  S i (x,y)S i+1 (x,y) ...  S k (x,y)T(y) ▫ [Dalvi, Suciu 12]: H k is #P-hard to evaluate ▫ However, some boolean combinations of the h ki are poly- time computable using lifted inference, e.g. (h 30  h 32 ) (h 30  h 33 ) (h 31  h 33 ) 18 h k0 h ki h kk

19 New Lower Bounds Theorem: Any Boolean function f of h k0,...,h kk that depends on all of them requires FBDD(f) = 2  () which implies DLDD(f) = 2  (√n) DLDD(f) = 2  (/) if f is monotone. Corollary: Grounded inference requires 2  () time even on probabilistic DB instances with poly(n) time algorithms using lifted inference. Implies separation between grounded and lifted inference 19

20 Outline DPLL algorithms ▫ Extensions (Caching & Component Analysis) ▫ Knowledge Compilation (FBDDs & Decision-DNNF) Our Contributions ▫ DLDD to FBDD conversion ▫ FBDD lower bounds Sketch of FBDD lower bound Conclusions 20

21 Outline of Proof FBDD  FBDD with unit rule Prove hardness for H k for FBDDs with unit rule Then reduce FBDDs for functions over h k0,...,h kk to FBDDs for H k 21

22 A “unit rule” for FBDDs Definition: A variable x in a boolean formula Φ is a unit for Φ if Φ = x v G, for some G. Definition: An FBDD for a formula F follows the unit rule if each node tests a unit variable whenever a unit variable exists in the corresponding sub-formula. 22 G x 1 10

23 A “unit rule” for FBDDs For any variable X in a DNF formula Φ, let deg(X) be the number of variables which co-occur with X in some clause, and let ∆(Φ) = max deg(X) Note: deg(H k ) = n Lemma: Given an FBDD for a monotone DNF formula Φ of size N, there exists an FBDD for Φ that follows the unit rule of size at most ∆(Φ) N. 23

24 Proof of Lemma: A Local Transform 24 y y(x 1 v x 2 v…v x n ) v H 1 w x 1 v x 2 v…v x n v H[y = 1] y y(x 1 v x 2 v…v x n ) v H w H[y = 1, x 1 = x 2 = … = x n = 0 ] 1 x1x1 1 1 0 xnxn 1 1 0 … Note: size increases by factor of ∆(Φ) Decision Node Unit Node

25 Proof of Lemma: A Local Transform But this might cause us to test a variable twice along a path, violating the read-once property! In last case, simply remove the second test ▫ Point all of edges that point to it to its 0-child ▫ Does not increase size! The resulting structure will then be read-once 25

26 Apply these transformations globally. Then it is sufficient to show the following: Claim: Let v be any node in the original FBDD, with corresponding subformula Φ v. In the new FBDD, it has corresponding subformula Φ v [X = 0], where X was the set of units in Φ v. Proof: Every unit of Φ v became a unit somewhere along each path to v, where we set it to 0. 26 Proof of Lemma

27 Back to H k H 1 = R(x)S(x,y)  S(x,y)T(y) Over the complete database of size n, H 1 = ∨ n R(i) S(ij)  ∨ n S(ij) T(j) Key idea: If a subformula of H 1 is unit-free, then all conjunctions are clearly either from h 10 or h 11 27

28 Bound for H 1 Let F be an FBDD for H 1 that follows the unit rule. For any partial path P in F starting at the root, let Row(P) = {i: P tests R(i) or S(ij) at a decision node, for some j} Col(P) = {j: P tests T(j) or S(ij) at a decision node, for some i} Let P be set of partial paths P so that the resulting subformula is not 0 or 1, and |Row(P)| < n and |Col(P)| < n but no extension of P has |Row(P)| < n and |Col(P)| < n 28

29 Bound for H 1 Proposition: If P, Q are paths in P which end at the same node v, then they must test the same set of R and T variables, and assign them the same value. Proof: Suppose P sets r(i), Q does not (other cases similar) The subformula at v cannot contain any term r(i)s(ij)  Q sets every s(ij) = 0 or every t(j) = 1 at some decision node  |Col(Q)| = n, contradiction 29

30 Admissible Paths Definition: A path P in P is admissible if there does not exist an i,j so that P is not consistent with the following table: Let A denote the set of admissible paths 30 R(i) S(ij) T(j) 010 100 001 101

31 Admissible Paths Theorem: Two distinct admissible paths P, Q end at different vertices. Proof: By contradiction. Assume P and Q end at same vertex (i.e. subformula are same) Let v be the first (decision) node where P, Q differ, with variable x ▫ w.l.o.g. assume P sets x = 0, Q sets x = 1 If x is R or T variable, done by Lemma, so assume x = s(ij) Q sets x to 1  Q sets R(i) = T(j) = 0 (unit rule) By Lemma, P sets R(i) = T(j) = 0, contradiction. 31

32 Proof of Lower Bound Thus, suffices to prove bound on number of admissible paths. Let P be an admissible path with |Row(P)| = n – 1 ▫ For each i ∈ Row(P), consider the first R(i), S(i1), S(i2), …, S(in) variable we encounter along P ▫ We could’ve set it to either 0 and 1 and still maintained admissibility up to that decision There are always at least n -1 such “unforced” decisions Any different choice for these decisions leads to a different admissible path  # of admissible paths ≥ 2 n-1 32

33 Proof for H k Same basic structure. We only need to change definition of admissible path. 33 R(i) S 1 (ij) S 2 (ij) S 3 (ij) T(j) 01010 10101 00101 10101

34 Boolean combinations of h k0,...,h kk f is a Boolean function that depends on all its inputs Ψ = f(h k0, h k1,…, h kk ) We want to reduce any FBDD F for Ψ into an FBDD for H k Intuitively: to compute Ψ using an FBDD, you must compute the h 30, h 31, h 32, h 33, so that FBDD can also compute H k 34

35 Transparent Subformulas Definition: A formula Φ that is a restriction of Ψ is called transparent if for any two partial assignments θ 1, θ 2, if Φ = Ψ[θ 1 ] = Ψ[θ 2 ], then h ki [θ 1 ] = h ki [θ 2 ], for all i. From Φ, we can read off the values of the h ki. When will the subformula be transparent? 35

36 When is a subformula easy? Definition: Let θ be a partial assignment to the h k0,...,h kk. A transversal in θ is a pair of indices (i, j) so that R(i) S 1 (i, j) is a prime implicant of h k0 [θ], S l (i, j)S (l+1) (i, j) is a prime implicant of h kl [θ], and S k (i, j) T(j) is a prime implicant of h kk [θ]. We say a formula Φ is transversal-free if there exists θ so that Φ = Ψ[θ] and θ has no transversals. 36

37 Transversal-free Subformula are easy 37 T(1)T(2)T(3)T(4)T(5) R(1)S(1,1)S(1,2)S(1,3)S(1,4)S(1,5) R(2)S(2,1)S(2,2)S(2,3)S(2,4)S(2,5) R(3)S(3,1)S(3,2)S(3,3)S(3,4)S(3,5) R(4)S(4,1)S(4,2)S(4,3)S(4,4)S(4,5) R(5)S(5,1)S(5,2)S(5,3)S(5,4)S(5,5)

38 Transversal-free Subformula are easy 38 T(1)T(2)T(3)T(4)T(5) R(1)S(1,1)S(1,2)S(1,3)S(1,4)S(1,5) R(2)S(2,1)S(2,2)S(2,3)S(2,4)S(2,5) R(3)S(3,1)S(3,2)S(3,3)S(3,4)S(3,5) R(4)S(4,1)S(4,2)S(4,3)S(4,4)S(4,5) R(5)S(5,1)S(5,2)S(5,3)S(5,4)S(5,5)

39 Transversal-free Subformula are easy 39 R(2) R(1) R(n) T(1) T(2) T(n)

40 Subformula with few transversals are easy Two transversals (i, j) and (i’, j’) are independent if i ≠ i’ and j ≠ j’ If a subformula has few independent transversals, then we can test the variables shared by the transversals, and make the formula transversal free. i.e. if all transversals went through R(i), then first test R(i), and then the resulting formula is transversal-free 40

41 Subformula with few transversals are easy 41 Φ GF Suppose all of Φ’s transversals involve the variables R(1), T(2) R(1) Φ T(2)

42 A Pseudo-Unit Rule Transversal-free  easy for FBDDs  hard for lower bounds So we need control over when a subformula becomes transversal-free This is just like the units for H 1 ! Definition: A variable X in a subformula Φ is a H k -unit if Φ is not transversal-free but Φ[X = 1] is. 42

43 Transparent Subformula Lemma Theorem: If a subformula is H k -unit free and has at least 4 independent transversals, then it is transparent. Proof: See paper. 43

44 Putting it all together 1.Do the unit rule conversion with H k -units* 2.If a node has < 4 transversals, transform it as above 3.Now the FBDD is transparent except at nodes at which we control ingress, so we can deduce the values of the h k0,...,h kk at every node. 44

45 Summary Quasi-polynomial conversion of any decision-DNNF into an FBDD (polynomial for k-DNF) Exponential lower bounds on model counting algorithms d-DNNFs and AND-FBDDs are exponentially more powerful than decision-DNNFs Applications in probabilistic databases 45

46 Open Problems A polynomial conversion of decision-DNNFs to FBDDs? A more powerful syntactic subclass of d-DNNFs than decision-DNNFs? ▫ d-DNNF is a semantic concept ▫ No efficient algorithm to test if two sub-DAGs of an OR-node are simultaneously satisfiable Approximate model counting? 46

47 Thank You Questions? 47


Download ppt "Model Counting of Query Expressions: Limitations of Propositional Methods Paul Beame 1 Jerry Li 2 Sudeepa Roy 1 Dan Suciu 1 1 University of Washington."

Similar presentations


Ads by Google