Download presentation

Presentation is loading. Please wait.

Published byReanna Overfield Modified over 2 years ago

1
A D ICHOTOMY ON T HE C OMPLEXITY OF C ONSISTENT Q UERY A NSWERING FOR A TOMS W ITH S IMPLE K EYS Paris Koutris Dan Suciu University of Washington

2
R EPAIRS An uncertain instance I for a schema with key constraints A repair r of I is a subinstance of I that satisfies the key constraints and is maximal 2 R(x, y) (a 1, b 1 ) (a 1, b 2 ) (a 2, b 2 ) (a 3, b 3 ) (a 3, b 4 ) (a 4, b 4 ) (a 1, b 1 ) (a 2, b 2 ) (a 3, b 4 ) (a 4, b 4 ) (a 1, b 1 ) (a 2, b 2 ) (a 3, b 3 ) (a 4, b 4 ) (a 1, b 2 ) (a 2, b 2 ) (a 3, b 4 ) (a 4, b 4 ) (a 1, b 2 ) (a 2, b 2 ) (a 3, b 3 ) (a 4, b 4 ) The 4 possible repairs

3
C ONSISTENT Q UERY A NSWERING If Q is boolean, we say that I is certain for Q, I |= Q, if for every repair r of I, Q(r) is true 3 R(x, y) (a 1, b 1 ) (a 1, b 2 ) (a 2, b 2 ) (a 3, b 3 ) (a 3, b 4 ) (a 4, b 4 ) S(y, z) (b 1, c 1 ) (b 2, c 1 ) (b 2, c 2 ) (b 3, c 3 ) Q() = R(x, y), S(y, z) I |= Q

4
P ROBLEM S TATEMENT CERTAINTY(Q): Given as input an instance I, does I |= Q when Q is a boolean CQ? In general, CERTAINTY(Q) is in coNP – Q 1 = R(x, y), S(y, z) : expressible as a first-order query – Q 2 = R(x, y), S(z, y) : coNP-complete – Q 3 = R(x, y), S(y, x) : PTIME but not first-order expressible 4 Conjecture For every boolean conjunctive query Q, CERTAINTY(Q) is either in PTIME or coNP-complete

5
P ROGRESS SO F AR [Wijsen, 2010] – Syntactic characterization of FO-expressible acyclic CQs w/o self- joins [Kolaitis and Pema, 2012] – A trichotomy for CQs with 2 atoms and no self-joins [Wijsen, 2010 & 2013] – PTIME algorithm for cyclic queries: C k = R 1 (x 1,x 2 ), …, R k (x k, x 1 ) – Further classification of acyclic CQs w/o self-joins 5

6
O UR C ONTRIBUTION A dichotomy for CQs w/o self-joins where atoms have either Simple keys : R(x, y, z) Keys that consist of all attributes: S(x, y, z) 6 Theorem For every boolean CQ Q w/o self-joins where for each atom the key consists of either one attribute or all attributes, there exists a dichotomy of CERTAINTY(Q) into PTIME and coNP-complete

7
O UTLINE 1.The Dichotomy Condition 2.Frugal Repairs & Representable Answers 3.Strongly Connected Graphs 7

8
T HE Q UERY G RAPH We equivalently study boolean CQs consisting only of binary relations where one attribute is the key: R(x, y) Relations can be consistent (R c ) or inconsistent (R i ) Query Graph: a directed edge (u, v) for each atom R(u,v) 8 Q = R i (x, y), S i (z, w), T c (y, w) y w x S T R z G[Q] source node u R end node v R

9
D EFINITIONS x +,R : set of nodes reachable from node x once we remove the edge R (through a directed path) R ~ S [source-equivalent]: source nodes u R, u S are in the same SCC [R]: the equivalence class of R w.r.t ~ 9 y R z x T S v w u x +,R = {x, v, w} R ~ T and [R] = {R, T} V U

10
C OUPLED E DGES coupled + (R) = edges in [R] + any inconsistent edge S s.t. the source node u S is connected to the end node v R through a (undirected) path that does not intersect with u R +,R 10 y = v R R z x = u R T S v w u = u V coupled + (R): contains R,T: [R] = {R, T} contains V: path from y (= v R ) to u (= u V ) does not contain U V U The set u R +,R

11
S PLITTABLE G RAPHS Two inconsistent edges R, S are coupled if – S in coupled + (R) & R in coupled + (S) A graph G[Q] is: – unsplittable if it contains a pair of coupled edges that are not source-equivalent. – splittable otherwise 11 y R z x T S v w u V U coupled + (R) = {R, T, V} coupled + (T) = {R, T, V} coupled + (V) = {V} coupled + (U) = {U,V,R,T} Only R,T are coupled SPLITTABLE!

12
T HE D ICHOTOMY C ONDITION 12 y R z x T S v w u V U Dichotomy Theorem If G[Q] is splittable, CERTAINTY(Q) is in PTIME If G[Q] is unsplittable, CERTAINTY(Q) is coNP- complete Splittable, so in PTIME

13
E XAMPLES 13 PTIME R(x, y), S(y, z) coNP-complete R(x, y), S(y, z), T c (x, z) x y z x y z PTIME R(x, y), S(y, z), U c (z, y) x y z coNP-complete R(x, y), S(z, y), U c (y, z) x y z

14
O UTLINE 1.The Dichotomy Condition 2.Frugal Repairs & Representable Answers 3.Strongly Connected Graphs 14

15
F RUGAL R EPAIRS (1) 15 Definition A repair r of an instance I is frugal for a boolean query Q if for any other repair r’ of I, Q f (r’) is not strictly contained in Q f (r) R(x, y) (a 1, b 1 ) (a 1, b 2 ) (a 2, b 3 ) (a 3, b 4 ) (a 4, b 4 ) S(y, x) (b 1, a 1 ) (b 3, a 2 ) (b 4, a 3 ) (b 4, a 4 ) repair r 1 = { R(a 1, b 1 ), R(a 2, b 3 ), R(a 3, b 4 ), R(a 4, b 4 ) S(b 1, a 1 ), S(b 3, a 2 ), S(b 4, a 3 ) } Q f (r 1 ) = { (a 1, b 1 ), (a 2, b 3 ), (a 3, b 4 ) } repair r 2 = { R(a 1, b 2 ), R(a 2, b 3 ), R(a 3, b 4 ), R(a 4, b 4 ) S(b 1, a 1 ), S(b 3, a 2 ), S(b 4, a 3 ) } Q f (r 2 ) = { (a 2, b 3 ), (a 3, b 4 ) } not frugal frugal Q f = all body variables to the head (full query)

16
R(x, y) (a 1, b 1 ) (a 1, b 2 ) (a 2, b 3 ) (a 3, b 4 ) (a 4, b 4 ) S(y, x) (b 1, a 1 ) (b 3, a 2 ) (b 4, a 3 ) (b 4, a 4 ) F RUGAL R EPAIRS (2) 16 I |= Q if and only if every frugal repair satisfies Q We lose no generality if we study only frugal repairs! Only two frugal repairs: Q f (r 2 ) = {(a 2, b 3 ), (a 3, b 4 )} Q f (r 3 ) = {(a 2, b 3 ), (a 4, b 4 )}

17
O R -S ETS 17 Efficiently represent all answer sets of frugal repairs We use or-sets: means 1 or 2 or 3 – A = – We can “compress” A as B = {, } – [Libkin and Wong, ‘93] “decompression” α operator: α(B) = A The or-set of answer sets for frugal repairs of I for Q: – M Q (I) = Compressed form (set of or-sets): – A Q (I) = {, }

18
R EPRESENTABILITY (1) 18 An or-set-of-sets S is representable if there exists a set-of- or-sets S 0 (compression) such that: – α(S 0 ) = S – For any distinct or-sets A, B in S 0, the tuples in A and B use distinct constants in all coordinates The compression of a representable set with active domain of size n has size polynomial in n {, } compressionnot representable

19
R EPRESENTABILITY (2) 19 I |= Q iff the compression A Q (I) is not empty If we can compute A Q (I) in polynomial time, deciding whether I |= Q is in PTIME Theorem If G[Q] is a strongly connected graph, M Q (I) is representable and its compression can be computed in polynomial time in the size of I

20
O UTLINE 1.The Dichotomy Condition 2.Frugal Repairs & Representable Answers 3.Strongly Connected Graphs 20

21
C YCLES 21 C k = R 1 (x 1, x 2 ), R 2 (x 2, x 3 )…, R k (x k, x 1 ) The purified instance contains a collection of disjoint SCCs ALGORITHM FrugalC – Find the SCCs that contain no directed cycle of length > k – For each such SCC i, create an or-set A i that contains all cycles of length k – Output A Ck (I) = {A 1, A 2, …} R(x, y) (a 1, b 1 ) (a 2, b 2 ) (a 2, b 3 ) S(y, z) (b 1, c 1 ) (b 2, c 2 ) (b 3, c 2 ) T(z, x) (c 1, a 1 ) (c 2, a 2 ) a1a1 b1b1 c1c1 a2a2 b2b2 c2c2 b3b3 A C3 (I) = {, }

22
G ENERAL C ASE : SCC S (1) 22 Recursively split a SCC G into a SCC G’ and a directed path P that intersects G’ only at its start and end node The set A G’ (I) can be recursively computed x y R S T t U V Graph G’ The path P = y -- > t -- > z A G’ (I) = {, } A1A1 A2A2 z

23
G ENERAL C ASE : SCC S (2) 23 A G’ (I) = {, } A1A1 A2A2 B(a, b) (A 1, [a 1 b 1 c 1 ]) (A 2, [a 2 b 2 c 2 ]) (A 2, [a 2 b 3 c 2 ]) B 1 c (b, y) ([a 1 b 1 c 1 ], b 1 ) ([a 2 b 2 c 2 ], b 2 ) ([a 2 b 3 c 2 ], b 3 ) B 2 c (b, z) ([a 1 b 1 c 1 ], c 1 ) ([a 2 b 2 c 2 ], c 2 ) ([a 2 b 3 c 2 ], c 2 ) B 0 c (z, b) (c 1, A 1 ) (c 2, A 2 ) Any value belongs in a unique or-set a y t U V b B B1cB1c z B2cB2c B0cB0c Replacement of G’ A cycle C = a -> b -> y -> t -> z -> a + a chord B 2 that is a consistent relation

24
R EST O F THE P ROOF 24 PTIME algorithm for splittable graphs – Find a separator in G[Q] (always exists if a graph is splittable) – The separator splits G[Q] into cases with fewer inconsistent edges, which are solved recursively – Base case: all edges are consistent (check whether Q(I) is true) coNP-hardness – Reduction from the Monotone-3SAT problem

25
C ONLUSIONS 25 Significant progress towards proving the dichotomy for the complexity of Certain Query Answering for Conjunctive Queries Settle the dichotomy (or trichotomy) even for queries with self-joins!

26
Thank you ! 26

27
G ENERAL C ASE : SCC S (3) 27 a y t U V b B B1cB1c z B2cB2c B0cB0c Replacement of G’ A cycle C = a -> b -> y -> t -> z -> a+ a chord B 2 that is a consistent relation Compute A C for the modified input Throw away any or-sets that have a tuple that does not agree with B 2 B(a, b) (A 1, [a 1 b 1 c 1 ]) (A 2, [a 2 b 2 c 2 ]) (A 2, [a 2 b 3 c 2 ]) B 1 c (b, y) ([a 1 b 1 c 1 ], b 1 ) ([a 2 b 2 c 2 ], b 2 ) ([a 2 b 3 c 2 ], b 3 ) B 2 c (b, z) ([a 1 b 1 c 1 ], c 1 ) ([a 2 b 2 c 2 ], c 2 ) ([a 2 b 3 c 2 ], c 2 ) B 0 c (z, b) (c 1, A 1 ) (c 2, A 2 )

28
O VERVIEW A query graph G[Q] is associated with query Q The condition for PTIME (splittability) is defined on G[Q] PTIME case: – We introduce the notion of frugal repairs & representable answers – Algorithm for Strongly Connected Graphs – Use the notion of separators to recursively split the query graph (self-reducibility) coNP-complete case: – Reduction from the Monotone-3SAT problem 28

Similar presentations

OK

Faster Query Answering in Probabilistic Databases using Read-Once Functions Sudeepa Roy Joint work with Vittorio Perduca Val Tannen University of Pennsylvania.

Faster Query Answering in Probabilistic Databases using Read-Once Functions Sudeepa Roy Joint work with Vittorio Perduca Val Tannen University of Pennsylvania.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google