CS848: Topics in Databases: Foundations of Query Optimization Topics covered  Introduction to description logic: Single column QL  The ALC family of.

Slides:



Advertisements
Similar presentations
Charting the Potential of Description Logic for the Generation of Referring Expression SELLC, Guangzhou, Dec Yuan Ren, Kees van Deemter and Jeff.
Advertisements

A D ICHOTOMY ON T HE C OMPLEXITY OF C ONSISTENT Q UERY A NSWERING FOR A TOMS W ITH S IMPLE K EYS Paris Koutris Dan Suciu University of Washington.
Modal Logic with Variable Modalities & its Applications to Querying Knowledge Bases Evgeny Zolin The University of Manchester
Knowledge Representation and Reasoning using Description Logic Presenter Shamima Mithun.
An Introduction to Description Logics
2005conjunctive-ii1 Query languages II: equivalence & containment (Motivation: rewriting queries using views)  conjunctive queries – CQ’s  Extensions.
Lecture 11: Datalog Tuesday, February 6, Outline Datalog syntax Examples Semantics: –Minimal model –Least fixpoint –They are equivalent Naive evaluation.
Propositional and First Order Reasoning. Terminology Propositional variable: boolean variable (p) Literal: propositional variable or its negation p 
1 Relational Algebra & Calculus. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.  Relational.
Fuzzy DL, Fuzzy SWRL, Fuzzy Carin (report from visit to Athens) M.Vacura VŠE Praha (used materials by G.Stoilos, NTU Athens)
1 Conditional XPath, the first order complete XPath dialect Maarten Marx Presented by: Einav Bar-Ner.
1 Introduction to Computability Theory Lecture12: Decidable Languages Prof. Amos Israeli.
Description Logics. Outline Knowledge Representation Knowledge Representation Ontology Language Ontology Language Description Logics Description Logics.
Predicate Calculus.
FiRE Fuzzy Reasoning Engine Nikolaos Simou National Technical University of Athens.
CS848: Topics in Databases: Foundations of Query Optimization Topics Covered  Databases  QL  Query containment  More on QL.
An Introduction to Description Logics. What Are Description Logics? A family of logic based Knowledge Representation formalisms –Descendants of semantic.
CSE-291: Ontologies in Data & Process Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies.
Notes on DL Reasoning Shawn Bowers April, 2004.
Ming Fang 6/12/2009. Outlines  Classical logics  Introduction to DL  Syntax of DL  Semantics of DL  KR in DL  Reasoning in DL  Applications.
DECIDABILITY OF PRESBURGER ARITHMETIC USING FINITE AUTOMATA Presented by : Shubha Jain Reference : Paper by Alexandre Boudet and Hubert Comon.
Chapter 8 Relational Calculus. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.8-2 Topics in this Chapter Tuple Calculus Calculus vs. Algebra.
CSE314 Database Systems The Relational Algebra and Relational Calculus Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E Pearson Ed Slide Set.
CS848: Topics in Databases: Foundations of Query Optimization Topics covered Overview of DEMO  Capturing database schema in QL  Differential query optimization.
LDK R Logics for Data and Knowledge Representation ClassL (part 3): Reasoning with an ABox 1.
Topics in artificial intelligence 1/1 Dr hab. inż. Joanna Józefowska, prof. PP Reasoning and search techniques.
An Introduction to Description Logics (chapter 2 of DLHB)
CS344: Introduction to Artificial Intelligence Lecture: Herbrand’s Theorem Proving satisfiability of logic formulae using semantic trees (from Symbolic.
CS621: Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 28– Interpretation; Herbrand Interpertation 30 th Sept, 2010.
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Description Logics: Logic foundation of Semantic Web Semantic.
1 Relational Algebra & Calculus Chapter 4, Part A (Relational Algebra)
1 How to decide Query Containment under Constraints using a Description Logic Ian Horrocks, Ulrike Sattler, Sergio Tessaris, and Stephan Tobies presented.
LDK R Logics for Data and Knowledge Representation Description Logics (ALC)
More on Description Logic(s) Frederick Maier. Note Added 10/27/03 So, there are a few errors that will be obvious to some: So, there are a few errors.
LDK R Logics for Data and Knowledge Representation Description Logics.
CS848: Topics in Databases: Information Integration Topics covered  Databases  QL  Query containment  An evaluation of QL.
DL Overview Second Pass Ming Fang 06/19/2009. Outlines  Description Languages  Knowledge Representation in DL  Logical Inference in DL.
THEORY OF COMPUTATION Komate AMPHAWAN 1. 2.
LDK R Logics for Data and Knowledge Representation ClassL (Propositional Description Logic with Individuals) 1.
LDK R Logics for Data and Knowledge Representation ClassL (part 2): Reasoning with a TBox 1.
Strings Basic data type in computational biology A string is an ordered succession of characters or symbols from a finite set called an alphabet Sequence.
Description Logics Dr. Alexandra I. Cristea. Description Logics Description Logics allow formal concept definitions that can be reasoned about to be expressed.
272: Software Engineering Fall 2012 Instructor: Tevfik Bultan Lecture 9: Test Generation from Models.
1 Reasoning with Infinite stable models Piero A. Bonatti presented by Axel Polleres (IJCAI 2001,
ece 627 intelligent web: ontology and beyond
Complexity of Reasoning
Knowledge Repn. & Reasoning Lec #11+13: Frame Systems and Description Logics UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2004.
Knowledge Representation and Reasoning University "Politehnica" of Bucharest Department of Computer Science Fall 2011 Adina Magda Florea
Presented by Kyumars Sheykh Esmaili Description Logics for Data Bases (DLHB,Chapter 16) Semantic Web Seminar.
Of 29 lecture 15: description logic - introduction.
CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration.
LDK R Logics for Data and Knowledge Representation Description Logics: family of languages.
Logics for Data and Knowledge Representation ClassL (part 1): syntax and semantics.
LDK R Logics for Data and Knowledge Representation Description Logics.
Logics for Data and Knowledge Representation
Introduction to Logic for Artificial Intelligence Lecture 2
Logics for Data and Knowledge Representation
Logics for Data and Knowledge Representation
Logics for Data and Knowledge Representation
Alternating tree Automata and Parity games
Logics for Data and Knowledge Representation
Logics for Data and Knowledge Representation
Logics for Data and Knowledge Representation
Logics for Data and Knowledge Representation
Description Logics.
Logics for Data and Knowledge Representation
Equivalence of Aggregate Queries in Conjunctive QL
Topics covered (class assignment)
Relational Algebra & Calculus
Logics for Data and Knowledge Representation
Presentation transcript:

CS848: Topics in Databases: Foundations of Query Optimization Topics covered  Introduction to description logic: Single column QL  The ALC family of dialects  Terminologies  Language extensions

CS848: Topics in Databases: Foundations of Query Optimization Single column QL D ::=THING |C Q ::=D as x |(empty x) |(THING as x minus C as x) |(from Q 1, Q 2 ) |(elim x from x.A = y, elim y from y = x, Q) |(x.Pf 1 = x.Pf 2 ) |(THING as x minus x.Pf 1 = x.Pf 2 ) |(elim x x.R = y) |(THING as x minus elim x from x.R = y, elim y from y = x, THING as x minus Q) | 

CS848: Topics in Databases: Foundations of Query Optimization Initial analysis The language L 2 consists of all formulae of FOPC with equality and constant functions that use at most two distinct variables. Theorem: The satisfiability problem for L 2 is NEXPTIME-complete. Corollary: The query containment problem for single column QL is decidable for queries that are attribute free.

CS848: Topics in Databases: Foundations of Query Optimization New syntax (cont’d) D ::=THING |C Q ::=D as x | ?, |(empty x) |(THING as x minus C as x) |(from Q 1, Q 2 ) |(elim x from x.A = y, elim y from y = x, Q) |(x.Pf 1 = x.Pf 2 ) |(THING as x minus x.Pf 1 = x.Pf 2 ) |(elim x x.R = y) |(THING as x minus elim x from x.R = y, elim y from y = x, THING as x minus Q) | 

CS848: Topics in Databases: Foundations of Query Optimization New syntax (cont’d) D ::=THING |C Q ::=D as x | ? | : C, |(THING as x minus C as x) |(from Q 1, Q 2 ) |(elim x from x.A = y, elim y from y = x, Q) |(x.Pf 1 = x.Pf 2 ) |(THING as x minus x.Pf 1 = x.Pf 2 ) |(elim x x.R = y) |(THING as x minus elim x from x.R = y, elim y from y = x, THING as x minus Q) | 

CS848: Topics in Databases: Foundations of Query Optimization New syntax (cont’d) D ::=THING |C Q ::=D as x | ? | : C |C 1 u C 2, |(from Q 1, Q 2 ) |(elim x from x.A = y, elim y from y = x, Q) |(x.Pf 1 = x.Pf 2 ) |(THING as x minus x.Pf 1 = x.Pf 2 ) |(elim x x.R = y) |(THING as x minus elim x from x.R = y, elim y from y = x, THING as x minus Q) | 

CS848: Topics in Databases: Foundations of Query Optimization New syntax (cont’d) D ::=THING |C Q ::=D as x | ? | : C |C 1 u C 2 | 8 A.D, |(elim x from x.A = y, elim y from y = x, Q) |(x.Pf 1 = x.Pf 2 ) |(THING as x minus x.Pf 1 = x.Pf 2 ) |(elim x x.R = y) |(THING as x minus elim x from x.R = y, elim y from y = x, THING as x minus Q) | 

CS848: Topics in Databases: Foundations of Query Optimization New syntax (cont’d) D ::=THING |C Q ::=D as x | ? | : C |C 1 u C 2 | 8 A.D |Pf 1 = Pf 2, |(x.Pf 1 = x.Pf 2 ) |(THING as x minus x.Pf 1 = x.Pf 2 ) |(elim x x.R = y) |(THING as x minus elim x from x.R = y, elim y from y = x, THING as x minus Q) | 

CS848: Topics in Databases: Foundations of Query Optimization New syntax (cont’d) D ::=THING |C Q ::=D as x | ? | : C |C 1 u C 2 | 8 A.D |Pf 1 = Pf 2 |Pf 1  Pf 2, |(THING as x minus x.Pf 1 = x.Pf 2 ) |(elim x x.R = y) |(THING as x minus elim x from x.R = y, elim y from y = x, THING as x minus Q) | 

CS848: Topics in Databases: Foundations of Query Optimization New syntax (cont’d) D ::=THING |C Q ::=D as x | ? | : C |C 1 u C 2 | 8 A.D |Pf 1 = Pf 2 |Pf 1  Pf 2 | 9 R.THING, |(elim x x.R = y) |(THING as x minus elim x from x.R = y, elim y from y = x, THING as x minus Q) | 

CS848: Topics in Databases: Foundations of Query Optimization New syntax (cont’d) D ::=THING |C Q ::=D as x | ? | : C |C 1 u C 2 | 8 A.D |Pf 1 = Pf 2 |Pf 1  Pf 2 | 9 R.THING | 8 R.D, |(THING as x minus elim x from x.R = y, elim y from y = x, THING as x minus Q) | 

CS848: Topics in Databases: Foundations of Query Optimization New syntax (cont’d) Q ::=D as x |  D ::=THING |C | ? | : C |C 1 u C 2 | 8 A.D |Pf 1 = Pf 2 |Pf 1  Pf 2 | 9 R.THING | 8 R.D |(D)

CS848: Topics in Databases: Foundations of Query Optimization New syntax (cont’d) Q ::=D as x |  D ::= > |C | ? | : C |C 1 u C 2 | 8 A.D |Pf 1 = Pf 2 |Pf 1  Pf 2 | 9 R. > | 8 R.D |(D)

CS848: Topics in Databases: Foundations of Query Optimization Concept dependencies On terminology and notation: We call an instance of the language generated by D for a given DL a concept. A concept inclusion dependency C for a given DL is written D 1 v D 2 and corresponds to the query containment dependency (D 1 as x) v (D 2 as x). A concept definition C for a given DL is written C ´ D and corresponds to the query equivalence dependency (C as x) ´ (D as x).

CS848: Topics in Databases: Foundations of Query Optimization CLASSIC † (our first DL) (syntax) (semantics) D ::= (universal concept) | >  (primitive concept) |C (C) I (bottom concept) | ? ; (atomic negation) | : C  – (C) I (intersection) |D 1 u D 2 (D 1 ) I Å (D 2 ) I (attribute value restriction) | 8 A.D {e : (A) I (e) 2 (D) I } (path agreement) |Pf 1 = Pf 2 {e : (Pf 1 ) I (e) = (Pf 2 ) I (e)} (path disagreement) |Pf 1  Pf 2 {e : (Pf 1 ) I (e)  (Pf 2 ) I (e)} (existential quantification) | 9 R.D {e 1 : 9 e 2 : (e 1, e 2 ) 2 (R) I Æ e 2 2 (D) I } (role value restriction) | 8 R.D {e 1 : 8 (e 1, e 2 ) 2 (R) I : e 2 2 (D) I } |(D) † [Borgida and Patel-Schneider, 1994]

CS848: Topics in Databases: Foundations of Query Optimization Concept dependencies (cont’d) The concept inclusion problem for a given DL is to determine if a concept inclusion dependency in the DL, D 1 v D 2, is an axiom; that is, to determine if (D 1 ) I µ (D 2 ) I for any database I. Theorem: The concept inclusion problem for CLASSIC is solvable in low order polynomial time.

CS848: Topics in Databases: Foundations of Query Optimization An efficient decision procedure Theorem: The following procedure decides if C = (D 1 v D 2 ) is an axiom for CLASSIC, and can be implemented in low order polynomial time. 1.Create a partial database I 1 consisting of a single individual e in concept D 1. Perform a simple chase of I 1 to obtain a partial database I 2. 2.Return true if the domain of I 2 is empty, or if the tuple h x : e, cnt : 1 i occurs in « D 2 as x ¬ ( I 2 ) † ; otherwise return false. † Use forced semantics for agreements and disagreements.

CS848: Topics in Databases: Foundations of Query Optimization The simple chase n : {D 1 t D 2 } [ L n : {D 1, D 2 } [ L n 1 : { 8 A.D} [ L n 2 : {D} n 1 : L A n 1 : { 9 R.D} [ L n 2 : {D} n 1 : L R

CS848: Topics in Databases: Foundations of Query Optimization The simple chase (cont’d) n 2 : L 2 n 1 : { 8 R.D} [ L 1 R n 2 : {D} [ L 2 n 1 : L 1 R n : {A 1.A 2. .A r = B 1.B 2. .B s } [ L n : L u 1 : ;  u r : ; A1A1 ArAr A2A2 v 1 : ;  v s : ; BsBs B2B2 B1B1

CS848: Topics in Databases: Foundations of Query Optimization The simple chase (cont’d) n : {A 1.A 2. .A r  B 1.B 2. .B s } [ L n : L u 1 : ;  u r : ; A1A1 ArAr A2A2 v 1 : ;  v s : ; BsBs B2B2 B1B1 w : L u : L 1 A v : L 2 A w : L u : L 1 A v : L 2 A

CS848: Topics in Databases: Foundations of Query Optimization The simple chase (cont’d) n 1 : L 1 n 2 : L 2 n 1 : L 1 [ L 2 n 2 : L 1 [ L 2 n 1 : L 1 n 2 : L 2 n 3 : L 3 n 1 : L 1 n 2 : L 2 n 3 : L 3 u : L 1 v : L 3 A x : L 4 A w : L 2 u : L 1 v : L 3 A x : L 4 A w : L 2

CS848: Topics in Databases: Foundations of Query Optimization The simple chase (cont’d) w : L u : L 1 A v : L 2 A w : { ? }u : L 1 A v : L 2 A u : L 1 v : L 3 A x : L 4 A w : L 2 u : L 1 v : L 3 A x : L 4 A w : L 2

CS848: Topics in Databases: Foundations of Query Optimization The simple chase (cont’d) (remove all nodes and incident arcs) n : { ? } [ L or m : L 1 n : L 2 n : {C, : C } [ L or

CS848: Topics in Databases: Foundations of Query Optimization Evaluating agreements and disagreements Note that agreements and disagreements can navigate missing attribute values. In such cases, assume a forced semantics. In particular, a node n satisfies an agreement iff the agreement has the form Pf 1.Pf = Pf 2.Pf where (Pf 1 ) I (n) and (Pf 2 ) I (n) are defined and lead to nodes connected by an equality arc; n satisfies a disagreement iff it has the form Pf 1 = Pf 2 where (Pf 1 ) I (n) and (Pf 2 ) I (n) are defined and lead to nodes connected by an inequality arc.

CS848: Topics in Databases: Foundations of Query Optimization Example Observation: The chase decision procedure for CLASSIC can be implemented in O(n log n) time, where n is the length of the component descriptions. select e from EMP as e where e = e.b.b.b and e = e.b.b.b.b.b  (from (EMP as x), (from (x = x.b.b.b), (x = x.b.b.b.b.b))) ´ EMP u (id = b.b.b) u (id = b.b.b.b.b) as x  EMP u (id = b.b.b) u (id = b.b.b.b.b) ´ EMP u (id = id.b)  EMP u (id = b) as x)  select e from EMP as e where e = e.b

CS848: Topics in Databases: Foundations of Query Optimization The ALC family of DLs (syntax) (semantics) D ::= (primitive concept) |C (C) I (universal concept) | >  (bottom concept) | ? ; (atomic negation) | : C  – (C) I (intersection) |D 1 u D 2 (D 1 ) I Å (D 2 ) I (role value restriction) | 8 R.D {e 1 : 8 (e 1, e 2 ) 2 (R) I : e 2 2 (D) I } (limited existential quantification) | 9 R. > {e 1 : 9 e 2 : (e 1, e 2 ) 2 (R) I Æ e 2 2 (D) I } (union) |D 1 t D 2 (D 1 ) I [ (D 2 ) I (full existential quantification) | 9 R.D {e 1 : 9 e 2 : (e 1, e 2 ) 2 (R) I Æ e 2 2 (D) I } (quantified number restriction) |( > n R) {e 1 : |{e 2 : (e 1, e 2 ) 2 (R) I }| ¸ n} (quantified number restriction) |( 6 n R) {e 1 : n ¸ |{e 2 : (e 1, e 2 ) 2 (R) I }|} (full negation) | : D  – (D) I

CS848: Topics in Databases: Foundations of Query Optimization The ALC family of DLs (cont’d) FL 0 FL – AL ALN D ::=C p p p p | > p p p | ? p p p | : C p p |D 1 u D 2 p p p p | 8 R.D p p p p | 9 R. > p p p |D 1 t D 2 | 9 R.D |( > n R) p |( 6 n R) p | : D

CS848: Topics in Databases: Foundations of Query Optimization The ALC family of DLs (cont’d) ALU ALE ALUEALC ALCN D ::=C p p p p p | > p p p p p | ? p p p p p | : C p p p p p |D 1 u D 2 p p p p p | 8 R.D p p p p p | 9 R. > p p p p p |D 1 t D 2 p p ± p | 9 R.D p p ± p |( > n R) p |( 6 n R) p | : D ± p p

CS848: Topics in Databases: Foundations of Query Optimization Some complexity results Theorem: The concept inclusion problems for ALC and ALCN are PSPACE-complete. A consistency problem for a given set of concepts is to determine if there exists a database that interprets a given member of the set as nonempty. Observation: The consistency problem for ALC (resp. ALCN ) coincides with the concept inclusion problem for ALC (resp. ALCN ). In particular, D 1 v D 2 is an axiom iff the concept (D 1 u : D 2 ) is not consistent.

CS848: Topics in Databases: Foundations of Query Optimization Testing consistency in ALC Theorem: The following procedure decides if a given concept D in ALC is consistent. 1.Create a singleton set S 1 = { I } of partial databases in which I consists of a single individual e in concept D. Perform a union generalized chase of S 1 to obtain a set of partial databases S 2 = { I 1, …, I n }. 2.Return true if the domain of any database in S 2 is nonempty; otherwise return false.

CS848: Topics in Databases: Foundations of Query Optimization Union generalized chase Repeatedly do the following to a given set of partial databases S until no changes occur. 1.Apply the simple chase augmented with the negation rule to a member of S. 2.If S contains a partial database I that in turn contains a node n with the form on the left below, then replace I with two partial databases I 1 and I 2 in S in which the labeling of node n is revised to the forms on the right below. e : {D 1 t D 2 } [ L e : {D 1 } [ L e : {D 2 } [ L (old node n in I )(new node n in I 2 )(new node n in I 1 )

CS848: Topics in Databases: Foundations of Query Optimization The negation rule Exhaustively apply the following rewrites to the concept labeling for any given node: † :>) ? :?) > :: D ) D : (D 1 u D 2 ) ) ( : D 1 ) t ( : D 2 ) :8 A.D ) 8 A. : D :8 R. D ) 9 R. : D :9 R. D ) 8 R. : D : (D 1 t D 2 ) ) ( : D 1 ) u ( : D 2 ) † Obtains negation normal form for concept descriptions.

CS848: Topics in Databases: Foundations of Query Optimization A general membership problem A database schema T that consists of concept dependencies in which no primitive concept occurs more than once on the left-hand-side of a concept definition is called a terminology. The membership problem for a DL dialect is to determine, given a set { C 1, …, C n, C } of concept dependencies in the DL, if { C 1, …, C n } ² C ; that is, if every database I that models each C i also models C. Theorem: The membership problem for CLASSIC is undecidable. Theorem: The membership problem for ALCN is DEXPTIME-complete.

CS848: Topics in Databases: Foundations of Query Optimization Varieties of terminologies A terminology T with only concept definitions is definitional. For each C 1 ´ D occurring in a terminology T and each primitive concept C 2 occurring in D, C 1 has a direct use of C 2. The use relation is the transitive closure of direct use. T is cyclic iff there exists an atomic concept in T that has a use of itself. T is acyclic iff it is definitional and is not cyclic.

CS848: Topics in Databases: Foundations of Query Optimization An acyclic terminology in ALC WOMAN ´ PERSON u FEMALE MAN ´ PERSON u : WOMAN MOTHER ´ WOMAN u 9 hasChild.PERSON FATHER ´ MAN u 9 hasChild.PERSON PARENT ´ FATHER t MOTHER GRANDMOTHER ´ MOTHER u 9 hasChild.PARENT MOTHERWITHMANYCHILDREN ´ MOTHER u > 3 hasChild MOTHERWITHOUTDAUGHTER ´ MOTHER u 8 hasChild. : WOMAN WIFE ´ WOMAN u 9 hasHusband.MAN

CS848: Topics in Databases: Foundations of Query Optimization More complexity results Theorem: The membership problem for FL 0 with acyclic terminologies is CoNP-complete. Theorem: The membership problem for ALC with acyclic terminologies is PSPACE-complete. The DL ALCF extends ALC with agreements and disagreements of path functions. Theorem: The concept inclusion problem for ALCF is PSPACE-complete. Theorem: The membership problem for ALCF with acyclic terminologies is NEXPTIME-complete.

CS848: Topics in Databases: Foundations of Query Optimization Blocking Theorem: The membership problem for ALCN is DEXPTIME-complete. The membership problem for ALCN can be solved by a refinement of the consistency checking algorithm for concepts in ALC. There are two important tricks to note. 1.Each concept dependency occurring in the terminology, e.g. D 1 v D 2, is internalized to each new node by adding a corresponding concept, e.g. ( : D 1 t D 2 ), to the node’s label. 2.To ensure termination, no chasing is performed on blocked nodes. A node is blocked if its concepts are included in an older node.

CS848: Topics in Databases: Foundations of Query Optimization Language extensions  Role constructors  Role value maps  Uniqueness constraints