Presentation is loading. Please wait.

Presentation is loading. Please wait.

San Diego Supercomputer Center XMLDM'02, Prague 1 Time to Leave the Trees: From Syntactic to Conceptual Querying of XML Bertram Ludäscher Ilkay Altintas.

Similar presentations


Presentation on theme: "San Diego Supercomputer Center XMLDM'02, Prague 1 Time to Leave the Trees: From Syntactic to Conceptual Querying of XML Bertram Ludäscher Ilkay Altintas."— Presentation transcript:

1 San Diego Supercomputer Center XMLDM'02, Prague 1 Time to Leave the Trees: From Syntactic to Conceptual Querying of XML Bertram Ludäscher Ilkay Altintas Amarnath Gupta San Diego Supercomputer Center U.C. San Diego Bertram Ludäscher Ilkay Altintas Amarnath Gupta San Diego Supercomputer Center U.C. San Diego

2 San Diego Supercomputer Center XMLDM'02, Prague 2 OverviewOverview Motivating Example:Motivating Example: –querying XML w/o and w/ conceptual-level information –“syntactic” vs. “conceptual” querying of XML Distilling conceptual-level information:Distilling conceptual-level information: –MXS (abstract Model for XML Schema) XPathT:XPathT: –Incorporating conceptual-level information in XPath

3 San Diego Supercomputer Center XMLDM'02, Prague 3 Motivating Example Example: “Books DB” (yes, more complex examples exist... ;)Example: “Books DB” (yes, more complex examples exist... ;) –elements:.............. Sample Queries:Sample Queries: –Q1: Which s have a below $80? –Q2: What’s the count and average of s? (Nice) Try:(Nice) Try: –Q1: myDB//book[price<80] –Q2: N := count(myDB//book); S := sum(myDB//book/price); Avg := S/N; But what about...But what about... –... s with multiple s? –... (award-winning-exemplars) elements (= subtype of book having subelement ): we forgot those!

4 San Diego Supercomputer Center XMLDM'02, Prague 4 Schema Information to the Rescue! XML & Semistructured Data Model:XML & Semistructured Data Model: –labeled ordered trees –“instance contains its own schema information” –XML instances and DTDs have very little “schema info”: tag names (aka element “types”) = attribute names element nesting = object (“slot”) structure  no data types, constraints, classes, class hierarchy,... Schemas are Good for You!Schemas are Good for You! –link to conceptual models/DB design, query formulation, –validation, storage layout (optimization), –query processing (optimization),...  XML Schema

5 San Diego Supercomputer Center XMLDM'02, Prague 5 Motivating Example (Cont’d) Q1 after studying and/or its XML Schema:Q1 after studying and/or its XML Schema:  there is a type hierarchy below type bookT  tag names are bound to those types  but XPath doesn’t know this => use Syntactic Queries: //*[book OR tbook OR cbook OR...OR awe] [price<80]  tedious and error-prone (do-it-yourself: Appendix A) –e.g. you overlooked ! (usually schema info not contained in the XML instance)  small changes in the schema (adding a new subtype) require rewriting of your query...

6 San Diego Supercomputer Center XMLDM'02, Prague 6 From Syntactic to Conceptual XML Queries 1. Distill conceptual information from the XML Schema  Abstract Model of XML Schema (MXS) 2. Incorporate MXS information into the query language  XPathT (“XPath with types/classes”)  turn Syntactic XML Query //*[book OR tbook OR cbook OR... OR awe] [price<80]  into a more adequate Conceptual XML Query: //*[ts(bookT)][price<80] /* works for any subtype of bookT */  more robust w.r.t. schema changes  new opportunities for semantic query optimization

7 San Diego Supercomputer Center XMLDM'02, Prague 7 Abstract Model of XML Schema (MXS) Basic Ideas:Basic Ideas: –Formal abstract model (never mind the XML Schema syntax!), inspired by Model Schema Language (MXL) [Brown-Fuchs-Robie-Wadler-WWW10-2001] –“Types as Classes” XML Schema Names:XML Schema Names: –T: Type Names –E: Element Names –A: Attribute Names XML Instances...XML Instances... –... usually contain only element names (tags) E and attributes A ( exception: “xsd:type =...” )

8 San Diego Supercomputer Center XMLDM'02, Prague 8 Abstract Model of XML Schema (MXS) MXS NamesMXS Names –T: Types, E: Elements, A: Attributes Kinds of TypesKinds of Types –simple vs. complex: T_s, T_c –abstract vs. concrete: T_a, T_na Type HierarchyType Hierarchy –restrict  (T_s  T_s)  (T_c  T_c) restricts possible instances, keeping structure –extend  (T_s  T_c)  T_c adds “slots” (elements and attributes) –subtype = extend  restrict extend and restrict are subtyping mechanisms

9 San Diego Supercomputer Center XMLDM'02, Prague 9 Type (Class) Hierarchy in XML Schema Convention: user-defined type names end with “T”Convention: user-defined type names end with “T” –authorT, publicationT, bookT,...

10 San Diego Supercomputer Center XMLDM'02, Prague 10 Inheritance in XML Schema (I) expTextBookT ::= SUBTYPE (bookT) that RESTRICTs to expPriceT and EXTENDs with expTextBookT ::= SUBTYPE (bookT) that RESTRICTs to expPriceT and EXTENDs with EXTEND RESTRICT SUBTYPE

11 San Diego Supercomputer Center XMLDM'02, Prague 11 Inheritance in XML Schema (II) 19 th centuryTextBookType ::= SUBTYPE {textBookT, c19bookT} multipleinheritance singleinheritance XML Schema type system does not known the two are equivalent!

12 San Diego Supercomputer Center XMLDM'02, Prague 12 Framework for Conceptual Queries in XML Binding Types to ElementsBinding Types to Elements –bind  (E  (T_s  T_c ))  (A  T_s) binds element names to simple or complex types binds attribute names to simple types Syntactic XML Instance: DSyntactic XML Instance: D –root(NodeId), child(NodeId,Integer,NodeId), tag(NodeId,Tagname), data(NodeId,Data) Conceptual XML Instance: D+Conceptual XML Instance: D+ –restrict(T, T), extend(T, T), subtype(T, T), –bind(E  T, T) –...

13 San Diego Supercomputer Center XMLDM'02, Prague 13 XPathT: Incorporating Type (Class) Information in XPath XPath patterns p and qualifiers q: p[q] returns matches of p which qualify according to qXPath patterns p and qualifiers q: p[q] returns matches of p which qualify according to q New XPathT patterns:New XPathT patterns: r(t), e(t), s(t): restrict, extend, subtype type tr(t), e(t), s(t): restrict, extend, subtype type t tr(t), te(t), ts(t): transitive versionstr(t), te(t), ts(t): transitive versions

14 San Diego Supercomputer Center XMLDM'02, Prague 14 Semantics of XPathT Example:Example: “transitive subtype”: SEM( ts(t) ) := { t’ | subtype*(t,t’) } from types to element names: SEM( [T] ) := { e | bind(t,e), t  T } SEM( [ts(bookT)] ) := {book,ebook,tbook,...}

15 San Diego Supercomputer Center XMLDM'02, Prague 15 Conceptual(-level) XML Queries in XPathT Which books have price below $80?Which books have price below $80? //*[ts(bookT)][price<80] Semantic-aware equivalent rewriting:Semantic-aware equivalent rewriting: //*[ts(bookT)][NOT ts(expTextBookT)][price<80] Logic XPathT Query Plan:Logic XPathT Query Plan: tree structure information conceptual information

16 San Diego Supercomputer Center XMLDM'02, Prague 16 SummarySummary Complex domains require conceptual level modeling and querying capabilities beyond just tree structureComplex domains require conceptual level modeling and querying capabilities beyond just tree structure Statues Quo: XML Schema: simple “conceptual model” with may ad-hoc “design decisions”/restrictionsStatues Quo: XML Schema: simple “conceptual model” with may ad-hoc “design decisions”/restrictions  Abstract Model of XML Schema (MXS)  XPathT: first step towards “conceptual” or “semantic” XML query language extensions  more concise, intuitive, flexible, and robust queries  the system maps conceptual to syntactic queries, not the programmer/query designer!

17 San Diego Supercomputer Center XMLDM'02, Prague 17 Next Steps & Outlook extend MXS to include more conceptual informationextend MXS to include more conceptual information develop formal semanticsdevelop formal semantics –XPathT, extensions: XPathC, XQueryC research problems:research problems: –mapping: XPathC queries => equivalent XPath queries –formalize equivalence, always possible? Then, conventional XML query processors can be used! –“proxy XML Schema doc”: instead of rewriting into XPath over the original instance, can one materialize some conceptual info as a “proxy XML doc” such that conceptual queries become conventional queries against the proxy... –semantic query optimization: equivalent rewritings given the conceptual level constraints


Download ppt "San Diego Supercomputer Center XMLDM'02, Prague 1 Time to Leave the Trees: From Syntactic to Conceptual Querying of XML Bertram Ludäscher Ilkay Altintas."

Similar presentations


Ads by Google