Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 SIGMOD 2000 Christophides Vassilis On Wrapping Query Languages and Efficient XML Integration V. Christophides, S. Cluet, J Simeon Computer Science Department,

Similar presentations


Presentation on theme: "1 SIGMOD 2000 Christophides Vassilis On Wrapping Query Languages and Efficient XML Integration V. Christophides, S. Cluet, J Simeon Computer Science Department,"— Presentation transcript:

1 1 SIGMOD 2000 Christophides Vassilis On Wrapping Query Languages and Efficient XML Integration V. Christophides, S. Cluet, J Simeon Computer Science Department, University of Crete Institute for Computer Science - FORTH Heraklion, Crete INRIA Rocquencourt Domaine de Voluceau Paris, France Bell Laboratories Murray Hill, NJ, USA

2 2 SIGMOD 2000 Christophides Vassilis An Integration Scenario Z39.50 Server ODBC Server Middleware Server SQL queries on tables with trading info about artifacts Full-text queries on well-formed XML docs with descriptive info about artifacts

3 3 SIGMOD 2000 Christophides Vassilis XML-based Middleware is Cool ! Z39.50 Server ODBC Server Middleware Server RDBMS-XML Wrapper Wais-XML Wrapper What are the Artifacts created in Giverny ? Q1 Q2 V1=(Q1,Q2) V2=... Title Creator Price Nympheas Monet 10M$ Waitress Manet 38M$ Monet Nympheas Impressionism 21 x 61 Giverny > Q S1 S2 Monet Nympheas 10M$ Impressionism 21 x 61 Giverny XML

4 4 SIGMOD 2000 Christophides Vassilis But XML is not a Panacea !!! Z39.50 Server ODBC Server Middleware Server XML Wrapper Q1’Q2’ Q=Q’(Q1’,Q2’) Q l Wrapping queries is hard l Optimization for XML queries is poor l What about type information? S1 S2 select... from... where... contains word1 or/and … XML

5 5 SIGMOD 2000 Christophides Vassilis The YAT Approach to Efficient XML Integration YAT Mediator Server SQL-XML Generic Wrapper Full Text-XML Generic Wrapper Q Z39.50 Server ODBC Server S1 S2 l An Algebra for XML l Generic wrapping of query languages and data structures l New optimization opportunities Q1’Q2’ Q’ Q2’Q1’ XML

6 6 SIGMOD 2000 Christophides Vassilis Outline l Brief Recall u YAT data model (wrappers’ structural metadata) u YATL integration language (XML view definition) l The YAT operational model u XML Algebra l Generic wrapping of source query capabilities u Wrappers’ operational metadata l Optimization opportunities l Summary and Related work

7 7 SIGMOD 2000 Christophides Vassilis Generic vs Specific XML Data Representation tuple rel_artifacts rel_artifacts:root... 10M$ “Nympheas” creator “Monet” title price tuple Relation: table Symbol tuple Symbol Int v String v Float v Bool Yat: Any Yat & Yat  tuple Float String creator String price rel_artifacts Rel_artifacts: table title YAT modelRelational model Artifacts Schema Artifacts Database * * * * 38M$ “Waitress” creator “Manet” title price table X Y owner String owner

8 8 SIGMOD 2000 Christophides Vassilis Mixing Valid & Well-formed XML Data artist String artist Float String misc dims price Artifact: artifact title Integrated Artifact Schema Field artwork: collection &Artifact * root works: docs Work * root Work: style * Field Symbol String Field: XML Artwork v Symbol Field String style title dims String work String owner

9 9 SIGMOD 2000 Christophides Vassilis Integrating Heterogeneous XML Data with YATL Artifact MAKE collection * Artifact($t,$a):= artifact [title:$t, artist:$a, price:$p, style:$s, dims:$d, owner:$o, misc:$f] MATCH rel_artifacts WITH table * tuple * { title:$t, creator:$c, price:$p, owner:$o } works WITH works * work [ artist:$a, title:$t’, style:$s, dims:$d, *($f) ] WHERE $t = $t’ and $c = $a

10 10 SIGMOD 2000 Christophides Vassilis The XML Algebra l What do we need ? u capture the query language u support optimization u wrap source query languages l Our XML algebra u relational operators: Select, Project, Join, , ,  u core object operators: Map, Djoin, Group, Sort ò Standard Relational & Object Rewritings u two XML operators: Bind and Tree ò New XML Rewritings 

11 11 SIGMOD 2000 Christophides Vassilis Bind Operator & Tab Structure work * docs Bind works... Tab  artiststyletitledims $s $a$t $d * ($f) $s $a $t$d$f Monet Nympheas Impressionism 21x61 crplace “Giverny” $s $a $t$d$f Manet Waitress Impressionism 37.5x51 theme “Folies Bergere”

12 12 SIGMOD 2000 Christophides Vassilis Tree & Restructuring Style($s): $s $a * Pablo Picasso Tree Bind (works, …) s1: “Cubism” Georges Braque... * Edouard Manet s2: “Impressionism” Claude Monet...  $s

13 13 SIGMOD 2000 Christophides Vassilis Algebraization of Queries docs rel_artifacts table Tree Bind Join $t = $t’ and $c = $a rel_artifactsworks * $d $a artist $p$t misc dims price artifact title $f $s style collection * Artifact($t,$a): = artwork: = artist style title dims work tuple creator price title * *($f) $c $p$t $t’ $s$a$d owner $o

14 14 SIGMOD 2000 Christophides Vassilis The Core YAT Operations Basic PredicateBindGroupSelect Tree Supported by: { YAT } Sig: Yat x Yat  Bool <...= Function Algebra Operation Join Supported by: { YAT } Sig: Yat x FYat  Tab Supported by: { YAT } Sig: Tab x Pred  Tab Supported by: { YAT } Sig: Yat x FYat  YAT...

15 15 SIGMOD 2000 Christophides Vassilis Generic Wrapping of Source Query Capabilities Function Basic Algebra Operation Predicate Bind Group Select Tree Supported by: { YAT, Rel } Sig: Yat x Yat  Bool <... = Join Supported by: { YAT } Sig: Yat x FYat  Tab Supported by: { YAT, Rel, Wais } Sig: Tab x Pred  Tab Supported by: { YAT } Sig: Yat x FYat  YAT... contains Supported by: { Rel } Sig: Rel x FRel  Tab Supported by: { Wais } Sig: Works x FWork  Tab Supported by: { Wais } Sig: String x Work  Bool Extension Refinement...

16 16 SIGMOD 2000 Christophides Vassilis Query Processing in YAT l Query: What are the artifacts created in Giverny and sold for less that 10M$? l Three phases query optimization: ¶ Simplification of algebraic expressions: Bind-Tree rewritings, push selections, projections,... · Pushing operations on external sources: filter simplification, source- supplied equivalencies,... ¸ Information passing between sources: reorder join arguments,... MAKE * answer [title: $t, artist: $a, price: $p] MATCH artwork WITH collection * artifact [title: $t, artist: $a, price: $p, misc.crplace: $cp] WHERE $cp = “Giverny” and $p < 10

17 17 SIGMOD 2000 Christophides Vassilis Query Preprocessing docs Tree Bind Join $t = $t’ and $c = $a rel_artifactsworks * $d $a artist $p$t misc dims price artifact title $f $s style collection * Artifact($t,$a) artwork artist style title dims work *($f) $t $s$a$d Tree Select Bind $cp=“Giverny” and $p<10 * answer title artist price $t$a $p * artifact title artist price $t$a$p collection misc crplace $cp Query View $o owner rel_artifacts table tuple creator price title * $c $p $t owner $o

18 18 SIGMOD 2000 Christophides Vassilis Query Optimization: Phase 1 Tree Bind * artifact title artist price $t$a$p collection misc $m Bind * t a p $t$a$p m $cp Bind crplace Project $t,$a, $p,$m:f * artifact title artist price $t$a $p misc crplace $cp $d $a artist $p$t misc dims price artifact title $f $s style collection * Artifact($t,$a) artwork $o owner

19 19 SIGMOD 2000 Christophides Vassilis Query Optimization: Phase 1 Tree Select $p<10 Bind Join $t = $t’ and $c = $a Project $t, $c, $p Select $cp=“Giverny” Project * answer artist price $t$a $p rel_artifacts $t, $a, $m:f docs works * artist style title dims work *($f) $t $s $a $d rel_artifacts table tuple creator price title * $c $p $t owner $o * $cp Bind crplace m title

20 20 SIGMOD 2000 Christophides Vassilis Query Optimization: Phase 2 Tree $w Bind works Join Select Bind rel_artists Select contains(“Giverny”,$w) Bind = (X,Work) => contains(X,Work) * answer artist price $t$a $p $t = $t’ and $c = $a $cp=“Giverny” $p<10 rel_artifacts table tuple creator price title * $c $p$t docs * work title $t$a $cp crplace * w title artist

21 21 SIGMOD 2000 Christophides Vassilis Query Optimization: Phase 3 Tree DJoin * answer artist price $t$a $p title $w Bind works Select contains(“Giverny”,$w) Bind $cp=“Giverny” docs * work $t$a $cp crplace * w title artist Select Bind rel_artists $p<10 rel_artifacts table tuple creator price title * a $pt

22 22 SIGMOD 2000 Christophides Vassilis Summary & Related Work l Wrapping Query Languages: implies to understand the semantics of QLs u Ad hoc solution proposed by Garlic (IBM) u Untyped solution proposed by DISCO (INRIA) u Query templates-based solution proposed by TSIMMIS (Stanford) ò Generic solution introduced for the YAT system (INRIA+Bell Labs) l XML query optimization: requires to exploit XML typing information ò YAT relies on an general purpose algebra allowing ¶ to reuse optimization techniques proposed in the relational and object context (pushing selections, projections, join reordering, …) · to introduce new ones taking advantage of the type information in order to prune navigation in XML trees, push query evaluation to the sources, etc.

23 23 SIGMOD 2000 Christophides Vassilis The YAT Architecture YAT API View Interface Structural Information Data Information ModuleQuery Module Optimizer Evaluator Server YAT API Data Conversion Structural Extraction Data Information ModuleQuery Module Query Translation Operational Extraction Client Source MEDIATORMEDIATOR WRAPPERWRAPPER Server Client


Download ppt "1 SIGMOD 2000 Christophides Vassilis On Wrapping Query Languages and Efficient XML Integration V. Christophides, S. Cluet, J Simeon Computer Science Department,"

Similar presentations


Ads by Google