Presentation is loading. Please wait.

Presentation is loading. Please wait.

Order-sensitive XML Query Processing over Relational Sources: An Algebraic Approach Authors: Ling Wang, Song Wang, Brian Murphy and Elke A. Rundensteiner.

Similar presentations


Presentation on theme: "Order-sensitive XML Query Processing over Relational Sources: An Algebraic Approach Authors: Ling Wang, Song Wang, Brian Murphy and Elke A. Rundensteiner."— Presentation transcript:

1 Order-sensitive XML Query Processing over Relational Sources: An Algebraic Approach Authors: Ling Wang, Song Wang, Brian Murphy and Elke A. Rundensteiner Institute: Database Systems Research Group, Worcester Polytechnic Institute (WPI) IDEAS’2005

2 IDEAS’05 2 Order in XML  Order is important to XML  Document order  XML view can be ordered (OrderBy) …  User query can be order-sensitive (OrderBy, position(), range()…) SXE Revenge Shutdown FOR $play in document(“record.xml")/PLAY OrderBy $play/band RETURN $play[3]/SONG[rang 1 to 2]/text() 1. Sort PLAY by its band’s name 2. Find third PLAY 3. Extract its first and second SONG Misfits She Back Street Boy Bullet We Are 138 Project X SXE Revenge Shutdown

3 IDEAS’05 3 Why XML-to-SQL?  XML is stored in relational database to …  provide reliable persistent storage  exploit mature technologies  XML-to-SQL Systems  SilkRoute (AT&T), XPERANTO (IBM), RAINBOW (WPI), Rolex (BellLab), Agora, MARS  Oracle XML DB, Microsoft SQL Server 2000 SQLXML, IBM DB2 XML Extender

4 IDEAS’05 4  XML Views  support XML view mechanism for XML data publishing  support queries (updates) over XML views  XML publishing scenario  Relational model is not order-sensitive  Order in XML views over RDB has no meaning  XML storage scenario  Order is essential !!!  Order-preserving loading –XML document  Relational database –Implicit order in XML document  explicit order code in RDB  Order-restoring in extraction views –Explicit order code in RDB  implicit order in XML view through view query XML Views XML RDB XML View User query Order encoding

5 IDEAS’05 5 Order-specific loading  Order-specific loading:  Loading strategies: Inline, edge, …  Order encoding methods: Global, local, dewey …

6 IDEAS’05 6 Example <xs:element name="PLAY" minoccurs="1" maxOccurs="unbounded"> <xs:element name="SONG" type="xs:string" minoccurs="1"/> Misfits She Back Street Boy Bullet We Are 138 Project X SXE Revenge Shutdown XML schema XML document

7 IDEAS’05 7 IIDPIDPOSITION 101 RECORDLIST IIDPIDPOSITIONBAND_PCDATA 211Misfits 312Back Street Boy 413Project X PLAY IIDPIDPOSITIONSONG_PCDATA 521She 631Bullet 732We Are 138 841SXE Revenge 942Shutdown SONG Relational Database Inline loading + local order encoding Example FOR $play IN document("dxv.xml")/PLAY/ROW ORDER BY $play/POSITION/text() RETURN $play/BAND_PCDATA FOR $song IN document("dxv.xml")/SONG/ROW [PID/text() = $play/IID/text()] ORDER BY $song/POSITION/text() RETURN $song/SONG_PCDATA/text() View query

8 IDEAS’05 8 Motivation  Many loading + Encoding combinations are possible …  {inline, edge, …} * {local, global, dewey…}  Hybrid of multiple loading and encoding may occur:  Loading: –Schema is available --- inline –Schema is not available --- edge  Order-encoding –Heavy update workload --- dewey –Query workload --- global  Multiple XML documents are loaded into RDB  Other loading and encoding methods may emerge in future  Conclude: Need general approach for XQuery-to-SQL translation

9 IDEAS’05 9 XSOT  XML-to-SQL Order-sensitive Translation (XSOT):  Step1: Encode XML document with explicit order code (order-exposing)  Step 2: Load XML to relational database (order-preserving)  Step 2: Extract XML view from relational database (order-restoring)  Step 3: Query via XML view with order predicates (order-sensitive)

10 IDEAS’05 10 XQuery Parser Default XML Schema Default XML View Web/Intranet Application User Sub- System Process Data Query flow Data flow Legend XAT Generator User XAT View Composer XAT Optimizer View XAT SQL XML Result Ordered Tuple Streams XML Schema XML Data View Query XML Generator XAT RDBMS SQL Generator Mapping Manager XQuery Engine DB2OracleSQL Server Loading XQuery Schema generation Data Loading Order Encoding XQuery Data Extracting XML Source Wrapper Default XML View Order-Sensitive User Query Composed XAT Optimized XAT Order Code Comparison Function Sybase XSOT Framework

11 IDEAS’05 11 IIDPIDPOSITION 101 RECORDLIST IIDPIDPOSITIONBAND_PCDATA 211Misfits 312Back Street Boy 413Project X PLAY IIDPIDPOSITIONSONG_PCDATA 521She 631Bullet 732We Are 138 841SXE Revenge 942Shutdown SONG We are 138 Shutdown FOR $record in document(“record.xml") RETURN $record/PLAY/SONG[2]/text() Find second song of each play FOR $play IN document("dxv.xml")/PLAY/ROW ORDER BY $play/POSITION/text() RETURN $play/BAND_PCDATA FOR $song IN document("dxv.xml")/SONG/ROW [PID/text() = $play/IID/text()] ORDER BY $song/POSITION/text() RETURN $song/SONG_PCDATA/text() View query Relational Database Inline loading + local order encoding Running Example

12 IDEAS’05 12 Order-sensitive XML Algebra Tree  XSOT methodology:  An algebraic approach  XML Algebra Tree (XAT)  XAT operators –Select, CartesianProduct, ThetaJoin, LeftOuterJoin, Distinct, GroupBy, OrderBy –Source, Navigate, Combine, Tagger  XAT Order Extension –Position() –Range()  Composition of the view and user XAT

13 IDEAS’05 13 View Query XAT FOR $play IN document("dxv.xml")/PLAY/ROW ORDER BY $play/POSITION/text() RETURN $play/BAND_PCDATA FOR $song IN document("dxv.xml")/SONG/ROW [PID/text() = $play/IID/text()] ORDER BY $song/POSITION/text() RETURN $song/SONG_PCDATA/text() Combine $dataPlayTag Tagger $dataSongTag $dataPlayTag GroupBy $play Combine $dataSongTag Navigate $song, SONG_PCDATA/text() $sData Tagger $sData $dataSongTag Navigate $song, POSITION/text() $sPos OrderBy $sPos GroupBy $play Source “dxv.xml” $S Navigate $S, SONG/ROW $song Navigate $song, PID/text() $sPID ThetaJoin $pIID=$sPID Source “dxv.xml” $P Navigate $P, PLAY/ROW $play Navigate $play, POSITION/text() $pPos OrderBy $pPos Tagger $dataPlayTag $record Navigate $play, IID/text() $pIID 1 2 3 4 5 6 7 8 9 11 10 12 13 14 15 16 17 18 19

14 IDEAS’05 14 User Query XAT FOR $record in document(“record.xml") RETURN $record/PLAY/SONG[2]/text() GroupBy $record, $uPlay Combine $uDataSongTag Tagger $uDataSongTag $result Tagger $uSongData $uDataSongTag Navigate $uRecord, PLAY $uPlay Navigate $uPlay, SONG $uSong Navigate $uSong, text() $uSongData Select $uNumPos=2 Source “record.xml” $P 20 21 22 23 24 25 26 27 28 POS $uSong $uNumPos

15 IDEAS’05 15 GroupBy $record, $uPlay Combine $uDataSongTag Tagger $uDataSongTag $result Tagger $uSongData $uDataSongTag Navigate $uRecord, PLAY $uPlay Navigate $uPlay, SONG $uSong Navigate $uSong, text() $uSongData Select $uNumPos=2 Source “record.xml” $P 20 21 22 23 24 25 26 27 28 POS $uSong $uNumPos User XAT $P=$record Composed XAT Combine $dataPlayTag Tagger $dataSongTag $dataPlayTag GroupBy $play Combine $dataSongTag Navigate $song, SONG_PCDATA/text() $sData Tagger $sData $dataSongTag Navigate $song, POSITION/text() $sPos OrderBy $sPos GroupBy $play Source “dxv.xml” $S Navigate $S, SONG/ROW $song Navigate $song, PID/text() $sPID ThetaJoin $pIID=$sPID Source “dxv.xml” $P Navigate $P, PLAY/ROW $play Navigate $play, POSITION/text() $pPos OrderBy $pPos Tagger $dataPlayTag $record Navigate $play, IID/text() $pIID 1 2 3 4 5 6 7 8 9 11 10 12 13 14 15 16 17 18 19 View XAT

16 IDEAS’05 16 XAT Optimization – Order Explicit  Why?  Order in user XAT depends on the implicit order in the view  It blocks further optimization: Computation push down

17 IDEAS’05 17 XAT Optimization – Order Explicit Tagger $dataSongTag $dataPlayTag GroupBy $play Combine $dataSongTag Tagger $sData $dataSongTag View XAT construct SONG construct PLAY GroupBy $record, $uPlay Select $uNumPos=2 POS $uSong $uNumPos For each PLAY Sort SONGs Pick second song User XAT Depend on Cannot push down! Cannot translated into SQL!

18 IDEAS’05 18 XAT Optimization – Order Explicit  Goal: Convert user query order  FROM: implicit order in the XML view  TO: Explicit order-code column in relational encoding POS $uSong = POS $sPos POS $uSong $uNumPos GroupBy $record, $uPlay 22 23 View Portion XAT Navigate $song, POSITION/text() $sPos OrderBy $sPos GroupBy $play 10 11 12 POS $sPos $uNumPos GroupBy $play 22 23 User Portion XAT View Portion XAT Navigate $song, POSITION/text() $sPos OrderBy $sPos GroupBy $play 10 11 12 User Portion XAT

19 IDEAS’05 19 SQL-oriented XAT optimization  Goal:  Optimize XAT for efficient order-sensitive SQL generation  Rules:  Computation push-down –Push as much as possible to RDB  Order pull-up –Sort as late as possible –Avoid re-sorting !!!  Order-step rewrite –Match RDB order template

20 IDEAS’05 20 Optimized XAT Navigate $song, POSITION/text() $sPos OrderBy $sPos GroupBy $play Source “dxv.xml” $S Navigate $S, SONG/ROW $song Navigate $song, PID/text() $sPID ThetaJoin $pIID=$sPID Source “dxv.xml” $P Navigate $P, PLAY/ROW $play Navigate $play, POSITION/text() $pPos Navigate $play, IID/text() $pIID 1 2 3 5 6 7 8 911 10 12 OrderBy $sPos, $pPos 4 GroupBy $pPos Combine $uDataSongTag Tagger $uDataSongTag $result Tagger $sData $uDataSongTag Select $uNumPos=2 22 23 24 26 27 28 POS $sPos $uNumPos Navigate $song, SONG_PCDATA/text() $sData 13 Computation push down Order pull up OrderStep rewrite OrderStep [$pPos], [$pPos, $sPos] $uNumPos

21 IDEAS’05 21 Navigate $song, POSITION/text() $sPos OrderStep [$pPos], [$pPos, $sPos] $uNumPos Source “dxv.xml” $S Navigate $S, SONG/ROW $song Navigate $song, PID/text() $sPID ThetaJoin $pIID=$sPID Source “dxv.xml” $P Navigate $P, PLAY/ROW $play Navigate $play, POSITION/text() $pPos Navigate $play, IID/text() $pIID 1 2 3 5 6 7 8 9 10 29 OrderBy $sPos, $pPos 4 Combine $uDataSongTag Tagger $uDataSongTag $result Tagger $sData $uDataSongTag Select $uNumPos=2 24 26 27 28 Navigate $song, SONG_PCDATA/text() $sData 13 Optimized XAT

22 IDEAS’05 22 TEMPLATE: SELECT row_number() over ( ? ) $pos_func_binding FROM + PARTITION: partition by ORDERBY: order by | TONUMBER: to_number( ) ELEMENT: element name TABLE: table name | TEMPLATE Order Template  SQL-99 standard  Oracle, DB2 … Order Template

23 IDEAS’05 23 Order-sensitive SQL generation  About push-down strategies  In general ---- push as much computation as possible into relational engine.  In order scenario --- tradeoff  Deep push: –Push OrderStep into Relational Engine –Relational engine has to support order template (SQL99) Q5 = SELECT Q2.sData FROM (SELECT Q1.pPos, Q1.sPos, Q1.sData, row_number() OVER (PARTITION BY Q1.pPos ORDER BY Q1.sPos) uNumPos FROM (SELECT P.POSITION AS pPos, S.POSITION AS sPos, S.SONG_PCDATA AS sData FROM PLAY P, SONG S WHERE P.IID = S.PID ) Q1 ) Q2 WHERE Q2.uNumPos = 2 ORDER BY Q2.pPos, Q2.sPos SQL Q5 $sData 32 Combine $uDataSongTag Tagger $uDataSongTag $result Tagger $sData $uDataSongTag 26 27 28

24 IDEAS’05 24 Order-sensitive SQL generation  Shallow push (otherwise) –leave OrderStep outside RDB –No requirement for Relational engine for supporting order template (SQL99) SELECT P.POSITION AS pPos, S.POSITION AS sPos, S.SONG_PCDATA AS sData FROM PLAY P, SONG S WHERE P.IID = S.PID OrderStep [$pPos], [$pPos, $sPos] $uNumPos 29 OrderBy $sPos, $pPos 4 Combine $uDataSongTag Tagger $uDataSongTag $result Tagger $sData $uDataSongTag Select $uNumPos=2 24 26 27 28 SQL Q1 $sData

25 IDEAS’05 25 Deep Push vs. Shallow Push  Low selectivity similar  High selectivity  Shallow push is better  Repeated sorting in deep push is expensive!

26 IDEAS’05 26 Experimental Study SQL Execution time --- Global vs. Local order encoding

27 IDEAS’05 27 Discussion: Further SQL optimization  General SQL optimization can be applied…  Cost-based SQL translation (SilkRoute)  Any other SQL optimization…  When order encoding is assumed…  SQL statements can be simplified by avoiding re-ordering  When relational database schema is aware …  Schema specific SQL optimization [KKN2002]

28 IDEAS’05 28 Related Work  XQuery-to-SQL translation systems: XPERANTO, SilkRoute, …  [TVB2002] I. Tatarinov, S. D. Viglas, K. Beyer, J. Shanmugasundaram, E. Shekita, and C. Zhang. Storing and Querying Ordered XML Using a Relational Database System. In SIGMOD, 2002.  Three order encoding methods are utilized  Algorithms of translating ordered XPath expressions into SQL But …  [KKN2002] R. Krishnamurthy, R. Kaushik, and J. F. Naughton. Optimizing Fixed-Schema XML to SQL Query Translation. In VLDB, 2002.

29 IDEAS’05 29 Conclusion  Propose a general framework for order-sensitive XQuery-to-SQL translation (XSOT)  Propose order-sensitive XML algebra Tree (XAT)  SQL-oriented order-sensitive XAT optimization  Efficient order SQL statements generation and optimization techniques  Implementation using Rainbow query engine  Experiments to verify the generality and SQL performance

30 IDEAS’05 30 Rainbow XML Management System  Rainbow website: http://davis.wpi.edu/dsrg/rainbow/index.html  Software download http://davis.wpi.edu/dsrg/rainbow/RainbowCore/release.htm http://davis.wpi.edu/dsrg/rainbow/RainbowCore/release.htm  Thank you!


Download ppt "Order-sensitive XML Query Processing over Relational Sources: An Algebraic Approach Authors: Ling Wang, Song Wang, Brian Murphy and Elke A. Rundensteiner."

Similar presentations


Ads by Google