Presentation is loading. Please wait.

Presentation is loading. Please wait.

MonetDB/XQuery Technology Preview 1 Stefan Manegold CWI Amsterdam -

Similar presentations


Presentation on theme: "MonetDB/XQuery Technology Preview 1 Stefan Manegold CWI Amsterdam -"— Presentation transcript:

1 MonetDB/XQuery Technology Preview 1 Stefan Manegold CWI Amsterdam http://monetdb.cwi.nl/ - http://pathfinder-xquery.org/

2 European Pathfinder Team University of Konstanz (Germany) –Torsten Grust, Jens Teubner, Jan Rittinger University of Twente (Netherlands) –Maurice van Keulen, Jan Flokstra CWI, Amsterdam (Netherlands) –Peter Boncz, Stefan Manegold, Sjoerd Mullender Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

3 Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

4 Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery Results: Performance (1)

5 did not finish Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery Results: Performance (2)

6 Story XQuery Example Relational XQuery –System Architecture –XML Encoding Science & Reseach Scalability Outlook –Conclusions –Roadmaps –Release & References Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

7 For each author, return number of books and receipts for books published in the past 2 years, ordered by name let $cat := fn:doc(“www.bn.com/catalog.xml”), (: Documents :) $sales := fn:doc(“www.publishersweekly.com/sales.xml”) for $author in distinct-values($cat//author) (: Grouping :) let $books := $cat//book[@year >= 2003 and author = $a], (: Sel. :) $receipts := $sales/book[@isbn = $books/@isbn]/receipts (: Join :) order by $author (: Ordering :) return (: XML Construction :) { $author } { fn:count($books) } (: Aggregation :) { fn:sum($receipts) } Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery XQuery Example

8 For each author, return number of books and receipts for books published in the past 2 years, ordered by name let $cat := fn:doc(“www.bn.com/catalog.xml”), Documents $sales := fn:doc(“www.publishersweekly.com/sales.xml”) for $author in distinct-values($cat//author) Grouping let $books := $cat//book[@year >= 2003 and author = $a], Sel. $receipts := $sales/book[@isbn = $books/@isbn]/receipts Join order by $author Ordering return XML Construction { $author } { fn:count($books) } Aggregation { fn:sum($receipts) } Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery XQuery Example

9 XQuery Systems: 2 Approaches Existing “native” XML/XQuery systems are built from scratch –Galax, Saxon, … –X-Hive, Tamino, … –(Still have to) re-invent optimization technology Our approach: –Build XQuery system on top of an RDBMS –Leverage mature relational technology to achieve efficient XQuery processing Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

10 Architecture Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

11 XML in an RDBMS: XPath Accelerator Node-based relational encoding of XQuery's data model Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery 1.f /descendant:SELECT * FROM pre_post WHERE pre > f.pre AND post < f.post 2.f /ancester:SELECT * FROM pre_post WHERE pre f.post 3.f /preceeding:SELECT * FROM pre_post WHERE pre < f.pre AND post < f.post 4.f /following:SELECT * FROM pre_post WHERE pre > f.pre AND post > f.post

12 Science & Research More research lead to more optimization –Join Recognition –Embedded XPath processing –Order Awareness Various scientific publications Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

13 Results: Scalability (3) Unsurpassed scalability Standard Opteron PC, 8GB RAM, 64-bit Linux Can process 11GB documents! Mostly linear scaling with document size Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

14 Conclusions Relational approach  Works  Is fast Is scalable Crucial Optimizations –Join recognition –Embedded XPath processing –Order awareness Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

15 Roadmap 30-05-05: MonetDB/XQuery 4.8/0.8 “Mercurius” –Developers Release / Technology Preview 1 30-09-05: MonetDB/XQuery 4.10/0.10 “Venus” –Student Release / Technology Preview 2 –XUpdate, Algebraic Query Optimization 30-12-05: MonetDB/XQuery 4.12/1.12 “Mars” –Final Release –Application Programming Interfaces –End-User Front-Ends Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

16 Open Source Release MonetDB + Pathfinder on SourceForge –Mozilla-like License MonetDB homepage –http://monetdb.cwi.nl/ Pathfinder homepage –http://pathfinder-xquery.org/ Developers website –http://sf.net/projects/monetdb/ Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

17 Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

18 Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

19 Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

20 Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

21 Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

22 Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

23 Outline Basic XML / XQuery Introduction of Pathfinder and MonetDB projects Relational XQuery –XPath steps in the pre/post plane –Translating for-loops, and beyond Optimizations –Order prevention –Loop-Lifted Staircase join –Join recognition Outlook –Conclusions –Roadmaps Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

24 Outline Basic XML / XQuery Introduction of Pathfinder and MonetDB projects Relational XQuery –XPath steps in the pre/post plane –Translating for-loops, and beyond Optimizations –Order prevention –Loop-Lifted Staircase join –Join recognition Outlook –Conclusions –Roadmaps Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

25 XML Standard, flexible syntax for data exchange –Regular, structured data Database content of all kinds: Inventory, billing, orders, … “Small” typed values –Irregular, unstructured text Documents of all kinds: Transcripts, books, legal briefs, … “Large” untyped values Lingua franca of B2B Applications… –Increase access to products & services –Integrate disparate data sources –Automate business processes … and numerous other application domains –Bio-informatics, library science, … Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

26 XML : A First Look XML document describing catalog of books No Such Thing as a Bad Day Hamilton Jordan Longstreet Press, Inc. 17.60 Publisher : This book is the moving account of one man's successful battles against three cancers... No Such Thing as a Bad Day is warmly recommended. Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

27 XQuery 1.0 Functional, strongly-typed query language XQuery 1.0 = XPath 2.0 for navigation, selection, extraction + A few more expressions For-Let-Where-Order By-Return (FLWOR) XML construction Operators on types + User-defined functions & modules + Strong typing Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

28 XSLT vs. XQuery XSLT 1.0: XML  XML, HTML, Text –Loosely-typed scripting language –Format XML in HTML for display in browser –Must be highly tolerant of variability/errors in data XQuery 1.0: XML  XML –Strongly-typed query language –Large-scale database access –Must guarantee safety/correctness of operations on data Over time, XSLT & XQuery may both serve needs of many application domains XQuery will become a hidden, commodity language Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

29 XQuery Example For each author, return number of books and receipts books published in past 2 years, ordered by name let $cat := fn:doc(“www.bn.com/catalog.xml“), Joinwww.bn.com/catalog.xml $sales := fn:doc(“www.publishersweekly.com/sales.xml“)www.publishersweekly.com/sales.xml for $author in distinct-values($cat//author) Grouping let $books := $cat//book[@year >= 2000 and author = $a], S.J. $receipts := $sales/book[@isbn = $books/@isbn]/receipts order by $author Ordering return XML Construction { $author } { fn:count($books) } Aggregation { fn:sum($receipts) } Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

30 Outline Basic XML / XQuery Introduction of Pathfinder and MonetDB projects Relational XQuery –XPath steps in the pre/post plane –Translating for-loops, and beyond Optimizations –Order prevention –Loop-Lifted Staircase join –Join recognition Outlook –Conclusions –Roadmaps Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

31 XQuery Systems: 2 Approaches Tree-based –Tree is basic data structure Also on disk (if an XQuery DBMS) –Navigational Approach Galax [Simeon..], Flux [Koch..], X-Hive –Tree Algebra Approach TIMBER [Jagadish..] Relational –Data shredded in relational tables –XQuery translated into database query (e.g. SQL) Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

32 The Pathfinder Project Challenge / Goal: –Turn RDBMSs into efficient XQuery engines People: –Torsten Grust, Jens Teubner University of Konstanz (June 2005: Technical University of Munich) –Maurice van Keulen University of Twente –Jan Rittinger University of Konstanz & CWI Task: generate code for MonetDB Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

33 The Pathfinder Project Challenge / Goal: –Turn RDBMSs into efficient XQuery engines People: –Torsten Grust, Jens Teubner,... University of Konstanz (June 2005: Technical University of Munich) –Maurice van Keulen, Jan Flokstra,... University of Twente –Jan Rittinger University of Konstanz & CWI Task: generate code for MonetDB –Peter Boncz, Stefan Manegold, Sjoerd Mullender,... CWI, Amsterdam Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

34 MonetDB: Applied CS Research at CWI a decade of “query-intensive” application experience image retrieval: Peter Bosch  ImageSpotter audio/video retrieval: Alex van Ballegooij  RAM XML text retrieval: de Vries / Hiemstra  TIJAH biological sequences: Arno Siebes  BRICKS XML databases: Albrecht Schmidt  XMark Grust / vKeulen  Pathfinder GIS: Wilco Quak  MAGNUM data warehousing / OLAP / data mining SPSS  DataDistilleries Univ. Massachussetts  PROXIMITY CWI research group successfully spun off DataDistilleries (now SPSS) Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

35 Pathfinder — MonetDB Pathfinder MonetDB Parser Sem. Analysis Core Translation Typechecking Relational Algebra Database MIL SQL Parser Sem. Analysis Core Translation Typechecking Database MIL Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

36 Pathfinder — MonetDB Pathfinder MonetDB Parser Sem. Analysis Core Translation Typechecking Relational Algebra Database MIL SQL Core to MIL Translation Parser Sem. Analysis Core Translation Typechecking Database MIL Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

37 Open Source MonetDB + Pathfinder on Sourceforge – Mozilla License MonetDB Homepage – http://monetdb.cwi.nl/ http://monetdb.cwi.nl Pathfinder Homepage – http://pathfinder-xquery.org/ Developers website: – http://sf.net/projects/monetdb/ RoadMap 14-apr-04: initial Beta release MonetDB/SQL 30-sep-04: first official release MonetDB/SQL 30-may-05: Developer release of MonetDB/XQuery (i.e. Pathfinder) 30-sep-05: Student release of MonetDB/XQuery (incl. XUpdate) 30-dec-05: Users release of MonetDB/XQuery (?) Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

38 MonetDB: extensible architecture Front-end/back-end: support multiple data models support multiple end- user languages support diverse application domains Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

39 Front-end/back-end: support multiple data models support multiple end- user languages support diverse application domains Pathfinder XQuery Frontend MonetDB: extensible architecture Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

40 Architecture Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

41 The Architecture Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

42 Outline Basic XML / XQuery Introduction of Pathfinder and MonetDB projects Relational XQuery –XPath steps in the pre/post plane –Translating for-loops, and beyond Optimizations –Order prevention –Loop-Lifted Staircase join –Join recognition Outlook –Conclusions –Roadmaps Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

43 XPath on an RDBMS Node-based relational encoding of XQuery's data model Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

44 Pre/Post  Pre/Level/Size done for better skipping and updates Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

45 Outline Basic XML / XQuery Introduction of Pathfinder and MonetDB projects Relational XQuery –XPath steps in the pre/post plane –Translating for-loops, and beyond Optimizations –Order prevention –Loop-Lifted Staircase join –Join recognition Outlook –Conclusions –Roadmaps Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

46 Order Prevention [VLDB03 Wang&Cherniack] define: Order properties of relations Order propagation rules for relational operators Decoration of physical plans with order properties  eliminate sort Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

47 –For loop  map with all combinations  O(N*N) –If `simple’ condition exist on two loop variables  join –Only make a map with the matching combinations –E.g. with Hash-Table  O(N) Performed on the XCore tree Recognize if-then expressions Open question: where to optimize best?? Join Recognition for $p in $auction/site/people/person for $t in $auction/site/closed_auctions/closed_auction where $t/buyer/@person = $p/@id return $t Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

48 Loop-lifted staircase join Staircase join [VLDB03]: –Single-pass for a *set* of context nodes Loop-lifting  multiple iters  multiple sets of context nodes –elaborate skipping! –Loop-Lifted Staircase Join In a single pass: process multiple input context node lists –Use a stack –Exploit axis properties for pruning Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

49 Scalability Test platform Opteron 1.6GHz, 8GB RAM, 64-bit Linux (Fedora Core 3) Can process 11GB document! Mostly linear scaling with document size Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

50 Scalability Test platform Opteron 1.6GHz, 8GB RAM, 64-bit Linux (Fedora Core 3) Can process 11GB document! Mostly linear scaling with document size Some swapping in the join queries Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

51 Scalability Test platform Opteron 1.6GHz, 8GB RAM, 64-bit Linux (Fedora Core 3) Can process 11GB document! Mostly linear scaling with document size Some swapping in the join-queries Q11 + Q12 generate quadratic result Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

52 XMark 10MB : Pathfinder vs XHive & Galax Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

53 XMark 1GB: Pathfinder vs X-Hive did not finish Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

54 Outline Basic XML / XQuery Introduction of Pathfinder and MonetDB projects Relational XQuery –XPath steps in the pre/post plane –Translating for-loops, and beyond Optimizations –Order prevention –Loop-Lifted Staircase join –Join recognition Outlook –Conclusions –Roadmaps Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

55 Conclusions Relational approach can be scalable & fast Crucial Optimizations –Join recognition –Loop-lifted XPath steps –Order awareness Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

56 Conclusions Relational approach can be scalable & fast Crucial Optimizations –Join recognition –Loop-lifted XPath steps –Order awareness Future Roadmap (Scientific/Research): Algebraic Query Optimization Updates Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

57 Product Roadmap 30-05-05: MonetDB/XQuery 4.8/0.8 “Mercurius” –Developers Release / Technology Preview 1 30-09-05: MonetDB/XQuery 4.10/0.10 “Venus” –Student Release / Technology Preview 2 –XUpdate, Algebraic Query Optimization 30-12-05: MonetDB/XQuery 4.12/1.12 “Mars” –Final Release –Application Programming Interfaces –End-User Front-Ends Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

58 Loop-lifted staircase join document List of context nodesActive stack Multiple lists of context nodes Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery

59 Staircase join document List of context nodes Stefan ManegoldHollandOpen, Amsterdam 31-5-2005MonetDB/XQuery


Download ppt "MonetDB/XQuery Technology Preview 1 Stefan Manegold CWI Amsterdam -"

Similar presentations


Ads by Google