Presentation is loading. Please wait.

Presentation is loading. Please wait.

11.02.08 Benchmarking XML storage systems Information Systems Lab HS 2007 Final Presentation © ETH Zürich | Benchmarking XML.

Similar presentations


Presentation on theme: "11.02.08 Benchmarking XML storage systems Information Systems Lab HS 2007 Final Presentation © ETH Zürich | Benchmarking XML."— Presentation transcript:

1 11.02.08 Benchmarking XML storage systems Information Systems Lab HS 2007 Final Presentation © ETH Zürich | Benchmarking XML

2 11.02.08 Benchmarking XML – Final Presentation 2 Agenda  Project Overview  Motivation  Goal of the Project  Benchmark Overview  Results  RDBMS 1  Sedna  MonetDB

3 11.02.08 Benchmarking XML – Final Presentation 3 Motivation  Traditional DBMS use relational data model  Vendors extend their systems to process XML or build new native stores  XML processing is conceived to be slow  Benchmarks for XML are just being developed

4 11.02.08 Benchmarking XML – Final Presentation 4 Goal of the Project  Analyse and compare performance of different systems to process XML  Systems tested:  RDBMS1 – big player in the relational DBMS market, extended their product with XML capabilities  Sedna – free native XML DB designed to be a universal system for a wide range of XML applications  MonetDB – very fast compared to other XML-DBs, but only supports a small part of the XQuery functions

5 11.02.08 Benchmarking XML – Final Presentation 5 Benchmark  Benchmark used : TPC-X  currently under development at ETH  models an Amazon-like online store in XML  complete database is one XML file  e.g.: users with history, products with comments  complex queries that put stress on query engine

6 11.02.08 RDBMS1 Information Systems Lab HS 2007 Final Presentation © ETH Zürich | Benchmarking XML

7 11.02.08 Benchmarking XML – Final Presentation 7 Impression of the System  almost all queries work with few changes  update queries were surprisingly easy to adapt

8 11.02.08 Benchmarking XML – Final Presentation 8 Impression of the System (contd.)‏  not supported:  type-switch (limited schema support)‏  user-defined functions

9 11.02.08 Benchmarking XML – Final Presentation 9 Current Performance  datamining  about one order of magnitude slower than Sedna  update and search  seem a bit faster (but still slower than others)‏

10 11.02.08 Benchmarking XML – Final Presentation 10 Tuning possibilities  any XPath expression can be indexed  Indexes seem to be based on rows rather than on trees

11 11.02.08 Benchmarking XML – Final Presentation 11 Issue with Indexing  Indexes help only with „split“-tables, but they are slower in general

12 11.02.08 Benchmarking XML – Final Presentation 12 Issues „When the only tool you own is a hammer, every problem begins to resemble a nail.“ Abraham Maslow

13 11.02.08 Benchmarking XML – Final Presentation 13 Issues with Joins  there is only Nested-Loops-Join  no use of index as soon as a join is needed  joins for almost anything

14 11.02.08 Benchmarking XML – Final Presentation 14 Summary  almost anything works (even the adapter for XCheck!)  everything is slow

15 11.02.08 Benchmarking XML – Final Presentation 15 Conclusion  RDBMS1 is not suited for TpcX-Benchmark  XML storage as a improvement for relational data but not as stand-alone system

16 11.02.08 Sedna Information Systems Lab HS 2007 Final Presentation © ETH Zürich | Benchmarking XML

17 11.02.08 Benchmarking XML – Final Presentation Overview  Free native XML Database  No Schema support  Bulk-Load (native XML data storage)  Document Collections  Indexing  Full-Text indexing (dtSearch)

18 11.02.08 Benchmarking XML – Final Presentation Impression  Good Introduction Example  Few Reference Material  Active Development Team

19 11.02.08 Benchmarking XML – Final Presentation XQuery Support  Most of the queries worked with a few changes  Not supported:  Schema Import  FLWR-Expression with Update-Statement

20 11.02.08 Benchmarking XML – Final Presentation Indexing (value Indices)  Based on B-Tree  For Elements and Attribute Values  Managing:  Create Index on Nodes by Keys  Query executer does not support indexes automatically -use „index-scan“ function in XQuery

21 11.02.08 Benchmarking XML – Final Presentation Indexing (cont.) gainsPerMonth1001’00010’00050’000100’000 Normal0.3627.53 With Indices0.080.525.1425.7165.58

22 11.02.08 Benchmarking XML – Final Presentation Indexing (Full-Text Indices)  Sedna provides Full-Text Indices with dtSearch  dtSearch: commercial text retrieval engine  No free download

23 11.02.08 Benchmarking XML – Final Presentation Conclusion  Easy to start with the system  Few reference material  Most of the queries work with a few changes  Execution time grows exponentially with larger dataset  Value indices deliver better execution times

24 11.02.08 MonetDB Information Systems Lab HS 2007 Final Presentation © ETH Zürich | Benchmarking XML

25 11.02.08 Benchmarking XML – Final Presentation Overview & impression of the system  well documented installation / usage  many xquery features not supported  good performance  xml schema support, but no noticed performance or functionality effect  no support for user defined indexing (”automatic and self-tuning indexes”)

26 11.02.08 Benchmarking XML – Final Presentation Architecture  MonetDB: Open-source database system for high-performance applications in data mining, OLAP, XML Query, test and multimedia retrieval. Provides the databse functionality using the MIL- interface (MonetDB Interpreter Language).  Pathfinder: XQuery compiler that translates xquery expressions into relational algebra and calls MIL functions.

27 11.02.08 Benchmarking XML – Final Presentation XQuery support  Date/Time functions (0/76)  String functions (21/32) fn:contains, fn:tokenize  Sequence functions (11/19) fn:insert-before  … … quite complete support for XQuery language… monetdb.cwi.nl Not supported functions:

28 11.02.08 Benchmarking XML – Final Presentation XML data import  pf:add-doc("url", "file", x%)  need x > 0 for update queries  -> need to adapt xcheck  influence on performance not clear

29 11.02.08 Benchmarking XML – Final Presentation Performance...often achieves a 10- fold raw speed improvement for SQL and XQuery over competitor RDBMSs... monetdb.cwi.nl

30 11.02.08 Benchmarking XML – Final Presentation Scalability

31 11.02.08 Benchmarking XML – Final Presentation Conclusions  Very fast, good for large documents and expensive queries  Small documents: no drawback compared to other DBMSs  Big problem: lack of function support If xquery function support gets better, it’s probably the database of our choice!

32 11.02.08 Project Summary Information Systems Lab HS 2007 Final Presentation © ETH Zürich | Benchmarking XML

33 11.02.08 Benchmarking XML – Final Presentation Project Summary  RDBMS1  slow but can process almost anything.  XML as a feature.  Sedna  quite fast, can process a reasonable part of XML.  MonetDB  very fast, but only limited capabilities.


Download ppt "11.02.08 Benchmarking XML storage systems Information Systems Lab HS 2007 Final Presentation © ETH Zürich | Benchmarking XML."

Similar presentations


Ads by Google