Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Connection Factory Jeroen van Rotterdam, CTO May 19th, WWW9.

Similar presentations

Presentation on theme: "The Connection Factory Jeroen van Rotterdam, CTO May 19th, WWW9."— Presentation transcript:

1 The Connection Factory Jeroen van Rotterdam, CTO May 19th, WWW9

2 Contents - Xhive setup - Xpath - Xpath performance issues within XML collections

3 Xhive - OO-XML database - Highly scalable - High granularity - W3C DOM L2 compliant - Xpath 1.0 compliant

4 Architecture


6 Why XPath Competing solutions: - XML-QL: Where-In constructs - XQL: limited - SQL: no alternative Xpath a complete pattern match language.

7 Xpath Advantages: - fairly complete - multiple axes - supported by W3C - base for Xpointer, Xlink - base for XML Query WG - user based functions Disadvantages: - document oriented - minor different tree model - no updates

8 Extending DOM Collection setup: Every document is a Bastard Node

9 Library Node Advantages - Natural extension of DOM - extendible - closely related to directory structures - searchable with Xpath

10 Library Node Disadvantages - potential bottleneck

11 Xpath - Xpath in a large PDOM collection environment: 1. Address memory issues 2. Solve differences in specs 3. Address performance issues

12 Memory issues - Avoid recursion - make subresults persistent capable

13 Solve differences Differences in specs are f.i.: - getParent on attributes vs. ownerElement - namespace nodes

14 Performance Increase Xpath performance: - Query analysis - Avoid reparsing - Lazy evaluation - Index structures - Cache strategy - DTD analysis - Statistical data

15 Performance 1. Query analysis: a. Can I simplify my query f.i:/child::chapter[5+5]

16 Performance 1. Query analysis: b. Does your query depends on the context node. Absolute queries are context independent: Give me all chapters where the title is the same as the book title //chapter[title=string(/book/title)] Evaluate string(/book/title) only once.

17 Performance 2. Storing parsed queries: Compile, optimize queries only once

18 Performance 3. Lazy evaluation: f.i. operations on Nodesets - booleans (evaluate first node) - strings (first in doc order) - number (string to number) Example: give me all chapters which have paragraphs /chapter[paragraph] Finding 1 paragraph will do

19 Performance 4. Indexing: - getFirstChildElementByName(String name) - getNextSiblingElementBySameName() - getFirstChildByType( short type ) - getNextSiblingByType( short type )

20 Performance 5. Caching strategy: top level paging/cluster strategy

21 Performance 6. Use DTD information: f.i. /child::chapter/child::book[4] Might return null if you have info on the DTDs used.

22 Performance 7. Gather statistical info: DTDs or Xschema specify structures that may occur, not whats actually in your collection.

23 Conclusion - DOM within database environments - Xpath on top of a PDOM - Xpath is fairly complete - Focus on performance

24 WWW9 Beta testers, Developers wanted. Email: Have fun…...

Download ppt "The Connection Factory Jeroen van Rotterdam, CTO May 19th, WWW9."

Similar presentations

Ads by Google