Presentation is loading. Please wait.

Presentation is loading. Please wait.

Storing and Querying Ordered XML Using a Relational Database System By Khang Nguyen Based on the paper of Igor Tatarinov and Statis Viglas.

Similar presentations


Presentation on theme: "Storing and Querying Ordered XML Using a Relational Database System By Khang Nguyen Based on the paper of Igor Tatarinov and Statis Viglas."— Presentation transcript:

1 Storing and Querying Ordered XML Using a Relational Database System By Khang Nguyen Based on the paper of Igor Tatarinov and Statis Viglas

2 Introduction Researchers proposed using relational database system to decompose XML documents into relations and translate XML queries into SQL queries over these relation. The paper will answer the question of how (and whether) the ordered XML data model can be effectively supported by the unordered relational data model.

3 Introduction (Cont.) The paper proposes three order encoding methods (Global Order, Local Order, and Dewey Order) to represent XML order in the relational data model. The paper will also answer the question of when and why to use which encoding method.

4 1. Ordered XML: Data Model, Query Language and Query Dimensions 1.1 The XML Data Model An XML can be viewed as a tree. Leaf Nodes = Data Values (text). Internal Nodes = XML elements. Document Order = the orders of the elements in the XML document.

5 1. Ordered XML: Data Model, Query Language and Query Dimensions 1.2 Order in XML Query Language 1.2.1 XPath is a language for specifying navigation within an XML document. An XPath expression’s syntax: Path ::= /Step1/Step2/…/Stepn Path ::= /films/film/writers/writer/Murray Burnett An XPath expression is evaluated sequentially, “step” by “step”.

6 1. Ordered XML: Data Model, Query Language and Query Dimensions (Cont.) 1.2.2 XQuery Is a more complex language based on XPath. Has all the functionalities of XPath. Includes Before and After operators that take two node sequences (XPath expressions) and Return the nodes from the first sequence that are before or after some node in the second sequence.

7 1. Ordered XML: Data Model, Query Language and Query Dimensions (Cont.) 1.3 Evaluation Modes for XML Queries 1.3.1 Select Mode: The nodes in an input XML document are assumed to have unique identifiers (IDs). The results of executing an XPath expression is an ordered set of node IDs. 1.3.2 Reconstruct Mode: Combines selection and extraction. The result of evaluating an XPath expression in reconstruct mode is an ordered set of XML elements.

8 2. XML Order Encoding Methods To capture the document order in the relational data model is accomplished by encoding each node’s position in an XML document as a data value. Unfortunately, there is no single encoding method, which is optimal for both queries and updates.

9 2. XML Order Encoding Methods (Cont.) 2.1 Global Order Encoding: Each node is assigned a number that represents the node’s absolute position in the document. Poor insertion performance is the primary weakness. To improve updates’ performance, use sparse numbering which does not require the remaining nodes be numbered when deleting XML fragments.

10 Global Order Encoding play title act 1 2 4 #texttitlescen e #text 35 6 7 Global Order

11 2. XML Order Encoding Methods (Cont.) 2.3 Local (Sibling) Order Encoding: Each node is assigned a number that represents its relative position among its siblings. Combining a node’s position with that of its ancestors yields a path vector that uniquely identifies the absolute position of the node within the document. The advantage is the low overhead incurred by updates.

12 Local Order Encoding play title act 1 1 2 #texttitlescen e #text 11 1 2 Local Order 3

13 2. XML Order Encoding Methods (Cont.) 2.3 Dewey Order Encoding: Each node is assigned a vector that represents the path from the document’s root to the node. Is “lossless” because each path uniquely identifies the absolute position of the node within the document. One potential disadvantage is the extra space required to store paths from the root to each node.

14 Dewey Order Encoding play title act 1 1.1 1.2 #texttitlescen e #text 1.1.11.2.1 1.2.1.1 1.2.2 Dewey Order 1.3

15 3. Shredding Ordered XML into Relations 3.1 The Schema-less Case: The schema of input document is unknown. The Edge shredding approach is proposed. A single relation, the Edge table, is used to store an entire document. The Edge table is defined Edge(id, parent, name, value) To reduce storage, a separate relation (the Path table) can be used to store paths and their identifiers.

16 3. Shredding Ordered XML into Relations (Cont.) 3.1.1 Storing Order Information Global Order Edge(id, parent_id, end_des_id, path_id, value) Local Order Edge(id, parent_id, sIndex, path_id, value) Dewey Order Edge(dewey, path_id, value)

17 3. Shredding Ordered XML into Relations (Cont.) 3.2 The Schema-aware Case: Inlining shredding technique can be used when an XML Schema (DTD) is known. One advantage is the possibility of more efficient navigation from an element to its sub-elements. Another advantage is Inlining shreds and stores XML documents into a set of tables. Queries tend to access less data and perform better.

18 Summary Relational database systems can support most ordered XML queries efficiently. Global Order is best for query-mostly workloads. Dewey Order is best for a mix of queries and updates. Local Order is best for update-intensive environments. Edge and Inlining methods are well known for translating ordered XML queries into SQL queries.


Download ppt "Storing and Querying Ordered XML Using a Relational Database System By Khang Nguyen Based on the paper of Igor Tatarinov and Statis Viglas."

Similar presentations


Ads by Google