Presentation is loading. Please wait.

Presentation is loading. Please wait.

A TREE BASED ALGEBRA FRAMEWORK FOR XML DATA SYSTEMS

Similar presentations


Presentation on theme: "A TREE BASED ALGEBRA FRAMEWORK FOR XML DATA SYSTEMS"— Presentation transcript:

1 A TREE BASED ALGEBRA FRAMEWORK FOR XML DATA SYSTEMS
Ali El bekai, Nick Rossiter School of Informatics, Northumbria University ,

2 Overview Framework in algebra for processing XML data.
Review related work Develop a simple algebra, called TA (Tree Algebra), for processing storing and manipulating XML data as trees Describe input and output of the algebraic operators Define the syntax of relationships/operators and their semantics in terms of algorithms. Examples are given in the domain specific XML query language. Discuss closure and application

3 Related Work IBM (Beech & Rys, 1999) Lore (McHugh et al 1997)
YATL (Christophides et al 2000) Niagara (Galanis et al 2001) AT&T (W3C) TAX (Jagadish et al 2001) Problems identified in complexity and generality

4 Tree Algebra True tree Leaves of tree Two types of operators
Each node one parent but many children Root node Leaves of tree Correspond to different sources – object relational Two types of operators Algebraic operators Relational operators

5 Concepts in Tree Model Root (ultimate ancestor or parent)
Node (parent or child) Edge (link from a parent to a child) Leaf (atomic values, nodes with no children) Path (sequence of edges between nodes) Descendants (all successor nodes for a node) Ancestors (all parent nodes for a node)

6 Mappings XML Document  Tree Element  Node (root, parent, child)
Leaf  child node, atomic values Attribute  function, values

7 Example XML Tree Root – collection element; object1, object3 – sub-elements;

8 Algebraic Relationships
Comparison of two trees Universal (unary) Defines tree containing all information Similarity (binary) Two trees have the same structure Equivalence (binary) Two trees are indistinguishable Subsumption (binary) One tree is subsumed in another

9 Example Equivalence Relationship
XML Tree Collection3 is equivalent to Collection4: Same node structure, no mismatch in content

10 Example Subsumption Relationship
Collection3 is part of collection4 (structure and content)

11 Algebraic Operators for Trees
Join (binary, input two trees, output one tree, commutative, associative) Joined on a predicate Union (binary, input two trees, output one tree, commutative, associative, disjoint) Summing trees together Complement (binary, input two trees, output one tree, not commutative, not associative) Nodes in one tree not found in another

12 Algorithm for Complement Operator
// Input two XML document or two DOC tree (DOCn Tree, DOCm Tree) // Output DOCnm Tree = (DOCn Tree - DOCm Tree) 1 Start from root node DOCn If root node DOCn Tree and root node DOCm Tree has parent/child node .1 Perform depth-first algorithm .2 If DOCn Tree has parent node not existing in DOCm Tree 2.2.1 set parent node DOCn Tree to the new DOCnm Tree 2.2.2 while parent node DOCn Tree has child node not existing in DOCm Tree set child node DOCn Tree to DOCnm Tree if child node DOCn Tree has leaf node not existing in DOCm Tree set leaf node DOCn Tree to DOCnm Tree set null to DOCnm Tree 2.2.3 repeat 2.3 set null to DOCnm Tree 3 Set root node to DOCnm Tree and terminate end/terminate

13 Projection Algebra Operator (unary, input one tree,
output one tree): Example Eliminates nodes other than those specified Projection of object3

14 Algebra Operators (continued)
Select (unary, input one tree, output one tree) Filters nodes according to a predicate Expose (unary, input one tree, output one tree) Retrieve specific elements/nodes given by parent/child boundaries Vertex (unary, input one tree, output one tree) Creates the vertex encompassing all nodes created by the expose operator

15 Algorithm for Complement Operator
// Input one DOC tree or one XML document // Output one DOC tree or one XML document 1 start with entry point, it is the root node perform depth-first algorithm 2.1 if parameter is equal to the specific node needed to expose .1.1 return the specific node .1.2 set specific node in the new tree 2.2 if exposed element does not exist then terminate 3 end/terminate

16 Results Developed Domain specific algebra Tree algebra
Algebraic relationships Universal, similarity, equivalence, subsumption Algebraic operators Join, union, complement, project, select, expose, vertex Closure – output is always a tree

17 Verification All operators: Case study: Presented as algorithms
Implemented in java Case study: Virtual museum application Implemented code employed for satisfaction of museum requirements

18 Further Work Investigate Further experimentation
Extent to which limitations in operators affects usability Does domain need extending? Further experimentation Examine feedback from museum study Look at further areas


Download ppt "A TREE BASED ALGEBRA FRAMEWORK FOR XML DATA SYSTEMS"

Similar presentations


Ads by Google