Bottom-up Evaluation of XPath Queries Stephanie H. Li Zhiping Zou.

Slides:



Advertisements
Similar presentations
XML-XSL Introduction SHIJU RAJAN SHIJU RAJAN Outline Brief Overview Brief Overview What is XML? What is XML? Well Formed XML Well Formed XML Tag Name.
Advertisements

Transforming XML XMLNamespaces, XSLT. XML Namespaces Sometimes it is necessary to mix XML elements –Different types of content –Use of markup to convey.
Dr. Alexandra I. Cristea CS 253: Topics in Database Systems: XPath, NameSpaces.
Dr. Alexandra I. Cristea XPath and Namespaces.
Internet Technologies1 1 Lecture 4: Programming with XSLT.
XML, XML Schema, Xpath and XQuery Slides collated from various sources, many from Dan Suciu at Univ. of Washington.
1 XSLT – eXtensible Stylesheet Language Transformations Modified Slides from Dr. Sagiv.
Managing Data Exchange: XPath
XPath XML Path Language. Outline XML Path Language (XPath) Data Model Description Node values XPath expressions Relative expressions Simple subset of.
1 CS 561 Presentation: Indexing and Querying XML Data for Regular Path Expressions A Paper by Quanzhong Li and Bongki Moon Presented by Ming Li.
XPath Eugenia Fernandez IUPUI. XML Path Language (XPath) a data model for representing an XML document as an abstract node tree a mechanism for addressing.
1 Conditional XPath, the first order complete XPath dialect Maarten Marx Presented by: Einav Bar-Ner.
Processing XML Processing XML using XSLT Processing XML documents with Java (DOM) Next week -- Processing XML documents with Java (SAX)
XSL Concepts Lecture 7. XML Display Options What can XSL Transformations do? generation of constant text suppression of content moving text (e.g., exchanging.
1 Indexing and Querying XML Data for Regular Path Expressions A Paper by Quanzhong Li and Bongki Moon Presented by Amnon Shochot.
G. Gottlob, C. Koch & R. Pichler TU Wien, Vienna, Austria Elias Politarhos Advanced Databases M.Sc. in Information Systems Athens University of Economics.
Internet Technologies XSLT Processing XML using XSLT Using XPath Escaping to Java.
More XML XML schema, XPATH, XSLT CS 431 – February 21, 2005 Carl Lagoze – Cornell University acknowledgements to
Storing and Querying Ordered XML Using a Relational Database System By Khang Nguyen Based on the paper of Igor Tatarinov and Statis Viglas.
XPath Carissa Mills Jill Kerschbaum. What is XPath? n A language designed to be used by both XSL Transformations (XSLT) and XPointer. n Provides common.
XPath Query Evaluation - A Top Down Approach Mohammed Pithapurwala Pejus Das
The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen.
XPath Tao Wan March 04, What is XPath? n A language designed to be used by XSL Transformations (XSLT), Xlink, Xpointer and XML Query. n Primary.
Cornell CS 502 More XML XML schema, XPATH, XSLT CS 502 – Carl Lagoze – Cornell University.
Object Oriented Programming III1 XSLT Processing XML using XSLT Using XPath.
September 15, 2003Houssam Haitof1 XSL Transformation Houssam Haitof.
Internet Technologies1 XSLT Processing XML using XSLT Using XPath.
Overview of XPath Author: Dan McCreary Date: October, 2008 Version: 0.2 with TEI Examples M D.
Introduction to XPath Bun Yue Professor, CS/CIS UHCL.
SD2520 Databases using XML and JQuery
XML files (with LINQ). Introduction to LINQ ( Language Integrated Query ) C#’s new LINQ capabilities allow you to write query expressions that retrieve.
10/06/041 XSLT: crash course or Programming Language Design Principle XSLT-intro.ppt 10, Jun, 2004.
Xpath Query Evaluation. Goal Evaluating an Xpath query against a given document – To find all matches We will also consider the use of types Complexity.
Navigating XML. Overview  Xpath is a non-xml syntax to be used with XSLT and Xpointer. Its purpose according to the W3.org is  to address parts of an.
CSE3201/CSE4500 XPath. 2 XPath A locator for elements or attributes in an XML document. XPath expression gives direction.
TDDD43 XML and RDF Slides based on slides by Lena Strömbäck and Fang Wei-Kleiner 1.
1/17 ITApplications XML Module Session 7: Introduction to XPath.
Introduction to XPath Web Engineering, SS 2007 Tomáš Pitner.
CSE3201/CSE4500 Information Retrieval Systems
XP New Perspectives on XML Tutorial 6 1 TUTORIAL 6 XSLT Tutorial – Carey ISBN
1 XPath XPath became a W3C Recommendation 16. November 1999 XPath is a language for finding information in an XML document XPath is used to navigate through.
1 CIS336 Website design, implementation and management (also Semester 2 of CIS219, CIS221 and IT226) Lecture 6 XSLT (Based on Møller and Schwartzbach,
Lecture 22 XML querying. 2 Example 31.5 – XQuery FLWOR Expressions ‘=’ operator is a general comparison operator. XQuery also defines value comparison.
XPath Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
Processing of structured documents Spring 2003, Part 7 Helena Ahonen-Myka.
XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for.
XSLT part of XSL (Extensible Stylesheet Language) –includes also XPath and XSL Formatting Objects used to transform an XML document into: –another XML.
IS432: Semi-Structured Data Dr. Azeddine Chikh. 6. XML Path (XPath)
August Chapter 6 - XPath & XPointer Learning XML by Erik T. Ray Slides were developed by Jack Davis College of Information Science and Technology.
XPath Aug ’10 – Dec ‘10. XPath   XML Path Language   Technology that allows to select a part or parts of an XML document to process   XPath was.
WPI, MOHAMED ELTABAKH PROCESSING AND QUERYING XML 1.
[ Part III of The XML seminar ] Presenter: Xiaogeng Zhao A Introduction of XQL.
Streaming XPath Engine Oleg Slezberg Amruta Joshi.
1 XML Data Management XPath Principles Werner Nutt.
More XML XPATH, XSLT CS 431 – February 23, 2005 Carl Lagoze – Cornell University.
Session II Chapter 3 – Chapter 3 – XPath Patterns & Expressions Chapter 4 – XPath Functions Chapter 15 – XPath 2.0http://
University of Nottingham School of Computer Science & Information Technology Introduction to XML 2. XSLT Tim Brailsford.
CSE3201/CSE4500 XPath. 2 XPath A locator for items in XML document. XPath expression gives direction of navigation.
XPath --XML Path Language Motivation of XPath Data Model and Data Types Node Types Location Steps Functions XPath 2.0 Additional Functionality and its.
XSLT: How Do We Use It? Nancy Hallberg Nikki Massaro Kauffman.
1 XPath. 2 Agenda XPath Introduction XPath Nodes XPath Syntax XPath Operators XPath Q&A.
1 The XPath Language. 2 XPath Expressions Flexible notation for navigating around trees A basic technology that is widely used uniqueness and scope in.
5 Copyright © 2004, Oracle. All rights reserved. Navigating XML Documents by Using XPath.
Indexing and Querying XML Data for Regular Path Expressions Quanzhong Li and Bongki Moon Dept. of Computer Science University of Arizona VLDB 2001.
XPath.
Xpath creation.
Querying and Transforming XML Data
{ XML Technologies } BY: DR. M’HAMED MATAOUI
XML Path Language Andy Clark 17 Apr 2002.
More XML XML schema, XPATH, XSLT
Presentation transcript:

Bottom-up Evaluation of XPath Queries Stephanie H. Li Zhiping Zou

Outline  Overview of XPath  Motivation  Algorithms : bottom-up evaluation  Design and implementation

Introduction- Overview  Overview of Xpath XPath is a querying language and is designed for addressing nodes of XML documents. Data model Syntax Expressions  Location paths  Operators  Functions Evaluation(context)

Data Model  Data Model XML document = tree of nodes 7 kinds of nodes: Element Attribute Text Namespace Processing-instruction Comment Document (root) nodes.

Data Model(Example) r a bb The root node The root element b b

Expression  XPath uses expressions to select nodes from XML documents  The main types of expressions are Location Paths, Functions and operators

Location Paths  Although there are many different kinds of XPath expressions, the one that’s of primary use in Java programs is the location path.  Location Path: /child::movies/child::movie[position()=5] step axis nodetest predicate location path

Location Step  Axis::Nodetest[predicts] Axis: chooses the direction to move from the context node Node test: determines what kinds of nodes will be selected along that axis Predicts: further filter the node-set.

XPath Axis  Axis---main navigator for a XML doc ancestor : nodes along the path to the root ancestor-or-self : same but including the context node child : children of the context node descendant : descendants of the context node descendant-or-self : same but including the context node following : nodes after the context node in document order, excluding descendants following-sibling : following sibling of the context node parent : the parent of the context node preceding : nodes before the context node in document order,excluding ancestors preceding-sibling : preceding sibling of the context node

Node Test  Node Type test Example T(root()) = {r}, T(element()) = {a; b1; : : : ; b4} T(element(a))= {a} T(element(b)) = {b1; : : : ; b4}  Node Name test Element node name

Operators and Functions  Arithmetic Ops  Ops for comparisons and boolean logic: {, =,=,!=} {or, and}  Functions Position() Last()

Xpath Query Evalutation  Query evaluation is a major algorithmic problem Main construct is the expression Each expression is evaluated to yield an object one of these four types: Node-set (an unordered collection of nodes without duplicates ) Boolean(true or false) Number(a floating-point number ) String

Context  All XPath expressions are evaluated w.r.t. a Context,which consists of A context node A context position(int) A context size(int)  The input context for query evaluation is chosen by the user.

Motivation  Claim: The way XPath is defined in W3C XPath recommendation motivates an inefficient implementation (exponential-time).  This paper propose more efficient way (polynomial-time)

Basic query evaluation strategy Procedure process-location-step(n 0, Q) /* n 0 is the context node; query Q is a list of location steps */ Begin node set S := apply Q.first to node n 0 ; if (Q.tail is not empty) then for each node n ∈ S do process-location-step(n, Q.tail); End Time(|Q|) = |D| * Time(|Q|-1) or |D| |Q| when |Q| > 0 1 when |Q| = 0 The algorithm recursively evaluates each remaining step for each matching node of the current step

Xpath Evaluate in PTime  Theorem: Let e be an arbitrary XPath expression. Then, for context node x, position k, and size n, the value of e is v, where v is the unique value such that ∈ E↑[e]  The main principle that the paper propose to obtain an XPath evaluation algorithm with PTime complexity is the notion of a context-value table(CVT)

Context-value table Principle  Given an expression e, the CVT of e specifies all valid combinations of contexts c and values v, s.t. e evaluates to v in context c  Such a table for expression e is obtained by first computing the CVTs of the direct subexpressions of e and then combining them into the CVT for e.  The size of each of the CVTs has a polynomial bound  Each of the combination steps can be effected in PTime  Thus, query evaluation in total under our principle also has a PTime bound

Bottom-up evaluation of XPath

Algorithm (Bottom-up algorithm for XPath) Input: An XPath query Q; Output: E↑[Q] Method: Let Tree(Q) be the parse tree of query Q; R:=Ø; For each atomic expression l ∈ leaves(Tree(Q)) do compute table E↑[l] and add it to R; [Note: we use JDom to do this] While E↑[root(Tree(Q))]! ∈ R do Begin take an Op(l1,…ln) nodes(Tree(Q)) s.t. E↑[l1],… E↑[ln] ∈ R; compute E↑[Op(l1,…ln)] using E↑[l1],…, E↑[ln]; add E↑[Op(l1,…ln)] to R; End; Return E↑[root(Tree(Q))] By a bottom-up algorithm we mean a method of processing XPath while traversing the parse tree of the query from its leaves up to its root.

Bottom-up evaluation of XPath  Example XML : Alan Turing computer scientist mathematician cryptographer href=" Richard M. Feynman physicist Playing the bongoes

Example: XML Doc Tree

Example: XPath Query tree Parse tree XPath query: descendant:: profession/following-sibling::*[position()!= last()]

Example: Evaluate subexpressions

Design and Implementaion  Environment Java,JDK1.5.0 Jdom1.0 XPath1.0 Features: Only Element nodes are queried Not support abbreviated xpath expressions Not support format of location steps in predicts.

System Structure Query Parser (Parser.java BinaryTree.java,Node.java) User input (MyDriver.java) Query tree Evaluator( QueryEval.java) JDom XML parser (org.jdom.input.SAXBuilder) Context value tables (ContextValTable.java and others) XML document tree Result for the full xpath query XML file Query Context node

Conclusion  XPath query evaluation algorithm that runs in polynomial time with respect to the size of both the data and the query (linear in the size of queries and quadratic in the size of data)  No optimization, strictly coheres to the specification given in the paper

References  G. Gottlob, C. Koch, and R. Pichler. "Xpath Processing in a Nutshell". In Proceedings of the 19th IEEE International Conference on Data Engineering (ICDE'03), Bangalore, India, Mar  G. Gottlob, C. Koch, and R. Pichler. "Efficient Algorithms for Processing XPath Queries". In Proceedings of the 28th International Conference on Very Large Data Bases (VLDB'02), Hong Kong, China, Aug  G. Gottlob, C. Koch, and R. Pichler. "XPath Query Evaluation: Improving Time and Space Efficiency". In Proceedings of the 19th IEEE International Conference on Data Engineering (ICDE'03), Bangalore, India, Mar 