XPath Aug ’10 – Dec ‘10. XPath   XML Path Language   Technology that allows to select a part or parts of an XML document to process   XPath was.

Slides:



Advertisements
Similar presentations
1 XSLT – eXtensible Stylesheet Language Transformations Modified Slides from Dr. Sagiv.
Advertisements

Managing Data Exchange: XPath
XPath XML Path Language. Outline XML Path Language (XPath) Data Model Description Node values XPath expressions Relative expressions Simple subset of.
Transforming XML Part I Document Navigation with XPath John Arnett, MSc Standards Modeller Information and Statistics Division NHSScotland Tel:
The learning site: /xpath_syntax.asp xsl/xsl/slides.html.
XPath Eugenia Fernandez IUPUI. XML Path Language (XPath) a data model for representing an XML document as an abstract node tree a mechanism for addressing.
XML 6.6 XPath 6. What is XPath? XPath is a syntax used for selecting parts of an XML document The way XPath describes paths to elements is similar to.
2-Jun-15 XPath. 2 What is XPath? XPath is a syntax used for selecting parts of an XML document The way XPath describes paths to elements is similar to.
1 XSL – XML Stylesheet Language. 2 XSL XSL = XML Stylesheet Language XSL cosists of –XPath (navigation in documents) –XSLT (T for transformations) –XSLFO.
XPath Carissa Mills Jill Kerschbaum. What is XPath? n A language designed to be used by both XSL Transformations (XSLT) and XPointer. n Provides common.
Lecture 12. Default Processing in XSLT The default processing in XSLT is to process the XPath root node The default processing for various node types.
XPath Tao Wan March 04, What is XPath? n A language designed to be used by XSL Transformations (XSLT), Xlink, Xpointer and XML Query. n Primary.
1 XPATH Modified Slides from Dr. Sagiv. 2 XPath A Language for Locating Nodes in XML Documents XPath expressions are written in a syntax that resembles.
XML Technologies and Applications Rajshekhar Sunderraman Department of Computer Science Georgia State University Atlanta, GA 30302
Object Oriented Programming III1 XSLT Processing XML using XSLT Using XPath.
September 15, 2003Houssam Haitof1 XSL Transformation Houssam Haitof.
17 Apr 2002 XML Stylesheets Andy Clark. What Is It? Extensible Stylesheet Language (XSL) Language for document transformation – Transformation (XSLT)
Overview of XPath Author: Dan McCreary Date: October, 2008 Version: 0.2 with TEI Examples M D.
Introduction to XPath Bun Yue Professor, CS/CIS UHCL.
SD2520 Databases using XML and JQuery
ECA 228 Internet/Intranet Design I Intro to XSL. ECA 228 Internet/Intranet Design I XSL basics W3C standards for stylesheets – CSS – XSL: Extensible Markup.
Navigating XML. Overview  Xpath is a non-xml syntax to be used with XSLT and Xpointer. Its purpose according to the W3.org is  to address parts of an.
CSE3201/CSE4500 XPath. 2 XPath A locator for elements or attributes in an XML document. XPath expression gives direction.
TDDD43 XML and RDF Slides based on slides by Lena Strömbäck and Fang Wei-Kleiner 1.
SDPL 20075: Overview of XSLT1 5 Document Transformations n XSLT (1.0 W3C Rec. 11/1999; XSLT 2.0 Rec. 1/07) –A language for transforming XML documents –initial.
1/17 ITApplications XML Module Session 7: Introduction to XPath.
Introduction to XPath Web Engineering, SS 2007 Tomáš Pitner.
XML DOCUMENTS & DATABASES. Summary of Introduction to XML HTML vs. XML HTML vs. XML Types of Data Types of Data Basics of XML Basics of XML XML Syntax,
CSE3201/CSE4500 Information Retrieval Systems
XP New Perspectives on XML Tutorial 6 1 TUTORIAL 6 XSLT Tutorial – Carey ISBN
XSLT and XPath, by Dr. Khalil1 XSL, XSLT and XPath Dr. Awad Khalil Computer Science Department AUC.
Representing Web Data: XML CSI 3140 WWW Structures, Techniques and Standards.
WORKING WITH XSLT AND XPATH
1 XPath XPath became a W3C Recommendation 16. November 1999 XPath is a language for finding information in an XML document XPath is used to navigate through.
1 CIS336 Website design, implementation and management (also Semester 2 of CIS219, CIS221 and IT226) Lecture 6 XSLT (Based on Møller and Schwartzbach,
Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka.
XPath Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
Processing of structured documents Spring 2003, Part 7 Helena Ahonen-Myka.
XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for.
1 XSLT An Introduction. 2 XSLT XSLT (extensible Stylesheet Language:Transformations) is a language primarily designed for transforming the structure of.
Li Tak Sing COMPS311F. XPath XPath is a simple language that allows you to write expressions to refer to different parts of an XML document. We will learn.
CITA 330 Section 6 XSLT. Transforming XML Documents to XHTML Documents XSLT is an XML dialect which is declared under namespace "
XSLT part of XSL (Extensible Stylesheet Language) –includes also XPath and XSL Formatting Objects used to transform an XML document into: –another XML.
XSLT Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
IS432: Semi-Structured Data Dr. Azeddine Chikh. 6. XML Path (XPath)
August Chapter 6 - XPath & XPointer Learning XML by Erik T. Ray Slides were developed by Jack Davis College of Information Science and Technology.
SDPL 2001Notes 5: XSLT1 5. Document Transformations n XSLT (W3C Rec. Nov-99) –A language for transforming XML documents »representative of features common.
ACG 6415 XSLT Presenting XML and XBRL. Re-Purpose  The main benefit of XML / XBRL Reusability of Data contained in Instance Document We need a method.
1 cs XSL XSL is a standard that consists of three parts: XPath (navigation in documents) XPath was taught in the DB course, so it will not be.
1 XML Data Management Extracting Data from XML: XPath Werner Nutt based on slides by Sara Cohen, Jerusalem.
1 XML Data Management XPath Principles Werner Nutt.
More XML XPATH, XSLT CS 431 – February 23, 2005 Carl Lagoze – Cornell University.
University of Nottingham School of Computer Science & Information Technology Introduction to XML 2. XSLT Tim Brailsford.
ACG 4401 XSLT Extensible Stylesheet Language for Transformations Presenting XML and XBRL.
CSE3201/CSE4500 XPath. 2 XPath A locator for items in XML document. XPath expression gives direction of navigation.
XPath --XML Path Language Motivation of XPath Data Model and Data Types Node Types Location Steps Functions XPath 2.0 Additional Functionality and its.
1 XPath. 2 Agenda XPath Introduction XPath Nodes XPath Syntax XPath Operators XPath Q&A.
ACG 4401 XSLT Extensible Stylesheet Language for Transformations Presenting XML and XBRL.
1 The XPath Language. 2 XPath Expressions Flexible notation for navigating around trees A basic technology that is widely used uniqueness and scope in.
1 XPath Extracting Data from XML. 2 Data stored in an XML document must be extracted to use with various applications Data can be extracted programmatically.
1 Extensible Stylesheet Language (XSL) Extensible Stylesheet Language (XSL)
CITA 330 Section 5 XPath. XSL XSL (Extensible Stylesheet Language) is the standard language for writing stylesheets to transform XML documents among different.
5 Copyright © 2004, Oracle. All rights reserved. Navigating XML Documents by Using XPath.
1 XSL Transformations (XSLT). 2 XSLT XSLT is a language for transforming XML documents into XHTML documents or to other XML documents. XSLT uses XPath.
1 Extensible Stylesheet Language (XSL) Extensible Stylesheet Language (XSL)
Beginning XML 4th Edition.
XML Path Language Andy Clark 17 Apr 2002.
Modified Slides from Dr. Sagiv
Modified Slides from Dr. Sagiv
XML DOCUMENTS & DATABASES
Presentation transcript:

XPath Aug ’10 – Dec ‘10

XPath   XML Path Language   Technology that allows to select a part or parts of an XML document to process   XPath was designed specifically for use with Extensible Stylesheet Language Transformations (XSLT), and with XML Pointer (XPointer)   Query language for locating nodes and fragments in XML treesnodesXML trees Aug’10 – Dec ’10 1

Why query XML? Aug’10 – Dec ’10 2  Need to extract parts of XML documents  Need to transform documents in different forms  Another XML form  HTML (to display on web browser)  Need to relate-join parts of same or different XML documents.

Aug’10 – Dec ’10 3  XPath uses path expressions to navigate in XML documents  XPath contains a library of standard functions  XPath is a major element in XSLT  XPath is a W3C recommendation XPath

Aug’10 – Dec ’10 4  Nested structure of start tags and end tags  Document also contains processing instructions, comments, attributes, namespace declarations and text content of elements.  Sequence of unicode characters, said to be serialized  More useful to model the logical structure of XML document in a way,  that describes the logical components that make up the XML document  that exposes components for programmatic manipulation  Need for formal model of logical content of XML document XML Document

Aug’10 – Dec ’10 5  XPath Data Model  represents most parts of XML document as a tree of nodes  root node, element node, attribute node, text node etc.  XML declarations, DOCTYPE declarations are not represented  The Document Object Model (DOM)  hierarchical tree of nodes  types of nodes in DOM are different from those in XPath  XML Information Set  infoset represents XML document as a tree of information items  each information item has properties  represents a pure version of information held in a XML document Modeling XML Documents

Aug’10 – Dec ’10 6 Visualizing XPath – directions around hierarchical tree of nodes that make up XPath Data Model – all legal XPath code is called an expression – an XPath expression that returns a node-set is called a location path – direction in XPath is called an axis – absolute XPath expression – relative XPath expression – part of expression in square brackets is predicate

Aug’10 – Dec ’10 7 Context – Context indicates the location of the node where the processor is currently situated – node is called the context node – also includes context position and context size Book.xml This is the first chapter This is the second chapter This is the third chapter This is the fourth chapter This is the fifth chapter XPath Expression:

Aug’10 – Dec ’10 8 XSLT  declarative programming language written in XML to convert XML to some other output  Often output is XML or HTML  XSLT uses XPath to select parts of the source XML document that are used in the result document  An XSLT file consists of a number of templates that match specific nodes in the XML being processed.  Standard way to select nodes is by specifying the apply-templates instruction

Aug’10 – Dec ’10 9 <xsl:stylesheet version=”1.0” xmlns:xsl=” > This shows the context position and context size. Context position and context size demo. When the context node is the second Chapter element node then the context position is and the context size is. The text the Chapter element node contains is ‘ ’. XSLT with XPath

Aug’10 – Dec ’10 10 HTML document created by stylesheet

Aug’10 – Dec ’10 11 XSLT Example XML Document Dark Side of the Moon Pink Floyd Space Oddity David Bowie 9.90 Aretha: Lady Soul Aretha Franklin 9.90

Aug’10 – Dec ’10 12 XSLT Example Stylesheet <xsl:stylesheet version="1.0" xmlns:xsl=" A CD Catalog Title Artist

Aug’10 – Dec ’10 13 XSLT Example HTML Output

Aug’10 – Dec ’10 14 Node  A node is a representation in the XPath data model of a logical part of an XML document.  There are 7 types of nodes:  Root Node  Element Node  Attribute Node  Text Node  Namespace Node  Comment Node  Processing Instruction Node

Aug’10 – Dec ’10 15 Root Node  Root node represents the document itself, independent of any content  Apex of the hierarchy of nodes that represent XML document  The element node which represents the document element, is a child of the root node  Root node can have only one child element, the document element  The root node can have child node which are processing instructions or comment nodes.  The root element is the first element in root node and is child of the root node.  The root nodes text value is the concatenation of the values of all descendant text nodes of the root node in document order. Mary had a little lamb. Text value of the document is Mary had a little lamb.

Aug’10 – Dec ’10 16 Element Node  Represents elements in XML document  Name consists of namespace prefix and name  The string value of an element node is the concatenation of the values of all its descendant text nodes, in document order Attribute Node  Attributes in XML document are represented as attribute nodes  The element node with which the attribute node is associated is called the parent node of the attribute node  Attribute nodes have name and value  In XPath, attribute nodes are not children of parent element  They are accessed by the attribute axis and not the default child one

Aug’10 – Dec ’10 17 Text Node  Text content of an element node is represented as text node  The string value of text node is its character data  Text node does not have a name Namespace Node  Namespaces of an element node are represented as namespace node  The name() function returns the namespace prefix associated with the namespace node.  The self::node() expression returns the namespace URI of the namespace node.

Aug’10 – Dec ’10 18 Comment Node  Represents a comment in an XML document  Comments in DOCTYPE declaration are not represented in XPath Data Model Processing Instruction Node  Represents processing instruction in the XML document  Processing instructions in DOCTYPE declaration are not represented in XPath Data Model  The name of the processing instruction node is its target  The string value of the processing instruction node is its content, excluding the target

Aug’10 – Dec ’10 19 XPath 1.0 Types  Four expression types:  Boolean  Node-set  number  string

Aug’10 – Dec ’10 20 Abbreviated and Unabbreviated Syntax  When using XPath use abbreviated syntax where possible to make it more concise and legible.  The syntax used in XPath is similar to the path syntax used for UNIX and Linux directories  Abbreviated Syntax  Unabbreviated Syntax /child::Book/child::Chapter/attribute::number

XPath 1.0 Axes Aug’10 – Dec ’10 21 XPath 1.0 has a total of 13 axes, which are used to navigate the node tree of the XPath data model: ❑ child axis ❑ attribute axis ❑ ancestor axis ❑ ancestor-or-self axis ❑ descendant axis ❑ descendant-or-self axis ❑ following axis ❑ following-sibling axis ❑ namespace axis (not used in XQuery, and deprecated in XPath 2.0) ❑ parent axis ❑ preceding axis ❑ preceding-sibling axis ❑ self axis

Child Axis  The child axis is the default axis in XPath.  The child axis selects nodes that are immediate child nodes of the context node. Eg:<Invoice><Date> </Date> QD123 QD123 AC345 AC345 </Invoice> the location path: child::Item or Item will return a node-set containing both Item element nodes, which are child nodes of the Invoice element. Aug’10 – Dec ’10 22

Child axis cont…  To select both the Date element node and Item element node child::* or abbreviated syntax * child::* or abbreviated syntax *  The * indicates any name, and the only nodes in the child axis that have names are element nodes.  if you want to select all child nodes, including comment nodes, processing instruction nodes, and text nodes, child::node() or node() Since it is the default axis, /child::Book/child::Chapter/child::Sectionand/Book/Chapter/Section both mean the same thing! Aug’10 – Dec ’10 23

Child axis cont…  To specifically select text node children of a context node, child::text() or text()  Because it is the default axis, it is not necessary to express the child axis when using abbreviated syntax Aug’10 – Dec ’10 24

attribute Axis (notice all axes’ names begin with lowercase letters!)  Location path: attribute::* will each return all the attribute nodes associated with that element node  To select a specific attribute node named security attribute::security attribute::security character is an abbreviation for the attribute axis.  If the context node is not an element node, the attribute axis returns an empty node-set. Aug’10 – Dec ’10 25

Using Child and Attribute Axes PersonData.xml <PersonData> <FirstName>Jack</FirstName><LastName>Slack</LastName></Name></PersonData> Aug’10 – Dec ’10 26

PersonData.xslt <xsl:stylesheet xmlns:xsl=” version=”1.0” > <html><head> Information about Information about </title></head><body> <xsl:value-of select =”/PersonData/Name/FirstName”/> =”/PersonData/Name/FirstName”/> </xsl:text> was born on was born on </body></html></xsl:template></xsl:stylesheet> Aug’10 – Dec ’10 27

PersonData.html<html><head> Information about Jack Slack Information about Jack Slack </head><body> Jack Slack was born on 1920/11/25 Jack Slack was born on 1920/11/25 </body></html> Aug’10 – Dec ’10 28

ancestor Axis  The ancestor axis selects the parent node of the context node, the parent of that node, its parent, and so on until the root node of the document is selected.  If the context node is the root node, the ancestor axis returns an empty node-set. <Book> This is the first section. This is the first section. This is the second section. This is the second section. </Chapter> </Chapter></Book> ancestor::* would return the Chapter element node, which has a number attribute node with a value of 1, the Book element node, and the root node.  there is no way to express the ancestor axis using abbreviated syntax. Aug’10 – Dec ’10 29

ancestor-or-self Axis  The ancestor-or-self axis includes all nodes in the ancestor axis plus the context node (which is in the self axis).  Using the document in the ancestor axis section and the same context node, the location path ancestor::Section  returns an empty node-set because no ancestor element node is named Section, but the location path ancestor-or-self::Section ancestor-or-self::Section  would return the Section element node, which is the context node Aug’10 – Dec ’10 30

descendant Axis  The descendant axis selects the child nodes of the context node, the child nodes of those child nodes, and so on. Eg:<Invoices><Invoice><Date> </Date><Item>KDH987</Item><Item>DSE355</Item></Invoice><Invoice><Date> </Date><Item>RAH198</Item><Item>DJE385</Item></Invoice></Invoices> Aug’10 – Dec ’10 31

descendant Axis  If the Invoices element node were the context node, then location path, descendant::* descendant::*  Selects both Invoice element nodes, both Date element nodes, and all Item element nodes.  Location paths that use descendant axis can be expressed only in unabbreviated syntax  Descendant axis can be used on element nodes only  Descendant axis with an absolute location path: /descendant::Item Aug’10 – Dec ’10 32

descendant-or-self Axis  The descendant-or-self axis includes all the nodes in the descendant axis plus the context node (which is contained in the self axis).  The abbreviated form for the descendant-or-self axis is //.  This enables you to find nodes irrespective of their position.  This flexibility comes at a price, as the processor needs to do an extensive recursive search of the document tree.  You should only use this form of XPath when the exact path is unknown. Aug’10 – Dec ’10 33

following Axis  The following axis contains all nodes that come after the context node in document order, but excludes all descendant nodes and any attribute nodes and namespace nodes associated with the context node. Aug’10 – Dec ’10 34 Employees.xml:<Employees><Person><FirstName>Lara</FirstName><LastName>Farmer</LastName><DateOfBirth> </DateOfBirth></Person><Person><FirstName>Patrick</FirstName><LastName>Stepfoot</LastName><DateOfBirth> </DateOfBirth></Person><Person><FirstName>Angela</FirstName><LastName>Paris</LastName><DateOfBirth> </DateOfBirth></Person></Employees>

Employees.xslt <html><head> This demonstrates the following axis. This demonstrates the following axis. </head><body> Following axis demo. Following axis demo. </body></html></xsl:template> which contains the text which contains the text “ ”. “ ”. </xsl:for-each></xsl:template></xsl:stylesheet> Aug’10 – Dec ’10 35

Aug’10 – Dec ’10 12

following-sibling Axis The following-sibling axis includes any nodes in the following axis that share their parent node with the context node. <html><head> This demonstrates the following-sibling axis. This demonstrates the following-sibling axis. </head><body> Following-sibling axis demo. Following-sibling axis demo. </body></html></xsl:template> which contains the text which contains the text “ ”. “ ”. </xsl:for-each></xsl:template></xsl:stylesheet> Aug’10 – Dec ’10 37

Aug’10 – Dec ’10 12

namespace Axis namespace Axis  The namespace axis is used to select namespace nodes.  An element node has a separate namespace node for each in-scope namespace. Some text. Some text. Some more text. Some more text. </xmml:Book>   two namespace nodes are associated with the xmml:Book element node     Aug’10 – Dec ’10 39

xmmlBook.xslt <xsl:stylesheet version=”1.0” xmlns:xsl=” xmlns:xmml=” > <html><head> This shows namespace nodes. This shows namespace nodes. </head><body> Namespace nodes of the xmml:Book element. Namespace nodes of the xmml:Book element. </body></html></xsl:template>. The namespace prefix. The namespace prefix has the namespace URI has the namespace URI <xsl:value-of select=”.” />. select=”.” />. </xsl:for-each></xsl:template></xsl:stylesheet> Aug’10 – Dec ’10 40

Aug’10 – Dec ’10 12

parent Axis  The parent axis is used to select the parent node of the context node. <Parts> </Parts>  If the context node were a Part element node, then the following location path selects the parent node, which is the Parts element node: parent::node()  Abbreviated syntax for the parent axis:..  Root node is the only node without a parent Aug’10 – Dec ’10 42

preceding Axis  The preceding axis contains all nodes that come before the context node in document order, excluding nodes in the ancestor axis and attribute and namespace nodes. Aug’10 – Dec ’10 43 Employees.xml:<Employees><Person><FirstName>Lara</FirstName><LastName>Farmer</LastName><DateOfBirth> </DateOfBirth></Person><Person><FirstName>Patrick</FirstName><LastName>Stepfoot</LastName><DateOfBirth> </DateOfBirth></Person><Person><FirstName>Angela</FirstName><LastName>Paris</LastName><DateOfBirth> </DateOfBirth></Person></Employees>

Employees.xslt) Aug’10 – Dec ’10 44 This demonstrates the preceding axis. Preceding axis demo. which contains the text “ ”.

Aug’10 – Dec ’10 12

preceding-sibling Axis  The preceding axis includes those nodes that are in the preceding axis and that also share a parent node with the context node. Aug’10 – Dec ’10 46 Employees.xml:<Employees><Person><FirstName>Lara</FirstName><LastName>Farmer</LastName><DateOfBirth> </DateOfBirth></Person><Person><FirstName>Patrick</FirstName><LastName>Stepfoot</LastName><DateOfBirth> </DateOfBirth></Person><Person><FirstName>Angela</FirstName><LastName>Paris</LastName><DateOfBirth> </DateOfBirth></Person></Employees>

Employees.xslt Aug’10 – Dec ’10 47 This demonstrates the preceding axis. Preceding axis demo. which contains the text “ ”.

Aug’10 – Dec ’10 12

self Axis  Selects the context node  The abbreviated syntax for the context node is.  To select the value of the context node using the xsl:value-of element  The unabbreviated syntax : Aug’10 – Dec ’10 49

XPath 1.0 Functions  Boolean Functions  Node-Set Functions  Numeric Functions  String Functions Aug’10 – Dec ’10 50

XPath 1.0 Functions  Boolean Functions  boolean()  false()  lang()  not()  true() Aug’10 – Dec ’10 51

XPath 1.0 Functions  Node-Set Functions  count()  id()  last()  local-name()  name()  namespace-uri()  position() Aug’10 – Dec ’10 52

XPath 1.0 Functions  Numeric Functions  ceiling()  floor()  number()  round()  sum() Aug’10 – Dec ’10 53

XPath 1.0 Functions  String Functions  concat()  contains()  normalize-space()  starts-with()  string()  string-length()  substring()  substring-after()  substring-before()  translate() Aug’10 – Dec ’10 54

Predicates  Predicates are used to filter node-sets selected using axis and location step  A predicate is optional in each location step of an XPath expression  There can be more than one predicate in one location step  Each predicate selects only from nodes that are already selected Aug’10 – Dec ’10 55

Structure of XPath Expressions  XPath expressions that select node-sets are called locations paths  A location path is made up of location steps  Each location step is made up of three parts :  An axis Axis is present in every location path  A node test Node test is used to specify what node in the axis is to be selected  An optional predicate Used for filtering node sets Aug’10 – Dec ’10 56

Structure of XPath Expressions child::Paragraph[position()=2]  Axis is child, Node test is Paragraph, predicate is [position()=2] /child::Book/child::Chapter  This location path has two location steps /child::Book/child::Chapter/child::Section  This location path has three location steps Aug’10 – Dec ’10 57

Structure of XPath Expressions /child::Book/child::Chapter[position()=3]/child::Section /Book/node()   To select all nodes under Book element node /Book/Chapter[1]/Section[2]/Paragraph[3]  To select 3 rd chapter in the 2 nd section in the 1 st chapter /child::Book/child::Chapter[position()=1]/child::Section[ position()=2]/child::Paragraph[position()=3]  Same using abbreviated syntax Both syntax can be used Aug’10 – Dec ’10 58