Presentation is loading. Please wait.

Presentation is loading. Please wait.

XPath Aug ’10 – Dec ‘10. XPath   XML Path Language   Technology that allows to select a part or parts of an XML document to process   XPath was.

Similar presentations


Presentation on theme: "XPath Aug ’10 – Dec ‘10. XPath   XML Path Language   Technology that allows to select a part or parts of an XML document to process   XPath was."— Presentation transcript:

1 XPath Aug ’10 – Dec ‘10

2 XPath   XML Path Language   Technology that allows to select a part or parts of an XML document to process   XPath was designed specifically for use with Extensible Stylesheet Language Transformations (XSLT), and with XML Pointer (XPointer)   Query language for locating nodes and fragments in XML treesnodesXML trees Aug’10 – Dec ’10 1

3 Why query XML? Aug’10 – Dec ’10 2  Need to extract parts of XML documents  Need to transform documents in different forms  Another XML form  HTML (to display on web browser)  Need to relate-join parts of same or different XML documents.

4 Aug’10 – Dec ’10 3  XPath uses path expressions to navigate in XML documents  XPath contains a library of standard functions  XPath is a major element in XSLT  XPath is a W3C recommendation XPath

5 Aug’10 – Dec ’10 4  Nested structure of start tags and end tags  Document also contains processing instructions, comments, attributes, namespace declarations and text content of elements.  Sequence of unicode characters, said to be serialized  More useful to model the logical structure of XML document in a way,  that describes the logical components that make up the XML document  that exposes components for programmatic manipulation  Need for formal model of logical content of XML document XML Document

6 Aug’10 – Dec ’10 5  XPath Data Model  represents most parts of XML document as a tree of nodes  root node, element node, attribute node, text node etc.  XML declarations, DOCTYPE declarations are not represented  The Document Object Model (DOM)  hierarchical tree of nodes  types of nodes in DOM are different from those in XPath  XML Information Set  infoset represents XML document as a tree of information items  each information item has properties  represents a pure version of information held in a XML document Modeling XML Documents

7 Aug’10 – Dec ’10 6 Visualizing XPath – directions around hierarchical tree of nodes that make up XPath Data Model – all legal XPath code is called an expression – an XPath expression that returns a node-set is called a location path – direction in XPath is called an axis – absolute XPath expression /Book/Chapter[@number=2] – relative XPath expression Chapter[@number=2] – part of expression in square brackets is predicate

8 Aug’10 – Dec ’10 7 Context – Context indicates the location of the node where the processor is currently situated – node is called the context node – also includes context position and context size Book.xml This is the first chapter This is the second chapter This is the third chapter This is the fourth chapter This is the fifth chapter XPath Expression: /Book/Chapter[@number=2]

9 Aug’10 – Dec ’10 8 XSLT  declarative programming language written in XML to convert XML to some other output  Often output is XML or HTML  XSLT uses XPath to select parts of the source XML document that are used in the result document  An XSLT file consists of a number of templates that match specific nodes in the XML being processed.  Standard way to select nodes is by specifying the apply-templates instruction

10 Aug’10 – Dec ’10 9 <xsl:stylesheet version=”1.0” xmlns:xsl=”http://www.w3.org/1999/XSL/Transform” > This shows the context position and context size. Context position and context size demo. When the context node is the second Chapter element node then the context position is and the context size is. The text the Chapter element node contains is ‘ ’. XSLT with XPath

11 Aug’10 – Dec ’10 10 HTML document created by stylesheet

12 Aug’10 – Dec ’10 11 XSLT Example XML Document Dark Side of the Moon Pink Floyd 10.90 Space Oddity David Bowie 9.90 Aretha: Lady Soul Aretha Franklin 9.90

13 Aug’10 – Dec ’10 12 XSLT Example Stylesheet <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> A CD Catalog Title Artist

14 Aug’10 – Dec ’10 13 XSLT Example HTML Output

15 Aug’10 – Dec ’10 14 Node  A node is a representation in the XPath data model of a logical part of an XML document.  There are 7 types of nodes:  Root Node  Element Node  Attribute Node  Text Node  Namespace Node  Comment Node  Processing Instruction Node

16 Aug’10 – Dec ’10 15 Root Node  Root node represents the document itself, independent of any content  Apex of the hierarchy of nodes that represent XML document  The element node which represents the document element, is a child of the root node  Root node can have only one child element, the document element  The root node can have child node which are processing instructions or comment nodes.  The root element is the first element in root node and is child of the root node.  The root nodes text value is the concatenation of the values of all descendant text nodes of the root node in document order. Mary had a little lamb. Text value of the document is Mary had a little lamb.

17 Aug’10 – Dec ’10 16 Element Node  Represents elements in XML document  Name consists of namespace prefix and name  The string value of an element node is the concatenation of the values of all its descendant text nodes, in document order Attribute Node  Attributes in XML document are represented as attribute nodes  The element node with which the attribute node is associated is called the parent node of the attribute node  Attribute nodes have name and value  In XPath, attribute nodes are not children of parent element  They are accessed by the attribute axis and not the default child one

18 Aug’10 – Dec ’10 17 Text Node  Text content of an element node is represented as text node  The string value of text node is its character data  Text node does not have a name Namespace Node  Namespaces of an element node are represented as namespace node  The name() function returns the namespace prefix associated with the namespace node.  The self::node() expression returns the namespace URI of the namespace node.

19 Aug’10 – Dec ’10 18 Comment Node  Represents a comment in an XML document  Comments in DOCTYPE declaration are not represented in XPath Data Model Processing Instruction Node  Represents processing instruction in the XML document  Processing instructions in DOCTYPE declaration are not represented in XPath Data Model  The name of the processing instruction node is its target  The string value of the processing instruction node is its content, excluding the target

20 Aug’10 – Dec ’10 19 XPath 1.0 Types  Four expression types:  Boolean  Node-set  number  string

21 Aug’10 – Dec ’10 20 Abbreviated and Unabbreviated Syntax  When using XPath use abbreviated syntax where possible to make it more concise and legible.  The syntax used in XPath is similar to the path syntax used for UNIX and Linux directories  Abbreviated Syntax /Book/Chapter/@number  Unabbreviated Syntax /child::Book/child::Chapter/attribute::number

22 XPath 1.0 Axes Aug’10 – Dec ’10 21 XPath 1.0 has a total of 13 axes, which are used to navigate the node tree of the XPath data model: ❑ child axis ❑ attribute axis ❑ ancestor axis ❑ ancestor-or-self axis ❑ descendant axis ❑ descendant-or-self axis ❑ following axis ❑ following-sibling axis ❑ namespace axis (not used in XQuery, and deprecated in XPath 2.0) ❑ parent axis ❑ preceding axis ❑ preceding-sibling axis ❑ self axis

23 Child Axis  The child axis is the default axis in XPath.  The child axis selects nodes that are immediate child nodes of the context node. Eg:<Invoice><Date>2004-01-02</Date> QD123 QD123 AC345 AC345 </Invoice> the location path: child::Item or Item will return a node-set containing both Item element nodes, which are child nodes of the Invoice element. Aug’10 – Dec ’10 22

24 Child axis cont…  To select both the Date element node and Item element node child::* or abbreviated syntax * child::* or abbreviated syntax *  The * indicates any name, and the only nodes in the child axis that have names are element nodes.  if you want to select all child nodes, including comment nodes, processing instruction nodes, and text nodes, child::node() or node() Since it is the default axis, /child::Book/child::Chapter/child::Sectionand/Book/Chapter/Section both mean the same thing! Aug’10 – Dec ’10 23

25 Child axis cont…  To specifically select text node children of a context node, child::text() or text()  Because it is the default axis, it is not necessary to express the child axis when using abbreviated syntax Aug’10 – Dec ’10 24

26 attribute Axis (notice all axes’ names begin with lowercase letters!)  Location path: attribute::* or @* will each return all the attribute nodes associated with that element node  To select a specific attribute node named security attribute::security or @security attribute::security or @security  @ character is an abbreviation for the attribute axis.  If the context node is not an element node, the attribute axis returns an empty node-set. Aug’10 – Dec ’10 25

27 Using Child and Attribute Axes PersonData.xml <PersonData> <FirstName>Jack</FirstName><LastName>Slack</LastName></Name></PersonData> Aug’10 – Dec ’10 26

28 PersonData.xslt <xsl:stylesheet xmlns:xsl=”http://www.w3.org/1999/XSL/Transform” version=”1.0” > <html><head> Information about Information about </title></head><body> <xsl:value-of select =”/PersonData/Name/FirstName”/> =”/PersonData/Name/FirstName”/> </xsl:text> was born on was born on </body></html></xsl:template></xsl:stylesheet> Aug’10 – Dec ’10 27

29 PersonData.html<html><head> Information about Jack Slack Information about Jack Slack </head><body> Jack Slack was born on 1920/11/25 Jack Slack was born on 1920/11/25 </body></html> Aug’10 – Dec ’10 28

30 ancestor Axis  The ancestor axis selects the parent node of the context node, the parent of that node, its parent, and so on until the root node of the document is selected.  If the context node is the root node, the ancestor axis returns an empty node-set. <Book> This is the first section. This is the first section. This is the second section. This is the second section. </Chapter> </Chapter></Book> ancestor::* would return the Chapter element node, which has a number attribute node with a value of 1, the Book element node, and the root node.  there is no way to express the ancestor axis using abbreviated syntax. Aug’10 – Dec ’10 29

31 ancestor-or-self Axis  The ancestor-or-self axis includes all nodes in the ancestor axis plus the context node (which is in the self axis).  Using the document in the ancestor axis section and the same context node, the location path ancestor::Section  returns an empty node-set because no ancestor element node is named Section, but the location path ancestor-or-self::Section ancestor-or-self::Section  would return the Section element node, which is the context node Aug’10 – Dec ’10 30

32 descendant Axis  The descendant axis selects the child nodes of the context node, the child nodes of those child nodes, and so on. Eg:<Invoices><Invoice><Date>2004-01-01</Date><Item>KDH987</Item><Item>DSE355</Item></Invoice><Invoice><Date>2004-01-01</Date><Item>RAH198</Item><Item>DJE385</Item></Invoice></Invoices> Aug’10 – Dec ’10 31

33 descendant Axis  If the Invoices element node were the context node, then location path, descendant::* descendant::*  Selects both Invoice element nodes, both Date element nodes, and all Item element nodes.  Location paths that use descendant axis can be expressed only in unabbreviated syntax  Descendant axis can be used on element nodes only  Descendant axis with an absolute location path: /descendant::Item Aug’10 – Dec ’10 32

34 descendant-or-self Axis  The descendant-or-self axis includes all the nodes in the descendant axis plus the context node (which is contained in the self axis).  The abbreviated form for the descendant-or-self axis is //.  This enables you to find nodes irrespective of their position.  This flexibility comes at a price, as the processor needs to do an extensive recursive search of the document tree.  You should only use this form of XPath when the exact path is unknown. Aug’10 – Dec ’10 33

35 following Axis  The following axis contains all nodes that come after the context node in document order, but excludes all descendant nodes and any attribute nodes and namespace nodes associated with the context node. Aug’10 – Dec ’10 34 Employees.xml:<Employees><Person><FirstName>Lara</FirstName><LastName>Farmer</LastName><DateOfBirth>1944-12-12</DateOfBirth></Person><Person><FirstName>Patrick</FirstName><LastName>Stepfoot</LastName><DateOfBirth>1955-11-11</DateOfBirth></Person><Person><FirstName>Angela</FirstName><LastName>Paris</LastName><DateOfBirth>1980-10-10</DateOfBirth></Person></Employees>

36 Employees.xslt <html><head> This demonstrates the following axis. This demonstrates the following axis. </head><body> Following axis demo. Following axis demo. </body></html></xsl:template> which contains the text which contains the text “ ”. “ ”. </xsl:for-each></xsl:template></xsl:stylesheet> Aug’10 – Dec ’10 35

37 Aug’10 – Dec ’10 12

38 following-sibling Axis The following-sibling axis includes any nodes in the following axis that share their parent node with the context node. <html><head> This demonstrates the following-sibling axis. This demonstrates the following-sibling axis. </head><body> Following-sibling axis demo. Following-sibling axis demo. </body></html></xsl:template> which contains the text which contains the text “ ”. “ ”. </xsl:for-each></xsl:template></xsl:stylesheet> Aug’10 – Dec ’10 37

39 Aug’10 – Dec ’10 12

40 namespace Axis namespace Axis  The namespace axis is used to select namespace nodes.  An element node has a separate namespace node for each in-scope namespace. Some text. Some text. Some more text. Some more text. </xmml:Book>   two namespace nodes are associated with the xmml:Book element node   http://www.XMML.com/namespaces http://www.XMML.com/namespaces   http://www.w3.org/XML/1998/namespace Aug’10 – Dec ’10 39

41 xmmlBook.xslt <xsl:stylesheet version=”1.0” xmlns:xsl=”http://www.w3.org/1999/XSL/Transform” xmlns:xmml=”http://www.XMML.com/namespaces” > <html><head> This shows namespace nodes. This shows namespace nodes. </head><body> Namespace nodes of the xmml:Book element. Namespace nodes of the xmml:Book element. </body></html></xsl:template>. The namespace prefix. The namespace prefix has the namespace URI has the namespace URI <xsl:value-of select=”.” />. select=”.” />. </xsl:for-each></xsl:template></xsl:stylesheet> Aug’10 – Dec ’10 40

42 Aug’10 – Dec ’10 12

43 parent Axis  The parent axis is used to select the parent node of the context node. <Parts> </Parts>  If the context node were a Part element node, then the following location path selects the parent node, which is the Parts element node: parent::node()  Abbreviated syntax for the parent axis:..  Root node is the only node without a parent Aug’10 – Dec ’10 42

44 preceding Axis  The preceding axis contains all nodes that come before the context node in document order, excluding nodes in the ancestor axis and attribute and namespace nodes. Aug’10 – Dec ’10 43 Employees.xml:<Employees><Person><FirstName>Lara</FirstName><LastName>Farmer</LastName><DateOfBirth>1944-12-12</DateOfBirth></Person><Person><FirstName>Patrick</FirstName><LastName>Stepfoot</LastName><DateOfBirth>1955-11-11</DateOfBirth></Person><Person><FirstName>Angela</FirstName><LastName>Paris</LastName><DateOfBirth>1980-10-10</DateOfBirth></Person></Employees>

45 Employees.xslt) Aug’10 – Dec ’10 44 This demonstrates the preceding axis. Preceding axis demo. which contains the text “ ”.

46 Aug’10 – Dec ’10 12

47 preceding-sibling Axis  The preceding axis includes those nodes that are in the preceding axis and that also share a parent node with the context node. Aug’10 – Dec ’10 46 Employees.xml:<Employees><Person><FirstName>Lara</FirstName><LastName>Farmer</LastName><DateOfBirth>1944-12-12</DateOfBirth></Person><Person><FirstName>Patrick</FirstName><LastName>Stepfoot</LastName><DateOfBirth>1955-11-11</DateOfBirth></Person><Person><FirstName>Angela</FirstName><LastName>Paris</LastName><DateOfBirth>1980-10-10</DateOfBirth></Person></Employees>

48 Employees.xslt Aug’10 – Dec ’10 47 This demonstrates the preceding axis. Preceding axis demo. which contains the text “ ”.

49 Aug’10 – Dec ’10 12

50 self Axis  Selects the context node  The abbreviated syntax for the context node is.  To select the value of the context node using the xsl:value-of element  The unabbreviated syntax : Aug’10 – Dec ’10 49

51 XPath 1.0 Functions  Boolean Functions  Node-Set Functions  Numeric Functions  String Functions Aug’10 – Dec ’10 50

52 XPath 1.0 Functions  Boolean Functions  boolean()  false()  lang()  not()  true() Aug’10 – Dec ’10 51

53 XPath 1.0 Functions  Node-Set Functions  count()  id()  last()  local-name()  name()  namespace-uri()  position() Aug’10 – Dec ’10 52

54 XPath 1.0 Functions  Numeric Functions  ceiling()  floor()  number()  round()  sum() Aug’10 – Dec ’10 53

55 XPath 1.0 Functions  String Functions  concat()  contains()  normalize-space()  starts-with()  string()  string-length()  substring()  substring-after()  substring-before()  translate() Aug’10 – Dec ’10 54

56 Predicates  Predicates are used to filter node-sets selected using axis and location step  A predicate is optional in each location step of an XPath expression  There can be more than one predicate in one location step //Section[@security=”confidential”] //Section[@security=”public”][@version=”final”]  Each predicate selects only from nodes that are already selected Aug’10 – Dec ’10 55

57 Structure of XPath Expressions  XPath expressions that select node-sets are called locations paths  A location path is made up of location steps  Each location step is made up of three parts :  An axis Axis is present in every location path  A node test Node test is used to specify what node in the axis is to be selected  An optional predicate Used for filtering node sets Aug’10 – Dec ’10 56

58 Structure of XPath Expressions child::Paragraph[position()=2]  Axis is child, Node test is Paragraph, predicate is [position()=2] /child::Book/child::Chapter  This location path has two location steps /child::Book/child::Chapter/child::Section  This location path has three location steps Aug’10 – Dec ’10 57

59 Structure of XPath Expressions /child::Book/child::Chapter[position()=3]/child::Section /Book/node()   To select all nodes under Book element node /Book/Chapter[1]/Section[2]/Paragraph[3]  To select 3 rd chapter in the 2 nd section in the 1 st chapter /child::Book/child::Chapter[position()=1]/child::Section[ position()=2]/child::Paragraph[position()=3]  Same using abbreviated syntax Both syntax can be used Aug’10 – Dec ’10 58


Download ppt "XPath Aug ’10 – Dec ‘10. XPath   XML Path Language   Technology that allows to select a part or parts of an XML document to process   XPath was."

Similar presentations


Ads by Google