XML for .NET Session 1 Introduction to XML Introduction to XSLT

XML for .NET Session 1 Introduction to XML Introduction to XSLT
Programmatically Reading XML Documents Introduction to XPATH

XML Documents Can be Read Programmatically
The .NET Framework consists of many classes to aid in programmatically iterating through and navigating XML documents. These classes are found in the System.Xml namespace. The various classes in the System.Xml namespace are highlighted in Chapter 6 of the text, XML and ASP.NET (starting on page. 261).

Accessing XML Content XML documents can be accessed in one of two ways: in a push model or a pull model. The pull model loads the entire XML document into memory, and then works with the document once it has been completely loaded. The push model accesses only tiny pieces of the XML document when needed.

Comparing and Contrasting Push and Pull Approaches
Pull Model Push Model Pluses Quickly iterate and navigate through XML content once it’s fully loaded. Allows for navigation and iteration of very large XML files. Minuses Requires that the entire XML document be loaded into memory; does not scale to large XML content or large number of users. Difficult to add and update elements in the XML document.

How to use the Two Methods
The .NET Framework provides developers both methods: Pull Method – use the DOM classes in the .NET Framework. Push Method – use the XmlReader and XmlWriter classes.

Using the Pull Method The System.Xml namespace contains a number of classes to work with XML documents in the DOM paradigm: XmlDocument – represents an XML document. XmlElement – represents an individual element in the DOM XmlAttribute – represents an attribute. XmlText – represents text content.

Using the Push Method The XmlReader reads one node at a time from a specified XML source. The XmlReader can only read in a FORWARD direction. The XmlReader class cannot be used directly; instead, one of its derived classes must be used instead: XmlNodeReader – reads one node at a time from an XML DOM. XmlTextReader – reads one node at a time from an XML source, such as a file with XML content. XmlValidatingReader – a reader that performs DTD or schema validation (more on this next week!)

Iterating through an XML Document using XmlTextReader
To iterate through the contents of an XML document with the XmlTextReader we need to: Specify the XML document to iterate through when creating the XmlTextReader. Call the Read() method, which reads in the next Node. Access the properties of the XmlTextReader to determine the name, value, and other information about the read Node.

Iterating through an XML Document using XmlTextReader
We can programmatically read through the contents of an XML file like so: // create an XmlTextReader to read the specified XML file XmlTextReader reader = new XmlTextReader(filepath); // now, display the information of each node in the TextBox while (reader.Read()) { // access the properties of the XmlTextReader class... // like reader.Name, reader.NodeType, reader.Value, etc. } // close the XmlTextReader reader.Close();

What is a Node? Recall that the XmlReader classes read XML nodes. What constitutes a node? Can you identify the nodes in the following XML fragment? <?xml version=“1.0” encoding=“utf-8” ?> <books> <book price=“34.95”> <title>Animal Farm</title> <authors> <author>Orwell</author> </authors> </book> </books>

What is a Node? <?xml version=“1.0” encoding=“utf-8” ?> <books> <book price=“34.95”> <title>Animal Farm</title> <authors> <author>Orwell</author> </authors> </book> </books> The whitespace between each element (if present) is also considered a node! (Although, you can set the XmlTextReader’s WhitespaceHandling property to specify if the Reader should read whitespace nodes or not.

What is a Node? <?xml version=“1.0” encoding=“utf-8” ?> <books> <book price=“34.95”> <title>Animal Farm</title> <authors> <author>Orwell</author> </authors> </book> </books> Notice that the attributes of an element are not considered nodes...

Creating a Program to View the Content Read by an XmlTextReader
We can create a program that allows the user to select an XML file; then, the contents of the XML file are read by an XmlTextReader, with each read node’s name, type, and value displayed. (Run demo!)

Reading the Attributes
As we saw in the demo, the attributes are not read as a separate node. We can determine whether or not a given node has attributes by the HasAttributes property. In order to programmatically access the attributes of a node, we must use the MoveToNextAttribute() method of the XmlTextReader.

Reading the Attributes
while (reader.Read()) // C# { if (reader.HasAttributes) while (reader.MoveToNextAttribute()) // Access the attribute name/value via // reader.Name/reader.Value } While reader.Read // VB.NET If reader.HasAttributes then While reader.MoveToNextAttribute() ' Access the attribute name/value via ' reader.Name/reader.Value End While End If

The XmlTextReader Properties and Methods
The properties and methods of the XmlTextReader are listed started on pg. 272 of the text. Some more germane methods include: ReadInnerXml() – returns a string with the complete content (including XML markup) of the current node’s content (child nodes, text content, etc.) ReadOutterXml() – returns a string containing the node’s XML markup along with the node’s content XML markup.

Run ReadInnerOutterXml-ForXmlTextReader demo… When reading an XML document, the XmlTextReader class will throw an XmlException if there was an error in parsing the XML. An error can occur if the XML, for example, is malformed. (That is, it is not well-formed.)

Run the XmlException demo We will examine the XmlNodeReader and XmlValidatingReader – the other two XmlReader classes – later in this course.

Using the DOM to Iterate through an XML Document
In contrast to the Push method (XmlReader/XmlWriter), the .NET Framework offers a Pull method. Recall that the Pull method reads the entire XML document into memory and then works with it from there. For this model, XML documents are represented in the Document Object Model (DOM).

What is the DOM? DOM stands for Document Object Model, and it’s a model that can be used to describe an XML document. The DOM expresses the XML document as a hierarchy of nodes, where each element can have zero to many children elements. The text content and attributes of an element are expressed as its children as well.

Example XML File <?xml version="1.0" encoding="UTF-8" ?>
<books> <book price="34.95"> <title>TYASP 3.0</title> <authors> <author>Mitchell</author> </authors> </book> <book price=“29.95"> <title>ASP.NET Tips</title> <authors> <author>Mitchell</author> <author>Walther</author> <author>Seven</author> </books>

The DOM View of the XML Document

The DOM Classes - XmlNode
There are a number of classes in the System.Xml namespace that represent the DOM. Each “box” in the DOM model is represented in the .NET Framework by the XmlNode class. This means that elements, attributes, and text values are all represented by the XmlNode class. The XmlNode class is discussed on pg. 287

Extending the XmlNode Class
There are a number of classes that are derived from the XmlNode class: XmlAttribute XmlElement XmlDocument And so on…

The XmlNode Properties
The XmlNode class many properties, the most germane ones being: Name – the name of the node. For elements and attributes, the name is the name of the element or attribute. For text content, the name is #text. Value – the value of the DOM element. For elements, there is no value. For attributes, it’s the value of the attribute; for text nodes, it’s the value of the text in the node. NodeType – indicates the type of the node (element, text, attribute, etc.)

More XmlNode Properties
InnerXml – the string content of the XML markup of the node’s children. OuterXml – the string content of the XML markup of the node itself and its children. InnerText – the string content of the value of the node and all its children nodes. HasChildNodes – a Boolean, indicating if the node has any children.

The XmlNodeList Class The XmlNodeList class represents an arbitrary collection of XmlNodes. For example, the XmlNode class has a ChildNodes property, which returns an XmlNodeList instance. This instance is a collection of nodes representing the DOM element’s children.

Loading an XML Document into a DOM Representation
The XmlDocument’s Load() method has four variations: Load(Stream) Load(string) Load(TextReader) Load(XmlTextReader) In the Load(string) variation, the input string is a file path (or URL) to the XML file to load into the DOM representation.

The XmlDocument Properties
The XmlDocument is derived from the XmlNode class, meaning it has all of the properties and methods available to the XmlNode class. Once an XML file has been loaded into an XmlDocument instance, we can access the root element through the DocumentElement property.

The XmlElement and XmlAttribute Classes
The XmlElement and XmlAttribute classes are also derived from the XmlNode class. They represent, respectively, an element and an attribute.

Example The following loads and XML document and displays the name of the root element. Dim xmlDoc As New XmlDocument() xmlDoc.Load(filepath) Dim rootElementName as String rootElementName = xmlDoc.DocumentElement.Name

Example Iterating through the root element’s children:
Dim xmlDoc As New XmlDocument() xmlDoc.Load(filepath) Dim n as XmlNode For Each n in xmlDoc.DocumentElement.ChildNodes ' Display the name of the node using n.Name Next

An Example of Iterating through an XML Document
Let’s create an application that displays an XML document in a TreeView control. Each node in the TreeView represents a Node in the DOM

An Example of Iterating through an XML Document
We can recursively iterate through the DOM, ensuring that we’ll visit each node. (Explain recursion?) Examine application code... Questions on the program?

Navigating through an XML Document
So far, all we have seen is how to iterate through an XML document, one node at a time. With the pull method (DOM), however, we can navigate through the document as well. For example, we might want access just the elements in the document that have a certain name. (Such as elements with the name <author>.)

Accessing Elements with a Certain Name
The XmlDocument class contains a GetElementsByTagName() method, which returns an XmlNodeList containing elements that have the specified tag name. Dim xmlDoc As New XmlDocument() xmlDoc.Load(filepath) Dim n as XmlNode For Each n in xmlDoc.GetElementsByTagName("author") Display n.Value Next What would be the output of the above code???

Navigating through an XML Document
However, what if we want to access nodes based on more complex criteria, such as: “Access all <book> elements with a price attribute value less than 30,” or, “Access the name of the authors who have written more than one book.” To accomplish this we need something more powerful – enter XPath!

A Quick Examination of XPath
XPath is used to define particular sections of an XML document. XPath is named XPath because its syntax is similar to the syntax for a file path. For example, in our books XML document, we could use the following XPath statement to access all of the author elements: /books/book/authors/author

Why We Might Want to Access Certain XML Document Portions
When using XSLT to display an XML file, typically we want to display only a subset of the XML document. For example, we might want to display a listing of flights, displaying the date, the departure city and the destination city. When working with XML data, we might want to retrieve only a certain subset of the data. We might want to access data that meets a certain set of criteria. All of these tasks can be accomplished with XPath

XPath Components – Steps
To access the root element of the XML document, we use the following syntax: /RootElementName Then, to access immediate descendents (children) of a given element, we use /, followed by the name of the child element. The / operator is referred to as the step operator.

XPath Components – Steps
The step operator has parallels to the \ operator in file paths. With file systems (which can be modeled as XML documents), you navigate the directory structure by using \. For example, a path like: C:\Games\Quake\SavedGames This file path - C:\Games\Quake\SavedGames – takes you to the specified directory. A file system can be represented as an XML Document

The file system can be represented as an XML document…
<?xml version="1.0" encoding="UTF-8" ?> <filesystem> <drive letter="C"> <folder name="Program Files" /> <folder name="Games"> <folder name="Quake"> <folder name="SavedGames" /> <file>Quake.exe</file> <file>README.txt</file> </folder> </folder> <folder name="Windows"> <file>README.txt</file> </folder> </drive> <drive letter="D"> <folder name="Backup"> <file> bak</file> <file> bak</file> </folder> </drive> </filesystem>

The DOM Model of the FileSystem XML Document

XPath Components - Steps
Using XPath we can access all of the root element using: /filesystem

To access all of the <drive> elements, we’d use: /filesystem/drive

To access all of the folder elements that were children of <drive> elements, we’d use: /filesystem/drive/folder

What about /filesystem/drive/folder/folder/folder

Descendent Steps Using elementName/elementName2, we get all of the elements that are children of elementName that have the name elementName2. But what if we want all elements that are descendents of elementName, regardless of whether or not the element is a child, grandchild, great-grandchild, etc.? Here, we use the // operator.

Descendent Steps As we saw earlier, /filesystem/drive/folder will return the folders that are immediate children of the <drive> element (Program Files, Games, and Window). If we want to get all folders, regardless of their depth in the hierarchy, we can use: /filesystem/drive//folder

Descendent Steps - Example
What will /filesystem//file return?

Accessing an Element’s Text Value
If an element has a text value (such as the <file> element), you can access it using the text() XPath function. For example, to return the contents of the <file> elements, we could use: /filesystem/drives//files/text()

Accessing Text Element’s - Example
/filesystem/drives//files/text()

Accessing an Element’s Attribute Value
To access an attribute value for all elements matching a particular XPath expression, use the following syntax: So, to access the values of the name attribute in the <folder> elements that are children of the <drive> element, you would use:

Accessing Element Attribute Values - Example

Example What if you wanted to retrieve the names of subdirectories? That is, you wanted to get the name attribute for all <folder> elements that were not children of the <drive> elements? What XPath expression would you use???

Filtering Imagine that you wanted to return only those folders that contain files. Would the following XPath work? /filesystem/drives//folders/file No! Because the above would return <file> elements. If you want to return folder elements, filtered to only those that contain files, you can use the following syntax: /filesystem/drives//folders[file]

Filtering Example /filesystem/drives//folders[file]

Filtering Similarly, you can return only elements that contain a certain attribute by using:

XPath Components - Predicates
Realize that when using steps, all matching elements are returned. From the file system example, /filesystem/drive/folder will return all four <folder> elements (Program Files, Games, Window, and Backup). Predicates allow to only return those elements that meet a certain set of criteria. Predicate syntax: [boolean expression]

XPath Components - Predicates
For example, to return all <folder> elements with the name attribute equal to Games, we could use:

Predicate Example

Predicate Example Predicates can also appear in earlier step expressions, like:

Predicates A number of operators can be used within predicates: =, !=, <, >, <=, >=, and, or, not(), +, -, div, *, mod Example: to get all of the files in folders named either Windows or Quake, you could do:

XPath Components – Predicates - Examples
Here are some predicates – what elements would be returned for each? NOTHING IS RETURNED! This is because there is no folder that is a child of the <drive> element that has its name attribute equal to Quake.

What about the following XPath expression? <folder name="Quake"> <folder name="SavedGames" /> <file>Quake.exe</file> <file>README.txt</file> </folder> </folder>

What about the following XPath expression? name="My Programs" name="Games" name="Windows"

What about the following XPath expression? <file>Quake.exe</file> <file>README.txt</file>

More on XPath There are many more features and much more functionality available with XPath, which we’ll examine in Session 3. For a good tutorial on XPath, see:

Navigating through the DOM using XPath
The XmlNode class contains two methods for navigating the DOM: SelectSingleNode(string) SelectNodes(string) These string input parameter for both of these methods is an XPath expression. SelectSingleNode() returns at most one node, the first node to match the XPath expression. SelectNodes() returns all of the nodes that match the XPath expression.

An Example The following code displays the titles of books whose price is less than $30.00. Dim xmlDoc As New XmlDocument() xmlDoc.Load(filepath) Dim n as XmlNode For Each n in _ Display n.Value Next

Answer: the name of the first author found in the XML document.
An Example What does the following code output? Dim xmlDoc As New XmlDocument() xmlDoc.Load(filepath) Dim n as XmlNode n = xmlDoc.SelectSingleNode("//author/text()") Display n.Value Answer: the name of the first author found in the XML document.

Summary In this presentation, we saw how to programmatically iterate through XML documents. We examined the differences between the push and pull methods. The pull method uses the DOM, while the push method uses XmlTextReaders and XmlTextWriters.

Summary We studied the syntax of XPath, a technology designed to allow for XML document navigation. We saw how to use the SelectSingleNode() and SelectNodes() methods of the XmlNode class to navigate an XML document. XML document navigation is only possible in the DOM world.

Questions?

XML for .NET Session 1 Introduction to XML Introduction to XSLT

Similar presentations

Presentation on theme: "XML for .NET Session 1 Introduction to XML Introduction to XSLT"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

XML for .NET Session 1 Introduction to XML Introduction to XSLT

Similar presentations

Presentation on theme: "XML for .NET Session 1 Introduction to XML Introduction to XSLT"— Presentation transcript:

Similar presentations

About project

Feedback