Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved. 0-13-185603-0 Chapter 7 Representing Web Data:

Slides:



Advertisements
Similar presentations
XML-XSL Introduction SHIJU RAJAN SHIJU RAJAN Outline Brief Overview Brief Overview What is XML? What is XML? Well Formed XML Well Formed XML Tag Name.
Advertisements

What is XML? a meta language that allows you to create and format your own document markups a method for putting structured data into a text file; these.
eXtensible Markup Language
SPECIAL TOPIC XML. Introducing XML XML (eXtensible Markup Language) ◦A language used to create structured documents XML vs HTML ◦XML is designed to transport.
An Introduction to XML Based on the W3C XML Recommendations.
XML 6.3 DTD 6. XML and DTDs A DTD (Document Type Definition) describes the structure of one or more XML documents. Specifically, a DTD describes:  Elements.
3 November 2008CIS 340 # 1 Topics To define XML as a technology To place XML in the context of system architectures.
Introduction to XLink Transparency No. 1 XML Information Set W3C Recommendation 24 October 2001 (1stEdition) 4 February 2004 (2ndEdition) Cheng-Chia Chen.
26-Jun-15 XML. 2 HTML and XML, I XML stands for eXtensible Markup Language HTML is used to mark up text so it can be displayed to users XML is used to.
Tutorial 11 Creating XML Document
Document Type Definitions. XML and DTDs A DTD (Document Type Definition) describes the structure of one or more XML documents. Specifically, a DTD describes:
Introducing XHTML: Module B: HTML to XHTML. Goals Understand how XHTML evolved as a language for Web delivery Understand the importance of DTDs Understand.
Fundamentals of Web DevelopmentRandy Connolly and Ricardo HoarFundamentals of Web DevelopmentRandy Connolly and Ricardo Hoar Fundamentals of Web DevelopmentRandy.
Introduction to XML This material is based heavily on the tutorial by the same name at
Introducing HTML & XHTML:. Goals  Understand hyperlinking  Understand how tags are formed and used.  Understand HTML as a markup language  Understand.
Topics The "bigger picture" –The "XML sales pitch" –XML/XHTML vs. SGML/HTML –XML in electronic publishing –XML and the future, web 2.0 XML basics: –Building.
Copyright © 2003 Pearson Education, Inc. Slide 2-1 Created by Cheryl M. Hughes, Harvard University Extension School — Cambridge, MA The Web Wizard’s Guide.
ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard.
XML introduction to Ahmed I. Deeb Dr. Anwar Mousa  presenter  instructor University Of Palestine-2009.
XP Tutorial 9New Perspectives on Creating Web Pages with HTML, XHTML, and XML 1 Working with XHTML Creating a Well-Formed Valid Document Tutorial 9.
XML Anisha K J Jerrin Thomas. Outline  Introduction  Structure of an XML Page  Well-formed & Valid XML Documents  DTD – Elements, Attributes, Entities.
Introduction to XML cs3505. References –I got most of this presentation from this site –O’reilly tutorials.
E X TENSIBLE M ARKUP L ANGUAGE (XML). What is XML?  XML stands for EXtensible Markup Language  XML is mainly designed to carry (or transmit) data, not.
Representing Web Data: XML CSI 3140 WWW Structures, Techniques and Standards.
XP 1 CREATING AN XML DOCUMENT. XP 2 INTRODUCING XML XML stands for Extensible Markup Language. A markup language specifies the structure and content of.
1 © Netskills Quality Internet Training, University of Newcastle Introducing XML © Netskills, Quality Internet Training University.
XML 1 Enterprise Applications CE00465-M XML. 2 Enterprise Applications CE00465-M XML Overview Extensible Mark-up Language (XML) is a meta-language that.
August Chapter 2 - Markup and Core Concepts Learning XML by Erik T. Ray Slides were developed by Jack Davis College of Information Science and Technology.
1 CIS336 Website design, implementation and management (also Semester 2 of CIS219, CIS221 and IT226) Lecture 6 XSLT (Based on Møller and Schwartzbach,
Representing Web Data: XML CSI 3140 WWW Structures, Techniques and Standards.
What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data.
 XML is designed to describe data and to focus on what data is. HTML is designed to display data and to focus on how data looks.  XML is created to structure,
Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka.
Tutorial 1: XML Creating an XML Document. 2 Introducing XML XML stands for Extensible Markup Language. A markup language specifies the structure and content.
1 Tutorial 13 Validating Documents with DTDs Working with Document Type Definitions.
1 Dr Alexiei Dingli XML Technologies XML Advanced.
1 Chapter 10: XML What is XML What is XML Basic Components of XML Basic Components of XML XPath XPath XQuery XQuery.
Softsmith Infotech XML. Softsmith Infotech XML EXtensible Markup Language XML is a markup language much like HTML Designed to carry data, not to display.
E0262 – MIS – Multimedia Storage Techniques XML (Extensible Markup Language  XML is a markup language for creating documents containing structured information.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation An Introduction to XML.
XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document.
Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies.
XML Instructor: Charles Moen CSCI/CINF XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.
XP 1 Creating an XML Document Developing an XML Document for the Jazz Warehouse XML Tutorial.
XML EXtensible Markup Language. Agenda Introduction to XML XML Rules XML Elements XML Attributes XML Validation XML Exercises XML Namespaces XML CDATA.
1 Introduction to XML XML stands for Extensible Markup Language. Because it is extensible, XML has been used to create a wide variety of different markup.
XML Design Goals 1.XML must be easily usable over the Internet 2.XML must support a wide variety of applications 3.XML must be compatible with SGML 4.It.
1 Tutorial 11 Creating an XML Document Developing a Document for a Cooking Web Site.
Internet & World Wide Web How to Program, 5/e. © by Pearson Education, Inc. All Rights Reserved.2.
1 herbert van de sompel CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 1COMP9321, 15s2, Week.
225 City Avenue, Suite 106 Bala Cynwyd, PA , phone , fax presents… XML Syntax v2.0.
Web Technologies Lecture 4 XML and XHTML. XML Extensible Markup Language Set of rules for encoding a document in a format readable – By humans, and –
19-Dec-15 XML 2 HTML and XML, I XML stands for eXtensible Markup Language HTML is used to mark up text so it can be displayed to users XML is used to.
Well Formed XML The basics. A Simple XML Document Smith Alice.
XML CSC1310 Fall HTML (TIM BERNERS-LEE) HyperText Markup Language  HTML (HyperText Markup Language): December  Markup  Markup is a symbol.
XP Tutorial 9New Perspectives on HTML and XHTML, Comprehensive 1 Working with XHTML Creating a Well-Formed Valid Document Tutorial 9.
Web Technology (NCS-504) Prepared By Mr. Abhishek Kesharwani Assistant Professor,UCER Naini,Allahabad.
C Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Introduction to XML Standards.
Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Representing Web Data:
CHAPTER NINE Accessing Data Using XML. McGraw Hill/Irwin ©2002 by The McGraw-Hill Companies, Inc. All rights reserved Introduction The eXtensible.
XML Introduction to XML Extensible Markup Language.
XML Notes taken from w3schools. What is XML? XML stands for EXtensible Markup Language. XML was designed to store and transport data. XML was designed.
1 Extensible Stylesheet Language (XSL) Extensible Stylesheet Language (XSL)
Unit 4 Representing Web Data: XML
eXtensible Markup Language
Introduction to XML Mr. Majed Bouchahma
Chapter 7 Representing Web Data: XML
Introduction to XML Mr. Majed Bouchahma
eXtensible Markup Language
Presentation transcript:

Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Representing Web Data: XML WEB TECHNOLOGIES A COMPUTER SCIENCE PERSPECTIVE JEFFREY C. JACKSON 1

Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved XML Example XML document: An XML document is one that follows certain syntax rules (most of which we followed for XHTML) 2

Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved XML Syntax An XML document consists of –Markup Tags, which begin with References, which begin with & and end with ; –Character, e.g. –Entity, e.g. < »The entities lt, gt, amp, apos, and quot are recognized in every XML document. »Other XHTML entities, such as nbsp, are only recognized in other XML documents if they are defined in the DTD –Character data: everything not markup 3

Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved XML Syntax Comments –Begin with <!-- –End --> –Must not contain – CDATA section –Special element the entire content of which is interpreted as character data, even if it appears to be markup –Begins with <![CDATA[ –Ends with ]]> (illegal except when ending CDATA) 4

Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved XML Syntax The CDATA section is equivalent to the markup 5

Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved XML Syntax < and & must be represented by references except –When beginning markup –Within comments –Within CDATA sections 6

Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved XML Syntax Element tags and elements –Three types Start, e.g. End, e.g. Empty element, e.g. –Start and end tags must properly nest –Corresponding pair of start and end element tags plus everything in between them defines an element –Character data may only appear within an element 7

Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved XML Syntax Start and empty-element tags may contain attribute specifications separated by white space –Syntax: name = quoted value –quoted value must not contain <, can contain & only if used as start of reference –quoted value must begin and end with matching quote characters ( ‘ or “ ) 8

Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved XML Syntax Element and attribute names are case sensitive XML white space characters are space, carriage return, line feed, and tab 9

Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved XML Documents A well-formed XML document –follows the XML syntax rules and –has a single root element Well-formed documents have a tree structure Many XML parsers (software for reading/writing XML documents) use tree representation internally 10

Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved XML Documents An XML document is written according to an XML vocabulary that defines –Recognized element and attribute names –Allowable element content –Semantics of elements and attributes XHTML is one widely-used XML vocabulary Another example: RSS (rich site summary) 11

sibling $ $ $ Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved

Sibling ex Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved

Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved XML Documents 14

Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved XML Documents 15

Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved XML Documents Valid names and content for an XML vocabulary can be specified using –Natural language –XML DTDs (Chapter 2) –XML Schema (Chapter 9) If DTD is used, then XML document can include a document type declaration: 16

Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved XML Documents Two types of XML parsers: –Validating Requires document type declaration Generates error if document does not –Conform with DTD and –Meet XML validity constraints »Example: every attribute value of type ID must be unique within the document –Non-validating Checks for well-formedness Can ignore external DTD 17

Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved XML Documents Good practice to begin XML documents with an XML declaration –Minimal example: –If included, < must be very first character of the document –To override default UTF-8/UTF-16 character encoding, include encoding declaration following version: 18

19 XML building blocks Aside from the directives, an XML document is built from: –elements: high in 103 –tags, in pairs: 103 –attributes: 103 –entities: Sunny & hot –character data, which may be: parsed (processed as XML)--this is the default unparsed (all characters stand for themselves)

20 Elements and attributes Attributes and elements are somewhat interchangeable Example using just elements: David Matuszek Example using attributes: You will find that elements are easier to use in your programs--this is a good reason to prefer them Attributes often contain metadata, such as unique IDs Generally speaking, browsers display only elements (values enclosed by tags), not tags and attributes

21 Entities Five special characters must be written as entities: & for & (almost always necessary) < for < (almost always necessary) > for > (not usually necessary) " for " (necessary inside double quotes) &apos; for ' (necessary inside single quotes) These entities can be used even in places where they are not absolutely required These are the only predefined entities in XML

22 Processing instructions PIs (Processing Instructions) may occur anywhere in the XML document (but usually first) A PI is a command to the program processing the XML document to handle it in a certain way XML documents are typically processed by more than one program Programs that do not recognize a given PI should just ignore it General format of a PI: Example:

23 Comments Comments can be put anywhere in an XML document Comments are useful for: –Explaining the structure of an XML document –Commenting out parts of the XML during development and testing Comments are not elements and do not have an end tag The blanks after are optional The character sequence -- cannot occur in the comment The closing bracket must be --> Comments are not displayed by browsers, but can be seen by anyone who looks at the source code

24 CDATA By default, all text inside an XML document is parsed You can force text to be treated as unparsed character data by enclosing it in Any characters, even & and <, can occur inside a CDATA Whitespace inside a CDATA is (usually) preserved The only real restriction is that the character sequence ]]> cannot occur inside a CDATA CDATA is useful when your text has a lot of illegal characters (for example, if your XML document contains some HTML text)

XML Namespaces To solve name conflicts using namespace Syntax: xmlns:prefix="URI". When using prefixes in XML, a so- called namespace for the prefix must be defined. The namespace is defined by the xmlns attribute in the start tag of an element. Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved

example Apples Bananas Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved

African Coffee Table Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved

Uniform Resource Identifier (URI) A Uniform Resource Identifier (URI) is a string of characters which identifies an Internet Resource. The most common URI is the Uniform Resource Locator (URL) which identifies an Internet domain address. Another, not so common type of URI is the Universal Resource Name (URN). Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved

Default Namespaces Syntax: xmlns="namespaceURI“ Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved

Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved XML Namespaces XML Namespace: Collection of element and attribute names associated with an XML vocabulary Namespace Name: Absolute URI that is the name of the namespace –Ex: is the namespace name of XHTML 1.0http:// Default namespace for elements of a document is specified using a form of the xmlns attribute: 30

Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved XML Namespaces Another form of xmlns attribute known as a namespace declaration can be used to associate a namespace prefix with a namespace name: Namespace prefix Namespace declaration 31

Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved XML Namespaces Example use of namespace prefix: 32

Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved XML Namespaces In a namespace-aware XML application, all element and attribute names are considered qualified names –A qualified name has an associated expanded name that consists of a namespace name and a local name –Ex: item is a qualified name with expanded name –Ex: xhtml:a is a qualified name with expanded name 33

Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved XML Namespaces Other namespace usage: A namespace can be declared and used on the same element 34

Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved XML Namespaces Other namespace usage: A namespace prefix can be redefined for an element and its content These elements belong to 35

Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved JavaScript and XML JavaScript DOM can be used to process XML documents JavaScript XML Dom processing is often used with XMLHttpRequest –Host object that is a constructor for other host objects –Sends an HTTP request to server, receives back an XML document 36

Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved JavaScript and XML Example use: –Previous visit count servlet: must reload document to see updated count –Visit count with XMLHttpRequest : browser will automatically update the visit count periodically without reloading the entire page 37

Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved JavaScript and XML Ajax: Asynchronous JavaScript and XML Combination of –(X)HTML –XML –CSS –JavaScript –JavaScript DOM (HTML and XML) –XMLHttpRequest in ansynchronous mode 38

Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved Java-based DOM Java DOM API defined by org.w3c.dom package Semantically similar to JavaScript DOM API, but many small syntactic differences –Nodes of DOM tree belong to classes such as Node, Document, Element, Text –Non-method properties accessed via methods Ex: parentNode accessed by calling getParentNode() 39

Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved Java-based DOM Methods such as getElementsByTagName() return instance of NodeList –getLength() method returns # of items –item() method returns an item 40

Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved Java-based DOM Default parser is non-validating and non- namespace-aware. 41

Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved XSL The Extensible Stylesheet Language (XSL) is an XML vocabulary typically used to transform XML documents from one form to another form XSL document Input XML document XSLT Processor Output XML document 42

Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved XSL Components of XSL: –XSL Transformations (XSLT): defines XSL namespace elements and attributes –XML Path Language (XPath): used in many XSL attribute values (ex: child::message ) –XSL Formatting Objects (XSL-FO): XML vocabulary for defining document style (print- oriented) 43

Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved XML and Browsers An XML document can contain a processing instruction telling a browser to: –Apply XSLT to create an XHTML document: 44

Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved XML and Browsers An XML document can contain a processing instruction telling a browser to: –Apply CSS to style the XML document: 45