Introduction to XML Extensible Markup Language

Slides:



Advertisements
Similar presentations
XML-XSL Introduction SHIJU RAJAN SHIJU RAJAN Outline Brief Overview Brief Overview What is XML? What is XML? Well Formed XML Well Formed XML Tag Name.
Advertisements

CSCI N241: Fundamentals of Web Design Copyright ©2004 Department of Computer & Information Science Introducing XHTML: Module B: HTML to XHTML.
XML: Extensible Markup Language
CS 898N – Advanced World Wide Web Technologies Lecture 21: XML Chin-Chih Chang
Tutorial 9 Working with XHTML. XP Objectives Describe the history and theory of XHTML Understand the rules for creating valid XHTML documents Apply a.
XML(EXtensible Markup Language). XML XML stands for EXtensible Markup Language. XML is a markup language much like HTML. XML was designed to describe.
XML Primer. 2 History: SGML vs. HTML vs. XML SGML (1960) XML(1996) HTML(1990) XHTML(2000)
Introduction to XML Extensible Markup Language Carol Wolf Computer Science Department.
Introducing XHTML: Module B: HTML to XHTML. Goals Understand how XHTML evolved as a language for Web delivery Understand the importance of DTDs Understand.
Introduction to XML This material is based heavily on the tutorial by the same name at
Introducing HTML & XHTML:. Goals  Understand hyperlinking  Understand how tags are formed and used.  Understand HTML as a markup language  Understand.
Topics The "bigger picture" –The "XML sales pitch" –XML/XHTML vs. SGML/HTML –XML in electronic publishing –XML and the future, web 2.0 XML basics: –Building.
Chapter 12 Creating and Using XML Documents HTML5 AND CSS Seventh Edition.
JavaScript, Fifth Edition Chapter 1 Introduction to JavaScript.
XP The University of Akron Summit College Business Technology Department Computer Information Systems 2440: 140 Internet Tools Instructor: Enoch E. Damson.
XML Anisha K J Jerrin Thomas. Outline  Introduction  Structure of an XML Page  Well-formed & Valid XML Documents  DTD – Elements, Attributes, Entities.
Introduction to XML cs3505. References –I got most of this presentation from this site –O’reilly tutorials.
Why XML ? Problems with HTML HTML design - HTML is intended for presentation of information as Web pages. - HTML contains a fixed set of markup tags. This.
XML eXtensible Markup Language by Darrell Payne. Experience Logicon / Sterling Federal C, C++, JavaScript/Jscript, Shell Script, Perl XML Training XML.
CREATED BY ChanoknanChinnanon PanissaraUsanachote
XML 1 Enterprise Applications CE00465-M XML. 2 Enterprise Applications CE00465-M XML Overview Extensible Mark-up Language (XML) is a meta-language that.
XHTML. Introduction to XHTML What Is XHTML? – XHTML stands for EXtensible HyperText Markup Language – XHTML is almost identical to HTML 4.01 – XHTML is.
What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data.
Introduction to XML Extensible Markup Language. What is XML XML stands for eXtensible Markup Language. A markup language is used to provide information.
 XML is designed to describe data and to focus on what data is. HTML is designed to display data and to focus on how data looks.  XML is created to structure,
Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka.
Tutorial 1: XML Creating an XML Document. 2 Introducing XML XML stands for Extensible Markup Language. A markup language specifies the structure and content.
XML TUTORIAL Portions from w3 schools By Dr. John Abraham.
Electronic Commerce COMP3210 Session 4: Designing, Building and Evaluating e-Commerce Initiatives – Part II Dr. Paul Walcott Department of Computer Science,
Avoid using attributes? Some of the problems using attributes: Attributes cannot contain multiple values (child elements can) Attributes are not easily.
Softsmith Infotech XML. Softsmith Infotech XML EXtensible Markup Language XML is a markup language much like HTML Designed to carry data, not to display.
E0262 – MIS – Multimedia Storage Techniques XML (Extensible Markup Language  XML is a markup language for creating documents containing structured information.
XML 2nd EDITION Tutorial 1 Creating An Xml Document.
Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies.
XML Instructor: Charles Moen CSCI/CINF XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.
XP 1 Creating an XML Document Developing an XML Document for the Jazz Warehouse XML Tutorial.
Lecture 16 Introduction to XML Boriana Koleva Room: C54
Web Technologies COMP6115 Session 4: Adding a Database to a Web Site Dr. Paul Walcott Department of Computer Science, Mathematics and Physics University.
1 Introduction to XML XML stands for Extensible Markup Language. Because it is extensible, XML has been used to create a wide variety of different markup.
1 Credits Prepared by: Rajendra P. Srivastava Ernst & Young Professor University of Kansas Sponsored by: Ernst & Young, LLP (August 2005) XBRL Module Part.
XML Introduction. What is XML? XML stands for eXtensible Markup Language XML stands for eXtensible Markup Language XML is a markup language much like.
XML Design Goals 1.XML must be easily usable over the Internet 2.XML must support a wide variety of applications 3.XML must be compatible with SGML 4.It.
XML Introduction. Markup Language A markup language must specify What markup is allowed What markup is required How markup is to be distinguished from.
XML Basics A brief introduction to XML in general 1XML Basics.
1 Tutorial 11 Creating an XML Document Developing a Document for a Cooking Web Site.
Java and XML. What is XML XML stands for eXtensible Markup Language. A markup language is used to provide information about a document. Tags are added.
Internet & World Wide Web How to Program, 5/e. © by Pearson Education, Inc. All Rights Reserved.2.
1 herbert van de sompel CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 1COMP9321, 15s2, Week.
Introduction to HTML Year 8. What is HTML O Hyper Text Mark-up Language O The language that all the elements of a web page are written in. O It describes.
XML A Language Presentation. Outline 1. Introduction 2. XML 2.1 Background 2.2 Structure 2.3 Advantages 3. Related Technologies 3.1 DTD 3.2 Schemas and.
Well Formed XML The basics. A Simple XML Document Smith Alice.
What is XML? eXtensible Markup Language eXtensible Markup Language A subset of SGML (Standard Generalized Markup Language) A subset of SGML (Standard Generalized.
Introduction to XML XML – Extensible Markup Language.
XML CSC1310 Fall HTML (TIM BERNERS-LEE) HyperText Markup Language  HTML (HyperText Markup Language): December  Markup  Markup is a symbol.
XP Tutorial 9New Perspectives on HTML and XHTML, Comprehensive 1 Working with XHTML Creating a Well-Formed Valid Document Tutorial 9.
Tutorial 9 Working with XHTML. New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 2 Objectives Describe the history and theory of XHTML.
Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Representing Web Data:
Tutorial 9 Working with XHTML. XP Objectives Describe the history and theory of XHTML Understand the rules for creating valid XHTML documents Apply a.
XML Introduction to XML Extensible Markup Language.
XML Notes taken from w3schools. What is XML? XML stands for EXtensible Markup Language. XML was designed to store and transport data. XML was designed.
Extensible Markup Language (XML) Pat Morin COMP 2405.
XML BASICS and more…. What is XML? In common:  XML is a standard, simple, self-describing way of encoding both text and data so that content can be processed.
Unit 4 Representing Web Data: XML
Introduction to XML Extensible Markup Language
Chapter 7 Representing Web Data: XML
Introduction to XML Extensible Markup Language
Introduction to XML Extensible Markup Language
14 XML.
Lecture 4 Introduction to XML Extensible Markup Language
Presentation transcript:

Introduction to XML Extensible Markup Language Carol Wolf Computer Science Department

What is XML XML stands for eXtensible Markup Language. A markup language is used to provide information about a document. Tags are added to the document to provide the extra information. HTML tags tell a browser how to display the document. XML tags give a reader some idea what some of the data means.

What is XML Used For? XML documents are used to transfer data from one place to another often over the Internet. XML subsets are designed for particular applications. One is RSS (Rich Site Summary or Really Simple Syndication ). It is used to send breaking news bulletins from one web site to another. A number of fields have their own subsets. These include chemistry, mathematics, and books publishing. Most of these subsets are registered with the W3Consortium and are available for anyone’s use.

Advantages of XML XML is text (Unicode) based. Takes up less space. Can be transmitted efficiently. One XML document can be displayed differently in different media. Html, video, CD, DVD, You only have to change the XML document in order to change all the rest. XML documents can be modularized. Parts can be reused.

Example of an HTML Document <head><title>Example</title></head. <body> <h1>This is an example of a page.</h1> <h2>Some information goes here.</h2> </body> </html>

Example of an XML Document <?xml version=“1.0”/> <address> <name>Alice Lee</name> <email>alee@aol.com</email> <phone>212-346-1234</phone> <birthday>1985-03-22</birthday> </address>

Difference Between HTML and XML HTML tags have a fixed meaning and browsers know what it is. XML tags are different for different applications, and users know what they mean. HTML tags are used for display. XML tags are used to describe documents and data.

XML Rules Tags are enclosed in angle brackets. Tags come in pairs with start-tags and end-tags. Tags must be properly nested. <name><email>…</name></email> is not allowed. <name><email>…</email><name> is. Tags that do not have end-tags must be terminated by a ‘/’. <br /> is an html example.

More XML Rules Tags are case sensitive. <address> is not the same as <Address> XML in any combination of cases is not allowed as part of a tag. Tags may not contain ‘<‘ or ‘&’. Tags follow Java naming conventions, except that a single colon and other characters are allowed. They must begin with a letter and may not contain white space. Documents must have a single root tag that begins the document.

Encoding XML (like Java) uses Unicode to encode characters. Unicode comes in many flavors. The most common one used in the West is UTF-8. UTF-8 is a variable length code. Characters are encoded in 1 byte, 2 bytes, or 4 bytes. The first 128 characters in Unicode are ASCII. In UTF-8, the numbers between 128 and 255 code for some of the more common characters used in western Europe, such as ã, á, å, or ç. Two byte codes are used for some characters not listed in the first 256 and some Asian ideographs. Four byte codes can handle any ideographs that are left. Those using non-western languages should investigate other versions of Unicode.

Well-Formed Documents An XML document is said to be well-formed if it follows all the rules. An XML parser is used to check that all the rules have been obeyed. Recent browsers such as Internet Explorer 5 and Netscape 7 come with XML parsers. Parsers are also available for free download over the Internet. One is Xerces, from the Apache open-source project. Java 1.4 also supports an open-source parser.

XML Example Revisited <?xml version=“1.0”/> <address> <name>Alice Lee</name> <email>alee@aol.com</email> <phone>212-346-1234</phone> <birthday>1985-03-22</birthday> </address> Markup for the data aids understanding of its purpose. A flat text file is not nearly so clear. Alice Lee alee@aol.com 212-346-1234 1985-03-22 The last line looks like a date, but what is it for?

Expanded Example <?xml version = “1.0” ?> <address> <name> <first>Alice</first> <last>Lee</last> </name> <email>alee@aol.com</email> <phone>123-45-6789</phone> <birthday> <year>1983</year> <month>07</month> <day>15</day> </birthday> </address>

XML Files are Trees address name email phone birthday first last year month day

XML Trees An XML document has a single root node. The tree is a general ordered tree. A parent node may have any number of children. Child nodes are ordered, and may have siblings. Preorder traversals are usually used for getting information out of the tree.

Validity A well-formed document has a tree structure and obeys all the XML rules. A particular application may add more rules in either a DTD (document type definition) or in a schema. Many specialized DTDs and schemas have been created to describe particular areas. These range from disseminating news bulletins (RSS) to chemical formulas. DTDs were developed first, so they are not as comprehensive as schema.

Document Type Definitions A DTD describes the tree structure of a document and something about its data. There are two data types, PCDATA and CDATA. PCDATA is parsed character data. CDATA is character data, not usually parsed. A DTD determines how many times a node may appear, and how child nodes are ordered.

DTD for address Example <!ELEMENT address (name, email, phone, birthday)> <!ELEMENT name (first, last)> <!ELEMENT first (#PCDATA)> <!ELEMENT last (#PCDATA)> <!ELEMENT email (#PCDATA)> <!ELEMENT phone (#PCDATA)> <!ELEMENT birthday (year, month, day)> <!ELEMENT year (#PCDATA)> <!ELEMENT month (#PCDATA)> <!ELEMENT day (#PCDATA)>

Schemas Schemas are themselves XML documents. They were standardized after DTDs and provide more information about the document. They have a number of data types including string, decimal, integer, boolean, date, and time. They divide elements into simple and complex types. They also determine the tree structure and how many children a node may have.

Schema for First address Example <?xml version="1.0" encoding="ISO-8859-1" ?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="address"> <xs:complexType> <xs:sequence> <xs:element name="name" type="xs:string"/> <xs:element name="email" type="xs:string"/> <xs:element name="phone" type="xs:string"/> <xs:element name="birthday" type="xs:date"/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>

Explanation of Example Schema <?xml version="1.0" encoding="ISO-8859-1" ?> ISO-8859-1, Latin-1, is the same as UTF-8 in the first 128 characters. <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> www.w3.org/2001/XMLSchema contains the schema standards. <xs:element name="address"> <xs:complexType> This states that address is a complex type element. <xs:sequence> This states that the following elements form a sequence and must come in the order shown. <xs:element name="name" type="xs:string"/> This says that the element, name, must be a string. <xs:element name="birthday" type="xs:date"/> This states that the element, birthday, is a date. Dates are always of the form yyyy-mm-dd.

XSLT Extensible Stylesheet Language Transformations XSLT is used to transform one xml document into another, often an html document. The Transform classes are now part of Java 1.4. A program is used that takes as input one xml document and produces as output another. If the resulting document is in html, it can be viewed by a web browser. This is a good way to display xml data.

A Style Sheet to Transform address.xml <?xml version="1.0" encoding="ISO-8859-1"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="address"> <html><head><title>Address Book</title></head> <body> <xsl:value-of select="name"/> <br/><xsl:value-of select="email"/> <br/><xsl:value-of select="phone"/> <br/><xsl:value-of select="birthday"/> </body> </html> </xsl:template> </xsl:stylesheet>

The Result of the Transformation Alice Lee alee@aol.com 123-45-6789 1983-7-15

Parsers There are two principal models for parsers. SAX – Simple API for XML Uses a call-back method Similar to javax listeners DOM – Document Object Model Creates a parse tree Requires a tree traversal

References Elliotte Rusty Harold, Processing XML with Java, Addison Wesley, 2002. Elliotte Rusty Harold and Scott Means, XML Programming, O’Reilly & Associates, Inc., 2002. W3Schools Online Web Tutorials, http://www.w3schools.com.