Introduction to XML: Part I

Slides:



Advertisements
Similar presentations
XML: text format Dr Andy Evans. Text-based data formats As data space has become cheaper, people have moved away from binary data formats. Text easier.
Advertisements

1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:
An Introduction to XML Based on the W3C XML Recommendations.
1 XML DTD & XML Schema Monica Farrow G30
Tutorial 9 Working with XHTML
An Introduction to XML Schema CSCI 7818 by Ming Rutar.
2/9/00 EECS 684: Current Topics in Databases1 ( W3C Working Draft 17 December 1999 )
XML Schemas Lecture 10, 07/10/02. Acknowledgements A great portion of this presentation has been borrowed from Roger Costello’s excellent presentation.
XML, XSL, XSLT, XHTML and others By Sean Hunter. Why XML?  XML was created to be a quick and easy way to provide structured data over the web.  Existing.
Outline IS400: Development of Business Applications on the Internet Fall 2004 Instructor: Dr. Boris Jukic XML.
XML Introduction What is XML –XML is the eXtensible Markup Language –Became a W3C Recommendation in 1998 –Tag-based syntax, like HTML –You get to make.
XML(EXtensible Markup Language). XML XML stands for EXtensible Markup Language. XML is a markup language much like HTML. XML was designed to describe.
Tutorial 11 Creating XML Document
Introducing XHTML: Module B: HTML to XHTML. Goals Understand how XHTML evolved as a language for Web delivery Understand the importance of DTDs Understand.
Introduction to XML This material is based heavily on the tutorial by the same name at
1 Advanced Topics XML and Databases. 2 XML u Overview u Structure of XML Data –XML Document Type Definition DTD –Namespaces –XML Schema u Query and Transformation.
Manohar – Why XML is Required Problem: We want to save the data and retrieve it further or to transfer over the network. This.
Introduction to XML: Part I By Sandeep Jangity CS 157B, Section 2 Dr. Lee.
Introduction to XML cs3505. References –I got most of this presentation from this site –O’reilly tutorials.
Why XML ? Problems with HTML HTML design - HTML is intended for presentation of information as Web pages. - HTML contains a fixed set of markup tags. This.
Dr. Azeddine Chikh IS446: Internet Software Development.
Neminath Simmachandran
McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Schemas Ellen Pearlman Eileen Mullin Programming the Web Using XML.
XP 1 CREATING AN XML DOCUMENT. XP 2 INTRODUCING XML XML stands for Extensible Markup Language. A markup language specifies the structure and content of.
 XML is designed to describe data and to focus on what data is. HTML is designed to display data and to focus on how data looks.  XML is created to structure,
Session IV Chapter 9 – XML Schemas
Avoid using attributes? Some of the problems using attributes: Attributes cannot contain multiple values (child elements can) Attributes are not easily.
1 Chapter 10: XML What is XML What is XML Basic Components of XML Basic Components of XML XPath XPath XQuery XQuery.
Softsmith Infotech XML. Softsmith Infotech XML EXtensible Markup Language XML is a markup language much like HTML Designed to carry data, not to display.
Of 33 lecture 3: xml and xml schema. of 33 XML, RDF, RDF Schema overview XML – simple introduction and XML Schema RDF – basics, language RDF Schema –
XML 2nd EDITION Tutorial 1 Creating An Xml Document.
1 Introduction  Extensible Markup Language (XML) –Uses tags to describe the structure of a document –Simplifies the process of sharing information –Extensible.
Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies.
XML Instructor: Charles Moen CSCI/CINF XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.
XML – Part III. The Element … This type of element either has the element content or the mixed content (child element and data) The attributes of the.
1 Introduction to XML XML stands for Extensible Markup Language. Because it is extensible, XML has been used to create a wide variety of different markup.
1 Credits Prepared by: Rajendra P. Srivastava Ernst & Young Professor University of Kansas Sponsored by: Ernst & Young, LLP (August 2005) XBRL Module Part.
An Introduction to XML Sandeep Bhattaram
XML Introduction. What is XML? XML stands for eXtensible Markup Language XML stands for eXtensible Markup Language XML is a markup language much like.
Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.
XML Introduction. Markup Language A markup language must specify What markup is allowed What markup is required How markup is to be distinguished from.
XML Basics A brief introduction to XML in general 1XML Basics.
1 Tutorial 11 Creating an XML Document Developing a Document for a Cooking Web Site.
XML 2nd EDITION Tutorial 4 Working With Schemas. XP Schemas A schema is an XML document that defines the content and structure of one or more XML documents.
1 Tutorial 14 Validating Documents with Schemas Exploring the XML Schema Vocabulary.
Tutorial 13 Validating Documents with Schemas
Internet & World Wide Web How to Program, 5/e. © by Pearson Education, Inc. All Rights Reserved.2.
XML Schema (W3C) Thanks to Jussi Pohjolainen TAMK University of Applied Sciences.
Web Technologies Lecture 4 XML and XHTML. XML Extensible Markup Language Set of rules for encoding a document in a format readable – By humans, and –
When we create.rtf document apart from saving the actual info the tool saves additional info like start of a paragraph, bold, size of the font.. Etc. This.
XP Tutorial 9New Perspectives on HTML and XHTML, Comprehensive 1 Working with XHTML Creating a Well-Formed Valid Document Tutorial 9.
 XML derives its strength from a variety of supporting technologies.  Structure and data types: When using XML to exchange data among clients, partners,
XML Extensible Markup Language
XML Notes taken from w3schools. What is XML? XML stands for EXtensible Markup Language. XML was designed to store and transport data. XML was designed.
XML SCHEMA 1 CH 20. Objective 2 What’s wrong with DTDs? What is a schema? The W3C XML Schema Language Hello schemas Complex types Simple types Deriving.
Web Services: Principles & Technology Slide 3.1 Chapter 3 Brief Overview of XML COMP 4302/6302.
CITA 330 Section 2 DTD. Defining XML Dialects “Well-formedness” is the minimal requirement for an XML document; all XML parsers can check it Any useful.
Extensible Markup Language (XML) Pat Morin COMP 2405.
XML BASICS and more…. What is XML? In common:  XML is a standard, simple, self-describing way of encoding both text and data so that content can be processed.
eXtensible Markup Language
Tutorial 9 Working with XHTML
XML QUESTIONS AND ANSWERS
Data Modeling II XML Schema & JAXB Marc Dumontier May 4, 2004
ece 720 intelligent web: ontology and beyond
What is XML?.
Chapter 13 XML Yingcai Xiao.
CH 20 XML Schema.
Allyson Falkner Spokane County ISD
Document Type Definition (DTD)
XML: The new standard -Eric Taylor.
Presentation transcript:

Introduction to XML: Part I By Sandeep Jangity CS 157B, Section 2 Dr. Lee

Overview What is XML? Why XML is popular? How to write a XML document? How to write XML DTD’s/Schemas?

What is XML? eXtensible Markup Language XML is a standard developed by the W3C XML is a syntax for expressing structured data in a text format XML is not a language on its own. Instead, XML is used to build markup languages. XML is like html per-se, but unlike html tags, XML tags convey meaning of the data inside their tags

Structured Data Structured data refers to data that is tagged for its content, meaning, or use Includes: spreadsheets, address books, databases, PDF documents, … Stored in binary or text format

XML Technology Model Data is modeled in XML The structure and constraints are modeled using DTD’s or Schemas The document format can be modeled using XSL (XML Style Sheets) REMEMBER: XML allows us to separate data from presentation!

Why use XML? Interoperability – XML is operating system, platform, language independent Separates content from presentation Well supported by most browsers Simple XML documents are human-readable and can be easily parsed by machines, as well. Easily converted to other formats. XML->PDF || Microsoft CHM etc., Can represent almost any kind of data Many, many applications: Math/Science/etc., (continued: next slide)

MyMathML 4 + (5 * 3) <expression> <operator> add </operator> <expression> <number> 4 </number></expression> <expression> <operator> mult </operator> <expression> <number> 5 </number> </expression> <expression> <number> 3 </number> </expression> </expression> </expression>

MyChemML ChemML (tracking experiments) <experiment date = "03-15-2003"> <introduction> The compound under investigation is common water: <molecule> <atom symbol="H" number ="2"/> <atom symbol="O" number ="1"/> </molecule> It boils at 100 degrees and freezes at 0 degrees! For more information about this amazing compound see the March 2003 issue of: <reference type = "simple" href = "http://www.ww.com"> Water World </reference> </introduction> <!-- etc --> </experiment> (Now the technical stuff)

XML document syntax Root element Elements and attributes are case sensitive Elements must be correctly nested Attributes values must be in quotes Tags must be closed Spaces are not allowed in element and attribute names

XML Example <?xml version="1.0"?> <Bookstore> <Book ID=“101”> <Author>John Doe</Author> <Title>Introduction to XML</Title> <Date>12 June 2001</Date> <ISBN>121232323</ISBN> <Publisher>XYZ</Publisher> </Book> <Book ID=“102”> <Author>Foo Bar</Author> <Title>Introduction to XSL</Title> <ISBN>12323573</ISBN> <Publisher>ABC</Publisher> </Bookstore>

Well-formed vs Valid Syntax & Semantic checking Well-formed (syntax): Properties: (1) every start tag has a matching end tag, and (2) elements are properly nested an XML document might be “well-formed” without being “valid“, but a “valid” document is “well-formed” Valid (semantic): A valid XML document conforms to the vocabulary constraints defined in a DTD or Schema

Well-Formed (cont’d) Well formed? <?xml version=“1.0”?> <memo> <from> Bill <to> Sue </from> </to> Dinner tonight? </nemo>

Definition and Validation Two ways to define the structure of an XML document DTDs Schemas Each set of rules specifies an XML vocabulary

What is a DTD? Document Type Definitions (DTD) Emphasis on the structure of the XML, what elements and attributes can appear and their relationships Difficult to work with No support for data types Not extensible

Bookstore Example <Bookstore> <Book ID=“101”> <Author>John Doe</Author> <Title>Introduction to XML</Title> <Date>12 June 2001</Date> <ISBN>121232323</ISBN> <Publisher>XYZ</Publisher> </Book> <Book ID=“102”> <Author>Foo Bar</Author> <Title>Introduction to XSL</Title> <ISBN>12323573</ISBN> <Publisher>ABC</Publisher> </Bookstore> <!ELEMENT Bookstore (Book)*> <!ELEMENT Book (Title, Author+, Date, ISBN, Publisher)> <!ATTLIST Book ID #REQUIRED> <!ELEMENT Title (#PCDATA)> <!ELEMENT Author (#PCDATA)> <!ELEMENT Date (#PCDATA)> <!ELEMENT ISBN (#PCDATA)> <!ELEMENT Publisher (#PCDATA)>

Problems with DTD’s It's not XML syntax You write your XML document using one syntax and the DTD using another syntax -> inconsistent, more work for the parsers. Limited set of primitive datatypes Desire a set of datatypes compatible with those found in databases One of the main weaknesses of DTD is its lack of support for data types beyond character strings (PCDATA). Limited support for applying constraints. Can support only constraints like “+” (1 or more occurences), “?” (0 or 1 occurences), “*” (0 or more occurences), etc. No facility for providing constraints like those found in databases (enumerations, ranges, string length, etc.)

What are Schemas? Schemas More complex than DTD’s Specify structure Support for precise data type constraints Allows for user-defined data types (complex/simple types) Enhanced datatypes (unlike PCDATA in DTD’s): Wider range of primitive data types, supporting those found in databases (string, boolean, decimal, integer, date, etc.) Can create your own datatypes (complexType) Support namespaces for extensibility

Schema Example <Bookstore> <Book ID=“101”> (next SLIDE) <Bookstore> <Book ID=“101”> <Author>John Doe</Author> <Title>Introduction to XML</Title> <Date>12 June 2001</Date> <ISBN>121232323</ISBN> <Publisher>XYZ</Publisher> </Book> <Book ID=“102”> <Author>Foo Bar</Author> <Title>Introduction to XSL</Title> <ISBN>12323573</ISBN> <Publisher>ABC</Publisher> </Bookstore>

<xsd:schema xmlns:xsd=“http://www.w3.org/2001/XMLSchema” targetNamespace="http://www.books.org" xmlns=“http://www.books.org”> <xsd:element name="Bookstore"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Book" minOccurs="1" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Book"> <xsd:element ref="Title" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Author" minOccurs="1" maxOccurs=“unbounded”/> <xsd:element ref="Date" minOccurs="1" maxOccurs="1"/> <xsd:element ref="ISBN" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Publisher" minOccurs="1" maxOccurs="1"/> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string"/> <xsd:element name="Date" type="xsd:Date"/> <xsd:element name="ISBN" type="xsd:integer"/> <xsd:element name="Publisher" type="xsd:string"/> </xsd:schema>

XML Namespaces: Code-reuse Identifies an XML vocabulary defined by a URI (Uniform Resource Identifier) Allows reuse of XML markup Resolves problems with recognition and collision of tags with similar names. Can happen if your combining elements from multiple documents. (see previous slide)

Cool XML Application: RSS <rss version="0.91">   <channel>    <title>XML.com</title>      <link>http://www.xml.com/</link>      <description>XML.com features a rich mix of information and services for the XML community.</description>     <language>en-us</language>      <item>        <title>Normalizing XML, Part 2</title>        <link>http://www.xml.com/pub/a/2002/12/04/normalizing.html</link>        <description>In this second and final look at applying relational normalization techniques to W3C XML Schema data modeling, Will Provost discusses when not to normalize, the scope of uniqueness and the fourth and fifth normal forms.</description>     </item>     <item>        <title>The .NET Schema Object Model</title>        <link>http://www.xml.com/pub/a/2002/12/04/som.html</link>        <description>Priya Lakshminarayanan describes in detail the use of the .NET Schema Object Model for programmatic manipulation of W3C XML Schemas.</description>     </item>   </channel> </rss>

?? Almost done … 

TOools/Software XML Spy By far, the most comprehensive editor. Handles XML files, DTD’s, XSL files, as well as XSD (XML Schema). Unfortunately only a 30 day trial version. http://www.xmlspy.com/download.html XML Notepad Microsoft XML Notepad is a simple application for building and editing small sets of XML-based data. Freeware. http://msdn.microsoft.com/xml/notepad/download.asp XML Pro XML Pro is a top-notch XML editor but it doesn’t include as many features as XML Spy. Shareware. http://www.vervet.com/demo.html $$ You can also validate your XML files by just opening them with IE5.0 or above. It checks if the XML file is well-formed or not, and also validates against a DTD (if specified on the DOCTYPE declaration Great links: www.w3schools.com http://www.cs.sjsu.edu/faculty/pearce/web/front.htm

Conclusion You thought HTML was easy? XML just got easier!   Get XML certified before you graduate! Visit: http://www.whizlabs.com/articles/xml-article.html Questions skumarjang@hotmail.com