CSCE 315 – Programming Studio Spring 2013

Slides:



Advertisements
Similar presentations
XML I.
Advertisements

What is XML? a meta language that allows you to create and format your own document markups a method for putting structured data into a text file; these.
XML: text format Dr Andy Evans. Text-based data formats As data space has become cheaper, people have moved away from binary data formats. Text easier.
XML 6.3 DTD 6. XML and DTDs A DTD (Document Type Definition) describes the structure of one or more XML documents. Specifically, a DTD describes:  Elements.
XML Document Type Definitions ( DTD ). 1.Introduction to DTD An XML document may have an optional DTD, which defines the document’s grammar. Since the.
1 XML DTD & XML Schema Monica Farrow G30
Document Type Definitions
Review Writing XML  Style  Common errors 1XML Technologies David Raponi.
A Technical Introduction to XML Transparency No. 1 XML quick References.
 2002 Prentice Hall, Inc. All rights reserved. ISQA 407 XML/WML Winter 2002 Dr. Sergio Davalos.
XML(EXtensible Markup Language). XML XML stands for EXtensible Markup Language. XML is a markup language much like HTML. XML was designed to describe.
Tutorial 11 Creating XML Document
XML Verification Well-formed XML document  conforms to basic XML syntax  contains only built-in character entities Validated XML document  conforms.
Document Type Definitions. XML and DTDs A DTD (Document Type Definition) describes the structure of one or more XML documents. Specifically, a DTD describes:
Introduction to XML This material is based heavily on the tutorial by the same name at
Tutorial 3: XML Creating a Valid XML Document. 2 Creating a Valid Document You validate documents to make certain necessary elements are never omitted.
XP New Perspectives on XML Tutorial 3 1 DTD Tutorial – Carey ISBN
Validating DOCUMENTS with DTDs
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Document Type Definition.
XML Anisha K J Jerrin Thomas. Outline  Introduction  Structure of an XML Page  Well-formed & Valid XML Documents  DTD – Elements, Attributes, Entities.
XML CPSC 315 – Programming Studio Fall 2008 Project 3, Lecture 1.
XP 1 CREATING AN XML DOCUMENT. XP 2 INTRODUCING XML XML stands for Extensible Markup Language. A markup language specifies the structure and content of.
Document Type Definitions Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
MIS 315 Bsharah An Introduction to XML 1MIS Bsharah.
1 herbert van de sompel CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel
1 © Netskills Quality Internet Training, University of Newcastle Introducing XML © Netskills, Quality Internet Training University.
XML 1 Enterprise Applications CE00465-M XML. 2 Enterprise Applications CE00465-M XML Overview Extensible Mark-up Language (XML) is a meta-language that.
XML Extensible Markup Language. What is XML? ● meta-markup language ● a language for defining a family of languages ● semantic/structured mark-up language.
XML Syntax - Writing XML and Designing DTD's
XP 1 DECLARING A DTD A DTD can be used to: –Ensure all required elements are present in the document –Prevent undefined elements from being used –Enforce.
XML (2) DTD Sungchul Hong.
Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka.
1 Tutorial 13 Validating Documents with DTDs Working with Document Type Definitions.
Avoid using attributes? Some of the problems using attributes: Attributes cannot contain multiple values (child elements can) Attributes are not easily.
XML A web enabled data description language 4/22/2001 By Mark Lawson & Edward Ryan L’Herault.
1 Chapter 10: XML What is XML What is XML Basic Components of XML Basic Components of XML XPath XPath XQuery XQuery.
XML 2nd EDITION Tutorial 1 Creating An Xml Document.
XML - DTD Week 4 Anthony Borquez. What can XML do? provides an application independent way of sharing data. independent groups of people can agree to.
Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies.
XML Instructor: Charles Moen CSCI/CINF XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.
XP 1 Creating an XML Document Developing an XML Document for the Jazz Warehouse XML Tutorial.
Lecture 16 Introduction to XML Boriana Koleva Room: C54
1 Introduction to XML XML stands for Extensible Markup Language. Because it is extensible, XML has been used to create a wide variety of different markup.
An Introduction to XML Sandeep Bhattaram
McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Understanding How XML Works Ellen Pearlman Eileen Mullin Programming the.
XML Introduction. What is XML? XML stands for eXtensible Markup Language XML stands for eXtensible Markup Language XML is a markup language much like.
Sheet 1XML Technology in E-Commerce 2001Lecture 2 XML Technology in E-Commerce Lecture 2 Logical and Physical Structure, Validity, DTD, XML Schema.
XML Design Goals 1.XML must be easily usable over the Internet 2.XML must support a wide variety of applications 3.XML must be compatible with SGML 4.It.
1 Tutorial 11 Creating an XML Document Developing a Document for a Cooking Web Site.
Internet & World Wide Web How to Program, 5/e. © by Pearson Education, Inc. All Rights Reserved.2.
QUALITY CONTROL WITH SCHEMAS CSC1310 Fall BASIS CONCEPTS SchemaSchema is a pass-or-fail test for document Schema is a minimum set of requirements.
XML CSC1310 Fall HTML (TIM BERNERS-LEE) HyperText Markup Language  HTML (HyperText Markup Language): December  Markup  Markup is a symbol.
DTD Document Type Definition. Agenda Introduction to DTD DTD Building Blocks DTD Elements DTD Attributes DTD Entities DTD Exercises DTD Q&A.
XML Notes taken from w3schools. What is XML? XML stands for EXtensible Markup Language. XML was designed to store and transport data. XML was designed.
CITA 330 Section 2 DTD. Defining XML Dialects “Well-formedness” is the minimal requirement for an XML document; all XML parsers can check it Any useful.
Extensible Markup Language (XML) Pat Morin COMP 2405.
Unit 4 Representing Web Data: XML
Tutorial 9 Working with XHTML
Extensible Markup Language XML
XML QUESTIONS AND ANSWERS
Session III Chapter 6 – Creating DTDs
Chapter 7 Representing Web Data: XML
Creating an XML Document
Tutorial 9 Working with XHTML
New Perspectives on XML
CSE591: Data Mining by H. Liu
Session II Chapter 6 – Creating DTDs
Allyson Falkner Spokane County ISD
Document Type Definition (DTD)
New Perspectives on XML
Presentation transcript:

CSCE 315 – Programming Studio Spring 2013 XML CSCE 315 – Programming Studio Spring 2013

Consistent Data Transfer Transfer of data has become increasingly important Can’t assume control of all ways data is created and used Cross-platform, cross-system, etc. People will want to access data for their own purposes People will want to use data from several sources Data may be more complicated than “traditional” formats would support E.g. ASCII text only good for some text documents Need a more universal means of transferring data

“Universal” Data Transfer Languages Several different ones developed, but 2 are very common Usually text-based Makes it (mostly) platform independent XML (we will discuss here) JSON JavaScript Object Notation Common, looks like and can be read like Javascript Less extensible than XML, but somewhat easier to use More compact form

Markup Languages Idea is to “tag” information to give a sense of its meaning/semantics How that is handled is up to reader Usually separates presentation from structure Examples: HTML: standard web page information, interpreted by browsers TeX/LaTeX: document specification, style descriptions determine how it is laid out

XML eXtensible Markup Language Extensible: able to define additional “tags” Specific tags and the semantics associated with them allow specifications of different languages Developed by the World Wide Web Consortium (W3C) to help standardize internet information transfer Now used as the basis for many specialized languages Each has its own semantic requirements

XML Characteristics Straightforward to use on internet Easily processed/parsed Human-readable Capable of expressing wide range of applications Including hierarchies, tables Can be very large/verbose

XML Document Text Intermingled character data and markups Markups: Start/End tags (and empty element tags) Entity/Character references Comments CDATA delimiters Processing Instructions XML/Text declarations Document type declarations

Basic XML Syntax Some prolog/header Single root element Possibly describing/referring to type of XML Single root element More elements forming a tree Elements fully “nest” inside each other Can have any number of children elements Elements begin with a start tag, end with an end tag <Elem>Stuff in element</Elem>

Tag Format Starting Tags can declare attributes <TagName Attr1=“…” Attr2=‘…’> Note that attributes can use “ or ‘ Ending Tags match starting tag name, but with a / preceding </TagName> Character data (and maybe other elements) in between start/end tags Empty element: <Elem/> Equivalent to <Elem></Elem>

Entity/Character References Note: Some character patterns are “reserved” <, >, &, ‘, “ An entity reference is a name given to a character or set of characters Used for any other things to be repeated General entity form: &Whatever; Used for the “reserved” chacters < <, > >, & &, " “, &apos; ‘

Character References Character References are specialized Use the form &#…; where the … is a reference to a character in an ISO standard & is an &

Comments Begin with <!-- End with --> Everything in between is ignored <!-- This is a comment -->

CDATA sections Used to note a section that would otherwise be viewed as markup data <![CDATA[ … ]]> <![CDATA[ <b>This <a>is</b>not</a>bad ]]>

Processing Instructions Allow documents to contain instructions for applications reading them “Outside” the main document <? Target … ?> Target is the target application name Any other instructions follow <? MyReader -o3 -f input.dat ?>

XML/Text Declarations Documents should start with declaration of XML type used, in a prolog: <?xml version=“1.0” ?> Other documents “included” should also have such a prolog, as the first line

XML Semantics Semantics must be declared to determine what is valid syntax Tags allowed and their attributes, entities Does not say how it is processed Can be located in XML document itself Can be contained in separate Document Type Declaration (DTD) Newer XML Schema definitions, which capture semantics in an XML-like document But drawbacks, including difficulty to use, not as universally implemented, large size, etc.

Document Type Declaration Defines constraints on the structure of the XML Comes before first element Either defines or points to external definition of Document Type Definition (DTD) External: <!DOCTYPE Name SYSTEM url> Internal: <!DOCTYPE Name […]> The DTD can be standalone (no further external references) or not

Element Declarations Define elements and allowed content (character data, subelements, attributes, etc.) <!ELEMENT Name Content> Name is the unique name Content describes that type of element Options for Content: EMPTY – nothing allowed in the element ANY – no restrictions Children elements only Mixed character and children elements

Element Declarations: Child element content When an element has (only) child elements within it Specify using: Parentheses () for grouping The , for sequencing The | for “choice of” The + (one or more), * (zero or more), or ? (zero or one) modifiers. If no modifier, means “exactly once”

Example of Child elements <!Element book ( title, coverpage, tableofcontents?, editionnote*, preface?, (chapternumber, chaptertitle, chaptertext)+, index? )>

Element Declarations: Mixed element content When an element can contain both character and child elements The character text is denoted as a kind of special element name: #PCDATA <!ELEMENT story (#PCDATA|a|b|c)*>

Attribute Declarations Define allowed attribute names, their types, and default values <!ATTLIST ElementName Attribute*> ElementName is the name of the element those attributes belong to Repeat attribute definition as many times as needed

Attribute Declaration: Types Name Type DefaultValue Name is the attribute name Type: CDATA : string Enumerated: specified via a comma-separated list in parentheses Tokenized: a limited form, specified by some other rule defined in the DTD Several variations

Attribute Declaration: Defaults Specify a default value Also specify whether attribute is needed in the element #REQUIRED This attribute must be specified each time (no default) #IMPLIED No default is specified Otherwise, use the default value given Precede by #FIXED if it must always take that default

Attribute Declaration Example <!ATTLIST Book title CDATA #REQUIRED author CDATA “anonymous” publisher CDATA #IMPLIED category (fiction,nonfiction) “fiction” language CDATA #FIXED ‘English’ >

Entity Declarations Entity References should be declared Internal Entity: <!ENTITY Name ReplacementText > <!ENTITY CR “Copyright 2008”> … &CR; External Entity: <!ENTITY Name SYSTEM url > <!ENTITY BP SYSTEM “http://this.com/BP.xml”> &BP; There are also other variations on external entities

Parameter Entities Like general entities, but refer to entities to be used in the Document Type Declaration Use a % instead of an & <!ENTITY % newdef SYSTEM “http://this.com/newdef-xml.entities”> … %newdef;

Conditionals (in the DTD) Used in the DTD to apply different rules <![Condition[…]]> If Condition is INCLUDE then keep If Condition is IGNORE then skip Combine with parameter entities: <!ENTITY % addborder ‘INCLUDE’> … <![%addborder;[ … (stuff to draw border) … ]]>

XML Namespaces Different XML definitions could define the same element name. If we want to use both, could have conflict. Can distinguish using namespaces. <a:book>…</a:book> <b:book>…</b:book>

Defining XML Namespaces xmlns attribute in definition of element xmlns:prefixname=“URL” <a:book xmlns:a=http://this.com/adef> Can be defined in first use of element or in XML root element. Can define a “default” No prefix needed, leave off : also

Summary/More Information XML has become a standard way of transferring information, especially over the internet Provides flexibility to represent a wide range of data. Many texts/online tutorials about XML W3C “official” pages: http://www.w3.org/XML/ See in particular the XML 1.0 specs (more than the 1.1 specs)