1 XMLXML Slide Courtesy to prof. Elis USC.

Slides:



Advertisements
Similar presentations
XML-XSL Introduction SHIJU RAJAN SHIJU RAJAN Outline Brief Overview Brief Overview What is XML? What is XML? Well Formed XML Well Formed XML Tag Name.
Advertisements

XML Basics Wednesday May 12, 1999 SD99 Copyright 1999 Elliotte Rusty Harold
XML: Extensible Markup Language
© De Montfort University, XML – a meta language Howell Istance and Peter Norris School of Computing De Montfort University.
XML 6.3 DTD 6. XML and DTDs A DTD (Document Type Definition) describes the structure of one or more XML documents. Specifically, a DTD describes:  Elements.
History Leading to XHTML
3 November 2008CIS 340 # 1 Topics To define XML as a technology To place XML in the context of system architectures.
Tutorial 9 Working with XHTML
CS 898N – Advanced World Wide Web Technologies Lecture 21: XML Chin-Chih Chang
Document Type Definitions
CSE 190: Internet E-Commerce Lecture 17: XML, XSL.
XML A brief introduction ---by Yongzhu Li. XML --- a brief introduction 2 CSI668 Topics in System Architecture SUNY Albany Computer Science Department.
Tutorial 9 Working with XHTML. XP Objectives Describe the history and theory of XHTML Understand the rules for creating valid XHTML documents Apply a.
Creating a Well-Formed Valid Document. 2 Objectives Introducing XHTML Creating a Well-Formed Document Creating a Valid Document Creating an XHTML Document.
1 XMLXML Slide Courtesy to prof. Elis USC.
Tutorial 11 Creating XML Document
Document Type Definitions. XML and DTDs A DTD (Document Type Definition) describes the structure of one or more XML documents. Specifically, a DTD describes:
XP New Perspectives on XML Tutorial 3 1 DTD Tutorial – Carey ISBN
XP Tutorial 9New Perspectives on Creating Web Pages with HTML, XHTML, and XML 1 Working with XHTML Creating a Well-Formed Valid Document Tutorial 9.
Introduction to XML cs3505. References –I got most of this presentation from this site –O’reilly tutorials.
XML CPSC 315 – Programming Studio Fall 2008 Project 3, Lecture 1.
XP 1 CREATING AN XML DOCUMENT. XP 2 INTRODUCING XML XML stands for Extensible Markup Language. A markup language specifies the structure and content of.
XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.
1 herbert van de sompel CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel
1 © Netskills Quality Internet Training, University of Newcastle Introducing XML © Netskills, Quality Internet Training University.
XHTML1 Building Document Structure Chapter 2. XHTML2 Objectives In this chapter, you will: Learn how to create Extensible Hypertext Markup Language (XHTML)
XML 1 Enterprise Applications CE00465-M XML. 2 Enterprise Applications CE00465-M XML Overview Extensible Mark-up Language (XML) is a meta-language that.
XML Extensible Markup Language. What is XML? ● meta-markup language ● a language for defining a family of languages ● semantic/structured mark-up language.
XML - Why: The HTML-Dilemma HTML, SGML, XML - How: Syntax, Concept, Language Elements Basics Well-formed XML-Documents (without DTD) Valid XML-Documents.
XP 1 DECLARING A DTD A DTD can be used to: –Ensure all required elements are present in the document –Prevent undefined elements from being used –Enforce.
What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data.
1 XML - Extensible Markup Language. 2 HTML - Hypertext Markup Language n HTML has a fixed tag set. n Use these tags to describe how information is to.
XML About XML Things to be known Related Technologies XML DOC Structure Exploring XML.
Tutorial 1: XML Creating an XML Document. 2 Introducing XML XML stands for Extensible Markup Language. A markup language specifies the structure and content.
1 Tutorial 13 Validating Documents with DTDs Working with Document Type Definitions.
Avoid using attributes? Some of the problems using attributes: Attributes cannot contain multiple values (child elements can) Attributes are not easily.
XML A web enabled data description language 4/22/2001 By Mark Lawson & Edward Ryan L’Herault.
1 Chapter 10: XML What is XML What is XML Basic Components of XML Basic Components of XML XPath XPath XQuery XQuery.
XML Extensible Markup Language
XML 2nd EDITION Tutorial 1 Creating An Xml Document.
Waqas Anwar Next SlidePrevious Slide. Waqas Anwar Next SlidePrevious Slide XML XML stands for EXtensible Markup Language.
IS432 Semi-Structured Data Lecture 2: DTD Dr. Gamal Al-Shorbagy.
Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies.
XML Instructor: Charles Moen CSCI/CINF XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.
XP 1 Creating an XML Document Developing an XML Document for the Jazz Warehouse XML Tutorial.
Lecture 16 Introduction to XML Boriana Koleva Room: C54
XML – Part III. The Element … This type of element either has the element content or the mixed content (child element and data) The attributes of the.
An Introduction to XML Sandeep Bhattaram
XML Introduction. What is XML? XML stands for eXtensible Markup Language XML stands for eXtensible Markup Language XML is a markup language much like.
XML Design Goals 1.XML must be easily usable over the Internet 2.XML must support a wide variety of applications 3.XML must be compatible with SGML 4.It.
1 Tutorial 11 Creating an XML Document Developing a Document for a Cooking Web Site.
Internet & World Wide Web How to Program, 5/e. © by Pearson Education, Inc. All Rights Reserved.2.
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 1COMP9321, 15s2, Week.
©SoftMoore ConsultingSlide 1 Introduction to HTML: Basic Document Structure.
XML CSC1310 Fall HTML (TIM BERNERS-LEE) HyperText Markup Language  HTML (HyperText Markup Language): December  Markup  Markup is a symbol.
XP Tutorial 9New Perspectives on HTML and XHTML, Comprehensive 1 Working with XHTML Creating a Well-Formed Valid Document Tutorial 9.
XML Validation. a simple element containing text attribute; attributes provide additional information about an element and consist of a name value pair;
Tutorial 9 Working with XHTML. New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 2 Objectives Describe the history and theory of XHTML.
C Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Introduction to XML Standards.
Tutorial 9 Working with XHTML. XP Objectives Describe the history and theory of XHTML Understand the rules for creating valid XHTML documents Apply a.
XP 1Creating Web Pages with XML Tutorial 1 New Perspectives on XML Tutorial 1 – Creating an XML Document.
XML Schema – XSLT Week 8 Web site:
XML Notes taken from w3schools. What is XML? XML stands for EXtensible Markup Language. XML was designed to store and transport data. XML was designed.
Rendering XML Documents ©NIITeXtensible Markup Language/Lesson 5/Slide 1 of 46 Objectives In this session, you will learn to: * Define rendering * Identify.
XML SCHEMA 1 CH 20. Objective 2 What’s wrong with DTDs? What is a schema? The W3C XML Schema Language Hello schemas Complex types Simple types Deriving.
Beyond HTML: Extensible Markup Language (XML)
Extensible Markup Language (XML) Pat Morin COMP 2405.
XML BASICS and more…. What is XML? In common:  XML is a standard, simple, self-describing way of encoding both text and data so that content can be processed.
XML QUESTIONS AND ANSWERS
Creating an XML Document
Presentation transcript:

1 XMLXML Slide Courtesy to prof. Elis USC

2 What is XML XML stands for Extensible Markup Language –the World Wide Web Consortium (W3C) directs the effort XML isn't a markup language, like HTML, but rather a system for defining other markup languages. XML is a common syntax for expressing structure in data, and as a result a way for others to define new tags –whereas the tag in HTML specifies text to be presented in a certain typeface and weight, an XML tag would explicitly identify the kind of information it surrounds: tag might identify the author of a document, tag could contain an item's cost in an inventory list

3 SGML, XML and HTML The parent of HTML and XML is Standard Generalized Markup Language (SGML) an ISO standard for electronic document exchange SGML competes with other standards, mainly de facto standards, like Adobe PDF (Acrobat), Microsoft RTF (Rich Text Format) and popular word processor file formats like Microsoft Word. both XML and HTML are document formats derived from SGML. –Thus they all share certain characteristics, such as a similar syntax and the use of bracketed tags. –But HTML is an application of SGML, whereas XML is a subset of SGML. XML documents can be –read by any SGML authoring or viewing tool. –XML is less complex than SGML, and it is designed to work across a limited-bandwidth network such as the Internet.

4 Why Are Developers Excited about XML? Domain-Specific Markup Languages –A DTD precisely describes the format –DTDs verify that documents adhere to the format –Ensures interoperability of unrelated tools Self-Describing Data –DTDs explain the format so reverse engineering isn't as necessary –Comments in DTDs can go even further Interchange of Data Among Applications –E-commerce and syndication –DTDs make sure that two independent applications speak the same language –DTDs detect malformed data –DTDs verify correct data Structured and Integrated Data –Can specify relationships between elements using element declarations –Can assemble data from multiple sources using external entity references declared in the DTD

5 XML Appications Chemical Markup Language (CML) –Jumbo: the first general-purpose XML browser –Assign each XML elements to a java class that knows how to render that element – Mathematical Markup Language (MathML) –The Amaya browser Synchronized Multimedia Integration Language (SMIL) Scalable Vector Graphics MusicML FoodWebML, GuiML

6 A Song Description in HTML Hot Cop by Jacques Morali, Henri Belolo, and Victor Willis Producer: Jacques Morali Publisher: PolyGram Records Length: 6:20 Written: 1978 Artist: Village People

7 A Song Description in XML Hot Cop Jacques Morali Henri Belolo Victor Willis Jacques Morali PolyGram Records 1978 Village People

8 Using XSLT Attaching style sheets to documents

9 Hot Cop Jacques Morali Henri Belolo Victor Willis Jacques Morali PolyGram Records 1978 Village People Using CSS – simpler, but limitted

10 Well-formedness All XML documents must be well-formed Well-formedness rules: –Open and close all tags –Empty tags end with /> –There is a unique root element –Elements may not overlap –Attribute values are quoted –< and & are only used to start tags and entities Parsers are required to reject malformed documents. This improves compatibility and interoperability.

11 Well-formedness Rules Open and close all tags Empty tags end with /> There is a unique root element Elements may not overlap Attribute values are quoted < and & are only used to start tags and entities Only the five predefined entity references are used

12 What is a Document Type Definition A Document Type Definition (DTD) is a set of syntax rules for tags. It tells you –what tags you can use in a document, –what order they should appear in, –which tags can appear inside other ones, –which tags have attributes, and so on. Originally developed for use with SGML, a DTD can be part of an XML document, but it's usually a separate document or series of documents. Because XML is not a language itself, but rather a system for defining languages, it doesn't have a universal DTD the way HTML does. Instead, each industry or organization that wants to use XML for data exchange can define its own DTDs. If an organization uses XML to tag documents for internal use only, it can create its own private DTD.

13 Validity To be valid an XML document must be 1.Well-formed 2.Must have a Document Type Definition (DTD) 3.Must comply with the constraints specified in the DTD

14 Validity is not always sufficient DTDs cannot specify anything about the contents of an element. –That an element must contain a number –That an element must contain a date –That a date must be between 1970 and 2001 –etc. Custom validation layers can sit on top of XML validation Schemas will add this

15 XML Schemas an XML-based syntax, or schema, for defining how an XML document is marked up. recommended by Microsoft an alternative to Document Type Definition (DTD) DTDs have many drawbacks, including the use of non-XML syntax, no support for data-typing, and non-extensibility. XML Schema improves upon DTDs in several ways, including the use of XML syntax, and support for data-typing and namespaces. For example, an XML Schema allows you to specify an element as an integer, a float, a boolean, an URL, etc. The XML parser in Internet Explorer 5 can validate an XML document with both a DTD and an XML Schema.

16 How to process XML? Java Parsers DOM Parser – tree structure SAX Parser – event driven approach DOM Parser makes use of SAX parser to parse and then create a tree structure

17 DTDs – Content Definitions Content model definitions describe what may be contained in an instance of an element –names of allowed or forbidden elements –DTD entities –document text syntax for expressing content is a form of regular expressions: –(…) delimits a group –A | Beither A or B –A, BA followed by B –A & BA and B in any order –A?A occurs zero or one time –A*A occurs zero or more times –A+A occurs one or more times

18 Element Declarations Each tag must be declared in a declaration. A declaration gives the name and content model of the element The content model uses a simple regular expression-like grammar to precisely specify what is and isn't allowed in an element

19 Content Specifications ANY – –A catalog can contain any child element and/or raw text (parsed character data) #PCDATA –Parsed Character Data; i.e. raw text, no markup. For example, – 1984 – Sequences Choices Mixed Content Modifiers EMPTY

20 #PCDATA There are a number of elements in the example document that only contain PCDATA:

21 Comments in DTDs DTDs seem fundamentally more obfuscated than C. Comments can improve this by giving example elements Comments are the same as in HTML; e.g.

22 Child Elements 1994 To declare that a date element must have a year child:

23 Child Elements You only have to declare the immediate children Elliotte Rusty Harold Julie Mandel To declare that an element must have exactly one name child:

24 Sequences Elliotte Rusty Harold Separate multiple required child elements with commas; e.g. A list of child elements separated by commas is called a sequence

25 More Sequences To use a sequence in an ELEMENT declaration: –The element being described must have only child elements, no mixed content –You must know the order of the child elements –You must know the type of each child element –You must know the number of child elements –The number can be relaxed with wild cards

26 One or More Children + Compositions by the members of New York Women Composers music publishing scores women composers New York The + suffix indicates that one or more of that element is required at that point

27 A DTD for Songs

28 Internal DTDs <!DOCTYPE GREETING [ ]> Hello XML!

29 Complete Example – Mail Message Suppose we describe an message as consisting of: a title; a header made of: the sender; the recipient; a subject; the body text made of: four paragraphs; quoted material; The tags are <!-- is a comment, (head,body) implies a group with body following head TO is followed by FR and both must appear, ? Means SB is optional, P may occur zero or more times

30 Well-formedness All XML documents must be well-formed Well-formedness rules: –Open and close all tags –Empty tags end with /> –There is a unique root element –Elements may not overlap –Attribute values are quoted –< and & are only used to start tags and entities Parsers are required to reject malformed documents. This improves compatibility and interoperability.

31 Well-formedness Rules Open and close all tags Empty tags end with /> There is a unique root element Elements may not overlap Attribute values are quoted < and & are only used to start tags and entities Only the five predefined entity references are used

32 Open and close all tags Good: – The quick brown fox jumped over the lazy dog – A very important point –Copyright 1999 Ellis Horowitz Bad: –The quick brown fox jumped over the lazy dog – A very important point –Copyright 1999 Ellis Horowitz

33 Empty tags end with />,, and instead of,, and Web browsers deal inconsistently with these Can use instead

34 There is a unique root element One element completely contains all other elements of the document This is HTML in HTML files The XML declaration and xml-stylesheet processing instruction are not elements

35 Elements may not overlap If an element contains a start tag for an element, it must also contain the corresponding end tag Empty elements may appear anywhere Every non root element has a parent element

36 Attribute values are quoted Good: – Bad: –

37 < and & are only used to start tags and entities Good: O'Reilly & Associates Bad: O'Reilly & Associates Good: for (int i = 0; i <= args.length; i++ ) { Bad: for (int i = 0; i

38 Only the five predefined entity references are used Good: –& –< –> –" –&apos; Bad: –© –® –&tm; –α –é – –etc. DTDs loosen this restriction by allowing you to define new entities, even in an invalid document.

39 Validity To be valid an XML document must be 1.Well-formed 2.Must have a Document Type Definition (DTD) 3.Must comply with the constraints specified in the DTD

40 Validity is not always sufficient DTDs cannot specify anything about the contents of an element. –That an element must contain a number –That an element must contain a date –That a date must be between 1970 and 2001 –etc. Custom validation layers can sit on top of XML validation Schemas will add this

41 XML Schemas an XML-based syntax, or schema, for defining how an XML document is marked up. recommended by Microsoft an alternative to Document Type Definition (DTD) DTDs have many drawbacks, including the use of non-XML syntax, no support for data-typing, and non-extensibility. XML Schema improves upon DTDs in several ways, including the use of XML syntax, and support for data-typing and namespaces. For example, an XML Schema allows you to specify an element as an integer, a float, a boolean, an URL, etc. The XML parser in Internet Explorer 5 can validate an XML document with both a DTD and an XML Schema.

42 Compare DTD & Schema

43

44 A DTD for Songs

45 A Valid Song Document Hot Cop Jacques Morali Henri Belolo Victor Willis Jacques Morali PolyGram Records 1978 Village People

46 XSLT - XSL Transformations XSL (eXtensible Stylesheet Language) consists of two parts: XSL Transformations and XSL Formatting Objects. An XSLT stylesheet is an XML document defining a transformation for a class of XML documents. A stylesheet seperates contents and logical structure from presentation. Not intended as completely general-purpose XML transformation language - designed for XSL Formatting Objects. Nevertheless: XSLT is generally useful. The basic idea: The basic design: XSLT is declarative and based on pattern-matching and templates

47 Song.xml processed with song2HTML.xsl

48 song2HTML.xsl

49 song2HTML.xsl

50 Transformer.java

51 Processing model template rule = pattern + template Construction of result tree fragment: the source tree is processed by processing the root a single node is processed by 1.finding the template rule with the best matching pattern 2.instantiating its template (creates fragment + continues processing recursively) a node list is processed by processing each node in order current node : the node currently being processed current node list : the node list currently being processed (used for evaluation context later)

52

53

54

55

56

57

58

59

60 CSS Examples – self study

61 A Blank Style Sheet...

62 The Default Rule Not every element needs a rule The root element should be at least display: block catalog { font-family: New York, Times New Roman, serif; font-size: 14pt; background-color: white; color: black; display: block }

63 A style rule for the category element Make it look like an H1 heading category { display: block; font-family: Helvetica, Arial, sans; font-size: 32pt; font-weight: bold; text- align: center} catalog { font-family: New York, Times New Roman, serif; font-size: 14pt; background-color: white; color: black; display: block }

64 A style rule for the composer element Make it look like a level 2 head No need to styleize the first, middle, and last names separately composer { display: block; font-family: Helvetica, Arial, sans; font-size: 24pt; font- weight: bold; text-align: left}

65 A style rule for the title element composition title { display: block; font- family: Helvetica, Arial, sans; font-size: 18pt; font-weight: bold; text-align: left}

66 Style Rules for composition children composition * {display:list-item} description {display: block}

67 Finished Style Sheet category { display: block; font-family: Helvetica, Arial, sans; font-size: 32pt; font-weight: bold; text-align: center} catalog { font-family: New York, Times New Roman, serif; font-size: 14pt; background-color: white; color: black; display: block } composer { display: block; font-family: Helvetica, Arial, sans; font-size: 24pt; font-weight: bold; text-align: left} composition title { display: block; font-family: Helvetica, Arial, sans; font-size: 18pt; font-weight: bold; text- align: left} composition * {display:list-item} description {display: block} // cataloging_info is only for search engines cataloging_info { display: none; color: #FFFFFF} last_updated, copyright, maintainer {display: block; font- size: small} copyright:before {content: "Copyright " } last_updated:before {content: "Last Modified " } last_updated {margin-top: 2ex }

68 Java Parsers DOM Parser – tree structure SAX Parser – event driven approach DOM Parser makes use of SAX parser to parse and then create a tree structure

69 Day Planner – example DTD

70 Planner Application

71

72

73

74

75

76

77