Presentation is loading. Please wait.

Presentation is loading. Please wait.

Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 1 Bill Cafiero 972-231-2180 A short on-line XML Tutorial may be.

Similar presentations


Presentation on theme: "Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 1 Bill Cafiero 972-231-2180 A short on-line XML Tutorial may be."— Presentation transcript:

1 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 1 Bill Cafiero A short on-line XML Tutorial may be found at

2 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 2 Where Does XML Fit? The Internet creates a need for platform- independent technology. XML HTML Java Internet presentation data processing platform

3 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 3 eXtensible Markup Language XML is designed to transfer structured text and data among systems in multiple organizations XML and HTML both evolved from SGML –XML focuses on document data content –HTML focuses on document display All markup languages use tags “ ” and “ ” to markup the text and data to provide information about the information

4 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 4 HTML HTML - HyperText Markup Language Non-proprietary document formatting standard Displayable on any web browser HTML - When my dog jumped over the lazy fox, I didn't know what to do! Result - When my dog jumped over the lazy fox, I didn't know what to do! HTML - My Red Text! Result - My Red Text!

5 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 5 History of XML XML eXtensible Markup Language Conceived, 1996 by team chaired by Jon Bosak. W3C (World Wide Web Consortium) recommended for standard, January Derived from SGML (Structured General Markup Language) parent. HTML (Hyper-Text Markup Language) is an earlier cousin. This is what XML looks like: February 29, :30 am 249 Cedar Elm Road Lots of high quality junk for sale

6 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 6 An XML Tutorial

7 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 7 This is an XML “Instance” document The Catcher in the Rye J.D. Salinger This sample contains a “prolog”, a reference to the accompanying “rules” for the document, element and attribute names and “content”. These are all identified by “markup syntax”

8 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 8 Tags, Elements and Attributes Tags are labels that tells an application or other agent to do something to whatever is encased in the tags is a “start” tag. is the closing or “end” tag An Element refers to both the tags plus the content (the stuff between the tags). E.G. The Catcher in the Rye The outermost element in the hierarchy is called the “Root Element” (Book in our example) Any tag can have an Attribute that takes the form of name/value pairs, E.G. Note: XML is case sensitive so that tags like,, and are all different

9 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 9 The XML Character Set Unicode is the character set for XML Universal Multiple-Octet Coded Character Set (UCS) 16-bit encoding for the worlds principle languages, including ancient languages ISO/IEC :1993 Description of the whole set is available from

10 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 10 XML Characters Defined by UNICODE (ISO 10646), supported by NT, Win95/98 and Java platforms. F000E000D000C000B000A General scripts Symbols CJK Misc CJK Ideographs Hangul Surrogates Private Use Compatibility

11 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 11 Names Names begin with a letter or one of a few punctuation characters (“-”, “.”) and continues with letters, digits, hyphens, underscores, colons or periods. Spaces are not allowed in names! Names beginning with the string “XML” are reserved.

12 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 12 Elements There are two kinds of elements - those that have content and those that don’t (empty elements) This is the title

13 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 13 Attributes Attributes are a way of attaching characteristics or properties to elements of a document. Attributes have names and values. Bill Smith John Doe

14 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 14 XML Prolog Encoding declaration Must always precede the XML content Processing instructions, so no closing tag The Prolog is made up of an XML declaration and a document type declaration (both optional). We will look at the DOCTYPE declaration in more detail later.

15 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 15 Comments Adding Comments to XML

16 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 16 Hierarchy and Navigation

17 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 17 Structure Structure in XML documents resembles storage containers Each storage container fits inside a larger one which fits inside another, and so on The storage containers make up the physical structure and the way they fit inside one another makes up the logical structure of the document

18 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 18 Shipment_Detail Line_Number Quantity Shipment Shipment_Number Shipment_Date Detail Line_Number Item Quantity Price An Order in XML /24/00 Bill's Supply Company 1 A B /15/ Orders Order_Number Order_Date Customer Detail Line_Number Item Quantity Price Shipment Shipment_Number Shipment_Date Shipment_Detail Line_Number Quantity

19 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 19 Structure : Well Formed Well-formed documents are tightly constructed -- no “loose ends” Well-formed documents use complete storage containers No missing end tags in well-formed XML documents

20 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 20 Document Type Definitions (DTD’s)

21 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 21 Document Type Definitions Because we can create our own tags and document structures using XML, we need a mechanism for defining the tags and what the valid structure is. A DTD is where we declare our specific elements tags. The DTD is where we declare the attributes of each tag. The DTD specifies the “occurrence indicators” for the child elements ?zero or one *zero or more +one or more (none)exactly one

22 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 22 DTD Syntax Declarative markup consists of: Markup open delimiter:

23 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 23 Document Type Declaration Must precede all markup and character data Links document and declarations Can be an external reference Not required for well-formed non-validating XML document Name must match the root tag element

24 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 24 Internal or External DTDs Rock N. Robyn>/name> Jay Bird Street Baltimore MD USA Here the DTD is part of the same data file as the XML data

25 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 25 DTD in an External File Rock N. Robyn>/name> Jay Bird Street Baltimore MD USA Here the DTD is stored in a local file

26 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 26 DTD on the Web Rock N. Robyn>/name> Jay Bird Street Baltimore MD USA Here the DTD is on a remote web server

27 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 27 XML Elements Elements are the "building blocks" within the logical structure of a document like paragraph, title, section... Elements have unique names and lengths are not restricted The first NAME character must be a letter, “_” or “:” (Note: numerics are not allowed). Declarations must be all uppercase words, names are mixed case and case sensitive

28 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 28 Element Declarations element name content model keyword name ELEMENT is the keyword “sender” is the element name. Element names specify the name of the declared element; element names are sometimes called “generic identifiers” (gi).

29 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 29 XML Connectors |OR, in any sequence, THEN, in sequence ( )GROUP connector Connectors provide the rules for the sequence or order the items in the content model may appear

30 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 30 Sequence Connector Elements separated by the sequence must appear in the order they are listed. A chapter consists of a title followed by a paragraph.

31 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 31 OR Connector The OR connector means only one of these elements may appear. A Item consists of a Product OR A Item consists of a Service But not both!

32 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 32 XML Occurrence Indicators ? ZERO or ONE (optional) * ZERO or MORE (optional repeatable) + ONE or MORE (required repeatable) (null) ONE ONLY (required) Occurrence indicators provide the rules that show how many times items in the content model may appear

33 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 33 Required and Repeatable Element appears 1 or more times A chapter could consist of one of these: A title followed by a paragraph A title followed by two paragraphs A title followed by three paragraphs A title followed by thirty-seven paragraphs many other options NOTE: You would not be allowed to have a chapter with only a title.

34 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 34 Optional A title followed by a paragraph A paragraph The optional occurrence indicator means the element may appear 0 or 1 times. A chapter could consist of one of these: NOTE: You could not have a chapter with 2 titles followed by a paragraph

35 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 35 Nested Model Groups A title followed by at least one paragraph A title followed by an illustration An illustration A paragraph Many paragraphs Model groups can be nested inside one another. A chapter could consist of one of these:

36 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 36 Attributes with a Choice of Values Attributes describe special conditions associated with individual elements; they are often used as the “adjectives” of XML

37 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 37 Attribute Defaults An attribute may have a default value specified in the DTD.

38 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 38 An XML Document and its DTD The Catcher in the Rye J.D. Salinger Book.dtd Book.xml

39 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 39 Good Style for DTD’s Nesting: elements may contain more than one other element: Buyer(Company,Contact) Elements that have a single element in their models are ones where that element is repeatable: Street(Line+) Data modelling and logical naming of elements ensures accurate representation of relationship between components. Keep role of DTD simple – don’t overload

40 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 40 Namespaces

41 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 41 Why Namespaces? The appeal of XML lies in the ability to invent tags that convey meaningful information. For example, XML allows you to represent information about a book as: A Suitable Boy Similarly, you can represent information about an author as: Mr Vikram Seth This example illustrates a problem. While the human reader can distinguish between the different interpretations of the "TITLE" element, a computer program does not have the context to tell them apart.

42 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 42 Namespaces Namespaces solve this problem by associating a vocabulary (or namespace) with a tag name. For example, the titles can be written as: A Suitable Boy Mr. The name preceding the colon, the prefix, refers to a namespace, a Universal Resource Identifier (URI). The URI ensures global uniqueness when merging XML sources, while the associated prefix, a short name that substitutes for the namespace, need only be unique in the tightly scoped context of the document. With this scheme, there are no conflicts in tags and attributes, and two tags can be the same only if they are from the same namespace and have the same tag name. This allows a document to contain both book and author information without confusion about whether the "TITLE" element refers to the book or the author.

43 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 43 Namespaces - Examples An XML namespace is a collection of names, identified by a URI reference, which are used in XML documents as element types and attribute names. This example shows both an element (publisher) and an attribute (category) qualified by the prefix “pubspace”: Numerical Analysis of Partial Differential Equations Addison Wesley The attribute "xmlns" is an XML keyword for a namespace declaration.

44 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 44 What XML namespaces are Not Two things that XML namespaces are not have caused a lot of confusion, so we'll mention them here: XML namespaces are not a technology for joining XML documents that use different DTDs. Although they might be used in such a technology, they don't provide it themselves. The URIs used as XML namespace names do not point to schemas, information about the namespace, or anything else -- they're just identifiers. URIs were used simply because they're a well-known system for creating unique identifiers. Don't even think about trying to resolve these URIs.

45 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 45 XML Schemas

46 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 46 Why Schemas? Although the DTD may have been powerful enough in many instances, it is inadequate to meet the needs of many applications that have been envisaged to use XML. The DTD does not support data types beyond character data, which is a severe limitation for describing standards and exposing database schemas. The DTD is not integrated with new XML technologies like Namespaces, so it is not possible to import constructs from external schemas to enable code reuse. Applications simply need a more flexible mechanism to specify constraints on document structure than a context-free grammar.

47 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 47 Additional Features of XML Schema One of the main weaknesses of DTD was its lack of support for data types beyond character strings. For example: A few years ago is correct using the previous DTD. XML Schema supports the following additional data types: string, boolean, real, decimal, integer, non-negative integer, positive integer, non-positive integer, negative integer, dateTime, date,time, timePeriod, binary, uri, language

48 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 48 User Defined Data Types Further constraints can be placed on the range of possible data values by creating new data types that extend built-in data types. For example, if our book list covered Twentieth Century literature, in XML Schema, we can limit the values of the year element to be between 1900 and Note the Schema itself is written in XML!

49 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 49 Code to check the structure and content of the data Code to actually do the work In a typical program, up to 60% of the code is spent checking the data! Save effort by using XML Schemas

50 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 50 If your data is structured as XML, and there is a schema, then you can hand the data-checking task off to a schema validator. Thus, your code is reduced by up to 60%!!! Big $$ savings! Save effort using XML Schemas (cont.) Code to check the structure and content of the data Code to actually do the work

51 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 51 A Final Example

52 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 52 A Payment Transaction

53 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 53 Payment Transaction Hierarchy Total_Payment_Amount Payment_Date Name Address Bank Bank_Code Account_Number Payee Name Address Bank Bank_Code Account_Number Payor Funds_Tranfer Amount Reason Adjustment Invoice_Number Date Amount_Invoiced Amount_Paid Amount Reason Adjustment Invoice Remittance_Data Payment

54 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 54 The Payment in XML

55 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 55 The DTD for the Payment

56 Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 56 Thank you


Download ppt "Copyright © William G. Cafiero, 2000 GE Global eXchange Services Page 1 Bill Cafiero 972-231-2180 A short on-line XML Tutorial may be."

Similar presentations


Ads by Google