Presentation is loading. Please wait.

Presentation is loading. Please wait.

XML, XSLT. Discussion on Markup Languages, Trends.

Similar presentations


Presentation on theme: "XML, XSLT. Discussion on Markup Languages, Trends."— Presentation transcript:

1 XML, XSLT

2 Discussion on Markup Languages, Trends

3 What is XML ? XML is the Extensible Markup Language It is designed to enable the use of SGML on the World Wide Web. It defines ‘an extremely simple dialect of SGML XML is not a single, predefined markup language: it's a metalanguage -- a language for describing other languages XML lets you define your own customized markup languages XML is a markup language for structured documentation

4 What is XML for? To make it easy and straightforward to use SGML on the Web –easy to define document types –easy to author and manage SGML-defined documents –easy to transmit and share them across the Web It defines ‘an extremely simple dialect of SGML which is completely described in the XML Specification The goal is to enable generic SGML to be served, received, and processed on the Web in the way that is now possible with HTML XML has been designed for ease of implementation, and for interoperability with both SGML and HTML

5 What is SGML? SGML is the Standard Generalized Markup Language (ISO 8879) It is the international standard for defining descriptions of the structure and content of different types of electronic document SGML is quite complex to implement and contains a lot of features that are very rarely used SGML parsers and browsers are complex and difficult to write

6 Extensibility Structure Validation Media independence Platform independence Meaningful markup Strict hierarchies ensure clean structures. Define your own schema to suit tags Why XML?

7 Structure modeling (DTD) UNICODE - multilingual support XML replaces ASCII CSV for data interchange XML provides the means for structured information exchange Why XML?

8 A bridge between SGML and HTML Simplified version of SGML –SGML too complex and heavy to use –Difficult to interpret an SGML document –A new standard which would take the best features of SGML, yet keep it SIMPLE HTML is to rigid. It has solved it’s purpose. XML is more extensible By defining your own markup language (DTD), you can encode the information of your documents much more precisely

9 Why XML? It removes two constraints which are holding back Web development –dependence on a single, inflexible document type (HTML) –the complexity of full SGML, whose syntax allows many powerful but hard-to-program options Authors and providers can design their own document types using XML, instead of being stuck with HTML Document types can be explicitly tailored to an audience, so the cumbersome fudging that has to take place with HTML to achieve special effects can become a thing of the past

10 Why XML? Information content can be richer and easier to use, because the hypertext linking abilities of XML are much greater than those of HTML XML can provide more and better facilities for browser presentation and performance, using CSS and XSL stylesheets Information will be more accessible and reusable, because the more flexible markup of XML can be used by any XML software instead of being restricted to specific manufacturers as has become the case with HTML

11 SGML, XML and HTML… SGML is the `mother tongue', used for describing thousands of different document types HTML is just one of these document types, the one most frequently used in the Web –It defines a simple, fixed type of document with markup designed for a common class of documents XML is an abbreviated version of SGML –Makes it easier for you to define your own document types –Omits the more complex and less-used parts of SGML

12 SGML, XML and HTML… XML itself does not replace HTML: instead, it provides an alternative which allows you to define your own set of markup elements HTML is expected to remain in common use for some time to come –Document Type Definitions for HTML are available in XML versions (XHTML)

13 HTML Presentation + Unstructured data It is an application of SGML Predefined tags Inability to nest components properly. One view XML Data structure divorced from presentation (more OO like) It is a subset of SGML, meta language User defined tags Ability to nest components using DTD. One document, multiple views HTML V/s XML

14 - Example A Resume HTML Representation RESUME Name Amit Rekhi Age 23yrs XML Representation Amit Rekhi 23yrs Male DTD XML

15 What happened to HTML? HTML is already overburdened with dozens of interesting but incompatible inventions from different manufacturers, because it provides only one way of describing your information. It is too rigid and fixed. No easy, generic and standard way to extend HTML It mixes structure with presentation It does not allows groups of people or organizations to create their own customized, standardized markup applications for exchanging information in their domain HTML has served it’s purpose of making the web popular. Now something better and more extensible is needed

16 What happened to HTML? HTML is broken. A few start tags do not have end tags. Can I make my existing HTML files work in XML? If so how?

17 Why XML and not Words or Notes? Public information cannot afford to be restricted to one make or model or manufacturer It is helpful for such information to be in a standard form that can be reused in many different ways, as this can minimize wasted time and effort Proprietary data formats, no matter how well documented or publicized, are simply not an option XML gives a standard interchange format for document interchange which is easily understood programmatically

18 Is XML same as C/C++/Java? XML is a markup language C/C++/Java are programming languages XML does not have programming constructs. –No if, for…..next etc. –No compiler ONLY a parser If no constructs, then how to represent logic? Do I need to do it? If so where is it done?

19 How to control presentation in XML? In XML, you can define your own tagset, consequently browsers cannot know anything about the names/elements you use so the use of a stylesheet is required XML deals only with structure. Style and presentation is taken care of seperately Concept of presentation is similar to Document View Architecture How should presentation be taken care of? Any reuse of XML here?

20 Can I use Java/C++/Scripts in XML? XML is ONLY about describing information Scripting languages and others enable embedded functionality which helps enables information to be manipulated at the user's end, are not used to represent structure No place for PLs in XML Do I need to represent PL logic in XML? How?

21 Can I use PLs to create XML files? Any programming language can be used to output data from any source in XML format XML is an interchange format. It is an input/output format. It only represents structure Implementations are available to manipulate XML Should you have APIs to access XML? What types? How would APIs relate to PLs?

22 How to execute XML files? You can't and you don't XML is not a programming language, so XML files don't ‘run’ or ‘execute’ XML files are data: You have to –Run a program which displays them (like a browser) –Write a program that does some work with them (like a converter which writes the data in another format) –Create a program that creates them

23 What does XML look like? Hello, world! Stop the planet,I want to get off!

24 What does XML look like? <!DOCTYPE titlepage SYSTEM "http://www.frisket.org/dtds/typo.dtd" [ ]> <title font="Baskerville" size="24/30" alignment="centered">Hello, world!

25 XML - A Simple Example Structure Definition DTD - defines STRUCTURE of XML documents DTD - used for VALIDATION of XML documents

26 XML - Instance Data <!DOCTYPE order SYSTEMS “http://www.something.org/messages/xml/message1.xml”> 0000123 A. B. Infosys Private Limited B-102, Gulmohar Park New Delhi India 110049 Pencil 12

27 Separation of (XSL) from Structure (XML-DTD) All XML based languages parsed using single browser Easy Development of 3-tier Web Applications. Data integration from disparate sources. XML data is self-describing Interchange format of a variety of applications Benefits of XML

28 Local computation and manipulation. Multiple views of data. XML based on Open Standards Benefits of XML

29 EXERCISE 1 XML

30 What do you see here? XML Lars Marius Garshol larsga@ifi.uio.no 1.0 20.jun.97 What is XML? SGML light.

31 … And here? <!ATTLIST PART NO CDATA #IMPLIED TITLE CDATA #IMPLIED>

32 XML Document Well-formed *Obeys XML syntax Valid *Conforms to DTD

33 Concepts Document Type Definition (DTD) Validity Well-formedness

34 Concepts - DTD A Document Type Definition (DTD) is a file written in XML's declaration syntax It contains a formal description of the syntax, structure particular type of document –It sets out what names can be used for element types –It sets out where elements may occur –It also shows how all elements and other constructs fit together The concept of a DTD and XML is similar to a class and an object

35 DTD Sample A DTD fragment:... An XML instance:... Chocolate Music Surfing

36 Concepts - Validity Valid XML files are those which have a Document Type Definition (DTD) and adhere to it They must already be well-formed A valid file begins with a Document Type Declaration (DTD) An XML version of the specified DTD must be accessible to the XML processor. This can be specified by supplying the URL for the DTD in a System Identifier Sample XML file: –

37 Concepts - Well-formedness All tags must be balanced: that is, all elements which may contain character data must have both start- and end-tags present All attribute values must be in quotes Any EMPTY element tags must either end with ‘/>’ or you have to make them appear non-EMPTY by adding a real end-tag There must not be any isolated markup-start characters (< or &) in your text data. If present it should be escaped. Elements must nest inside each other properly

38 Sample well-formed XML file?......

39 Structure of an XML file Word Document

40 XML Structure: Elements Elements are the most common form of markup Delimited by angle brackets (start tags, end tags), most elements identify the nature of the content they surround. Some elements may be empty in which case they have no content and are shown as If an element is not empty, it begins with a start-tag,, and ends with an end-tag, Element Sample – Sample Content

41 XML Structure: Attributes Attributes are name-value pairs that occur inside tags after the element name All attribute values must be quoted Attribute Sample – Sample Content

42 XML Structure: Entity References An entity reference refers to the content of a named entity They are used to insert reserved markup characters (<, &, “) characters into your document as content Entities are also used to refer to often repeated or varying text and to include the content of external files In order to use an entity, you simply reference it by name References to parsed general entities use ampersand (&) and semicolon (;) as delimiters. Parameter-entity references use percent-sign (%) and semicolon (;) as delimiters

43 XML Structure: Char. References Entity References Sample –& and %pe1; Is a special form of entity reference Can be used to insert arbitrary Unicode characters into your document This is a mechanism for inserting characters that cannot be directly typed Character references take one of two forms: –decimal references eg. ℞ –hexadecimal references eg. ℞

44 XML Structure:Comments Comments begin with “ ” They can contain any data except the literal string “- -” You can place comments between markup anywhere in your document They are not part of the textual content of an XML document XML Comment Sample –

45 XML Struc.:Processing Instructions Processing instructions (PIs) are an escape hatch to provide information to an application They are not textually part of the XML document They have the form: –The name, called the PI target, identifies the PI to the application The names used in PIs may be declared as notations in order to formally identify them Any data that follows the PI target is optional, it is for the application that recognizes the target XML PI Sample –

46 XML Structure.:CDATA Sections CDATA sections may occur anywhere character data may occur They are used to escape blocks of text containing characters which would otherwise be recognized as markup They begin with the string " ” The only string that cannot occur in a CDATA section is “]]>” Sample XML CDATA Section: –

47 XML Structure Document Type Declarations

48 XML Structure:Doc.Type Decl. For any XML document to have meaning there must be some constraint on the sequence and nesting of tags Declarations are where these constraints can be expressed Declarations allow a document to communicate meta-information to the parser about its content Meta-information includes –Allowed sequence and nesting of tags –Attribute values and their types and defaults –the names of external files that may be referenced and whether or not they contain XML –the formats of some external (non-XML) data, and entities

49 XML Structure:Doc.Type Decl. There are four kinds of declarations in XML : –element declarations –attribute list declarations –entity declarations –notation declarations

50 XML Structure:Element Decl. The element structure of an XML document may, for validation purposes, be constrained using element type and attribute-list declarations. Element type declarations often constrain which element types can appear as children of the element Element declarations identify the names of elements and the nature of their content (content model) In addition to element names, the special symbol #PCDATA is reserved to indicate character data Elements with both element content and PCDATA content are said to have “mixed content”

51 XML Structure:Element Decl. Three other content models are possible: –EMPTY indicates that the element has no content (and consequently no end-tag) –ANY indicates that any content is allowed –ELEMENT content indicates only elements within an element. Significance of *, +, ?, |,, XML Element Decl. Sample

52 XML Structure:Attribute Decl. Attribute declarations identify –Which elements may have attributes –What attributes they may have –What values the attributes may hold –What default value each attribute has Each attribute in a declaration has three parts: a name, a type, and a default value There are six possible types: –CDATA: CDATA attributes are strings, any text is allowed

53 XML Structure:Attribute Decl. –ID: The value of an ID attribute must be a name. All of the ID values used in a document must be different IDs uniquely identify individual elements in a document Elements can have only a single ID attribute –IDREF/IDREFS:An IDREF attribute's value must be the value of a single ID attribute on some element in the document The value of an IDREFS attribute may contain multiple IDREF values separated by white space. –

54 XML Structure:Attribute Decl. –ENTITY or ENTITIES: An ENTITY attribute's value must be the name of a single entity. The value of an ENTITIES attribute may contain multiple ENTITY values separated by white space –NMTOKEN or NMTOKENS: Name token attributes are a restricted form of string attribute The value of an NMTOKENS attribute may contain multiple NMTOKEN values separated by white space. –A list of names: The value of an attribute must be taken from a specific list of names This is frequently called an enumerated type

55 XML Structure:Attribute Decl. There are four possible default values: –#REQUIRED: The attribute must have an explicitly specified value on every occurrence of the element in the document –#IMPLIED: The attribute value is not required, and no default value is provided If a value is not specified, the XML processor must proceed without one. –"value”: An attribute can be given any legal value as a default The attribute value is not required on each element, but if it is not present, it will appear to be the default. –

56 XML Structure:Attribute Decl. –#FIXED "value”: An attribute declaration may specify that an attribute has a fixed value The attribute is not required, but if it occurs, it must have the specified value One use for fixed attributes is to associate semantics with an element XML Attribute Declaration sample: –

57 XML Structure:Entity Decl. Entity declarations allow you to associate a name with some other fragment of the document. The other fragment could be: –A chunk of regular text –A chunk of the document type declaration –A reference to an external file containing either text or binary data There are three kinds of entities: –Internal Entities: The replacement text is stored in the declaration Internal entities allow to define shortcuts for frequent text or text that is expected to change –

58 XML Structure:Entity Decl. The XML specification predefines five internal entities: –< produces the left angle bracket, < –> produces the right angle bracket, > –& produces the ampersand, & –&apos; produces a single quote character (an apostrophe), ' –" produces a double quote character, " Internal Entity Declaration Sample: –

59 XML Structure:Entity Decl. –External Entities: External entities allow an XML document to refer to an external file External entities contain either text or binary data If they contain text, the content of the external file is inserted at the point of reference and parsed as part of the referring document (Parsed) Binary data is not parsed and may only be referenced in an attribute (Unparsed) –External Entity Declaration Sample: –

60 XML Structure:Entity Decl. –Parameter Entities: Parameter entities can only occur in the document type declaration A parameter entity is identified by placing “% ” (percent-space) in front of its name in the declaration Parameter entity references are immediately expanded in the document type declaration and their replacement text is part of the declaration –Parameter Entity Declaration Sample: –

61 XML Structure:Notation Decl. Notation declarations identify specific types of external binary data –This information is passed to the processing application, which may make whatever use of it it wishes –Notation Declaration Sample: –

62 EXERCISE 2 DTD

63

64 XSLT - Introduction Has evolved from the early Extensible Stylesheet Language (XSL) standard An XSLT style sheet is an XML document Specifies a language definition for –XML data presentation (XSL-FO): Future uncertain! –XML data transformations (XSLT) XSLT is a programming language for transforming XML documents

65 XSLT - Introduction Adopts the XPath language syntax for expressions Supports a small, flexible data types: –Boolean, number, string, node-set Supports a full set of operations: –,,, Supports programming flow-control: –,,

66 XSLT – Hello World! An XSLT Programmer Hello, World!

67 XSLT - Introduction <xsl:stylesheet xmlns:xsl=http://www.w3.org/1999/XSL/Transformhttp://www.w3.org/1999/XSL/Transform version="1.0"> from

68 XSLT – Introduction XSLT is rule-based and declarative Rules are called template rules –A template rule is an instruction to transform a specified source element in a particular way XSLT processors: –Put XML document and XSLT style sheets together as data and code –Invoke template rules to produce the desired output

69 XSLT – Behind the Scenes XSLT processors: –Put XML, XSLT together as data, code –Invoke template rules to produce the desired output XSLT Processing Steps: –Reading XML, associated XSLT style sheets –Parsing XML files, associated XSLT files, into trees of nodes

70 XSLT – Behind the Scenes XSLT Processing Steps: –Applying XSLT transformation to the source trees –Producing result trees according to XSLT specification –Serializing the result trees as output files Formats such as: XML, HTML, Text

71 XSLT – Behind the Scenes The XSLT processor does not work on raw XML data Raw XML data is first parsed into an XML DOM object (tree of nodes) Tree representation allows building expressions in the XPath language A template rule is invoked when the specified element, as expressed in an XPath pattern, is matched

72 XSLT – Linking to XML Embed an XSLT style sheet inside the source XML document Explicitly call: –transformNode()/ transformNodeToObject() method on the source XML DOM

73 XSLT – Linking to XML Embed an XSLT style sheet inside the source XML document Explicitly call: –transformNode()/ transformNodeToObject() method on the source XML DOM

74 XSLT: Behind the Scenes

75 XSLT Components XML Declaration –An XSLT file is an XML file –An XML declaration could be used: XSLT Stylesheet Declaration –The style sheet declaration is <xsl:stylesheet version="1.0“ xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

76 XSLT Components Name Space Declaration Prefix –The following name space declaration is required: xmlns:xsl="http://www.w3.org/1999/XSL/Transform –Additional namespace prefixes can also be added xmlns:msxsl="urn:schemas-microsoft-com:xslt" –Default namespace has no prefix, usually used for literal result elements

77 XSLT Components Output Method Declaration – used by the processor to determine how to serialize the result tree –Must be a child of the element

78 XSLT Components XSLT Variables –Identify generically named objects, from which a value can be retrieved as needed –Can be used as a cache for a value –Provide mechanism for Setting a value Referring to result tree fragment variable_val

79 XSLT Components XSLT Parameters –Used to pass data –Local parameters pass data from one template rule to another –Global parameters pass a value into a style sheet from outside

80 XSLT Components External Document References –document() XSLT function allows you to incorporate another XML file – element imports another style sheet into the current style sheet – element includes another style sheet into the current style sheet

81 XSLT Components Match Rules –Consists of two components Match pattern Template itself –Match Pattern consists of Match Attribute XML Path Language (XPath) expression [template]

82 XSLT Components Match Rules –When template finds match in the source tree that matches match pattern, it places content in the result tree –template rule instantiates the template for each match

83 XSLT Components Literal Result Elements –Templates can contain 2 kinds of elements: Elements in the XSLT namespace, such as Elements not in the XSLT namespace, such as, –Such elements are treated as literal result elements –Such elements are copied to the output with minimal processing –HTML elements in the XSLT file must be closed

84 XSLT Components Comments –Comments in XSLT work the same as comments in XML XPath Expressions –XPath expression selects a node or nodes that match specified pattern –Expression is run against node tree representation of XML document

85 XSLT Components Namespace Prefixes –Commonly used namespace prefixes in an XSLT style sheet are as follows: <xsl:stylesheet version="1.0" xmlns="urn:schemas-mycompany-com:xslt“ xmlns:xsl=http://www.w3.org/1999/XSL/Transformhttp://www.w3.org/1999/XSL/Transform xmlns:msxsl="urn:schemas-microsoft-com:xslt“ xmlns:fo="http://www.w3.org/1999/XSL/Format">


Download ppt "XML, XSLT. Discussion on Markup Languages, Trends."

Similar presentations


Ads by Google