Presentation on theme: "The Semantic Blessings of XSLT Diederik Gerth van Wijk XML Holland 2008 Planetarium Gaasperplas, Amsterdam, 20 november DOXATRIX."— Presentation transcript:
The Semantic Blessings of XSLT Diederik Gerth van Wijk firstname.lastname@example.org XML Holland 2008 Planetarium Gaasperplas, Amsterdam, 20 november DOXATRIX
Diederik Gerth van WijkSemantic Blessings of XSLT2 Intended audience Understands English Knows what XML is about Cares about meaning, processing and validation Does not need to know about XSLT Does not need to be a programmer But might be aware that computers need to be programmed
Diederik Gerth van WijkSemantic Blessings of XSLT3 Semantic? Blessings? XSLT? XML is about the structure of a document Semantics are about “meaning” A schema can say that a document should have a title (structure) The documentation might add that a title is used for identification (unique within a set of documents), and give a clue about what the document is about (semantics) The words used in the title are really semantics Blessings are good, helpful, you want them What is XSLT? How can XSLT help you in adding, verifying and using semantic markup?
Diederik Gerth van WijkSemantic Blessings of XSLT4 Why bother marking up explicitly?
Diederik Gerth van WijkSemantic Blessings of XSLT5 NLP is good, Explicit Markup is better “Plein 26 Den Haag”= Plein 26 Den Haag “Plein 1813 Den Haag”= Plein 1813 Den Haag XML is about tagging structure A schema adds semantics Quattro Staggioni : Pizza by Mario or piece by Vivaldi? I don’t care (in this presentation)
Diederik Gerth van WijkSemantic Blessings of XSLT6 eXtensible Stylesheet Language - Transformations XSL: the eXtensible Stylesheet Language Family of three W3C recommendations for transformation and presentation XML Path Language (XPath) XSL Transformations (XSLT) XSL Formatting Objects (XSL-FO) XSLT stylesheet 1 XSLT stylesheet 2 XSLT processor PDF HTML pages XML source document(s) XSL-FO document XSL-FO processor
Diederik Gerth van WijkSemantic Blessings of XSLT7 XSLT characteristics An XSLT style sheet is an XML document Input is one or more XML documents Output is one or more XML (XSLT!), HTML, XSL-FO or plain text (CSS!) documents Style sheet can look like template of the result document (data pull) Or be event driven (data push) Elements and attributes are “events” Functional programming language Rule based Declarative No side effects Statements can be executed in any order Embeds XPath XSLT 2.0 and XPath 2.0 know XML Schema types XSLT 2.0 can compute from implicit structure
Diederik Gerth van WijkSemantic Blessings of XSLT8 XSLT engines stand alone: Saxon (open source, Michael Kay) Altova (free, XML Spy) MSXML on server: Saxon +.NET Altova +.NET MSXML + ASP built in browser: IE6 and higher FF1 and higher Opera9 and higher
Diederik Gerth van WijkSemantic Blessings of XSLT10 XSLT and semantics... XML elements describe what the content is (semantics) XSLT stylesheets what to do (processing) with them How can a processing stylesheet be a semantic blessing?
Diederik Gerth van WijkSemantic Blessings of XSLT11 Blessing 3: XSLT 2.0 may be schema aware A schema defines the semantics of a document type XSLT 2.0 is based on XPath 2.0 XSLT 2.0 may use schemas Then, XPath 2.0 can use the type of element types or attributes So it can know whether to treat an attribute as string or as integer (”12” ”3” if type is integer) But will it sort correctly: or (yes, if the roman numbers were coded as Ⅷ and Ⅸ) With the “instance of” operator you can use information that is not in the document, but is in the schema Therefore, XSLT 2.0 disencourages stand alone processing From a semantic point of view, that’s a blessing
Diederik Gerth van WijkSemantic Blessings of XSLT12 Blessing 4: Schema independent processing (1) In a sequence group, the order contains no information: (title, abbreviated-title?) (1) is equivalent to (abbreviated-title?, title) (2) Suppose, you want to print the abbreviated title if one is coded, and otherwise the full title In streamprocessing, the q&d solution might be as simple as: temp=getNextElement; if existsNextElement then write(getNextElement) else write(temp); (1) or write(getNextElement); (2) But what if you decide to change from order (1) to (2)? Or add an optional element toc-title? (title, abbreviated-title?, toc-title?) (1) (toc-title?, abbreviated-title?, title) (2) The simple program breaks
Diederik Gerth van WijkSemantic Blessings of XSLT13 Blessing 4: Schema independent processing (2) In XSLT, you have access to the elements by name, in arbitrary order The style sheet fragment looks like If the schema (and documents) change order, the stylesheet remains the same If an optional toc-title is added, the stylesheet remains the same Verbosity turns out to be simpler, in the long run By the way, if sequence matters in the document, it shouldn’t in the schema Reasons to prescribe sequence: to ease input to enforce cardinality
Diederik Gerth van WijkSemantic Blessings of XSLT14 Blessing 5: functional programming No variables Suppose you want to sort items alphabetically and do act on each new letter First idea: No good: the value of the variable PrevLetter is reset in every iteration of the for- each loop
Diederik Gerth van WijkSemantic Blessings of XSLT15 Would this work? Better, but the function preceding-sibling operates on the original order, not on the sorted... Is that a bug or a feature? It’s a blessing!
Diederik Gerth van WijkSemantic Blessings of XSLT16 The solution Think XML Think in creating hierarchies: groups of titles starting with the same letter
Diederik Gerth van WijkSemantic Blessings of XSLT17 The ultimate semantic normalisation “PCDATA considered harmful” (Han Nonnekes, Shell Oil) Text is the outer structure in a specific language of a deeper meaning You should encode a text as that deeper tree With references to abstract words (concepts) For each language (“English, upper class, around 1850”) give dictionary and transformation rules Then generate the text
Diederik Gerth van WijkSemantic Blessings of XSLT18 Questions? Ask me now Ask me during lunch or tea break Ask me during buffet Mail email@example.com Presentation can be downloaded from www.xmlholland2008.nl www.doxatrix.nl/dg
Diederik Gerth van WijkSemantic Blessings of XSLT19