Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 XQuery and XSLT Felix Sasaki World Wide Web Consortium

Similar presentations


Presentation on theme: "1 XQuery and XSLT Felix Sasaki World Wide Web Consortium"— Presentation transcript:

1 1 XQuery and XSLT Felix Sasaki World Wide Web Consortium

2 2 Purpose General overview of XSLT 2.0 and XQuery 1.0: Common features, differences, future perspectives

3 3 Background: Who am I? Topic 1: Japanese and Linguistics: 私ワタシ私名詞 - 代名詞 - 一般 はハは助詞 - 係助詞 鰻ウナギ鰻名詞 - 一般 ですデスです助動詞特殊・デス基本形 。。。記号 - 句点 Topic 2: Representation and processing of multilingual data based on standard technologies & formats: 私...

4 4 Topics Introduction Interplay between the components The common underpinning: XPath 2.0 Path expressions General processing of XQuery / XSLT Future: Full text search

5 5 Introduction 17 (!) specifications about "XQuery" and "XSLT", abbreviated as "QT" A complex architecture QT describes input, processing and output of XML data

6 6 The different pieces of the cake 1.The common underpinning of XQuery and XSLT: XPath 2.0 data model & formal semantics 2.How to select information in XML documents: XPath Manipulating information: XPath functions and operators 4.Generating output: Serialization 5.The XQuery 1.0 and XSLT 2.0 specifications, which deploy 1-4

7 7 The different pieces of the cake XQuery 1.0 XPath 2.0 XPath 1.0 XSLT 2.0 Path Expressions Comparison Expressions Some Built-In Functions Conditional Expressions Arithmetic Expressions Quantified Expressions Built-In Functions and Operators Data Model Stylesheets Templates Formatting FLWOR Expressions XML Constructors Query Prolog User-Defined Functions Graphic based on XQuery tutorial from Priscilla Walmsley (Datypic)

8 8 Attention! Basis of this presentation: A set of WORKING DRAFTS! Things might still change!

9 9 Topics Introduction Interplay between the components The common underpinning: XPath 2.0 Path expressions General processing of XQuery / XSLT Future: Full text search

10 10 Input: XML documents, XML database, XML Schema QT-Processing Serialization: XML documents, XML database, … QT processing: defined in terms of XPath 2.0 data model Information processed by XQuery / XSLT

11 11 Information processed by XQuery / XSLT Input: XML documents:... Input: XML Schema documents …

12 12 Information processed by XQuery / XSLT Input: Type information based on user defined XML Schema data types: element(*,myns:myType) … Type information can be deployed for XQuery / XSLT processing:

13 13 Information processed by XQuery / XSLT Predefined XML Schema data types Examples: built in primitive data types like anyURI, dateTime, gYearMonth, gYear, … Specially for XPath 2.0: xdt:dayTimeDuration Good for: URI processing, time related processing Type casting for built in data types: xs:date(" :00")

14 14 Topics Introduction Interplay between the components The common underpinning: XPath 2.0 Path expressions General processing of XQuery / XSLT Future: Full text search

15 15 XPath 2.0 data model: sequences of items, i.e. nodes … –document node –element nodes: … –attribute nodes: –namespace nodes: … –text nodes: My yellow (and small) flower. –comment node: –processing instruction: and / or atomic values (see below)

16 16 Visualization of nodes document() mydoc.xml element() myDoc element() myEl element() myEl attribute() myAttr attribute() myAttr order of nodes is defined by document order:

17 17 Atomic values Nodes in XPath 2.0 have string values and typed values, i.e. a sequence of atomic values "string" function: returns a string value, e.g. –string(doc("mydoc.xml"))

18 18 Deployment of types: Example for time related types Extracting the timezone from a date value: timezone-from-date (xs:date(" :00")) output: PT7H

19 19 Not in the data model... is: –Character encoding schema –CDATA section boundaries –entity references –DOCTYPE declaration and internal DTD subset All this information might get lost during XQuery / XSLT processing Mainly XSLT allows the user to parameterize the output, i.e. the serialization of the data model

20 20 Topics Introduction Interplay between the components The common underpinning: XPath 2.0 Path expressions General processing of XQuery / XSLT Future: Full text search

21 21 Path expressions …... xquery version "1.0"; { let $input := doc("mydoc.xml") for $elements in $input//myEl return

22 22 Path steps: child axis document() mydoc.xml element() myDoc element() myEl element() myEl attribute() myAttr attribute() myAttr element() myEl element() myEl element() myEl element() myEl element() myEl child::* child::myEl or

23 23 Path steps: parent axis document() mydoc.xml element() myDoc element() myEl element() myEl attribute() myAttr attribute() myAttr element() myEl element() myEl element() myEl element() myEl element() myEl parent::document -node()

24 24 Path steps: sibling axis document() mydoc.xml element() myDoc element() myEl element() myEl attribute() myAttr attribute() myAttr element() myEl element() myEl element() myEl element() myEl element() myEl preceding-sibling:: myEl

25 25 predicate expressions document() mydoc.xml element() myDoc element() myEl element() myEl attribute() myAttr attribute() myAttr element() myEl element() myEl element() myEl element() myEl element() myEl child::* [position()>1]

26 26 Topics Introduction Interplay between the components The common underpinning: XPath 2.0 Path expressions General processing of XQuery / XSLT Future: Full text search

27 27 General processing of XQuery / XSLT XQuery: –Input: zero or more source documents –Output: zero or more result documents XSLT: –Input: zero or more source documents –Output: zero or more result documents What is the difference?

28 28 An example Processing input "mydoc.xml": Desired processing output "yourdoc.xml":

29 29 XSLT Template based processing Traversal of input document, match of templates "Push processing": Nodes from the input are pushed to matching templates...

30 30 Templates and matching nodes document() mydoc.xml element() myDoc element() myEl element() myEl attribute() myAttr attribute() myAttr a a c b 3 c 5 c b

31 31 "Pull processing": XPath expressions pull information out of document(s) xquery version "1.0"; { let $input := doc("mydoc.xml") for $elements in $input//myEl return } XQuery

32 32 document() mydoc.xml element() myDoc element() myEl element() myEl attribute() myAttr attribute() myAttr xquery version "1.0"; { let $input := doc("mydoc.xml") for $elements in $input//myEl return } XQuery

33 33 When to use XSLT Good for processing of mixed content, e.g. text with markup. Example task: My yellow and small flower. should become My yellow (and small) flower. Solution: push processing of the content: … …

34 34 When to use XQuery Good for processing of multiple data sources in a single or multiple documents via For Let Where Order-by Return (FLWOR) expressions Example: creation of a citation index for $mybibl in doc("my-bibl.xml")//entry for $citations in doc("mytext.xml") //cite where return

35 35 Topics Introduction Interplay between the components The common underpinning: XPath 2.0 Path expressions General processing of XQuery / XSLT Future: Full text search

36 36 Full Text Search: Objectives Search for phrases, not substrings Language based search (e.g. using morphological information) Token-based search Application of stemming / thesauri

37 37 Full Text Search: Basics "Word": character, n-gram, or sequence of characters returned by a tokenizer "Phrase": Sequence of words "Sentence" and "Paragraph": Defined by the tokenizer

38 38 Full Text Search: Example Applying stemming in a Query: for $b in /books/book where $b/title ftcontains ("dog" with stemming) && "cat" return $b/author

39 39 Full Text Search: Example Language specification: ftcontains "salon de the" with default stop words language "fr"

40 40 Full Text Search: Example Score specification: for $b score $s in /books/book[content ftcontains "web site" && "usability"] where $s > 0.5 order by $s descending return …

41 41 Topics Introduction Interplay between the components The common underpinning: XPath 2.0 Path expressions General processing of XQuery / XSLT Future: Full text search

42 42 XQuery and XSLT Felix Sasaki World Wide Web Consortium

43 43 Topics Introduction The common underpinning: XPath 2.0 data model General processing of XQuery / XSLT String and number processing IRI processing Dates, timezones, language information Generating output: serialization

44 44 Aspects of string processing What is the scope: characters (code points) String counting Codepoint conversion String comparison: collations String comparison: regular expressions Normalization The role of schemas e.g. in the case of white space handling

45 45 Scope of string processing Basic operation: Counting 'characters' Good message: QT counts code points, not bytes or code units Attention: All string processing uses string values, not typed values!

46 46 With a schema: type = xs:date Works not works string-length(xs:string($myDoc/myEl/revision- String values versus typed values

47 47 String values versus typed values Difference: second example uses adequate type casting Type casting is not always possible: functions/#casting-from-primitive-to-primitive

48 48 Codepoints versus strings: XQuery {"string to code points: suçon becomes ", string-to-codepoints("suçon"), "code points to string: becomes ", codepoints-to-string((115, 117, 231, 111, 110)) } string to code points: suçon becomes code points to string: becomes suçon

49 49 Codepoints versus strings: XSLT string to code points: suçon becomes . code points to string: becomes

50 50 Collation functions: compare() compare("abc", "abc") Returns "0": Returns "-1": Returns "1":

51 51 Collation based function compare() compare("Strasse", "Straße", "myCollation") Example: returns "1" if 'myCollation' describes the order respectively: Identification of collation via an URI.

52 52 Collation identification Identification via an URI. Codepoint-based collation: functions/collation/codepoint Parameterization via an URI: lang=de;strength=primary

53 53 String comparison: regular expressions Based on regular expressions for XML Schema datatypes, with some additions Flags for case mapping based on Unicode case mapping tables:

54 54 Normalization XML documents: not always with early unicode normalization Unicode collation algorithm ensures equivalent results Normalization can be ensured for NCF, NFD, NFKC, NFKD: suçon Output:

55 55 White space and typed values Assuming a type Comparison of typed values via eq Collation might also affect white space handling

56 56 White space and typed values Result: "false" or "true": –"false" if type collapses whitespace –"true" if type does not collapse whitespace

57 57 Number processing: rounding number / currency formatting: round(2.5) returns 3. round(2.4999) returns 2. round(-2.5) returns -2 does not deploy culture specific rounding conventions, e.g. –round 3rd digit less than 3 to 0 or drop it (Argentina)

58 58 XSLT-specific: Numbering Conversion of numbers into a string, controlled by various attributes:

59 59 XSLT-specific: Numbering Erste ア๑ Zweite イ๒ Dritte ウ๓ Output for a sequence of three items:

60 60 XSLT-specific: Numbering format-number(): designed for numeric quantities (not necessarily whole numbers)

61 61 Topics Introduction The common underpinning: XPath 2.0 data model General processing of XQuery / XSLT String and number processing IRI processing Dates, timezones, language information Generating output: serialization

62 62 Status of IRI in QT In the data model: Support for IRI will be normative. data type xs:anyURI: relies on xml schema anyURI, still defined in terms of URI

63 63 Functions for IRI / URI processing casting to xs:anyURI: from untyped values or string: xs:anyURI("http://example.müller.com")

64 64 Functions for IRI / URI processing escaping URI via escape-uri, escaped- reserved="false" escape-uri ("http://example.dürst.com",false()) output:

65 65 Functions for IRI / URI processing http%3A%2F%2Fexample.d%C3%BCrst.com output with escaped-reserved="true":

66 66 Topics Introduction The common underpinning: XPath 2.0 data model General processing of XQuery / XSLT String and number processing IRI processing Dates, timezones, language information Generating output: serialization

67 67 Dates and time types Basis: –date and time types from XML Schema –QT specific extensions: xdt:yearMonthDuration, xdt:dayTimeDuration Operations: time comparison, time adjustment, timezone sensitive operations

68 68 Comparison of date types: xdt:yearMonthDuration("P1Y6M") eq xdt:yearMonthDuration("P1Y7M") Comparison of date types output: false

69 69 Component extraction Extracting the timezone from a date value: timezone-from-date (xs:date(" :00")) output: PT7H

70 70 Arithmetic functions on dates and times Subtract dayTimeDurations: xdt:dayTimeDuration("P2DT12H") - xdt:dayTimeDuration("P2DT12H30M") output: -PT30M

71 71 XSLT: Formatting Dates / Times Some parameters for formatting conventions: picture string with [components]; presentation modifier; language

72 72 XSLT: Formatting Dates / Times Output: September 7th September 2005

73 73 Processing of language information function lang: /myRoot/myEl/text()[lang("de")] returns the content of, assuming the document: Some german text. }

74 74 Processing of language information no value for xml:lang: lang("de") returns "false"

75 75 Topics Introduction The common underpinning: XPath 2.0 data model General processing of XQuery / XSLT String and number processing IRI processing Dates, timezones, language information Generating output: serialization

76 76 Serialization – basic concept XQuery / XSLT: process XML in terms of the XPath 2.0 data model Output: described in terms of serialization parameters

77 77 Some serialization parameters byte-order-mark cdata-section-elements encoding escape-uri-attributes media-type normalization-form use-character-maps

78 78 Output methods Pre-configuration of various serialization parameters for: –XML –XHTML –HTML –Text XQuery: –Mandatory output method: XML, version="1.0" –No need for implementations to support further serialization parameters

79 79 Output methods in XSLT Provides support for serialization parameters and output methods via –xsl:output Support also not mandatory

80 80 XSLT character maps Mapping characters to other characters Desired output: '/>

81 81 XSLT character maps Character map:

82 82 Regular expressions with XSLT

83 83 Regular expressions with XQuery xquery version "1.0"; declare function local:expandPUAChar($string as xs:string, $char as xs:string) as item()* { if (contains($string, $char)) then (substring-before($string, $char), element myChar { attribute code {string-to- codepoints($char)} }, local:expandPUAChar(substring-after($string, $char), $char)) else $string }; for $input in doc("replace-characters.xml")//text() return local:expandPUAChar($input,"")

84 84 Topics – finally! Introduction The common underpinning: XPath 2.0 data model General processing of XQuery / XSLT String and number processing IRI processing Dates, timezones, language information Generating output: serialization

85 85 Wrap up: Is it useful? Yes! QT: a power tool for i18n sensitive XML processing Quite hard to digest, but very tasty Some aspects of i18n related processing might be improved Remember: It's still a set of working drafts...

86 86 I18n Sensitive Processing with XQuery and XSLT Felix Sasaki World Wide Web Consortium


Download ppt "1 XQuery and XSLT Felix Sasaki World Wide Web Consortium"

Similar presentations


Ads by Google