1 Typing XQuery WANG Zhen (Selina)
2 Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints, Automated Deduction A member in REWERSE (Reasoning on the Web with Rules and Semantics), a research network within EU Aim: develop reasoning languages for Web applications. In progress: Xcerpt A deductive, rule-based query language for graph-structured data, including XML data. More suitable for reasoning, compared to XQuery. Still working on Xcerpt and its the typing system. Question: How to build a type system for Xcerpt? Refer to the typing system of other query languages. My Internship: analyze the typing system of XQuery
3 Outline Background Related Work XQuery Typing System Conclusion and Future Work
4 Background XQuery An XML query language E.g.: A simple path expression doc("Catalogue.xml")/catalogue/cdCatalogue.xml Path expressions with predicate doc("Catalogue.xml")/catalogue/cd[ 1 ]/title doc("Catalogue.xml")/catalogue/cd[ price>=30 ] /title doc("Catalogue.xml")/catalogue/cd[ keyword ] /title
Predicate [pre] serves to filter a sequence, retaining some items and discarding others. For …/x[pre]… Compute the predicate truth value of pre for each item x. If true, the item x is retained, else, the item x is discarded Three Typical Predicates [pre] : pre is numeric → predicate truth value = if position is pre doc("Catalogue.xml")/catalogue/cd[1]/title pre is boolean → predicate truth value = pre doc("Catalogue.xml")/catalogue/cd[price>30]/title pre is a typed path → predicate truth value = if pre exit doc("Catalogue.xml")/catalogue/cd[keyword]/title Background
6 Typing XQuery An important aspect of XQuery formal semantics E.g.: Given: Catalogue.xml A query: extract the title of the CD's, with price equal to or more than 30 XQuery expression: doc("Catalogue.xml")/catalogue/cd[price>=30]/title Result
7 Background Problem, if no type information for the XML data The queries and Different ResultsDifferent Results Query1: doc("Catalogue.xml")/catalogue/cd[price>=30]/title Query2: doc("Catalogue.xml")/catalogue/cd[price>="30.0"]/title Query3: doc("Catalogue.xml")/catalogue/cd[price>="30.00"]/title >= price“30.0” incorrect=30 correct>30 or <30 ResultPrice Not only compare the number, but also compare the length in some cases Possible Reason:
8 Background However, there is no error message or warning. The mistake is too subtle to be located easily. If we provide type information (E.g.: define price as a float) and type checking, we may find the mistake during compilation: >= numeric (numeric: decimal, float, double etc) price>="30.0" Typing error!
9 Related Work XQuery 1.0 and XPath 2.0 Formal Semantics A W3C Candidate Recommendation, including Describes the formal semantics, including some details in static analysis phase and dynamic evaluation phase Provides some generic typing rules Too general to guide the implementation of the detailed typing procedure E.g.: only a single rule for typing path expressions Some inconsistency between the summarized formal semantics and the rules E.g.: Formal SemanticsRules Three kinds of Predicate: (Numeric, Boolean, Typed path) ---
10 Related Work Besides numeric/boolean/typed path, for the other possible expressions pre for Predicates [pre] If pre is a string or a sequence, the predicate truth value is true if pre is not empty, and is false otherwise. In all other cases, a typing error is raised. Problem: any expression can be used in a predicate. Some of them, can pass compilation, but does not give reasonable results doc("Catalogue.xml")/catalogue/cd[ “1” ] doc("Catalogue.xml")/catalogue/cd[ “price>=30” ] doc("Catalogue.xml")/catalogue/cd[ “keyword” ]
11 XQuery Typing System This system includes the typing rules which describes the detailed typing procedure for XQuery. Extension on W3C work Adopt and modify some basic notations to focus on typing Try to solve the inconsistency problem Up to now, we mainly extend the typing rules for path expression including predicates. Definitions and Notations Typing Rules Example Implementation
12 Definitions and Notations A Basic Type The built-in datatypes defined in XML Schema, including the primitive and the derived datatypes. E.g.: string, integer etc. A user defined simple type. E.g.: “myInteger” defines the integer with value between 1000 and 2000 :
13 Definitions and Notations A type is: 1.A type constant, e.g.: DocumentType, predicate 2.A basic type, or 3.A type symbol (E.g., a type called “CD”), or 4.A functional type with the form (n ≥ 0) where {…} are types for attributes, τi are types for children 5. A disjunction type is of the form (whereτi are types, and n ≥ 0): 6. A type with occurrence indicator, in the form of
14 Definitions and Notations A typing judgement exp:τ exp is an typed expression, τis a type. if exp’s type is τ, the typing judgement is true The conclusion is true, given that all the premises are true, All the premises and the conclusion are typing judgements. If there is no premise, the conclusion is always true. A typing rule
15 Definitions and Notations Notations
16 Typing Rules Typing rules used for Query1: doc("Catalogue.xml")/catalogue/cd[price>=30]/title Query2: doc("Catalogue.xml")/catalogue/cd[price>="30.0"]/title Query2: doc("Catalogue.xml")/catalogue/cd[price>="30.00"]/title Typing doc(f) doc(f) is a document function, which is used to extract data from XML file f. Typing rule suppose that the type of the root element of XML file f isτ
Typing Paths
Typing Predicate ( numeric, boolean, typed path ) τ1<: τ2 means type τ1 is the subtype of type τ2 If exp: τ1, and τ1<: τ2, then exp: τ2 Use a type called “numeric” where: (W3C) Typing rules
Example Typing Query 1 with a schema: doc("Catalogue.xml")/catalogue/cd[price>=30]/titleschema
Typing Query 1: doc("Catalogue.xml")/catalogue/cd[price>=30]/title Example
Typing Query 2: doc("Catalogue.xml")/catalogue/cd[price>="30.0"]/title Predicate (numeric, boolean, typed path) see whether price>=“30.0”: boolean Typing Rule (from W3C) for operator “>=”, while τis numeric Example A typing error is generated
23 Implementation In order to apply those typing rules, we need to: parse an XQuery expression into an abstract syntax tree apply those rules by navigating through the tree, add type information on the nodes Our implementation: XQueryX – XML expression of XQuery syntax TOM -- An extension of Java designed to manipulate tree structures and XML documents, by using pattern matching facilities. Framework Example: Query 1: doc("Catalogue.xml")/catalogue/cd[price>=30]/title
<module xmlns:xqx=" xmlns:xsi=" xsi:schemaLocation=" doc Catalogue.xml XQueryX expression
child catalogue child cd price 30 XQueryX expression
child title XQueryX expression
Apply the Rules by Using TOM Rules for typing Doc(f) Rules for typing each step in a path expression
28 Conclusion and Future Work Conclusion We analyze the related work in typing XQuery, and solve some inconsistency by extends the typing rules. A prototype of XQuery Typing System is implemented, including the detailed typing rules for the path expressions in XQuery. Future Work Implementation of all the typing rules in W3C work, find and solve the potential inconsistency problem Design typing system for Xcerpt Find a polymorphic typing system for Web query languages.
29 Thank You
Catalogue.xml "Empire Burlesque" Bob Dylan Empire Bob Hide your heart Bonnie Tyler Stop Sam Brown Result Hide your heart Stop Query: doc("Catalogue.xml")/catalogue/cd[price>=30]/title Source: Catalogue.xml
Result Hide your heart Stop Source: Catalogue.xml Query 1 Query 2 Query 3 Result Stop 30 Incorrect Correct Query1: doc("Catalogue.xml")/catalogue/cd[price>=30]/title Query2: doc("Catalogue.xml")/catalogue/cd[price>="30.0"]/title Query3: doc("Catalogue.xml")/catalogue/cd[price>="30.00"]/title
Catalogue.xml "Empire Burlesque" Bob Dylan Empire Bob Hide your heart Bonnie Tyler Stop Sam Brown
Result Hide your heart Stop Source: Catalogue.xml Query 1 Query 2 Query 3 Result Stop 30 Incorrect Correct Query1: doc("Catalogue.xml")/catalogue/cd[price>=30]/title Query2: doc("Catalogue.xml")/catalogue/cd[price>="30.0"]/title Query3: doc("Catalogue.xml")/catalogue/cd[price>="30.00"]/title
Result Hide your heart Stop Source: Catalogue.xml Query 1 Query 2 Query 3 Result Stop Incorrect Correct 30.0 Query1: doc("Catalogue.xml")/catalogue/cd[price>=30]/title Query2: doc("Catalogue.xml")/catalogue/cd[price>="30.0"]/title Query3: doc("Catalogue.xml")/catalogue/cd[price>="30.00"]/title
Catalogue.xml "Empire Burlesque" Bob Dylan Empire Bob Hide your heart Bonnie Tyler Stop Sam Brown
Result Hide your heart Stop Source: Catalogue.xml Query 1 Query 2 Query 3 Result Stop 30 Incorrect Correct Query1: doc("Catalogue.xml")/catalogue/cd[price>=30]/title Query2: doc("Catalogue.xml")/catalogue/cd[price>="30.0"]/title Query3: doc("Catalogue.xml")/catalogue/cd[price>="30.00"]/title
Result Hide your heart Stop Source: Catalogue.xml Query 1 Query 2 Query 3 Correct Query1: doc("Catalogue.xml")/catalogue/cd[price>=30]/title Query2: doc("Catalogue.xml")/catalogue/cd[price>="30.0"]/title Query3: doc("Catalogue.xml")/catalogue/cd[price>="30.00"]/title
Example: Schema file “Catalogue.xsd”