XML-QL A Query Language for XML Charuta Nakhe
Querying XML document What is a query language? Why not adapt SQL or OQL to query XML data? What is an XML query? What is the database? -- XML documents What is input to the query? – XML document What is the output of the query? – XML document
Requirements of XML query language Query operations : Selection: eg. Find books with “S. Sudarshan” as author Extraction: eg. Extract the publisher field of above books Restructuring : Restructuring of elements Combination : Queries over more than one documents Must be able to transform & create XML structures Capability for querying even in absence of schema
The XML-QL language The XML-QL language is designed with the following features: it is declarative, like SQL. it is relational complete, e.g. it can express joins. it can be implemented with known database techniques. it can extract data from existing XML documents and construct new XML documents. XML-QL is implemented as a prototype and is freely available in a Java version.
Example XML document Inside COM Dale Rogerson Microsoft Database system concepts S. Sudarshan H. Korth McGrawHill
Matching data using patterns Find those authors who have published books for McGraw Hill: WHERE McGraw Hill $t $a IN “bib.xml” CONSTRUCT $t $a the $t and $a are variables that pick out contents. the output is a collection of author names.
Result XML document Database system concepts S. Sudarshan Database system concepts H. Korth
Grouping with Nested Queries Group results by book title : WHERE $p IN “bib.xml”, $t McGraw Hill IN $p CONSTRUCT $t WHERE $a IN $p CONSTRUCT $a Produces one result for each title and contains a list of all its authors
Result XML document Database system concepts S. Sudarshan H. Korth.
Constructing XML data Results of a query can be wrapped in XML: WHERE McGrawHill $t $a IN “bib.xml” CONSTRUCT $a $t Results are grouped in elements. The pattern matches once for each author, which may give duplicates of books.
Joining elements by value Find all articles that have at least one author who has also written a book since 1995 : WHERE $n I CONTENT_AS $a IN “bib.xml”, $n IN “bib.xml”, y > 1995 CONSTRUCT $a CONTENT_AS $a following a pattern binds the content of the matching element to the variable $a
Tag variables Find all publications in 1995 where Smith is either an author or editor : WHERE $t 1995 Smith IN “bib.xml”, $e IN {author, editor} CONSTRUCT $t Smith $p matches book and article. $e matches author and editor.
Regular-path expressions Find the name of every part element that contains a brand element equal to “Ford”, regardless of the nesting level at which r occurs. WHERE $r Ford IN “bib.xml” CONSTRUCT $r Regular path expressions can specify element paths of arbitrary depth
Other interesting features Constructing explicit root element Grouping of data Transforming XML data Integrating data from different XML sources
Links for more information : The XML-QL W3C Note : The XML-QL home page : The XML Query Working Group : XML Query Requirements (W3C Working Draft) : Robin Cover's page on XML query languages
Example DTD
Creating an explicit root element Every XML document must have a single root. XML-QL supplies an element as default, but others may be specified: CONSTRUCT { WHERE McGrawHill $t $a IN “bib.xml” CONSTRUCT $a $t }