XML –Query Languages, Extracting from Relational Databases ADVANCED DATABASES Khawaja Mohiuddin Assistant Professor Department of Computer Sciences Bahria University (Karachi Campus) Content for this lecture is taken from: Chapter 11 of “Database Systems: Models, Languages …”, 6th Ed.” by Elmasri and Navathe (Chapter 12 of “Fundamentals of Database Systems” 6th Ed. by Elmasri and Navathe) 1
Topics to Cover XML Languages Xpath XQuery Extracting XML Documents from Relational Databases 2
XML Languages Two query language standards emerged XPath Provides language construct to specify path expressions to identify certain nodes (elements) or attributes within an XML document that match specific patterns XQuery Uses XPath expressions but has additional constructs 3
XPath: Specifying Path Expressions in XML XPath expression Returns a sequence of items that satisfy a certain pattern as specified by the expression Returned items are either values (from leaf nodes) or elements or attributes Qualifier conditions Further restrict nodes that satisfy pattern Separators used when specifying a path: Single slash (/) - specifies that the tag must appear as a direct child of the previous (parent) tag Double slash (//) - specifies that the tag can appear as a descendant of the previous tag at any level 4
XPath: Specifying Path Expressions in XML (cont’d.) 1. Returns the company root node and all its descendant nodes, i.e. returns the whole XML document. 2. Returns all department nodes (elements) and their descendant subtrees. 3. // is convenient if we do not know the full path name we are searching for, but do know the name of some tags of interest within the XML document. Returns all employeeName nodes that are direct children of an employee node, such that the employee node has employeeSalary value greater than
XPath: Specifying Path Expressions in XML (cont’d.) XPath has a number of comparison operations for use in qualifier conditions, including standard arithmetic, string, and set comparison operations. 4. Returns the same result as the previous one, except that the full path name is specified in this example. 5. Returns all projectWorker nodes and their descendant nodes that are children under a path /company/project and have a child node hours with a value greater than or equal to 20.0 hours. Note: For COMPANY XML document, stored at the first XPath expression should be written as: doc( 6
XPath: Specifying Path Expressions in XML (cont’d.) To include attributes in Xpath expression, attribute name is prefixed by symbol. Wildcard symbol * Stands for any element Example: /company/* the result can be a sequence of different types of items 7
XPath: Specifying Path Expressions in XML (cont’d.) Axes Our previous examples only included three: 1. Child of current node (/) 2. Descendent or self at any level of current not (//) 3. Attribute of the current node A more general model for path expressions called Axes has been proposed. In it, it is possible to move in multiple directions from current node in path expression Includes self, child, descendent, attribute, parent, ancestor, previous sibling, and next sibling 8
XPath: Specifying Path Expressions in XML (cont’d.) Main restriction of XPath path expressions Path that specifies the pattern also specifies the items to be retrieved Difficult to specify certain conditions on the pattern while separately specifying which result items should be retrieved. The XQuery language separates these two concerns, and provides more powerful constructs for specifying queries 9
XQuery: Specifying Queries in XML The typical form a query in XQuery is known as FLWR expression FLWR stands for the four main clauses of XQuery 1. FOR 2. LET 3. WHERE 4. RETURN Zero or more instances of FOR and LET clauses allowed WHERE clause is optional, but can appear at most once RETURN clause must appear exactly once 10
1. Variables are prefixed with the $ sign. 2. LET clause assigns a variable to a particular expression for the rest of the query. 3. FOR clause assigns a variable to range over each of the individual items in a sequence. 4. WHERE clause specifies additional conditions on the selection of items. 5. RETURN clause specifies which elements or attributes should be retrieved from the items that satisfy the query conditions. 11
Other Examples: XQuery: Specifying Queries in XML (cont’d.) 12
XQuery: Specifying Queries in XML (cont’d.) XQuery contains powerful constructs to specify complex queries: Conditional quantifiers, aggregate functions, ordering of query results, selection based on position in a sequence, and even conditional branching. Hence, in some ways, it qualifies as a full-fledged programming language Contains documents describing the latest standards related to XML and XQuery 13
Other Languages and Protocols Related to XML The goal is to provide the technology for realization of the Semantic Web, where all information in the Web can be intelligently located and processed. Extensible Stylesheet Language (XSL) Define how a document should be rendered for display by a Web browser Extensible Stylesheet Language for Transformations (XSLT) Transform one structure into different structure 14
Other Languages and Protocols Related to XML (cont’d.) Web Services Description Language (WSDL) Allows description of Web Services in XML making the Web Service available to users and programs over the Web. Simple Object Access Protocol (SOAP) Platform-independent and programming language- independent protocol for messaging and remote procedure calls Resource Description Framework (RDF) Provides languages and tools for exchanging and processing of meta-data (schema) descriptions and specifications over the Web 15
Extracting XML Documents from Relational Databases Creating hierarchical XML views over flat or graph- based data Representational issues arise when converting data from a relational database system into XML documents UNIVERSITY database example 16
17
18
19
In this hierarchy, the combined COURSE/SECTION information is replicated under each student who completed the section. 20
Breaking Cycles to Convert Graphs into Trees Complex subset with one or more cycles Indicate multiple relationships among the entities Even more difficult to decide how to create the document hierarchies Can replicate the entity types involved to break the cycles and convert into a hierarchy 21
Breaking Cycles to Convert Graphs into Trees (cont’d.) 22
Other Steps for Extracting XML Documents from Databases 1. Create correct query in SQL to extract desired information for XML document 2. Restructure query result from flat relational form to XML tree structure 3. Customize query to select either a single object or multiple objects into document. For example: In Figure 12.13, the query can select a single student entity and create a document corresponding to that single student, or it may select several, or even all of the students and create a document with multiple students. 23
Conclusion XPath and XQuery languages Query XML data Extracting XML documents from relational databases involve converting from flat to hierarchical structure. There could be cycles which need to be broken by replicating some entities. Other than conversion some additional steps need to be taken 24