About XML/Xquery/RDF 4/1. Why XML XML is the confluence of several factors: –The Web needed a more declarative format for data, trying to describe the.

Slides:



Advertisements
Similar presentations
XML: Extensible Markup Language
Advertisements

XML May 3 rd, XQuery Based on Quilt (which is based on XML-QL) Check out the W3C web site for the latest. XML Query data model –Ordered !
XML, XML Schema, Xpath and XQuery Slides collated from various sources, many from Dan Suciu at Univ. of Washington.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 311 Database Systems I The Semistructured Data Model.
Database Management Systems, R. Ramakrishnan1 Introduction to Semistructured Data and XML Chapter 27, Part D Based on slides by Dan Suciu University of.
About XML/Xquery/RDF 4/5 Proejct part C Homework 3 The truth is in here.
Agenda from now on Done: SQL, views, transactions, conceptual modeling, E/R, relational algebra. Starting: XML To do: the database engine: –Storage –Query.
Querying XML (cont.). Comments on XPath? What’s good about it? What can’t it do that you want it to do? How does it compare, say, to SQL?
1 Lecture 10 XML Wednesday, October 18, XML Outline XML (4.6, 4.7) –Syntax –Semistructured data –DTDs.
1 Lecture 10: Database Design XML Wednesday, October 20, 2004.
XQuery: 1 W3C (World Wide Web Consortium) What is W3C? –An industry consortium, best known for standardizing HTML and XML. –Working Groups create or adopt.
1 COS 425: Database and Information Management Systems XML and information exchange.
1 Statistics XML: –Altavista: 800,000 pages returned. –Amazon.com: 242 books. In comparison: –God: 12,000 books, 7 Million pages –Bible: 32,000 books,
1 XML and QUERY Shilpi Ahuja CSE Data Mining 4 th April 2002.
XML and The Relational Data Model
1 Introduction to XML Yanlei Diao UMass Amherst April 19, 2007 Slides Courtesy of Ramakrishnan & Gehrke, Dan Suciu, Zack Ives and Gerome Miklau.
About XML/Xquery/RDF 4/1. TEXT Structured (relational) Data XML Less Structure More Structure.
XML and Databases 198:541. XML Motivation  Huge amounts of unstructured data on the web: HTML documents  No structure information  Only format instructions.
XML –Query Languages, Extracting from Relational Databases ADVANCED DATABASES Khawaja Mohiuddin Assistant Professor Department of Computer Sciences Bahria.
Fundamentals, Design, and Implementation, 9/e Text and XML databases Instructor: Dragomir R. Radev Winter 2005.
End of SQL XML April 22 th, Null Values If x=Null then 4*(3-x)/7 is still NULL If x=Null then x=“Joe” is UNKNOWN Three boolean values: –FALSE =
XML, XML Schema, XPath and XQuery Query Languages CS561 Slides collated from several sources, including D. Suciu at Univ. of Washington.
1 Lecture 08: XML and Semistructured Data. 2 Outline XML (Section 17) –XML syntax, semistructured data –Document Type Definitions (DTDs) XPath.
Managing XML and Semistructured Data Lecture 2: XML Prof. Dan Suciu Spring 2001.
1 Lecture 08: XML and Semistructured Data. 2 Outline XML (Section 17) –XML syntax, semistructured data –Document Type Definitions (DTDs) XPath.
1 Advanced Topics XML and Databases. 2 XML u Overview u Structure of XML Data –XML Document Type Definition DTD –Namespaces –XML Schema u Query and Transformation.
XML, distributed databases, and OLAP/warehousing The semantic web and a lot more.
XML: Extensible Markup Language FST-UMAC Gong Zhiguo.
IS432: Semi-Structured Data Dr. Azeddine Chikh. 1. Semi Structured Data Object Exchange Model.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 XML Taken from Chapter 7.
XML by Dan Suciu 1 Introduction to Semistructured Data and XML Based on slides by Dan Suciu University of Washington.
XML and XPath. Web Services: XML+XPath2 EXtensible Markup Language (XML) a W3C standard to complement HTML A markup language much like HTML origins: structured.
TDDD43 XML and RDF Slides based on slides by Lena Strömbäck and Fang Wei-Kleiner 1.
IT420: Database Management and Organization XML 21 April 2006 Adina Crăiniceanu
Extensible Markup and Beyond
XML A web enabled data description language 4/22/2001 By Mark Lawson & Edward Ryan L’Herault.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Exam and Lecture Overview.
Lecture 6: XML Query Languages Thursday, January 18, 2001.
Lecture 5: XML Tuesday, January 16, Outline XML, DTDs (Data on the Web, 3.1) Semistructured data in XML (3.2) Exporting Relational Data in XML (8.3.1)
XML query. introduction An XML document can represent almost anything, and users of an XML query language expect it to perform useful queries on whatever.
What it is and how it works
1 XML eXtensible Markup Language. 2 XML vs. HTML HTML is a HyperText Markup language HTML is a HyperText Markup language Designed for a specific application,
1 Introduction to Semistructured Data and XML. 2 How the Web is Today  HTML documents often generated by applications consumed by humans only easy access:
More XML: semantics, DTDs, XPATH February 18, 2004.
The Semistructured-Data Model Programming Languages for XML Spring 2011 Instructor: Hassan Khosravi.
XML e X tensible M arkup L anguage (XML) By: Albert Beng Kiat Tan Ayzer Mungan Edwin Hendriadi.
IS432 Semi-Structured Data Lecture 6: XQuery Dr. Gamal Al-Shorbagy.
SEMI-STRUCTURED DATA (XML) 1. SEMI-STRUCTURED DATA ER, Relational, ODL data models are all based on schema Structure of data is rigid and known is advance.
XML – Basic Concepts (modified version from Dr. Praveen Madiraju) 2015, Fall Pusan National University Ki-Joune Li.
XML Extensible Markup Language
XML Databases Presented By: Pardeep MT15042 Anurag Goel MT15006.
Lecture 14: Relational Algebra Projects XML?
XML: Extensible Markup Language
XML Related Technologies
XML QUESTIONS AND ANSWERS
Management of XML and Semistructured Data
Management of XML and Semistructured Data
Slides adapted from Rao (ASU) & Franklin (Berkeley)
About XML/Xquery/RDF.
XML Data Introduction, Well-formed XML.
eXtensible Markup Language (XML)
Lecture 12: XML, XPath, XQuery
Semi-Structured data (XML Data MODEL)
Lecture 9: XML Monday, October 17, 2005.
Lecture 8: XML Data Wednesday, October
CSE591: Data Mining by H. Liu
Introduction to Database Systems CSE 444 Lecture 10 XML
Lecture 15: Querying XML Friday, October 27, 2000.
Semi-Structured data (XML)
Lecture 11: XML and Semistructured Data
Presentation transcript:

about XML/Xquery/RDF 4/1

Why XML XML is the confluence of several factors: –The Web needed a more declarative format for data, trying to describe the meaning of the data –Documents needed a mechanism for extended tags –Database people needed a more flexible interchange format Original expectation: –The whole web would go to XML instead of HTML Today’s reality: –Not so… But XML is used all over “under the covers” TEXT Structured (relational) Data XML Less Structure More Structure

TEXT Structured (relational) Data XML Less Structure More Structure

An XML Document Example Fugitive, The Roger Ebert gives two thumbs up ! A fun action movie, Harrison Ford at his best. The standard &hollywood; summer movie strikes back. 183,752,965 X Files,The 4 Start Tag End Tag Attribute Element Mixed Content

XML Terminology tags: book, title, author, … start tag:, end tag: elements: …, … elements are nested empty element: abbrv. an XML document: single root element well formed XML document: if it has matching tags

More XML: Attributes Foundations of Databases Abiteboul … 1995 Attributes are single-valued --No guidance on when to use them

More XML: Oids and References Jane Mary John oids and references in XML are just syntax Object identifiers

HTML vs. XML Bibliography Foundations of Databases Abiteboul, Hull, Vianu Addison Wesley, 1995 Data on the Web Abiteoul, Buneman, Suciu Morgan Kaufmann, 1999 Foundations… Abiteboul Hull Vianu Addison Wesley 1995 … “Self-describing” -Schema info part of the data -Good for data exchange (albeit baroque for storage)

Bibliography Foundations of Databases Abiteboul, Hull, Vianu Addison Wesley, 1995 Data on the Web Abiteoul, Buneman, Suciu Morgan Kaufmann, 1999 Foundations… Abiteboul Hull Vianu Addison Wesley 1995 … HTML describes presentation XML describes content

Why are Database folks so excited about XML? XML is just a syntax for (self- describing) data This is still exciting because –No standard syntax for relational data –With XML, we can Translate any legacy data to XML Can exchange data in XML format –Ship over the web, input to any application

XML  machine accessible meaning This is what a web-page in natural language looks like for a machine Jim Hendler

XML  machine accessible meaning CV name education work private XML allows “meaningful tags” to be added to parts of the text Jim Hendler

XML  machine accessible meaning CV name education work private But to your machine, the tags look like this…. Jim Hendler

XML  machine accessible meaning Schemas help…. …by relating common terms between documents  Jim Hendler

But other people use other schemas CV name education work private   >  Someone else has one like this…. Jim Hendler

But other people use other schemas …which don’t fit in  Moral: There is still need for ontology mapping.. Jim Hendler

The X-standards… XML: an on-the-wire representation for data –Xquery: a query language for XML –Xschema: a schema description language for XML data RDF: a language for meta- data description WSDL/SOAP/UDDI: languages for describing services

Bibliography Foundations of Databases Abiteboul, Hull, Vianu Addison Wesley, 1995 Data on the Web Abiteoul, Buneman, Suciu Morgan Kaufmann, 1999 Foundations… Abiteboul Hull Vianu Addison Wesley 1995 … HTML describes presentation XML describes content

XML Dialect “pot pourri” Extensible Financial Reporting Markup Language (XFRML), eXtensible Business Reporting Language (XBRL), MusicXML, Spacecraft Markup Language (SML), Bank Internet Payment System (BIPS), Bioinformatic Sequence Markup Language (BSML), Biopolymer Markup Language (BIOML), Open Catalog Format (OCF), Chemical Markup Language (CML), Electronic Business XML Initiative (ebXML), Open Trading Protocol (OTP), FinXML, Financial Information eXchange protocol (FIX), RecipeML, CVML, XML Bookmark Exchange Language (XBEL), Scalable Vector Graphics (SVG), NewsML, DocBook, Real Estate Listing Markup Language (RELML),...

XML vs. Relational Data XML is meant as a language that supports both Text and Structured Data –Conflicting demands... XML supports semi-structured data –In essence, the schema can be union of multiple schemas Easy to represent books with or without prices, books with any number of authors etc. XML supports free mixing of text and data –using the #PCDATA type XML is ordered (while relational data is unordered) TEXT Structured (relational) Data XML Less Structure More Structure

XML Data Model “two...” imdb show title review “Fugitive, The” review suntimes reviewer rating nyt “Roger Ebert” “1993” … Check for more detailshttp://

DTDs <!DOCTYPE paper [ ]> <!DOCTYPE paper [ ]> … Notice that DTD is not In XML syntax…  Semi- structured

XML Schemas More recent proposal (with XML syntax) unifies previous schema proposals generalizes DTDs uses XML syntax two documents: structure and datatypes – –

XML Schema

RDF: Meta-data Standard for Web birds, butterflies, snakes John Smith Good’ol semantic networks..?

XQuery 1.0: An XML Query Language –W3C Working Draft 20 December 2001 XML Query Use Cases –W3C Working Draft 20 December 2001 Microsoft.Net Xquery Language Demo – – hive.com/xquery/index.ht ml –Supports querying on the documents described in the W3C Use Cases Xquery Tutorial by Fankhauser & Wadler – user/wadler/papers/xquery- tutorial/ xquery-tutorial.pdf Xquery Resources

10/24 --Exam 1 returned (both versions) --Project 2 due on Wednesday --Homework 3 started (will be closed shortly) --Approximate schedule of topics put up Today: Xquery discussion Semantic Web standards

Exam 1 Stats In-class Avg: 44; Max: 62; Min: 32; Stdev: 12.7 Grads: 49/62/33/9.8 UG: 34/53/16/12.6 At-home Avg: 53;Max: 63; Min: 32.5; Stdev: 8.18 Grads: 56.8/63/49/4.75 UG: 48.4/59/32.5/9.69 All happy families are happy alike, each unhappy family is unhappy in its own way All correct answers are correct alike, each incorrect answer is incorrect in its own way

Querying XML Requirements: –Need to handle lack of schema. We may not know much about the data, so we need to navigate the XML. –Need to support both “information retrieval” and “SQL- style” queries. Ordered vs. un-ordered XML –“Human readable” like SQL? Candidates –Many… based on conflicting requirements XSL: Makes IR folks happy XML-QL: Makes DB folks happy Xquery : W3C’s attempt to make everybody (un)happy

You will be asked to play with it in homework 3 qn 4

FLoWeR Expressions Xquery queries are made up of FLWR expressions that work on “paths” For binds variables to nodes Let computes aggregates Where applies a formula to find matching elements Return constructs the output elements Path expressions are of the form: element//element/element[attrib=value]

Comparison to SQL Look at the use case description on Xquery manual Supports all (?) SQL style queries (with different syntax of course) [default queries in the demo] Has support for –“construction”—outputting the answers in arbitrary XML formats (use case “XMP” ) –“path expressions” --- navigating the XML tree (use case “seq”) –Simple text queries [use case “text”] –Allows queries on “Tag” elements Removes the “data/meta-data” barrier in queries –For each book that has at least one author, list the title and first two authors, and an empty "et-al" element if the book has additional authors. [XMP use case 6]

DTD for

Example Query { for $b in /bib/book where $b/publisher = "Addison- Wesley" and > 1991 return { $b/title } } “For all books after 1991, return with Year changed from a tag to an attribute” TCP/IP Illustrated Advanced Programming in the Unix environment Result Query

Example Query (2) Return the books that cost more at amazon than fatbrain Let $amazon := document( Let $fatbrain := document( For $am in $amazon/books/book, $fat in $fatbrain/books/book Where $am/isbn = $fat/isbn and $am/price > $fat/price Return { $am/title, $am/price, $fat/price } Join

XML frenzy in the DB Community Now that XML is there, what can we do with it? –Convert all databases from Relational to XML? Or provide XML views of relational databases? –Develop theory of native XML databases? Or assume that XML data will be stored in relational databases.. –Issues: What sort of storage mechanisms? What sort of indices?

XML middleware for Databases XML adapters (middle-ware) received significant attention in DB community –SilkRoute (AT&T) –Xperanto (IBM) Issues: – Need to convert relational data into XML Tagging (easy) –Need to convert Xquery queries into equivalent SQL queries Trickier as Xquery supports schema querying On the internet, nobody needs to know that you are a dog RDBMS