Presentation is loading. Please wait.

Presentation is loading. Please wait.

II. XML Data Management A : XML refresher using material from A. Silverschatz and M. Sapossnek B: - XML-Data Management (1) Query languages: XPATH, XQuery,

Similar presentations


Presentation on theme: "II. XML Data Management A : XML refresher using material from A. Silverschatz and M. Sapossnek B: - XML-Data Management (1) Query languages: XPATH, XQuery,"— Presentation transcript:

1 II. XML Data Management A : XML refresher using material from A. Silverschatz and M. Sapossnek B: - XML-Data Management (1) Query languages: XPATH, XQuery, SQLX C:- Mapping XML data to databases - Native XML Data management

2 HS / DBSII-03-XML-1 2 What is XML?  Acronym for eXtensible Markup Language  Syntax for structuring data and documents in human-readable form  THE "Syntax of the WEB"  Meta language for defining languages  Bases of many extensions  Namespaces  Stylesheets  Hyperlinks  Schemata  Standardized by W3C http://www.w3.org/TR/REC-xml http://www.w3.org/TR/REC-xml

3 HS / DBSII-03-XML-1 3 What XML is Not..  No protocol  Language for describing data  Used as data format in protocols  Protocols may be syntactically defined by XML  No programming language but  XML documents may contain code fragments  New languages allow for XML – code as part of the language (Xen, a MS extension of C# )  Some XML extensions with superimposed PL semantics, rule semantics in XSLT  No magic semantics  Interpretation by humans, applications, standards derived from XML

4 HS / DBSII-03-XML-1 4 Why XML?  … not a question any more, since widely adopted  Simple  Extensible  Easy to process  Easy to generate  Data interchange critical for networked applications "XML will be the ASCII of the Web: basic, essential, unexciting" Tim Bray... it is already

5 HS / DBSII-03-XML-1 5 XML example  Pre-XML representation of data:  XML representation of the same data: “PO-1234”,”CUST001”,”X9876”,”5”,”14.98” Elements PO-1234 CUST001 2 14.53 Attribute Prologue

6 HS / DBSII-03-XML-1 6 XML example  Graphical representation PO_NUM PURCHASE_ORDER Cust:_ID QUNTY PO-1234 CUST001 XML documents - tree structured - Data an metadata in the same document (as opposed to RDBS) 2 { ItemNum=X9876 } ITEM PRICE 14.53

7 HS / DBSII-03-XML-1 7 XML Usage  Two basic types of XML usage Document centric (document oriented)  structuring a digital document, including logical layout  primary focus of SGML -predecessor of XML  Data centric  Description of data in a self describing form for later processing  Distinction not totally clear  See purchase order example: If typical document characteristic included (company addr., customer addr, date, …, company logo) it would be a document oriented usage of XML

8 HS / DBSII-03-XML-1 8 Document centric XML documents: example Variabler Maulschlüssel Full Fabrication Labs, Inc. Großer, verstellbarer Schraubenschlüssel Der Engländer besteht aus erstklassigem Stahl und besitzt einen gummierten Handgriff. Die Maulgröße liegt zwischen 0 und 32 mm. Sie können..... Bestellen Andere Werkzeuge ansehen Den Katalog herunterladen Der Schraubenschlüssel kostet 15.33 Euro inkl. MWSt. Wenn Sie jetzt bestellen, erhalten Sie zusätzlich unsere wertlose Hobbybastler-Fibel. Typical:Long text elements

9 HS / DBSII-03-XML-1 9 Data centric XML documents: example ABC Industries 123 Main St. Chicago.... Turkey wrench: Stainless steel, one-piece construction, lifetime guarantee. 9.95 10.......

10 HS / DBSII-03-XML-1 10 XML Syntax  One, and only one, root element  Sub-elements must be properly nested  A tag must end within the tag in which it was started  Attributes are optional  Attribute values must be enclosed in “” or ‘’  No data type but 'string'  Processing instructions optional  XML is case-sensitive  and are not the same type of element

11 HS / DBSII-03-XML-1 11 Why hierarchical "data model"?  Hierachies (nesting) in data bases? Why not?  REDUNDANCY! Multiple items, customers, … occur multiple times in different orders Normalization replaces redundancies by foreign keys OO / OR – Data bases??  Nesting useful in data transfer  External application does not have access to foreign key / to database.

12 HS / DBSII-03-XML-1 12 XML Attributes vs Elements  Distinction between subelement and attribute  In the context of documents: attributes are part of markup subelement contents part of the basic document contents  In the context of data representation: difference not clear, but confusing Same information can be represented in two ways  ….  A-101 …  Suggestion: use attributes for identifiers of elements use subelements for contents

13 HS / DBSII-03-XML-1 13 How to use XML data?  Basic Idea Application with XML-Generator XML-Parser XML-Parser Receiving application DBMS DBMS DOMSAX Standard-Interfaces How does application know about - syntactical correctness - data semantics ?

14 HS / DBSII-03-XML-1 14 Correct or not correct ? Different encodings Different encodings specified by encoding attribute specified by encoding attribute

15 HS / DBSII-03-XML-1 15 Correctness of XML documents  Syntactic correctness  Conformance to XML syntax  Document structured according to XML syntax is well- formed  Compare Syntax checker for program  Semantic correctness  Given Meta level description of XML documents: Document Type Definition (DTD) or XML Schema  Document is valid with respect to DTD (Schema) if all definitions and restrictions have been fulfilled  No DTD allowed, applications must know, what is meant  What is semantics??  Interpretation of tags is a matter of humans and/or the application program: could mean "book title" or "first name" or…

16 HS / DBSII-03-XML-1 16 XML Namespaces  Part of XML’s extensibility  Allow autonomous users to differentiate between tags of the same name (using a prefix)  Frees author to focus on the data and decide how to best describe it  Allows multiple XML documents from multiple authors to be merged Namespace declarationPrefix URI (URL) xmlns: bk = “http://www.example.com/bookinfo/”

17 HS / DBSII-03-XML-1 17 Namespace  Examples  No prefix: all elements belong to same namespace All About XML Joe Developer 19.99 All About XML Joe Developer

18 HS / DBSII-03-XML-1 18 DTD and XML schema  Type of XML document defined as  DTD - not expressible in XML syntax  XML schema  Document Type Definition (DTD)  Does not constrain types: all values are strings in XML  Syntax

19 HS / DBSII-03-XML-1 19 DTD: elements and attributes  Example (element decl)  Subelements  names of elements  #PCDATA (parsed character data), i.e., character strings  EMPTY (no subelements) or ANY (anything can be a subelement)  Subelement specification may have regular expressions Notation:  “ | ” : alternatives  “ + ” : 1 or more occurrences "?" 0 or one  “ * ” : 0 or more occurrences

20 HS / DBSII-03-XML-1 20 DTD example <!DOCTYPE bank [ ]>

21 HS / DBSII-03-XML-1 21 DTD attributes  Attribute specification : for each attribute  Name  Type of attribute CDATA ID (identifier) or IDREF (ID reference) or IDREFS  more on this later  Whether mandatory (#REQUIRED) has a default value (value), or neither (#IMPLIED)  Examples   <!ATTLIST customer customer-id ID # REQUIRED accounts IDREFS # REQUIRED >

22 HS / DBSII-03-XML-1 22 DTD attribute ID  At most one attribute of type ID per element  ID attribute value of each element in an XML document must be distinct  ID attribute value is object identifier  attribute of type IDREF must contain the ID value of an element in the same document  attribute of type IDREFS contains a set of (0 or more) ID values. ID value must contain the ID value of an element in the same document  ID, IDREF, IDREFS do not designate a particular domain (no type!)

23 HS / DBSII-03-XML-1 23 DTD declaration External DTD-declaration... Internal DTD-declaration ]> consumer rights protagonist Mixed usage ]>...

24 HS / DBSII-03-XML-1 24 DTD limits  No typing of text elements and attributes  All values are strings, no integers, reals, etc.  Difficult to specify unordered sets of subelements  Order is usually irrelevant in databases  (A | B)* allows specification of an unordered set, but Cannot ensure that each of A and B occurs only once How to express: a, b and c in arbitrary order?  IDs and IDREFs are untyped  The owners attribute of an account may contain a reference to another account, which is meaningless owners attribute should ideally be constrained to refer to customer elements

25 HS / DBSII-03-XML-1 25 XML Schema  XML Schema (XSD): much more expressible Schema language compared to DTD schemas  Typing of values E.g. integer, string, etc constraints on min/max values  User defined types  specified in XML syntax, unlike DTDs More standard representation, but verbose  namespace support  Many more features List types, uniqueness and foreign key constraints, inheritance Ability to map to RDB, …  significantly more complicated than DTD syntax  Use of XSD recommended

26 http://www.w3.org/2001/XMLSchema ….. definitions of customer and depositor …. XSD example (from Silverschatz)

27 HS / DBSII-03-XML-1 27 Using XML  Data exchange   Data management:  Store, retrieve, query large document sets efficiently Today's solutions:  Mapping to RDB / ORDB / OODB  "Native" XML data management (not necessarily very different from storing in conventional DB)  Standardized data description: different extensions and applications  Bioinformatic Sequence Markup Language (BSML)  MathML  Scalable Vector Graphics (SVG).. And many, many more  Ressource Description in the web (RDF) …

28 HS / DBSII-03-XML-1 28 Using XML: RDF with XML syntax Creator www.me.de/~fritz Homepage Fritz Müller RDF-Modell <RDF xmlns="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:s="http://description.org/schema/"> xmlns:s="http://description.org/schema/"> Fritz Müller Fritz Müller Fritz Müller Fritz Müller about=fritz@web.de </RDF> Encoded in XML: Many of these triples form a graph fritz@web.defritz@web.de emailOf

29 HS / DBSII-03-XML-1 29 Using XML  Layout of documents?  XML documents have logical structure  Layout structure needed for output Use transformation language to describe device specific transformations XML-Doc. (Layout- transf.) Standard- Software (XSL-Processor) XML-Doc. (device spec. Layout) Standard Software (HTML-Browser) XML-Doc. (Daten) Transformation into all kinds of languages (HTML, pdf, …) on all kinds of devices

30 HS / DBSII-03-XML-1 30 XML transformation  XSLT: The language used for converting XML documents into other forms  Describes how the document is transformed  Expressed as an XML document (.xsl)  Template rules  Patterns match nodes in source document  Templates instantiated to form part of result document  XPath for querying, sorting, etc.  XSL-FO language for describing layout XSL = XSLT + XPATH + XSL-FO

31 HS / DBSII-03-XML-1 31 XML transformation: example (1)  Document Scootney Publishing Regional Sales Report Sales Report West Coast...

32 HS / DBSII-03-XML-1 32 XML transformation: example (2)  XSL style sheet - mapping to HTML... Region\Quarter Q... color:red; color:green;... <xsl:value-of select="format-number(sum(quarter/@books_sold), '###,###')"/> XPath expression XPath: query language on doc trees

33 HS / DBSII-03-XML-1 33 XML transformation: example (2)  The result


Download ppt "II. XML Data Management A : XML refresher using material from A. Silverschatz and M. Sapossnek B: - XML-Data Management (1) Query languages: XPATH, XQuery,"

Similar presentations


Ads by Google