10/14/2001 Coping with Semantics in XML Document Management Thomas Kudrass Leipzig University of Applied Sciences Department of Computer Science and Mathematics.

Slides:



Advertisements
Similar presentations
XML-XSL Introduction SHIJU RAJAN SHIJU RAJAN Outline Brief Overview Brief Overview What is XML? What is XML? Well Formed XML Well Formed XML Tag Name.
Advertisements

Open Office.Org What is the Open Office.org Source Project? Open source project through which Sun Microsystems is releasing the technology for the popular.
What is XML? a meta language that allows you to create and format your own document markups a method for putting structured data into a text file; these.
XML: Extensible Markup Language
XML Technology in E-Commerce
1 Lecture 10 XML Wednesday, October 18, XML Outline XML (4.6, 4.7) –Syntax –Semistructured data –DTDs.
XML(EXtensible Markup Language). XML XML stands for EXtensible Markup Language. XML is a markup language much like HTML. XML was designed to describe.
Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Representing Web Data:
Tutorial 11 Creating XML Document
Introduction to XML This material is based heavily on the tutorial by the same name at
4/20/2017.
XP New Perspectives on XML Tutorial 4 1 XML Schema Tutorial – Carey ISBN Working with Namespaces and Schemas.
Management of XML Documents in Object-Relational Databases Thomas Kudrass Matthias Conrad HTWK Leipzig EDBT-Workshop XML-Based Data Management Prague,
Sheet 1XML Technology in E-Commerce 2001Lecture 6 XML Technology in E-Commerce Lecture 6 XPointer, XSLT.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Document Type Definition.
XML Anisha K J Jerrin Thomas. Outline  Introduction  Structure of an XML Page  Well-formed & Valid XML Documents  DTD – Elements, Attributes, Entities.
Introduction to XML cs3505. References –I got most of this presentation from this site –O’reilly tutorials.
1Computer Sciences Department Princess Nourah bint Abdulrahman University.
XML and XPath. Web Services: XML+XPath2 EXtensible Markup Language (XML) a W3C standard to complement HTML A markup language much like HTML origins: structured.
School of Computing and Management Sciences © Sheffield Hallam University To understand the Oracle XML notes you need to have an understanding of all these.
XP 1 CREATING AN XML DOCUMENT. XP 2 INTRODUCING XML XML stands for Extensible Markup Language. A markup language specifies the structure and content of.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation XML Storage Techniques.
CISC 3140 (CIS 20.2) Design & Implementation of Software Application II Instructor : M. Meyer Address: Course Page:
XML BIS4430 – unit 10. XML Origins Extensible Markup Language (XML) 1998 Inspired by Standard Generalized Markup Language (SGML) and HTML. SGML defines.
A Z Approach in Validating ORA-SS Data Models Scott Uk-Jin Lee Jing Sun Gillian Dobbie Yuan Fang Li.
XML - Why: The HTML-Dilemma HTML, SGML, XML - How: Syntax, Concept, Language Elements Basics Well-formed XML-Documents (without DTD) Valid XML-Documents.
XML as a Boxwood Data Structure Feng Zhou, John MacCormick, Lidong Zhou, Nick Murphy, Chandu Thekkath 8/20/04.
 XML is designed to describe data and to focus on what data is. HTML is designed to display data and to focus on how data looks.  XML is created to structure,
Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka.
Avoid using attributes? Some of the problems using attributes: Attributes cannot contain multiple values (child elements can) Attributes are not easily.
1 Chapter 10: XML What is XML What is XML Basic Components of XML Basic Components of XML XPath XPath XQuery XQuery.
E0262 – MIS – Multimedia Storage Techniques XML (Extensible Markup Language  XML is a markup language for creating documents containing structured information.
XML 2nd EDITION Tutorial 1 Creating An Xml Document.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation An Introduction to XML.
XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document.
Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies.
XP 1 Creating an XML Document Developing an XML Document for the Jazz Warehouse XML Tutorial.
17 Apr 2002 XML Syntax: Documents Andy Clark. Basic Document Structure Element tags – Elements have associated attributes Text content Miscellaneous –
An Introduction to XML Sandeep Bhattaram
What it is and how it works
XML Design Goals 1.XML must be easily usable over the Internet 2.XML must support a wide variety of applications 3.XML must be compatible with SGML 4.It.
XML Introduction. Markup Language A markup language must specify What markup is allowed What markup is required How markup is to be distinguished from.
The Semistructured-Data Model Programming Languages for XML Spring 2011 Instructor: Hassan Khosravi.
XML and Database.
CS 157B: Database Management Systems II February 11 Class Meeting Department of Computer Science San Jose State University Spring 2013 Instructor: Ron.
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 1COMP9321, 15s2, Week.
XML A Language Presentation. Outline 1. Introduction 2. XML 2.1 Background 2.2 Structure 2.3 Advantages 3. Related Technologies 3.1 DTD 3.2 Schemas and.
The relational model A data model (in general) : Integrated collection of concepts for describing data (data requirements). Relational model was introduced.
UML Basics and XML Basics Navigating the ISO Standards.
XML CSC1310 Fall HTML (TIM BERNERS-LEE) HyperText Markup Language  HTML (HyperText Markup Language): December  Markup  Markup is a symbol.
XML Validation. a simple element containing text attribute; attributes provide additional information about an element and consist of a name value pair;
Lecture 23 XQuery 1.0 and XPath 2.0 Data Model. 2 Example 31.7 – User-Defined Function Function to return staff at a given branch. DEFINE FUNCTION staffAtBranch($bNo)
C Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Introduction to XML Standards.
Introduction to XML Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
1 Storing and Maintaining Semistructured Data Efficiently in an Object- Relational Database Mo Yuanying and Ling Tok Wang.
Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Representing Web Data:
SEMI-STRUCTURED DATA (XML) 1. SEMI-STRUCTURED DATA ER, Relational, ODL data models are all based on schema Structure of data is rigid and known is advance.
XML Introduction to XML Extensible Markup Language.
Connecting to External Data. Financial data can be obtained from a number of different data sources.
XML Notes taken from w3schools. What is XML? XML stands for EXtensible Markup Language. XML was designed to store and transport data. XML was designed.
CITA 330 Section 2 DTD. Defining XML Dialects “Well-formedness” is the minimal requirement for an XML document; all XML parsers can check it Any useful.
SNU OOPSLA Lab. A Tour of XML © copyright 2001 SNU OOPSLA Lab.
XML Databases Presented By: Pardeep MT15042 Anurag Goel MT15006.
XML: Extensible Markup Language
Unit 4 Representing Web Data: XML
XML QUESTIONS AND ANSWERS
XML stands for Extensible Markup Language.
The XML Language.
Chapter 7 Representing Web Data: XML
Presentation transcript:

10/14/2001 Coping with Semantics in XML Document Management Thomas Kudrass Leipzig University of Applied Sciences Department of Computer Science and Mathematics

Coping with Semantics in XML Document Management 2 Overview  Introduction –Motivation –XML: A Semantic Perspective –XML Document Types  XML Semantic Problems –XML: A Database Perspective –Common Mapping Problems  RM-ODP Viewpoints on XML Documents –Content View vs. Logical Layout View –Example  Realization of XML Document Management: Nesting of Viewpoints  Conclusions

Coping with Semantics in XML Document Management 3 Motivation  Aim: XML Document Management using Database Systems  Problem: Map XML Documents to Databases –different approaches –no mapping rules –many open issues  Reason: Semantics of XML not well understood –XML: only syntax, no predefined semantics Introduction XML Semantic Problems Viewpoints on XML Documents Realization Conclusions

Coping with Semantics in XML Document Management 4 XML - A Semantic Perspective  User-Defined Markup –structure the character data of a document –explain the documents through the use of names  Naming –RMD-ODP: “A name is a term that refers to an entity in a given naming context.“ –XML namespaces no solution –possible improvement: shared ontologies  No Standard Behavior of Tags –XSL processors: flexible presentation of XML document –XML processor: check well-formedness and validity of the XML document –open issue: document object semantics Introduction XML Semantic Problems Viewpoints on XML Documents Realization Conclusions

Coping with Semantics in XML Document Management 5 XML Document Types  Data-Centric Documents –designed for machine consumption (XML for data transport) –examples: sales orders, stock quotes, flight schedules –fairly regular structure –fine-grained data  Document-Centric Documents –designed for human readers –examples: books, journal articles, s –less regular structure –coarse-grained data  Hybrid Documents –composition of documents of different types –example: medical documents = patient data + findings + prescriptions + procedures  Document Type  Requirements to the Document Management System Introduction XML Semantic Problems Viewpoints on XML Documents Realization Conclusions

Coping with Semantics in XML Document Management 6 XML - A Database Perspective  Round-Trip Problem –store an XML document in a database and retrieve the “same“ document back again –vital to applications required by law to keep exact copies of documents –less important to data-centric documents focus on the document content ignore the order of sibling elements –many XML-to-DB algorithms don‘t preserve the whole documents CDATA sections character entities comments processing instructions Introduction XML Semantic Problems Viewpoints on XML Documents Realization Conclusions

Coping with Semantics in XML Document Management 7 Common Mapping Problems (1)  Attributes vs. Element Text –where to store data of a document? –both alternatives possible, influenced by the implementation  Meaning of Attributes –ambiguities when interpreting attributes –example: order of a customer has an attribute expiry date = “11/2001“  different meanings: “The order will expire in Nov. 2001“ “The information about the order can be thrown away in Nov. 2001“ “The expiry date is an information about the credit card used for purchase“ Introduction XML Semantic Problems Viewpoints on XML Documents Realization Conclusions

Coping with Semantics in XML Document Management 8 Common Mapping Problems (2)  Null Values –different semantics of null values –database null values have to be reflected in XML documents –XML Schema: null values in element‘s text can be expressed no concept of null for attributes –DTD: optional elements and attributes  Comments, Processing Instructions –considered no content of the document in many algorithms  Markup –visible in the logical document layout (e.g., character entities) –substituted in the physical representation of the document –Example: <foo/&gt stored in a database non-XML aware database don‘t recognize markup Introduction XML Semantic Problems Viewpoints on XML Documents Realization Conclusions

Coping with Semantics in XML Document Management 9 Common Mapping Problems (3)  Links –links originally designed for documents and document fragments e.g., XPointers point to document subtrees using XPath –not adequate to express semantic relationships among document elements e.g., ID: identifier value - primary key IDREF - foreign key Behavioral Semantics? –another language more appropriate to specify the invariants  Sibling Orders –particularly important for document-centric documents –can be arbitrary in data-centric documents  Other Invariants (e.g., identity constraints) –specified on the level of instances - not schema –construct the set of all concerned objects (using XPath) before Introduction XML Semantic Problems Viewpoints on XML Documents Realization Conclusions

Coping with Semantics in XML Document Management 10 RM-ODP Viewpoints on Documents  Physical Presentation View –dependent on media, screen size / paper size –document = composition of characters with attributes (font, size, style) –XML character entities replaced  Logical Layout View –composition of prose components (paragraphs, sections, lists, list items) and other objects (e.g., frames, code sections) –mostly ordered composition in document-centric documents –many possible physical presentation views  Content View –composition of information objects (title, author, abstract, body, bibliography) –can be organized in a hierarchical structure or can be flat –mapped to several logical layouts Introduction XML Semantic Problems Viewpoints on XML Documents Realization Conclusions

Coping with Semantics in XML Document Management 11 Content View vs. Logical Layout  Content View –document-centric documents information viewpoint in DTD or XML Schema some constructs to specify structural constraints (e.g., cardinality constraints in XML Schema) –data-centric documents structure not very relevant many invariants among content elements cannot be adequately expressed in DTD / XML Schema possible abuse of XLink / XPointers to specify relationships among content elements  Logical Layout –document-centric documents may follow the structure of the content –data-centric documents often arbitrary Introduction XML Semantic Problems Viewpoints on XML Documents Realization Conclusions

Coping with Semantics in XML Document Management 12 Data-Centric Documents: Content View  Example:  Integrity Constraints: –The overall value of an order must exceed a certain minimum. –A customer can submit at most 5 orders. –If a customer is deleted, all of his orders have to be cancelled. Order HeaderLine Item (1,1)(1,N) Customer Product CD  How to Map to an XML Document ? OR  How to Map to the Logical Layout View? Introduction XML Semantic Problems Viewpoints on XML Documents Realization Conclusions Rel

Coping with Semantics in XML Document Management 13  Alternative 1 C1... O O C2... O O Data-Centric Documents: Logical Layout View  Alternative 2 O1... C O2... C O3... C O4... C  Alternative 3... O1... C O2... C O3... C Introduction XML Semantic Problems Viewpoints on XML Documents Realization Conclusions

Coping with Semantics in XML Document Management 14 Operations  Operations are viewpoint-specific  XML-APIs: DOM / XPath –based on a tree model –although powerful, not appropriate for set-oriented operations  Viewpoints vs. Operations –content view: set-oriented operations –logical layout view: navigating operations (on a tree)  Need another language to express operations in the content view of data-centric documents! Introduction XML Semantic Problems Viewpoints on XML Documents Realization Conclusions

Coping with Semantics in XML Document Management 15 iTE Realization: Nesting of Viewpoints XML Document Content View B “Store“ “Retrieve“ Semantic Model ENTERPRISEINFORMATIONCOMPUT. ENG.TECHNOLOGY RDBMS native XML-DB Logical Layout View B “Store“ “Retrieve“ XML Schema DTD ENTERPRISEINFORMATIONCOMPUT. ENG.TECHNOLOGY File (Template) Large Object Presentation View B “Browse“ “Store“ SVG PDF ENTERPRISEINFORMATIONCOMPUT. ENG.TECHNOLOGY Media: Screen, Paper Introduction XML Semantic Problems Viewpoints on XML Documents Realization Conclusions

Coping with Semantics in XML Document Management 16 Conclusions  Analyze the requirements first before building an XML system –data-centric vs. document-centric documents –huge impact on the choice of technology (storage platform)  Think in viewpoints to understand the semantics –mixed occurrence of content view and logical layout in XML documents –expand viewpoints into the specification of a new system  Use generic relationships for constraint modelling  Beware of the difference between specification and realization Introduction XML Semantic Problems Viewpoints on XML Documents Realization Conclusions