Presentation is loading. Please wait.

Presentation is loading. Please wait.

ICS-FORTH May 25, 2001 1 The Utility of XML Martin Doerr Foundation for Research and Technology - Hellas Institute of Computer Science Heraklion, May.

Similar presentations


Presentation on theme: "ICS-FORTH May 25, 2001 1 The Utility of XML Martin Doerr Foundation for Research and Technology - Hellas Institute of Computer Science Heraklion, May."— Presentation transcript:

1

2 ICS-FORTH May 25, 2001 1 The Utility of XML Martin Doerr Foundation for Research and Technology - Hellas Institute of Computer Science Heraklion, May 25, 2001 Center for Cultural Informatics

3 ICS-FORTH May 25, 2001 2 XML is XML is a compromise between databases and free texts It takes the better from both sides without being perfect on either side. It is readable. It allows to disambiguate meaning. It is simple. It is rich enough to open a new systems paradigm.

4 ICS-FORTH May 25, 2001 3 What is a Document ?  A composite statement : a unit relating known facts, items and categories with new knowledge - linguistic or by other media.  It has an inner logic: the pure rendered knowledge, independent from language and form.  It has a meaningful structure: The sequence, arrangement or linking used to render the inner logic.  It has a presentation: Structure and style to assist perception and impression

5 ICS-FORTH May 25, 2001 4 A document

6 ICS-FORTH May 25, 2001 5 The statements…. Diego Velasquez is Spanish. Diego Velasquez lived 1599-1660. Diego Velasquez painted “Juan de Pareja”. “Juan de Pareja” is a painting. “Juan de Pareja” has dimension 81,3X69,9cm Juan de Pareja is Moorish. Juan de Pareja is a painter. Philipp IV sent Velazquez to Italy. …..

7 ICS-FORTH May 25, 2001 6 Another document

8 ICS-FORTH May 25, 2001 7 What’s Wrong with HTML MONET, Claude Haystacks at Chailly at Sunrise 1865 Oil on canvas 30 x 60 cm (11 7/8 x 23 3/4 in.) San Diego Museum of Art  If written properly, normal HTML may reflect document presentation, but it cannot adequately represent the semantics & structure of data Artist Name Date Artifact Title Dimensions Material Museum Image Reference

9 ICS-FORTH May 25, 2001 8 User Problems/ Design Reasons  Preserving info units: who said that / self-contained  Entering data:  what can I say,  what should I say,  how can I say it.  Rendering data: how to tell my child, the public…  Accessing data: querying, mediation  Reusing data: transmission to other environments, merging, evolution of local system, preservation for future use.

10 ICS-FORTH May 25, 2001 9 In Technical Terms  Transformation under preservation of meaning  Correct adaptation of presentation without knowing meaning  Packaging information for presentation – “1 document”  Sequencing categories for data input.  Interpretation of intended meaning - searching  Automatic relating of common meaning – merging of different statements

11 ICS-FORTH May 25, 2001 10 What’s wrong with  Free texts: Clear packaging, rendering for one target, not machine processable (poor querying, categories uncomprehensive), poorly reusable, no help to enter data, transform data..  HTML: Solves platform-independence of presentation, weak connection between meaning and presentation structure – not far better than free text.  Databases: Clear logical structure, categorization, machine processable, excellent querying, difficult presentation, transformation, merging, evolution, no information units  XML: Clear packaging, logical structure, machine processable if correctly used, clear separation and relation of meaningful structure and presentation. Helpful to enter data, easy to extend, transform, present. Can be queried, structure not independent from user view.

12 ICS-FORTH May 25, 2001 11 XML and databases  Databases:  Schema first: Prior to data, complete, inflexible analysis of all categories and their relations.  Table structures: indexes prepared, excellent consistency enforcement.  XML:  Data first; structure explanatory, can come second, need not be formalized, extensible, DTD’s can be combined  semi-structured: flexible, but reduced guarantee if a question can be answered, reduced consistency enforcement.  Embedded schema: each instance carries the schema it uses – querying by parsing without index structures – ideal transport format.

13 ICS-FORTH May 25, 2001 12 Data First, Embedded Schema  This document carries the interpretation with it. It is readable without knowledge of the schema. Claude Monet Haystacks at Chailly at Sunrise 1865 Oil on canvas 30 60 11 7/8 23 3/4 San Diego Museum of Art

14 ICS-FORTH May 25, 2001 13 What’s important  Data first: delayed analysis, preserves data.  Embedded schema: facilitates data transport, readable in the future.  Separation of semantics and presentation: enables information reuse.  Guides and controls data entry  Same meaning can be encoded in multiple formats:  DTD design depends on purpose: Transport, presentation, data entry…

15 ICS-FORTH May 25, 2001 14 Useful Applications  Prescription for documentation / input  Data transfer between systems (“middle ware”)  Document bases with full query access.  Combine database with XML documents: mission-critical data in tables and DTD, rich extensible structures in DTD only.  Create data for long-term use: even machine readable from paper!  Create information sets for multiple presentation

16 ICS-FORTH May 25, 2001 15 Final Remark  How to encode meaning without structure ambiguities: => use RDF/ RDFS  How to standardize meaning of element types (tags) ? => use ontologies – e.g. formulated in RDFS!


Download ppt "ICS-FORTH May 25, 2001 1 The Utility of XML Martin Doerr Foundation for Research and Technology - Hellas Institute of Computer Science Heraklion, May."

Similar presentations


Ads by Google