Download presentation
Presentation is loading. Please wait.
Published byFriederike Fleischer Modified over 6 years ago
1
Introduction to Semantic Metadata & Semantic Web
Structured Web Documents in XML
2
Semantic Metadata & Semantic Web
Lecture Outline HTML vs. XML Detailed Description of XML Structuring: DTD, XML Schema XML Namespaces Navigating XML documents: XPath Semantic Metadata & Semantic Web
3
Semantic Metadata & Semantic Web
HyperText Markup Language (HTML) vs. eXtensible Markup Language (XML) Semantic Metadata & Semantic Web
4
Semantic Metadata & Semantic Web
HTML Semantic Metadata & Semantic Web
5
Semantic Metadata & Semantic Web
The Same Example in XML XML is for struching/publishing data for machines Semantic Metadata & Semantic Web
6
HTML versus XML: Similarities
Both use tags (e.g. <h2> and <year>) Tags may be nested (tags within tags) Human users can read and interpret both HTML and XML representations quite easily … But how about machines? Semantic Metadata & Semantic Web
7
Problems with Automated Interpretation of HTML Documents
An intelligent agent trying to retrieve the names of the authors of the book Authors’ names could appear immediately after the title or immediately after the word by Are there three authors? Semantic Metadata & Semantic Web
8
HTML vs XML: Structural Information (1)
HTML documents do not contain structural information about content: pieces of the document and their relationships. XML more easily accessible to machines because Every piece of information is described. Relations are also defined through the nesting structure. E.g., the <author> tags appear within the <book> tags, so they describe properties of the particular book. Semantic Metadata & Semantic Web
9
HTML vs XML: Structural Information (2)
A machine processing the XML document would be able to deduce that the author element refers to the enclosing book element XML allows the definition of constraints on values E.g. a year must be a number of four digits Semantic Metadata & Semantic Web
10
HTML vs XML: Formatting
The HTML representation provides more than the XML representation: The formatting of the document is also described But a weakness of HTML XML: separation of content from display same information can be displayed in different ways (using XSLT style sheets) Semantic Metadata & Semantic Web
11
HTML vs XML: Another Example
In HTML <h2>Relationship force-mass</h2> <i> F = M A </i> In XML <equation> <description>Relationship force-mass</description> <leftside> F </leftside> <rightside> M A </rightside> </equation> Semantic Metadata & Semantic Web
12
HTML vs XML: Different Use of Tags
In both HTML same tags HTML tags define display: color, lists … XML XML meta markup language for defining markup languages user definable tags Semantic Metadata & Semantic Web
13
Semantic Metadata & Semantic Web
Lecture Outline HTML vs. XML Detailed Description of XML Structuring: DTD, XML Schema Namespaces Navigating XML documents: XPath Semantic Metadata & Semantic Web
14
Semantic Metadata & Semantic Web
XML Elements The “things” the XML document talks about E.g. books, authors, publishers An element consists of: an opening tag the content a closing tag <lecturer>David Billington</lecturer> Semantic Metadata & Semantic Web
15
Semantic Metadata & Semantic Web
XML Elements (2) Tag names can be chosen almost freely. The first character must be a letter, an underscore, or a colon No name may begin with the string “xml” in any combination of cases E.g. “Xml”, “xML” Semantic Metadata & Semantic Web
16
Content of XML Elements
Content may be text, or other elements, or nothing <lecturer> <name>David Billington</name> <phone> +61 − 7 − </phone> </lecturer> If there is no content, then the element is called empty; it is abbreviated as follows: <lecturer/> for <lecturer></lecturer> Semantic Metadata & Semantic Web
17
Semantic Metadata & Semantic Web
XML Attributes An empty element is not necessarily meaningless It may have some properties in terms of attributes An attribute is a name-value pair inside the opening tag of an element <lecturer name="David Billington" phone="+61 − 7 − "/> Semantic Metadata & Semantic Web
18
XML Attributes: An Example
<order orderNo="23456" customer="John Smith" date="October 15, 2002"> <item itemNo="a528" quantity="1"/> <item itemNo="c817" quantity="3"/> </order> Semantic Metadata & Semantic Web
19
The Same Example without Attributes
<order> <orderNo>23456</orderNo> <customer>John Smith</customer> <date>October 15, 2002</date> <item> <itemNo>a528</itemNo> <quantity>1</quantity> </item> <itemNo>c817</itemNo> <quantity>3</quantity> </order> Semantic Metadata & Semantic Web
20
XML Elements vs Attributes
Attributes can be replaced by elements When to use elements and when attributes is a matter of design taste But attributes cannot be nested Semantic Metadata & Semantic Web
21
Well-Formed XML Documents
Syntactically correct documents Some syntactic rules: Only one outermost element (called root element) Each element contains an opening and a corresponding closing tag Tags may not overlap <author><name>Lee Hong</author></name> Attributes within an element have unique names Element and tag names must be permissible Semantic Metadata & Semantic Web
22
The Tree Model of XML Documents: An Example
< > <head> <from name="Michael Maher" <to name="Grigoris Antoniou" <subject>Where is your draft?</subject> </head> <body> Grigoris, where is the draft of the paper you promised me last week? </body> </ > Semantic Metadata & Semantic Web
23
The Tree Model of XML Documents: An Example (2)
Semantic Metadata & Semantic Web
24
Semantic Metadata & Semantic Web
Lecture Outline Introduction Detailed Description of XML Structuring DTDs XML Schema Namespaces Navigating XML documents: XPath Transformations: XSLT Semantic Metadata & Semantic Web
25
Structuring XML Documents
Define the structure Define all the element and attribute names that may be used what values an attribute may take which elements may or must occur within other elements, etc. If such structuring information exists, the document can be validated Semantic Metadata & Semantic Web
26
Structuring XML Dcuments (2)
An XML document is valid if it is well-formed respects the structuring information it uses There are two ways of defining the structure of XML documents: DTDs (the older and more restricted way) XML Schema (offers extended possibilities) Semantic Metadata & Semantic Web
27
Semantic Metadata & Semantic Web
Lecture Outline Introduction Detailed Description of XML Structuring DTDs XML Schema Namespaces Navigating XML documents: xPath Semantic Metadata & Semantic Web
28
Semantic Metadata & Semantic Web
XML Document Prolog The declaration header consists of an XML declaration and A reference to external schema documents DTD can be put in XML document itself Semantic Metadata & Semantic Web
29
DTD: Element Type Definition
<lecturer> <name>David Billington</name> <phone> +61−7− </phone> </lecturer> DTD for above element (and all lecturer elements)? <!ELEMENT lecturer (name, phone)> <!ELEMENT name (#PCDATA)> <!ELEMENT phone (#PCDATA)> Semantic Metadata & Semantic Web
30
Semantic Metadata & Semantic Web
The Meaning of the DTD The element types lecturer, name, and phone may be used in the document A lecturer element contains a name element and a phone element, in that order (sequence) A name element and a phone element may have any content In DTDs, #PCDATA is the only atomic type for elements Semantic Metadata & Semantic Web
31
DTD: Disjunction in Element Type Definitions
We express that a lecturer element contains either a name element or a phone element as follows: <!ELEMENT lecturer (name|phone)> A lecturer element contains a name element and a phone element in any order. <!ELEMENT lecturer((name,phone)|(phone,name))> Semantic Metadata & Semantic Web
32
Example of an XML Element
<order orderNo="23456" customer="John Smith" date="October 15, 2002"> <item itemNo="a528" quantity="1"/> <item itemNo="c817" quantity="3"/> </order> Semantic Metadata & Semantic Web
33
Semantic Metadata & Semantic Web
The Corresponding DTD <!ELEMENT order (item+)> <!ATTLIST order orderNo ID #REQUIRED customer CDATA #REQUIRED date CDATA #REQUIRED> <!ELEMENT item EMPTY> <!ATTLIST item itemNo ID #REQUIRED quantity CDATA #REQUIRED comments CDATA #IMPLIED> Semantic Metadata & Semantic Web
34
Semantic Metadata & Semantic Web
Comments on the DTD The item element type is defined to be empty + (after item) is a cardinality operator: ?: appears zero times or once *: appears zero or more times +: appears one or more times No cardinality operator means exactly once Semantic Metadata & Semantic Web
35
Semantic Metadata & Semantic Web
Comments on the DTD (2) In addition to defining elements, we define attributes This is done in an attribute list containing: Name of the element type to which the list applies A list of triplets of attribute name, attribute type, and value type Attribute name: A name that may be used in an XML document using a DTD Semantic Metadata & Semantic Web
36
Semantic Metadata & Semantic Web
DTD: Attribute Types Similar to predefined data types, but limited selection The most important types are CDATA, a string (sequence of characters) ID, a name that is unique across the entire XML document IDREF, a reference to another element with an ID attribute carrying the same value as the IDREF attribute IDREFS, a series of IDREFs Limitations: no dates, number ranges etc. Semantic Metadata & Semantic Web
37
Referencing with IDREF and IDREFS
<!ELEMENT family (person*)> <!ELEMENT person (name)> <!ELEMENT name (#PCDATA)> <!ATTLIST person id ID #REQUIRED mother IDREF #IMPLIED father IDREF #IMPLIED children IDREFS #IMPLIED> #REQUIRED: Attribute must appear in every occurrence of the element type in the XML document #IMPLIED: The appearance of the attribute is optional Semantic Metadata & Semantic Web
38
An XML Document Respecting the DTD
<family> <person id="kalsoom" mother="khalida" father="yonus"> <name>Kalsoom Yonus</name> </person> <person id="ali" mother="khalida" father="yonus"> <name>Muhammad Ali</name> <person id="khalida" children="ali kalsoom"> <name>Khalida Yonus</name> <person id="yonus" children="ali kalsoom"> <name>Muhammad Yonus</name> </family> Semantic Metadata & Semantic Web
39
Semantic Metadata & Semantic Web
XML Entities An XML entity can play the role as a placeholder for repeatable characters a section of external data We can use the entity reference &thisyear instead of the value " 2007 " <!ENTITY thisyear " 2007 " > Semantic Metadata & Semantic Web
40
A DTD for an Email Element
<!ELEMENT (head,body)> <!ELEMENT head (from,to+,cc*,subject)> <!ELEMENT from EMPTY> <!ATTLIST from name CDATA #IMPLIED address CDATA #REQUIRED> <!ELEMENT to EMPTY> <!ATTLIST to name CDATA #IMPLIED Semantic Metadata & Semantic Web
41
A DTD for an Email Element (2)
<!ELEMENT cc EMPTY> <!ATTLIST cc name CDATA #IMPLIED address CDATA #REQUIRED> <!ELEMENT subject (#PCDATA)> <!ELEMENT body (text,attachment*)> <!ELEMENT text (#PCDATA)> <!ELEMENT attachment EMPTY> <!ATTLIST attachment file CDATA #REQUIRED> Semantic Metadata & Semantic Web
42
Interesting Parts of the DTD
A head element contains: a from element at least one to element zero or more cc elements a subject element In from, to, and cc elements the name attribute is not required the address attribute is always required A body element contains a text element possibly followed by a number of attachment elements Semantic Metadata & Semantic Web
43
Semantic Metadata & Semantic Web
Lecture Outline Introduction Detailed Description of XML Structuring DTDs XML Schema Namespaces Navigating XML documents: xPath Semantic Metadata & Semantic Web
44
Semantic Metadata & Semantic Web
XML Schema Richer language for structuring of XML documents Its syntax is based on XML itself Reuse and refinement of schemas Expand or delete already existent schemas Sophisticated set of data types, compared to DTDs (which only supports strings) Semantic Metadata & Semantic Web
45
Semantic Metadata & Semantic Web
XML Schema (2) XML schema is an element with an opening tag like <?xml version="1.0"?> <xs:schema xmlns:xs=" ... </xs:schema> Schema consists of element and attribute types Semantic Metadata & Semantic Web
46
Semantic Metadata & Semantic Web
Element Types <element name="head" type="headType"/> <element name="to" type="nameAddress" minOccurs="1" /> Cardinality constraints: minOccurs="x" (default value 1) maxOccurs="x" (default value 1) Generalizations of *, ?, + offered by DTDs Semantic Metadata & Semantic Web
47
Semantic Metadata & Semantic Web
Attribute Types <attribute name="id" type="ID " use="required"/> <attribute name="speaks" type="Language" use="default" value="en"/> Existence: use="x", where x may be optional or required Default value: use="x" value="...", where x may be default or fixed Semantic Metadata & Semantic Web
48
Semantic Metadata & Semantic Web
Data Types Built-in data types Numerical data types: integer, short etc. String types: string, ID, IDREF, CDATA etc. Date and time data types: time, month etc. User-defined data types simple data types: which cannot use elements or attributes complex data types: which can use elements and attributes Semantic Metadata & Semantic Web
49
Semantic Metadata & Semantic Web
Data Types (2) Complex data types are defined from existing data types by defining some attributes (if any) and using indicators: Order indicators sequence, a sequence of existing data type elements (order is important) all, a collection of elements that must appear (order is not important) choice, a collection of elements, of which one will be chosen Occurrence Indicators maxOccurs minOccurs Semantic Metadata & Semantic Web
50
Semantic Metadata & Semantic Web
A Data Type Example <complexType name="lecturerType"> <sequence> <element name="firstname" type="string" minOccurs="0“ maxOccurs="unbounded"/> <element name="lastname" type="string"/> </sequence> <attribute name="title" type="string" use="optional"/> </complexType> Semantic Metadata & Semantic Web
51
Semantic Metadata & Semantic Web
Mixed Content Example <letter> Dear Mr.<name>John Smith</name>. Your order <orderid>1032</orderid> will be shipped on <shipdate> </shipdate>. </letter> <xs:element name="letter"> <xs:complexType mixed="true"> <xs:sequence> <xs:element name="name" type="xs:string"/> <xs:element name="orderid" type="xs:positiveInteger"/> <xs:element name="shipdate" type="xs:date"/> </xs:sequence> </xs:complexType> </xs:element> Semantic Metadata & Semantic Web
52
Semantic Metadata & Semantic Web
Simple Data Types <simpleType name="dayOfMonth"> <restriction base="integer"> <minInclusive value="1"/> <maxInclusive value="31"/> </restriction> </simpleType> Semantic Metadata & Semantic Web
53
Data Type: Enumeration
<simpleType name="dayOfWeek"> <restriction base="string"> <enumeration value="Mon"/> <enumeration value="Tue"/> <enumeration value="Wed"/> <enumeration value="Thu"/> <enumeration value="Fri"/> <enumeration value="Sat"/> <enumeration value="Sun"/> </restriction> </simpleType> Semantic Metadata & Semantic Web
54
XML Schema: The Email Example
<element name=" " type=" Type"/> <complexType name=" Type"> <sequence> <element name="head" type="headType"/> <element name="body" type="bodyType"/> </sequence> </complexType> Semantic Metadata & Semantic Web
55
XML Schema: The Email Example (2)
<complexType name="headType"> <sequence> <element name="from" type="nameAddress"/> <element name="to" type="nameAddress" minOccurs="1" maxOccurs="unbounded"/> <element name="cc" type="nameAddress" minOccurs="0" maxOccurs="unbounded"/> <element name="subject" type="string"/> </sequence> </complexType> Semantic Metadata & Semantic Web
56
XML Schema: The Email Example (3)
<complexType name="nameAddress"> <attribute name="name" type="string" use="optional"/> <attribute name="address" type="string" use="required"/> </complexType> Similar for bodyType Semantic Metadata & Semantic Web
57
Semantic Metadata & Semantic Web
Lecture Outline HTML vs. XML Detailed Description of XML Structuring: DTD, XML Schema Namespaces Navigating XML documents: XPath Semantic Metadata & Semantic Web
58
Semantic Metadata & Semantic Web
Namespaces Namespaces allow to uniquely identify XML vocabularies by using a uniform resource identifier (URI) Different independent groups can define same objects differently in their schemas/vocabularies. This may lead to name clashes in XML documents when using multiple such schemas A solution to this heterogeneity problem is namespaces In XML documents, qualified names for elements and attributes are used Semantic Metadata & Semantic Web
59
Semantic Metadata & Semantic Web
An Example <instructors xmlns=" xmlns:gu=" xmlns:uky=" <uky:faculty uky:title="assistant professor" uky:name="John Smith" uky:department="Computer Science"/> <gu:academicStaff gu:title="lecturer" gu:name="Mate Jones" gu:school="Information Technology"/> </instructors> Semantic Metadata & Semantic Web
60
Namespace Declarations
This way, an XML document may use more than one DTD or schema, each having a different prefix Namespaces are declared within an element and can be used in that element and any of its children (elements and attributes) A namespace declaration has the form: xmlns:prefix="location" location is the address of the DTD or schema If a prefix is not specified: xmlns="location" then the location is used by default Semantic Metadata & Semantic Web
61
XML Vocabularies/Applications
Web applications must agree on common vocabularies to communicate and collaborate Communities and business sectors are defining their specialized vocabularies XHTML Dublin Core (DC) mathematics (MathML) bioinformatics (BSML) … Semantic Metadata & Semantic Web
62
Semantic Metadata & Semantic Web
Lecture Outline Introduction Detailed Description of XML Structuring: XML Schema Namespaces Navigating XML documents: Xpath; XQuery Semantic Metadata & Semantic Web
63
Addressing and Querying XML Documents
In relational databases, parts of a database can be selected and retrieved using SQL Same necessary for XML documents Query languages: XQuery, XQL, XML-QL The central concept of XML query languages is a path expression Specifies how a node or a set of nodes, in the tree representation of the XML document can be reached Semantic Metadata & Semantic Web
64
Semantic Metadata & Semantic Web
XPath XPath is core for XML query languages Language for addressing parts of an XML document. It operates on the tree data model of XML It has a non-XML syntax Semantic Metadata & Semantic Web
65
Tree Structure of an XML Document
The root node Element nodes Text nodes Attribute nodes Comment nodes … Semantic Metadata & Semantic Web
66
Semantic Metadata & Semantic Web
An XML Example <library location="Bremen"> <author name="Henry Wise"> <book title="Artificial Intelligence"/> <book title="Modern Web Services"/> <book title="Theory of Computation"/> </author> <author name="William Smart"> <book title="Artificial Intelligence" price="30" /> <author name="Cynthia Singleton"> <book title="The Semantic Web" price= "40.99" /> <book title="Browser Technology Revised"/> </library> Semantic Metadata & Semantic Web
67
Semantic Metadata & Semantic Web
Tree Representation Semantic Metadata & Semantic Web
68
Examples of Path Expressions in XPath
Address all author elements /library/author Addresses all author elements that are children of the library element node Absolute Path Semantic Metadata & Semantic Web
69
Examples of Path Expressions in XPath (2)
Address all author elements //author This path expression addresses all author elements anywhere in the document Semantic Metadata & Semantic Web
70
Examples of Path Expressions in XPath (3)
Address the location attribute nodes within library element nodes The is used to denote attribute nodes Semantic Metadata & Semantic Web
71
Examples of Path Expressions in XPath (4)
Address all books with title “Artificial Intelligence” Intelligence"] Test within square brackets: a filter expression It restricts the set of addressed nodes. Query 4 addresses book elements, the title of which satisfies a certain condition. Semantic Metadata & Semantic Web
72
Examples of Filter Expressions (Predicates)
Address the first author element node in the XML document //author[1] Address the title of last book element within the first author element node in the document Address the title of all book elements having price greater than 30 Address the title of all book elements having no price Semantic Metadata & Semantic Web
73
Semantic Metadata & Semantic Web
Few more Queries Select the titles of books having “Modern” in title Selecting several paths: gives the name of all authors and the titles of their books | Returns the attribute values of title and price of all book elements | Path expression with wildcard: selects all the elements in the document //* Semantic Metadata & Semantic Web
74
Semantic Metadata & Semantic Web
XQuery Do yourself Semantic Metadata & Semantic Web
75
Semantic Metadata & Semantic Web
Review XML is a meta-language that allows users to define markup. Nesting of tags introduces structure. The structure of documents can be enforced using XML schemas or DTDs. XML separates content and structure from formatting. XML supports the exchange of structured information across different applications through markup, structure, and transformations. XML is supported by query languages Semantic Metadata & Semantic Web
76
Semantic Metadata & Semantic Web
Literature For XML, XML Schema, xpath, xQuery Book: XML Databases and the Semantic Web, ch 8 “EditiX - XML basics_xPath_xQuery.pdf” Semantic Metadata & Semantic Web
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.