Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. XML Design (A Gentle Transition from XML to RDF) Roger L. Costello David B. Jacobs.

Similar presentations


Presentation on theme: "1 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. XML Design (A Gentle Transition from XML to RDF) Roger L. Costello David B. Jacobs."— Presentation transcript:

1 1 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. XML Design (A Gentle Transition from XML to RDF) Roger L. Costello David B. Jacobs The MITRE Corporation (The creation of this tutorial was sponsored by DARPA)

2 2 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Acknowledgments We are very grateful to the Defense Agency Research Projects Agency (DARPA) for funding the creation of this tutorial. We are especially grateful to Murray Burke (DARPA) and John Flynn (BBN) for making it all happen. Special thanks to Frank Manola for answering our many questions.

3 3 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. What is the Purpose of RDF? The purpose of RDF (Resource Description Framework) is to give a standard way of specifying data "about" something. Here's an example of an XML document that specifies data about China's Yangtze river : <River id="Yangtze" xmlns="http://www.geodesy.org/river"> 6300 kilometers western China's Qinghai-Tibet Plateau East China Sea "Here is data about the Yangtze River. It has a length of 6300 kilometers. Its startingLocation is western China's Qinghai-Tibet Plateau. Its endingLocation is the East China Sea."

4 4 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. XML --> RDF <River id="Yangtze" xmlns="http://www.geodesy.org/river"> 6300 kilometers western China's Qinghai-Tibet Plateau East China Sea XML Modify the following XML document so that it is also a valid RDF document: <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> 6300 kilometers western China's Qinghai-Tibet Plateau East China Sea RDF Yangtze.xml Yangtze.rdf "convert to"

5 5 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. The RDF Format <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> 6300 kilometers western China's Qinghai-Tibet Plateau East China Sea RDF provides an ID attribute for identifying the resource being described. The ID attribute is in the RDF namespace. Add the "fragment identifier symbol" to the namespace. 1 2 3

6 6 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. The RDF Format (cont.) <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> 6300 kilometers western China's Qinghai-Tibet Plateau East China Sea Identifies the type (class) of the resource being described. Identifies the resource being described. This resource is an instance of River. These are properties, or attributes, of the type (class). Values of the properties 1 2 3 4

7 7 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Namespace Convention xmlns="http://www.geodesy.org/river#" Question: Why was "#" placed onto the end of the namespace? E.g., Answer: RDF is very concerned about uniquely identifying things - uniquely identifying the type (class) and uniquely identifying the properties. If we concatenate the namespace with the type then we get a unique identifier for the type, e.g., http://www.geodesy.org/river#River If we concatenate the namespace with a property then we get a unique identifier for the property, e.g., http://www.geodesy.org/river#length http://www.geodesy.org/river#startingLocation http://www.geodesy.org/river#endingLocation Thus, the "#" symbol is simply a mechanism for separating the namespace from the type name and the property name. Best Practice

8 8 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. The RDF Format <Class rdf:ID="Resource" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="uri"> value...

9 9 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Advantage of using the RDF Format You may ask: "Why should I bother designing my XML to be in the RDF format?" Answer: there are numerous benefits: –The RDF format, if widely used, will help to make XML more interoperable: Tools can instantly characterize the structure, "this element is a type (class), and here are its properties”. RDF promotes the use of standardized vocabularies... standardized types (classes) and standardized properties. –The RDF format gives you a structured approach to designing your XML documents. The RDF format is a regular, recurring pattern. –It enables you to quickly identify weaknesses and inconsistencies of non- RDF-compliant XML designs. It helps you to better understand your data! –You reap the benefits of both worlds: You can use standard XML editors and validators to create, edit, and validate your XML. You can use the RDF tools to apply inferencing to the data. –It positions your data for the Semantic Web! Network effect Interoperability

10 10 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Disadvantage of using the RDF Format Constrained: the RDF format constrains you on how you design your XML (i.e., you can't design your XML in any arbitrary fashion). RDF uses namespaces to uniquely identify types (classes), properties, and resources. Thus, you must have a solid understanding of namespaces. Another XML vocabulary to learn: to use the RDF format you must learn the RDF vocabulary.

11 11 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Uniquely Identify the Resource Earlier we said that RDF is very concerned about uniquely identifying the type (class) and the properties. RDF is also very concerned about uniquely identifying the resource, e.g., <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> 6300 kilometers western China's Qinghai-Tibet Plateau East China Sea This is the resource being described. We want to uniquely identify this resource.

12 12 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. rdf:ID The value of rdf:ID is a "relative URI". The "complete URI" is obtained by concatenating the URL of the XML document with "#" and then the value of rdf:ID, e.g., <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> 6300 kilometers western China's Qinghai-Tibet Plateau East China Sea Suppose that this RDF/XML document is located at this URL: http://www.china.org/geography/rivers. Thus, the complete URI for this resource is: Yangtze.rdf http://www.china.org/geography/rivers#Yangtze

13 13 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. xml:base On the previous slide we showed how the URL of the document provided the base URI. Depending on the location of the document is brittle: it will break if the document is moved, or is copied to another location. A more robust solution is to specify the base URI in the document, e.g., <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#" xml:base="http://www.china.org/geography/rivers"> 6300 kilometers western China's Qinghai-Tibet Plateau East China Sea Resource URI = concatenation(xml:base, '#', rdf:ID) = concatenation(http://www.china.org/geography/rivers, '#', "Yangtze") = http://www.china.org/geography/rivers#Yangtze

14 14 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. rdf:about Instead of identifying a resource with a relative URI (which then requires a base URI to be prepended), we can give the complete identity of a resource. However, we use rdf:about, rather than rdf:ID, e.g., <River rdf:about="http://www.china.org/geography/rivers#Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> 6300 kilometers western China's Qinghai-Tibet Plateau East China Sea

15 15 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Triple -> resource/property/value http://www.china.org/geography/rivers#Yangtze has a http://www.geodesy.org/river#length of 6300 kilometers resource property value http://www.china.org/geography/rivers#Yangtze has a http://www.geodesy.org/river#startingLocation of western China's... resource property value http://www.china.org/geography/rivers#Yangtze has a http://www.geodesy.org/river#endingLocation of East China Sea resource property value

16 16 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. The RDF Format = triples! The fundamental design pattern of RDF is to structure your XML data as resource/property/value triples! The value of a property can be a literal (e.g., length has a value of 6300 kilometers). Also, the value of a property can be a resource, as shown above (e.g., property-A has a value of Resource-B, property-B has a value of Resource-C). We will see examples of properties having a resource value in a little bit. Value-C value of property-A value of property-B Notice that the RDF design pattern is an alternating sequence of resource-property. This pattern is known as "striping".

17 17 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Naming Convention The convention is to use a capital letter to start a type (class) name, and use a lowercase letter to start a property name. –This helps the eye quickly discern the striping pattern. <River rdf:about="http://www.china.org/geography/rivers#Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> 6300 kilometers western China's Qinghai-Tibet Plateau East China Sea uppercase lowercase

18 18 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. RDF Model (graph) Legend: Ellipse indicates "Resource" Rectangle indicates "literal string value"

19 19 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. rdf:Description + rdf:type There is still another way of representing the XML. This way makes it very clear that you are describing something, and it makes it very clear what the type (class) is of the thing you are describing: <rdf:Description rdf:about="http://www.china.org/geography/rivers#Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> 6300 kilometers western China's Qinghai-Tibet Plateau East China Sea This is read as: "This is a Description about the resource http://www.china.org/geography/rivers#Yangtze. This resource is an instance of the River type (class). The http://www.china.org/geography/rivers#Yangtze resource has a length of 6300 kilometers, a startingLocation of western China's Qinghai-Tibet Plateau, and an endingLocation of the East China Sea." Note: this form of describing a resource is called the "long form". The form we have seen previously is an abbreviation of this long form. An RDF Parser interprets the abbreviated form as if it were this long form.

20 20 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Alternative Alternatively we can use rdf:ID rather than rdf:about, as shown here: <rdf:Description rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#" xml:base="http://www.china.org/geography/rivers"> 6300 kilometers western China's Qinghai-Tibet Plateau East China Sea

21 21 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Equivalent Representations! <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#" xml:base="http://www.china.org/geography/rivers"> 6300 kilometers western China's Qinghai-Tibet Plateau East China Sea <River rdf:about="http://www.china.org/geography/rivers#Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> 6300 kilometers western China's Qinghai-Tibet Plateau East China Sea <rdf:Description rdf:about="http://www.china.org/geography/rivers#Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> 6300 kilometers western China's Qinghai-Tibet Plateau East China Sea Note: In the RDF literature the examples are typically shown in this form.

22 22 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. RDF Namespace http://www.w3.org/1999/02/22-rdf-syntax-ns# ID about type resource Description

23 23 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Terminology As you read the RDF literature you may see the following terminology: –Subject: this term refers to the item that is playing the role of the resource. –predicate: this term refers to the item that is playing the role of the property. –Object: this term refers to the item that is playing the role of the value. Subject Object predicate Resource Value property Equivalent!

24 24 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. RDF Parser There is a nice RDF parser at the W3 Web site: http://www.w3.org/RDF/Validator/ This RDF parser will tell you if your XML is in the proper RDF format. Do Lab1

25 25 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Example #2 <River id="Yangtze" xmlns="http://www.geodesy.org/river"> 6300 kilometers western China's Qinghai-Tibet Plateau East China Sea <Dam id="ThreeGorges" xmlns="http://www.geodesy.org/dam"> The Three Gorges Dam 1.5 miles 610 feet $30 billion Yangtze2.xml Modify the following XML document so that it is RDF-compliant:

26 26 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Note the two types (classes) River Dam Instance: Yangtze Properties: length startingLocation endingLocation Instance: ThreeGorges Properties: name width height cost

27 27 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Dam - out of place <River id="Yangtze" xmlns="http://www.geodesy.org/river"> 6300 kilometers western China's Qinghai-Tibet Plateau East China Sea <Dam id="ThreeGorges" xmlns="http://www.geodesy.org/dam"> The Three Gorges Dam 1.5 miles 610 feet $30 billion Dam Types (classes) contain properties. Here we see the River type containing the properties - length, startingLocation, and endingLocation. It also shows River containing a type - Dam. Thus, there is a Resource that contains another Resource. This is inconsistent with RDF design pattern. (We are seeing one of the benefits of using the RDF format - to identify inconsistencies in an XML design.)

28 28 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Property value must be a Literal or a Resource 6300 kilometers property Value is a Literal <Dam id="ThreeGorges" xmlns="http://www.geodesy.org/dam"> The Three Gorges Dam 1.5 miles 610 feet $30 billion property Value is a Resource

29 29 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Modified XML (to make it consistent) <River id="Yangtze" xmlns="http://www.geodesy.org/river"> 6300 kilometers western China's Qinghai-Tibet Plateau East China Sea <Dam id="ThreeGorges" xmlns="http://www.geodesy.org/dam"> The Three Gorges Dam 1.5 miles 610 feet $30 billion Yangtze2,v2.xml "The Yangtze River has an obstacle that is the ThreeGorges Dam. The Dam has a name - The Three Gorges Dam. It has a width of 1.5 miles, a height of 610 feet, and a cost of $30 billion."

30 30 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. RDF Format <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#" xml:base="http://www.china.org/geography/rivers"> 6300 kilometers western China's Qinghai-Tibet Plateau East China Sea <Dam rdf:ID="ThreeGorges" xmlns="http://www.geodesy.org/dam#"> The Three Gorges Dam 1.5 miles 610 feet $30 billion Changed id to rdf:ID Added the '#' symbol As always, the other representations using rdf:about and rdf:Description are available.

31 31 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. RDF Model (graph)

32 32 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. <Dam rdf:ID="ThreeGorges" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/dam#" xml:base="http://www.china.org/geography/rivers"> The Three Gorges Dam 1.5 miles 610 feet $30 billion <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#" xml:base="http://www.china.org/geography/rivers"> 6300 kilometers western China's Qinghai-Tibet Plateau East China Sea Three-Gorges-Dam.rdf Alternatively, suppose that someone has already created a document containing information about the Three Gorges Dam: Yangtze.rdf Then we can simply reference the Three Gorges Dam resource using rdf:resource, as shown here:

33 33 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Note: reference is to a resource, not to a file Why was this the reference: and not this: That is, why wasn't the reference to a "file"? Answer: 1. What if the file moved? Then the reference would break. 2. By using an identifier of the Three Gorges Dam, and keeping a particular file unspecified, then an "aggregator tool" will be able to collect information from all the files that talk about the Three Gorges Dam resource (see next slide). Do Lab2

34 34 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Anyone, Anywhere, Anytime Can Talk About a Resource In all of our examples we have provided a unique identifier to resources, e.g., http://www.china.org/geography/rivers#Yangtze Consequently, if another RDF document identifies the same resource then the data that it specifies gives additional data about that resource. An aggregator tool will be able to collect all data about a resource and present a consolidated set of data for the resource. That's powerful!

35 35 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. rdf:ID versus rdf:about When should rdf:ID be used? When should rdf:about be used? –When you want to introduce a resource, and provide an initial set of information about a resource use rdf:ID –When you want to extend the information about a resource use rdf:about The RDF philosophy is akin to the Web philosophy. That is, anyone, anywhere, anytime can provide information about a resource.

36 36 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. <River rdf:about="http://www.china.org/geography/rivers#Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> 6300 kilometers western China's Qinghai-Tibet Plateau East China Sea http://www.china.org/geography/rivers/yangtze.rdf <River rdf:about="http://www.china.org/geography/rivers#Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> Dri Chu - Female Yak River Tongtian He, Travelling-Through-the-Heavens River Jinsha Jiang, River of Golden Sand http://www.encyclopedia.org/yangtze-alternate-names.rdf <River rdf:about="http://www.china.org/geography/rivers#Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> 6300 kilometers western China's Qinghai-Tibet Plateau East China Sea Dri Chu - Female Yak River Tongtian He, Travelling-Through-the-Heavens River Jinsha Jiang, River of Golden Sand Aggregated Data! Aggregator tool collects data about the Yangtze A distributed network of data!

37 37 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. <Dam rdf:ID="ThreeGorges" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/dam#" xml:base="http://www.china.org/geography/rivers"> The Three Gorges Dam 1.5 miles 610 feet $30 billion <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#" xml:base="http://www.china.org/geography/rivers"> 6300 kilometers western China's Qinghai-Tibet Plateau East China Sea http://www.china.org/geography/rivers/yangtze.rdf http://www.encyclopedia.org/three-gorges-dam.rdf <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#" xml:base="http://www.china.org/geography/rivers"> 6300 kilometers western China's Qinghai-Tibet Plateau East China Sea <Dam rdf:ID="ThreeGorges" xmlns="http://www.geodesy.org/dam#"> The Three Gorges Dam 1.5 miles 610 feet $30 billion Aggregate! Note that the reference to the ThreeGorges Dam resource has been replaced by whatever information the aggregator could find on this resource! Another Example of Aggregation

38 38 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Example #3 Yangtze 6300 kilometers western China's Qinghai-Tibet Plateau East China Sea Notice that in this XML document there is no unique identifier: Yangtze3.xml XML Yangtze 6300 kilometers western China's Qinghai-Tibet Plateau East China Sea Yangtze3.rdf RDF The RDF is identical to the XML!

39 39 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Interpreting the RDF Yangtze 6300 kilometers western China's Qinghai-Tibet Plateau East China Sea Yangtze3.rdf This is read as: "This is an instance of the River type (class). The River has a name of Yangtze, a length of 6300 kilometers, a startingLocation of western China's Qinghai-Tibet Plateau, and an endingLocation of the East China Sea." In this document the resource is anonymous - it has no identifier.

40 40 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Disadvantage of anonymous resources Yangtze 6300 kilometers western China's Qinghai-Tibet Plateau East China Sea http://www.china.org/geography/rivers/yangtze.rdf Yangtze Dri Chu - Female Yak River Tongtian He, Travelling-Through-the-Heavens River Jinsha Jiang, River of Golden Sand http://www.encyclopedia.org/yangtze-alternate-names.rdf An aggregator tool will not be able to determine if these documents are talking about the same resource. Aggregate

41 41 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Example #4 <River id="Yangtze" xmlns="http://www.geodesy.org/river" xmlns:uom="http://www.measurements.org/units-of-measure"> 6300 western China's Qinghai-Tibet Plateau East China Sea XML Yangtze4.xml Yangtze4.rdf <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#" xmlns:uom="http://www.measurements.org/units-of-measure#"> 6300 kilometers western China's Qinghai-Tibet Plateau East China Sea RDF

42 42 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. <River id="Yangtze" xmlns="http://www.geodesy.org/river" xmlns:uom="http://www.measurements.org/units-of-measure#"> 6300 western China's Qinghai-Tibet Plateau East China Sea Yangtze4.xml RDF does not allow attributes on the properties (except for special RDF attributes such as rdf:resource). So we need to make the uom:units attribute a child element. Your first instinct might be to modify length to have two child elements: <River id="Yangtze" xmlns="http://www.geodesy.org/river" xmlns:uom="http://www.measurements.org/units-of-measure#"> 6300 kilometers western China's Qinghai-Tibet Plateau East China Sea However, now the length property has as its value two values. RDF only binary relations i.e., a single value for a property.

43 43 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. rdf:value length 6300 kilometers length has two values - 6300 and kilometers. RDF provides a special property, rdf:value, to be used for specifying the "primary" value. In this example, 6300 is the primary value, and kilometers is a value which provides additional information about the primary value.

44 44 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. RDF Format <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#" xmlns:uom="http://www.measurements.org/units-of-measure#"> 6300 kilometers western China's Qinghai-Tibet Plateau East China Sea Yangtze4.rdf An anonymous resource Read this as: "The Yangtze River has a length whose value is a resource which has a value of 6300 and whose units is kilometers.

45 45 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Advantage of anonymous resources 6300 kilometers This is an anonymous resource. Its purpose is solely to provide a context for the two properties. Other RDF documents will have no need to amplify this resource. So, in this case, there is no reason for giving the resource an identifier. In this case it makes good sense to use an anonymous resource.

46 46 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. RDF Model (graph) An anonymous resource (also called a "blank node"). That is, a resource with no identifier. (Note: RDF Parsers will typically generate a unique identifier for anonymous resources, to distinguish one anonymous resource from another.) Legend:

47 47 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. rdf:parseType="Resource" <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#" xmlns:uom="http://www.measurements.org/units-of-measure#"> 6300 kilometers western China's Qinghai-Tibet Plateau East China Sea Yangtze4,v2.rdf If the value of a property is comprised of several values then one option is to create an anonymous resource, as we saw. RDF provides a shorthand, so that you don't need to create an rdf:Description element, by using rdf:parseType="Resource", as shown here: The meaning of this is identical to that shown on the previous slide.

48 48 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Equivalent! 6300 kilometers 6300 kilometers Do Lab3

49 49 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Summary Modify the following XML document so that it is also a valid RDF document: <River id="Yangtze" xmlns="http://www.geodesy.org/river" xmlns:uom="http://www.measurements.org/units-of-measure#"> 6300 western China's Qinghai-Tibet Plateau East China Sea <Dam id="ThreeGorges" xmlns="http://www.geodesy.org/dam"> The Three Gorges Dam 1.5 miles 610 feet $30 billion Yangtze.xml See next slide -->

50 50 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. RDF Format! <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#" xmlns:uom="http://www.measurements.org/units-of-measure#" xml:base="http://www.china.org/geography/rivers"> 6300 kilometers western China's Qinghai-Tibet Plateau East China Sea <Dam rdf:ID="ThreeGorges" xmlns="http://www.geodesy.org/dam#"> The Three Gorges Dam 1.5 miles 610 feet $30 billion Yangtze.rdf With relatively few changes the XML document is now usable by both XML tools and RDF tools!

51 51 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Example #5 <River id="Yangtze" xmlns="http://www.geodesy.org/river" xmlns:uom="http://www.measurements.org/units-of-measure#"> 6300 175 55 Yangtze5.xml Modify the following XML document so that it is also a valid RDF document: <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#" xmlns:uom="http://www.measurements.org/units-of-measure#"> 6300 kilometers 175 meters 55 kilometers Yangtze5.rdf This is one way of doing it. Now we will see a better way - using "typed literals". (See next slide)

52 52 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Alternate RDF Format <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> 6300 175 55 Yangtze5.rdf With rdf:datatype you can give a property's value a datatype label. The rdf:datatype value acts as a semantic label for the datatype of the value. This is called a typed literal. For this example there must be a namespace, http://www.uom.org/distance#, which defines two datatypes - kilometer and meter. On the next slide is shown how to do this using XML Schemas.

53 53 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Defining the kilometer and meter datatypes using XML Schemas <schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.uom.org/distance#"> uom.xsd

54 54 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Another example using rdf:datatype <Person rdf:ID="JohnSmith" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.person.org#"> 30 In this example we are specifying that the value (30) of age is a nonNegativeInteger (which is defined in the XML Schema namespace).

55 55 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Example #6 -2 degrees Celsius 30.4 (rising) USA Massachusetts Boston January 22, 2003, 11:15 EST WMUR_TV_WeatherReading.xml Modify the following XML document so that it is also a valid RDF document:

56 56 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Things to note 1. The XML document is using three types (classes): WeatherReading Weather Location 2. The 3 type instances are anonymous. Consequently, we cannot benefit from others, and others cannot benefit from us. Where does it make sense to give an identifier to an instance? - In general, the root element should have an identifier. - There is lots of information about Boston. Let's give the Location instance an identifier.

57 57 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Modification 1: Identifiers added <WeatherReading id="BOS-012203-1115" xmlns="http://www.meteorology.org#"> -2 degrees Celsius 30.4 (rising) <Location id="BOS" xmlns="http://www.geodesy.org#"> USA Massachusetts Boston January 22, 2003, 11:15 EST WMUR_TV_WeatherReading,v2.xml

58 58 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Modification 2: Create a property for the Weather, Location types Types (classes) contain properties. We need to wrap the Weather and Location types within a property: <WeatherReading id="BOS-012203-1115" xmlns="http://www.meteorology.org#"> -2 degrees Celsius 30.4 (rising) <Location id="BOS" xmlns="http://www.geodesy.org#"> USA Massachusetts Boston January 22, 2003, 11:15 EST WMUR_TV_WeatherReading,v3.xml

59 59 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. RDF Format! <WeatherReading rdf:ID="BOS-012203-1115" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.meteorology.org#" xml:base="http://www.wmur-tv.com/weather"> -2 degrees Celsius 30.4 (rising) <Location rdf:ID="BOS" xmlns="http://www.geodesy.org#" xml:base="http://www.airports.org/icao"> USA Massachusetts Boston January 22, 2003, 11:15 EST WMUR_TV_WeatherReading.rdf

60 60 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. The rdf:Bag type (class) The rdf:Bag type is used to represent an unordered collection.

61 61 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Example #7 <Meeting id="XML-Design-Patterns" xmlns="http://www.business.org"> John Smith Sally Jones Modify the following XML document so that it is also a valid RDF document: DesignMeeting.xml rdf:Bag makes it clear that this is an unordered collection of names. <Meeting rdf:ID="XML-Design-Pattern" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.business.org#"> John Smith Sally Jones DesignMeeting.rdf

62 62 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. The rdf:Alt type (class) The rdf:Alt type is used to represent a set of alternate properties.

63 63 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Example #8 <Retailer id="BarnesAndNoble" xmlns="http://www.retailers.org"> http://www.bn.com http://www.barnesandnoble.com Modify the following XML document so that it is also a valid RDF document: BarnesAndNoble.xml rdf:Alt makes it clear that the urls listed are alternates, i.e., choose one of them. <Retailer rdf:ID="BarnesAndNoble" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.retailers.org#"> http://www.bn.com http://www.barnesandnoble.com BarnesAndNoble.rdf

64 64 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. The rdf:Seq type (class) The rdf:Seq type is used to represent a sequence of properties.

65 65 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Example #9 <ToDoList id="MondayMeetings" xmlns="http://www.reminders.org"> Meet with CEO at 10am Luncheon at The Eatery Flight at 3pm Modify the following XML document so that it is also a valid RDF document: MyDaysActivities.xml rdf:Seq makes it clear that the activities listed are to be done in the sequence listed. <ToDoList rdf:ID="MondayMeetings" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.reminders.org#"> Meet with CEO at 10am Luncheon at The Eatery Flight at 3pm MyDaysActivities.rdf

66 66 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. rdf:li Property The property, rdf:li ("list item"), is provided by RDF for use with either rdf:Bag, rdf:Alt, or rdf:Seq. The rdf:li property is provided for you to specify an item in a Bag/Alt/Seq. An RDF Parser will replace each rdf:li with rdf:_1, rdf:_2, rdf:_3, etc. The following slide recasts the previous examples using the rdf:li property.

67 67 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. <ToDoList rdf:ID="MondayMeetings" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.reminders.org#"> Meet with CEO at 10am Luncheon at The Eatery Flight at 3pm MyDaysActivities.rdf <Retailer rdf:ID="BarnesAndNoble" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.retailers.org#"> http://www.bn.com http://www.barnesandnoble.com BarnesAndNoble.rdf <Meeting rdf:ID="XML-Design-Pattern" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.business.org#"> John Smith Sally Jones DesignMeeting.rdf

68 68 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Example #10 Modify the following XML document so that it is also a valid RDF document: <Catalogue xmlns="http://www.publishing.org#" xmlns:dc="http://pur1.org/metadata/dublin-core#"> Lateral Thinking Edward de Bono 1973 0-06-099325-2 Harper & Row Illusions: The Adventures of a Reluctant Messiah Richard Bach 1977 0-440-34319-4 Dell Publishing Co. The First and Last Freedom J. Krishnamurti 1954 0-06-064831-7 Harper & Row Barnes_and_Noble _BookCatalogue.xml Note: regrettably, the Dublin Core does not conform to the RDF naming conventions. Hence, you see properties with the first letter capitalized.

69 69 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Things to note 1. The XML document uses two types (classes): Catalogue Book 2. All type instances are anonymous. Consequently, we cannot benefit from others, and others cannot benefit from us. Where does it make sense to give the instance an identifier? - In general, the root element should have an identifier. - There is lots of information about each book instance. Let's give each book instance an identifier (the ISBN).

70 70 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Modification 1: Identifiers added <Catalogue id="BookCatalogue" xmlns="http://www.publishing.org#" xmlns:dc="http://pur1.org/metadata/dublin-core#"> Lateral Thinking Edward de Bono 1973 Harper & Row Illusions: The Adventures of a Reluctant Messiah Richard Bach 1977 Dell Publishing Co. The First and Last Freedom J. Krishnamurti 1954 Harper & Row Barnes_and_Noble _BookCatalogue,v2.xml Notice that the ISBN elements were deleted and their values used as identifiers. Why was an underscore placed in front of the ISBN? Answer: The ID datatype does not allow an identifier to begin with a digit. So, we (arbitrarily) decided to use an underscore.

71 71 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Modification 2: Create a property for the Book type Barnes_and_Noble _BookCatalogue,v3.xml Types (classes) contain properties. We need to wrap the Book type within a property: <Catalogue id="BookCatalogue" xmlns="http://www.publishing.org#" xmlns:dc="http://pur1.org/metadata/dublin-core#"> Lateral Thinking Edward de Bono 1973 Harper & Row Illusions: The Adventures of a Reluctant Messiah Richard Bach 1977 Dell Publishing Co. The First and Last Freedom J. Krishnamurti 1954 Harper & Row

72 72 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. RDF Format! <Catalogue rdf:ID="BookCatalogue" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.publishing.org#" xmlns:dc="http://pur1.org/metadata/dublin-core#" xml:base="http://www.bn.com"> <Book rdf:ID="_0-06-099325-2" xml:base="http://www.publishing.org/book"> Lateral Thinking Edward de Bono 1973 Harper & Row <Book rdf:ID="_0-440-34319-4" xml:base="http://www.publishing.org/book"> Illusions: The Adventures of a Reluctant Messiah Richard Bach 1977 Dell Publishing Co.... Barnes_and_Noble_BookCatalogue.rdf

73 73 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Dublin Core (dc:) The Dublin Core is a standard set of properties: ContentIntellectual Property Instance Title Subject Description Language Relation Coverage Source Creator Publisher Contributor Rights Date Type Format Identifier Note: many people use these properties in their HTML today. For example:

74 74 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. rdf:parseType="Collection" This may be added as an attribute of a property to indicate that the contents of the property is a list of resources. The following slide recasts the BookCatalogue example to use this list type.

75 75 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. <Catalogue rdf:ID="BookCatalogue" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.publishing.org#" xmlns:dc="http://pur1.org/metadata/dublin-core#" xml:base="http://www.bn.com"> <Book rdf:ID="_0-06-099325-2" xml:base="http://www.publishing.org/book"> Lateral Thinking Edward de Bono 1973 Harper & Row <Book rdf:ID="_0-440-34319-4" xml:base="http://www.publishing.org/book"> Illusions: The Adventures of a Reluctant Messiah Richard Bach 1977 Dell Publishing Co.... Barnes_and_Noble_BookCatalogue.rdf Do Lab4

76 76 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Example #11 <Article rdf:ID="QuickBrownFox" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.publishing.org#"> The quick brown fox jumped over the lazy dog. An important person once said " Now is the time for all good men to come to the aid of their country " QuickBrownFox.rdf rdf:parseType="Literal" indicates that the content of paragraph is to be treated simply as literal XML, i.e., tools shouldn't try to parse the content into resource/property/value triples. <Article id="QuickBrownFox" xmlns="http://www.publishing.org"> The quick brown fox jumped over the lazy dog. An important person once said " Now is the time for all good men to come to the aid of their country " Modify the following XML document so that it is also a valid RDF document: QuickBrownFox.xml The paragraph contains "mixed content".

77 77 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. rdf:parseType="Literal" In all of the previous examples the data was structured as resource/property/value triples. Sometimes it doesn't make sense to do such structuring –Example: with mixed content In those cases we can simply indicate "hey, the content of this property is okay. Treat it as a literal XML string."

78 78 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Dangers of rdf:parseType="Literal" The advantage of structuring your XML as resource/property/value triples is enhanced interoperability. When you use rdf:parseType="Literal" you lose the ability for a tool to instantly take advantage of the resource/property/value structure (since you are, by definition, saying that the data doesn't have this structure). Lesson Learned: use rdf:parseType="Literal" sparingly !

79 79 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Example 12 <River id="Yangtze" xmlns="http://www.geodesy.org/river" length="6300 kilometers" startingLocation="western China's Qinghai-Tibet Plateau" endingLocation="East China Sea"/> <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#" xmlns:r="http://www.geodesy.org/river#" r:length="6300 kilometers" r:startingLocation="western China's Qinghai-Tibet Plateau" r:endingLocation="East China Sea"/> XML RDF Yangtze.xml Yangtze.rdf Modify the following XML document so that it is also a valid RDF document: Note that attributes are being used, not child elements! The RDF format allows you to use attributes as well!

80 80 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Attributes Recall that at the very beginning of this tutorial we said that a resource has properties (attributes). Thus, a property can be represented either as a child element, or as an attribute. (Of course, a property can only be represented as an attribute if it has a literal value, not a structured value.)

81 81 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Equivalent! <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#" xmlns:r="http://www.geodesy.org/river#" r:length="6300 kilometers" r:startingLocation="western China's Qinghai-Tibet Plateau" r:endingLocation="East China Sea"/> <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> 6300 kilometers western China's Qinghai-Tibet Plateau East China Sea Do Lab5

82 82 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Example #13 John Smith 555-1212 abc@def.com 1995-01-01 1999-01-01 researcher Some Corp http://www.company.org Cool stuff Modify the following XML document so that it is also a valid RDF document:

83 83 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. <Resume rdf:ID="JSmith-2003" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.resume.org#" xml:base="http://www.jsmith.com/resume"> John Smith 555-1212 abc@def.com 1995-01-01 1999-01-01 researcher Some Corp http://www.company.org Cool stuff Oracle 13 Java 3 BS RPI 1987

84 84 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Conclusion In this tutorial we demonstrated that, with oftentimes slight modifications, "traditional XML" documents can be made RDF-compliant. We recommend using the RDF format for all your XML documents. We feel that the advantages accrued by formatting your XML far outweigh any possible disadvantages (having to structure your XML in a specific format and having to learn a new vocabulary)

85 85 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Related Articles http://www.xml.com/pub/a/2002/10/30/rdf-friendly.html "Make Your XML RDF-Friendly" by Bob DuCharme, John Cowan


Download ppt "1 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. XML Design (A Gentle Transition from XML to RDF) Roger L. Costello David B. Jacobs."

Similar presentations


Ads by Google