Presentation is loading. Please wait.

Presentation is loading. Please wait.

Encoding DC in (X)HTML, XML and RDF Andy Powell UKOLN, University of Bath, UK UKOLN is supported by: Tutorial.

Similar presentations


Presentation on theme: "Encoding DC in (X)HTML, XML and RDF Andy Powell UKOLN, University of Bath, UK UKOLN is supported by: Tutorial."— Presentation transcript:

1 Encoding DC in (X)HTML, XML and RDF Andy Powell a.powell@ukoln.ac.uk UKOLN, University of Bath, UK http://www.ukoln.ac.uk/ UKOLN is supported by: Tutorial at ECDL 2004, Bath September 2004

2 ECDL 2004 tutorial - Bath, Sept 2004 2 Contents an abstract model for DC (30 mins) encoding DC in XHTML (30 mins) encoding DC in XML (30 mins) encoding DC in RDF/XML (30 mins) practical examples OAI Protocol for Metadata Harvesting and RSS (20 mins) assigning identifiers (20 mins) Note: you are going to see lots of angle-brackets – but no XML schemas!

3 ECDL 2004 tutorial - Bath, Sept 2004 3 Important this is a tutorial… …please feel free to ask questions as we go through!

4 Important DCMI documents… DCMI Abstract Model – DRAFT http://www.ukoln.ac.uk/metadata/dcmi/abstract-model/ http://www.ukoln.ac.uk/metadata/dcmi/abstract-model/ Expressing Dublin Core in HTML/XHTML meta and link elements http://dublincore.org/documents/dcq-html/ http://dublincore.org/documents/dcq-html/ Guidelines for implementing Dublin Core in XML http://dublincore.org/documents/dc-xml-guidelines/ http://dublincore.org/documents/dc-xml-guidelines/ Expressing Simple Dublin Core in RDF/XML http://dublincore.org/documents/dcmes-xml/ http://dublincore.org/documents/dcmes-xml/ Expressing Qualified Dublin Core in RDF/XML http://dublincore.org/documents/dcq-rdf-xml/ http://dublincore.org/documents/dcq-rdf-xml/ Namespace Policy for the DCMI http://dublincore.org/documents/dcmi-namespace/ http://dublincore.org/documents/dcmi-namespace/ DCMI Metadata Terms http://dublincore.org/documents/dcmi-terms/ http://dublincore.org/documents/dcmi-terms/

5 ECDL 2004 tutorial - Bath, Sept 2004 5 Implementing DC this tutorial is about the mechanics of implementing DC in HTML, XML and RDF it doesn’t really consider which implementation strategy is the best! ask yourself two questions… what am I trying to achieve? does using HTML, XML or RDF help me achieve it? do software and services exist that will support the creation and use of my metadata?

6 ECDL 2004 tutorial - Bath, Sept 2004 6 Abstract models for DC

7 ECDL 2004 tutorial - Bath, Sept 2004 7 Why abstract models? the first part of this tutorial isn’t going to show any angle brackets! why? because before we start creating DCMI descriptions we need to understand the DCMI view of the world/resources we want to describe (the DCMI resource model) the DCMI view of the descriptions we make about that world (the DCMI description model)

8 ECDL 2004 tutorial - Bath, Sept 2004 8 DCMI resource model each resource that we want to describe has zero or more properties a property is a specific aspect, characteristic, attribute or relation used to describe a resource each property has one or more values each value is a resource (the physical or conceptual entity that is associated with a property when it is used to describe a resource)

9 ECDL 2004 tutorial - Bath, Sept 2004 9 Err… but what is a resource? W3C/IETF definition of resource is “…anything that has identity. Familiar examples include an electronic document, an image, a service (e.g., "today's weather report for Los Angeles"), and a collection of other resources. Not all resources are network "retrievable"; e.g., human beings, corporations, and bound books in a library can also be considered resources.” i.e. a resource is “anything” physical things (books, cars, people) digital things (Web pages, digital images) conceptual things (colours, points in time)

10 ECDL 2004 tutorial - Bath, Sept 2004 10 Yeah, but… no, but… but… this seems to be too wide for the things we can describe with DC! can we really describe people using DC? do people have titles and subjects? no… in general we only use DC to describe a sub-set of all resources anything covered by the DCMIType list… Collection, Dataset, Event, Image (Still or Moving), Interactive Resource, Service, Software, Sound, Text, Physical Object

11 ECDL 2004 tutorial - Bath, Sept 2004 11 DCMI resource model (2) each resource may be a member of one or more classes each class may be related to one or more other classes by a refines (sub-class) relationship the two classes share some semantics such that all resources that are members of the sub-class are also members of the related class where the resource is the value of a property, the class is referred to as a vocabulary encoding scheme

12 ECDL 2004 tutorial - Bath, Sept 2004 12 DCMI resource model (3) each property may be related to exactly one other property by a refines (sub- property) relationship the two properties share some semantics such that all valid values of the sub-property are also valid values of the related property

13 ECDL 2004 tutorial - Bath, Sept 2004 13 DCMI description model a description is made up of one or more statements (about one, and only one, resource) and zero or one resource URI (a URI reference that identifies the resource being described) each statement is made up of a property URI (that identifies a property), zero or one value URI (that identifies a value of the property), zero or one encoding scheme URI (that identifies the class of the value) and zero or more value representations of the value

14 ECDL 2004 tutorial - Bath, Sept 2004 14 DCMI description model (2) each property is an attribute of the resource being described each property URI may be repeated in multiple statements the value representation may take the form of a value string, a rich value or a related description

15 ECDL 2004 tutorial - Bath, Sept 2004 15 DCMI description model (3) each value string is a simple, human- readable string that represents the value of the property each value string may have an associated encoding scheme URI that identifies a syntax encoding scheme each value string may have an associated value string language that is an ISO language tag (e.g. en-GB)

16 ECDL 2004 tutorial - Bath, Sept 2004 16 DCMI description model (4) each rich value is some marked-up text, an image, a video, some audio, etc. or some combination thereof that represents the resource that is the value of the property each related description is a description of (i.e. some metadata about) the resource that is the value of the property

17 ECDL 2004 tutorial - Bath, Sept 2004 17 The 1:1 principle notice that the model indicates that each property used in a description must be an attribute of the resource being described this is commonly referred to as the 1:1 principle - the principle that a DCMI metadata description describes one, and only one, resource however…

18 ECDL 2004 tutorial - Bath, Sept 2004 18 Description sets real-world metadata applications tend to be based on loosely grouped sets of descriptions (where the described resources are typically related in some way) known here as description sets for example, a description set might comprise descriptions of both a painting and the artist

19 ECDL 2004 tutorial - Bath, Sept 2004 19 DCMI records description sets are instantiated, for the purposes of exchange between software applications, in the form of metadata records each record conforms to one of the DCMI encoding guidelines (XHTML meta tags, XML, RDF/XML, etc.) a document andy powell

20 ECDL 2004 tutorial - Bath, Sept 2004 20 Values (again!) a value is the physical or conceptual entity that is associated with a property when it is used to describe a resource the value of the DC Creator property is a person, organisation or service - a physical entitiy the value of the DC Date property is a point in time - a conceptual entity the value of the DC Coverage property may be a geographic region or country - a physical entity the value of the DC Subject property may be a concept - a conceptual entity - or a physical object or person - a physical entity each of these entities is a resource

21 ECDL 2004 tutorial - Bath, Sept 2004 21 Simple vs. qualified DC? the notions of “simple DC” and “qualified DC” are often referred to in DCMI discussions a view of what these two terms mean is presented here… note that not everyone agrees with my definitions!

22 ECDL 2004 tutorial - Bath, Sept 2004 22 Simple DC record a simple DC record is a record that: conforms to the abstract model, comprises only a single description, uses only the 15 properties in the Dublin Core Metadata Element Set, makes no use of value URIs, encoding schemes, rich values or related descriptions.

23 ECDL 2004 tutorial - Bath, Sept 2004 23 A couple of notes… there is no guaranteed linkage between a simple DC record and the resource being described because the resource URI is optional such a linkage may be made by encoding the URI of the resource as the value string of the DC Identifier element, however this is not mandatory – everything in DC is optional while the value string of a property may look like a URI, there is nothing in the simple DC model that indicates this is the case …at their own risk, implementations may choose to guess which value strings are URIs and which are not…

24 ECDL 2004 tutorial - Bath, Sept 2004 24 Qualified DC model a qualified DC record is a record that: conforms to the DCMI abstract model, contains at least one property taken from the DCMI Metadata Terms recommendation

25 ECDL 2004 tutorial - Bath, Sept 2004 25 A couple of notes… it is still the case that there is no guaranteed linkage between a qualified DC record and the resource being described! a linkage may be made by encoding the URI of the resource as the value string of the DC Identifier element, however this is not mandatory – everything in DC is optional …where the value of a property is a URI, we can now indicate that it is a URI by using the ‘URI’ encoding scheme…

26 ECDL 2004 tutorial - Bath, Sept 2004 26 Dumb-down the process of translating a qualified DC metadata record into a simple DC metadata record is normally referred to as ‘dumbing-down’ can be separated into two parts: property dumb-down and value dumb-down. each of these processes can be be approached in one of two ways informed dumb-down uninformed dumb-down

27 ECDL 2004 tutorial - Bath, Sept 2004 27 Dumb and dumberer informed dumb-down takes place where the software performing the dumb-down algorithm has knowledge built into it about the property relationships and values being used within a specific DCMI metadata application uninformed dumb-down takes place where the software performing the dumb- down algorithm has no prior knowledge about the properties and values being used

28 ECDL 2004 tutorial - Bath, Sept 2004 28 Dumb-down algorithm ignore any property that isn't in the Dublin Core Metadata Element Set use value URI (if present) or value string as new value string recursively resolve sub- property relationships until one of the 15 properties in the DCMES is reached, otherwise ignore use knowledge of the related descriptions or the value string to create a new value string elementvalue uninformed informed …and in all cases: ignore any related descriptions and rich values, ignore any encoding scheme URIs.

29 ECDL 2004 tutorial - Bath, Sept 2004 29 Encoding DC in XHTML (and HTML!)

30 ECDL 2004 tutorial - Bath, Sept 2004 30 What is being described? a DC record embedded in an (X)HTML document describes that document if you want to describe something else, don’t embed it in the (X)HTML document! …not everyone would agree with this…

31 ECDL 2004 tutorial - Bath, Sept 2004 31 Simple DC elements use the ‘ name ’ and ‘ content ’ attributes of the XHTML element to encode the DC element (one of the 15 DCMES elements) and it's value string. Use the following pattern: for example: …the element names of the 15 DCMES elements always have a lower-case first letter…

32 ECDL 2004 tutorial - Bath, Sept 2004 32 Simple DC values value strings go in the XHTML element ‘content’ attribute… the string in the ‘ content ’ attribute is defined to be CDATA, i.e. a sequence of characters from the document character set which may include character entities … long value strings may be wrapped across multiple lines as necessary… …will need to escape some characters, &, <, >, etc…

33 ECDL 2004 tutorial - Bath, Sept 2004 33 Language of the value where the language of the value string is indicated, it should be encoded using the ‘ xml:lang ’ attribute of the XHTML element. For example:

34 ECDL 2004 tutorial - Bath, Sept 2004 34 Repeated elements multiple property values should be encoded by repeating the XHTML element for that property, for example:

35 ECDL 2004 tutorial - Bath, Sept 2004 35 Other DC elements use the ‘ name ’ and ‘ content ’ attributes of the XHTML element to encode the DC element (e.g. audience) and it's value. Use the following patterns: for example: …element names may be mixed-case but should always have a lower-case first letter…

36 ECDL 2004 tutorial - Bath, Sept 2004 36 Element refinements use the ‘ name ’ and ‘ content ’ attributes of the XHTML element to encode the element refinement and it's value. Use the following pattern: for example:

37 ECDL 2004 tutorial - Bath, Sept 2004 37 Element refinements (2) element refinements should use the names specified in: DCMI Metadata Terms http://dublincore.org/documents/dcmi-terms/ http://dublincore.org/documents/dcmi-terms/ …as a general rule, element refinement names may be mixed-case but should always have a lower-case first letter…

38 ECDL 2004 tutorial - Bath, Sept 2004 38 Encoding schemes encoding schemes are encoded using the ‘ scheme ’ attribute of the XHTML element, using the following pattern: for example:

39 ECDL 2004 tutorial - Bath, Sept 2004 39 Encoding schemes (2) encoding schemes should use the names specified in: DCMI Metadata Terms http://dublincore.org/documents/dcmi-terms/ http://dublincore.org/documents/dcmi-terms/ …as a general rule, encoding scheme names may be mixed-case but should always start with an upper-case letter. Encoding scheme names are often all upper-case…

40 ECDL 2004 tutorial - Bath, Sept 2004 40 Handling namespaces… the ‘DC.’ and ‘DCTERMS.’ prefixes are used to indicate the namespace from which the property is taken put the namespace URI in an XHTML element: while any string is allowable as the prefix, current practice is to use ‘DC.’ and ‘DCTERMS.’

41 ECDL 2004 tutorial - Bath, Sept 2004 41 Value URIs where the value of a property is the URI of another resource (e.g. DC.relation) an alternative form of encoding using the XHTML element is preferred. Use the following pattern: for example:

42 ECDL 2004 tutorial - Bath, Sept 2004 42 HTML profile to give recipient software applications an indication of the XHTML profile that was used to encode the DCMI metadata, the ‘ profile ’ attribute of the XHTML element should be used:

43 ECDL 2004 tutorial - Bath, Sept 2004 43 The case of names note that some of the old DCMI documents recommend(ed) using an uppercase first letter for the names of DCMES elements and element refinements, e.g. using ‘Title’ rather than ‘title’ this form of DCMES element naming is acceptable but is no longer considered the preferred form

44 ECDL 2004 tutorial - Bath, Sept 2004 44 The case of names (2) in general, any software applications that consume DC records embedded into XHTML Web pages should ignore the case of names and should treat both ‘.’ (full-stop) and ‘:’ (colon) as valid characters after the prefix i.e. all the following forms should be treated as being equivalent:

45 ECDL 2004 tutorial - Bath, Sept 2004 45 Older versions of HTML all the examples in this tutorial conform to XHTML 1.0 most of the recommendations in this tutorial can be applied to older versions of HTML (e.g. HTML 4.01) but use ‘ lang ’ rather than ‘ xml:lang ’ older versions of HTML do not require the trailing ‘/’ before the closing ‘>’ in the HTML element very old versions of HTML have no ‘ scheme ’ attribute

46 ECDL 2004 tutorial - Bath, Sept 2004 46 Mixing DC and non-DC DC metadata can be mixed with non-DC metadata in XHTML elements the following example embeds DC, AGLS and unspecified metadata properties in the same XHTML Web page:

47 ECDL 2004 tutorial - Bath, Sept 2004 47 A couple of examples Simple DC example 1 example 1 Qualified DC example 2 example 2 ScreenCam of using DC-dot http://www.ukoln.ac.uk/metadata/dcdot/ScreenCam http://www.ukoln.ac.uk/metadata/dcdot/

48 ECDL 2004 tutorial - Bath, Sept 2004 48 Encoding DC in XML

49 ECDL 2004 tutorial - Bath, Sept 2004 49 Introduction to XML this section is based on: Guidelines for implementing Dublin Core in XML http://dublincore.org/documents/dc-xml-guidelines/ http://dublincore.org/documents/dc-xml-guidelines/ nine recommendations for encoding DC in XML Note: these recommendations do not create RDF/XML. T his is not intended to imply that plain XML is better than RDF/XML… …RDF is covered in the next section!

50 ECDL 2004 tutorial - Bath, Sept 2004 50 Recommendation 1 implementers should base their XML applications on XML Schemas rather than XML DTDs approaches based on XML Schemas are more flexible and are more easily re-used within other XML applications …in some cases it may be sensible to provide both an XML Schema and a DTD for the application. Where XML Schemas are not used, a DTD should be provided instead…

51 ECDL 2004 tutorial - Bath, Sept 2004 51 Recommendation 1 (2) the DCMI maintains a list of XML schemas that are in use in projects or products using DCMI metadata DCMI Metadata expressed in XML Schema Language http://dublincore.org/schemas/xmls/ http://dublincore.org/schemas/xmls/

52 ECDL 2004 tutorial - Bath, Sept 2004 52 Recommendation 2 implementers should use URI references (see later) to uniquely identify DC elements, element refinements and encoding schemes …DC namespaces are defined in the DCMI Namespace Recommendation...DCMI Namespace Recommendation

53 ECDL 2004 tutorial - Bath, Sept 2004 53 Container elements note that it is anticipated that records will be encoded within one or more container XML element(s) of some kind this tutorial makes no recommendations for the name of any container element, nor for the namespace that the element should be taken from candidate container element names include,,, and

54 ECDL 2004 tutorial - Bath, Sept 2004 54 Recommendation 3 implementers should encode properties as XML elements and values as the content of those elements the name of the XML element should be an XML qualified name (QName) of the property Dublin Core in XML do not use constructs like

55 ECDL 2004 tutorial - Bath, Sept 2004 55 Recommendation 4 the property names for the 15 DC elements should be all lower-case Dublin Core in XML do not use Dublin Core in XML

56 ECDL 2004 tutorial - Bath, Sept 2004 56 Recommendation 5 multiple property values should be encoded by repeating the XML element for that property First title Second title

57 ECDL 2004 tutorial - Bath, Sept 2004 57 Simple DC example example 3

58 ECDL 2004 tutorial - Bath, Sept 2004 58 Recommendation 6 element refinements should be treated in the same way as other properties the name of the XML element should be an XML qualified name (QName): 2002-06 do not use any of the following: 2002-06 2002-06 2002-06

59 ECDL 2004 tutorial - Bath, Sept 2004 59 Recommendation 6 (2) element refinements are properties in their own right and are therefore best encoded in a similar way to other DC elements in particular, it should be noted that element refinements may have further refinements of their own (e.g. ‘format’ is refined by ‘extent’ which might be further refined by ‘duration’) …nesting does not mean refinement; nesting might be used for other purposes…

60 ECDL 2004 tutorial - Bath, Sept 2004 60 Recommendation 7 encoding schemes should be implemented using the ' xsi:type ' attribute of the XML element for the property the name of the encoding scheme should be given as the attribute value, and should be in the form of an XML qualified name (QName): http://www.ukoln.ac.uk/

61 ECDL 2004 tutorial - Bath, Sept 2004 61 Recommendation 7 (2) it should be noted that there may be existing DC XML applications that use other conventions to support encoding schemes, notably the use of a ‘ scheme ’ attribute of the XML element for the property therefore, it may be sensible for software applications that consume DC XML to be fairly liberal in what they accept

62 ECDL 2004 tutorial - Bath, Sept 2004 62 Recommendation 8 elements, element refinements and encoding schemes should use the names specified in DCMI Metadata Terms http://dublincore.org/documents/dcmi-terms/ http://dublincore.org/documents/dcmi-terms/ …note, the 15 DCMES element names all start with a lowercase letter…

63 ECDL 2004 tutorial - Bath, Sept 2004 63 Recommendation 8 (2) element and element refinement names may be mixed-case but should always have a lower-case first letter encoding scheme names may be mixed- case but should always start with an upper-case letter http://www.bbc.co.uk/ name=The Great Depression; start=1929; end=1939;

64 ECDL 2004 tutorial - Bath, Sept 2004 64 Recommendation 9 where the language of the value is indicated, it should be encoded using the ‘ xml:lang ’ attribute seafood fruits de mer

65 ECDL 2004 tutorial - Bath, Sept 2004 65 Some examples Qualified DC example 4 example 4 DC and IMS example 5 example 5 DC, IMS and ODRL example 6 example 6 HEALTH WARNING Examples 5 and 6 may seriously damage your interoperability!

66 ECDL 2004 tutorial - Bath, Sept 2004 66 Encoding DC in RDF

67 ECDL 2004 tutorial - Bath, Sept 2004 67 What is RDF? Resource Description Framework W3C recommendation for metadata model and syntax(es) XML is most common syntax in use on the Web underpins the ‘semantic Web’ W3C - Resource Description Framework (RDF) http://www.w3.org/RDF/ http://www.w3.org/RDF/

68 ECDL 2004 tutorial - Bath, Sept 2004 68 Why use RDF? RDF provides shared metadata ‘model’… …shared ‘meaning’ metadata can be shared between applications that have little or no knowledge about each other e.g. an RDF-based bibliographic application can consume RDF-based geospatial metadata and have 'some' knowledge of what it means …with (X)HTML and XML encodings, software applications must have ‘understanding’ hard-coded into them…

69 ECDL 2004 tutorial - Bath, Sept 2004 69 DC in RDF DC abstract models map easily onto the RDF model (because RDF was the basis for them!) DC in RDF/XML syntax is an encoding of the RDF model in XML simple DC is similar to the non-RDF XML we've seen already… …but with the addition of and container elements example 7

70 ECDL 2004 tutorial - Bath, Sept 2004 70 RDF basics… the model model based on triples a resource has a property which has a value often represented as an arc-node diagram (or “graph”) resources and properties are identified using URI references resource “value” property

71 ECDL 2004 tutorial - Bath, Sept 2004 71 A more concrete example The graph below approximately translates into English as… the resource identified by the URI http://example.org/ has a dc:creator that is represented by the string “Andy Powell” http://example.org/ “Andy Powell” http://purl.org/dc/elements/1.1/creator

72 ECDL 2004 tutorial - Bath, Sept 2004 72 Values as resources values can be resources too means that we can then attach properties to the value as well as to the original resource build up quite complex graphs http://example.org/ “Andy Powell” dc:creator “01225 383933” my:phoneNumber my:name

73 ECDL 2004 tutorial - Bath, Sept 2004 73 Typed and blank nodes nodes can be blank (to represent resources that have not be assigned a URI) can also indicate the class of a resource using the rdf:type property http://example.org/ “Andy Powell” dc:creator “01225 383933” my:phoneNumber my:name my:Person rdf:type

74 ECDL 2004 tutorial - Bath, Sept 2004 74 Qualified DC in RDF now ready to look at some more complex examples for full details about how to encode DC in RDF see… Expressing Simple Dublin Core in RDF/XML http://dublincore.org/documents/dcmes-xml/ Expressing Qualified Dublin Core in RDF/XML http://dublincore.org/documents/dcq-rdf-xml/ http://dublincore.org/documents/dcmes-xml/ http://dublincore.org/documents/dcq-rdf-xml/

75 ECDL 2004 tutorial - Bath, Sept 2004 75 Case study 1 – dc:creator Andy Powell a.powell@ukoln.ac.uk Example RDF description using dc:creator…

76 ECDL 2004 tutorial - Bath, Sept 2004 76 Andy Powell a.powell@ukoln.ac.uk Case study 1 – dc:creator dc:creator Andy Powell… my:affiliation a.powell@uko… my:email …and the RDF model it represents. UKOLN, Univ… a.powell@uko… Andy Po… rdfs:label my:name

77 ECDL 2004 tutorial - Bath, Sept 2004 77 Andy Powell a.powell@ukoln.ac.uk Case study 1 – dc:creator dc:creator Andy Powell… my:affiliation a.powell@uko… my:email UKOLN, Univ… a.powell@uko… Andy Po… rdfs:label my:name But… we don’t want to embed all this information into every instance metadata record do we? relatedMetadata

78 ECDL 2004 tutorial - Bath, Sept 2004 78 Andy Powell Case study 1 – dc:creator dc:creator Andy Powell… rdfs:label Andy Powell a.powell@ukoln.ac.uk my:affiliation a.powell@uko… my:email UKOLN, Univ… a.powell@uko… Andy Po… my:name Need to separate part of the information out and store it in a single place – in this case in a directory service…

79 ECDL 2004 tutorial - Bath, Sept 2004 79 Andy Powell Case study 1 – dc:creator valueURI dc:creator Andy Powell… rdfs:label Andy Powell a.powell@ukoln.ac.uk valueURI my:affiliation a.powell@uko… my:email UKOLN, Univ… a.powell@uko… Andy Po… my:name To do this we need to assign a URI (the ‘valueURI’) to the anonymous ‘value’ node…

80 ECDL 2004 tutorial - Bath, Sept 2004 80 Andy Powell Case study 1 – dc:creator valueURI dc:creator Andy Powell… rdfs:label Andy Powell a.powell@ukoln.ac.uk valueURI my:affiliation a.powell@uko… my:email UKOLN, Univ… a.powell@uko… Andy Po… my:name relatedMetadataURI The document containing this information is itself an RDF resource (the ‘relatedMetadata’) and has a URI

81 ECDL 2004 tutorial - Bath, Sept 2004 81 Andy Powell a.powell@ukoln.ac.uk Case study 1 – dc:creator valueURI dc:creator Andy Powell… rdfs:label Andy Powell a.powell@ukoln.ac.uk valueURI my:affiliation a.powell@uko… my:email UKOLN, Univ… a.powell@uko… Andy Po… my:name relatedMetadataURI rdfs:seeAlso Use rdf:seeAlso to form linkage between description and relatedMetadata…

82 ECDL 2004 tutorial - Bath, Sept 2004 82 Case study 2 – dc:subject D08.586.682.075.400 Formate Dehydrogenase Example RDF description using dc:subject (taken from Qualified DC in RDF recommendation…

83 ECDL 2004 tutorial - Bath, Sept 2004 83 Case study 2 – dc:subject D08.586.682.075.400 Formate Dehydrogenase dcterms:MESH dc:subject rdf:type D08.586… rdf:type rdfs:label Formated… rdfs:value …and the RDF model it represents.

84 ECDL 2004 tutorial - Bath, Sept 2004 84 Case study 2 – dc:subject D08.586.682.075.400 Formate Dehydrogenase dcterms:MESH dc:subject rdf:type But… we don’t want to embed all this information into every instance metadata record do we? relatedMetadata D08.586… rdfs:label Formated… rdfs:value

85 ECDL 2004 tutorial - Bath, Sept 2004 85 Case study 2 – dc:subject D08.586.682.075.400 dcterms:MESH dc:subject rdf:type D08.586.682.075.400 Formate Dehydrogenase dcterms:MESH D08.586… Formated… Need to separate part of the information out and store it in a single place – in this case with the terminology owner… rdfs:label Formated…

86 ECDL 2004 tutorial - Bath, Sept 2004 86 Case study 2 – dc:subject D08.586.682.075.400 valueURI dcterms:MESH dc:subject rdf:type D08.586.682.075.400 Formate Dehydrogenase valueURI dcterms:MESH D08.586… Formated… To do this we need to assign a URI (the ‘valueURI’) to the anonymous ‘value’ node… rdfs:label Formated…

87 ECDL 2004 tutorial - Bath, Sept 2004 87 Case study 2 – dc:subject D08.586.682.075.400 valueURI dcterms:MESH dc:subject rdf:type D08.586.682.075.400 Formate Dehydrogenase valueURI dcterms:MESH D08.586… Formated… relatedMetadataURI The document containing this information is itself an RDF resource (the ‘relatedMetadata’) and has a URI rdfs:label Formated…

88 ECDL 2004 tutorial - Bath, Sept 2004 88 Case study 2 – dc:subject D08.586.682.075.400 valueURI dcterms:MESH dc:subject rdf:type D08.586.682.075.400 Formate Dehydrogenase valueURI dcterms:MESH D08.586… Formated… relatedMetadataURI rdfs:seeAlso Use rdf:seeAlso to form linkage between description and relatedMetadata… rdfs:label Formated…

89 ECDL 2004 tutorial - Bath, Sept 2004 89 Abstract DC model D08.586.682.075.400 valueURI dcterms:MESH dc:subject rdf:type D08.586.682.075.400 Formate Dehydrogenase valueURI dcterms:MESH D08.586… Formated… relatedMetadataURI rdfs:seeAlso resource property valueURI valueString In terms of abstract DC model we now have: resource, property, value URI, value string (and value string language), encoding scheme, related description resource property value URI related description encoding scheme rdfs:label Formated… value string (language)

90 ECDL 2004 tutorial - Bath, Sept 2004 90 Practical examples – OAI and RSS

91 ECDL 2004 tutorial - Bath, Sept 2004 91 OAI-PMH OAI Protocol for Metadata Harvesting simple protocol for sharing metadata records between applications currently at version 2.0 based on HTTP, XML, XML Schema and XML namespaces allows a harvester to ask a remote repository for some or all of its metadata records

92 ECDL 2004 tutorial - Bath, Sept 2004 92 OAI-PMH (2) simple DC is default (mandatory) record format supports any record format provided it can be encoded using XML (e.g. DC, IMS, MARC, ODRL, …) Open Archives Initiative http://www.openarchives.org/ http://www.openarchives.org/

93 ECDL 2004 tutorial - Bath, Sept 2004 93 OAI-PMH example record from the American Memory repository at the Library of Congress http://memory.loc.gov/cgi-bin/oai2_0 example 8 ScreenCam of using the ‘repository explorer’ScreenCam‘repository explorer’ GetRecord for record identifier oai:lcoa1.loc.gov:loc.gmd/g3701p.rr003570

94 ECDL 2004 tutorial - Bath, Sept 2004 94 RSS RDF Site Summary or Rich Site Summary (or even Really Simple Syndication) at least 3 different versions (0.91, 1.0 and 2.0) all based on XML but not compatible simple format for sharing news feeds on the Web RSS ‘channels’ – list of ‘items’ channels updated by updating XML file RSS clients gather XML on regular basis

95 ECDL 2004 tutorial - Bath, Sept 2004 95 RSS 1.0 and DC example RSS 1.0 based on RDF most flexible and extensible of the RSS ‘family’ - not necessarily the most widely deployed can include DC in both ‘channel’ and ‘item’ descriptions example 9 full documentation at: RDF Site Summary 1.0 Modules: Qualified Dublin Core http://web.resource.org/rss/1.0/modules/dcterms/ http://web.resource.org/rss/1.0/modules/dcterms/

96 ECDL 2004 tutorial - Bath, Sept 2004 96 Assigning identifiers to metadata terms

97 ECDL 2004 tutorial - Bath, Sept 2004 97 What’s the problem? the terms used in DCMI metadata records must be assigned a URI reference before they can be used qualified DC application profiles generally use ‘local’ additions to DCMI terms therefore these additional terms must be assigned a URI reference …a URI reference is a URI with an optional fragment identifier…

98 ECDL 2004 tutorial - Bath, Sept 2004 98 DCMI terms URI references all DCMI terms have already been assigned URI references for example… http://purl.org/dc/elements/1.1/title http://purl.org/dc/terms/dateCopyrigh ted http://purl.org/dc/elements/1.1/title http://purl.org/dc/terms/dateCopyrigh ted

99 ECDL 2004 tutorial - Bath, Sept 2004 99 Namespace-name issues encoding syntaxes split the term URI reference into two parts namespace name the namespace is shortened to a namespace prefix for example DC.title (XHTML) dc:title (XML, RDF/XML)

100 ECDL 2004 tutorial - Bath, Sept 2004 100 Guidelines for groups of related terms, URI references are typically assigned such that they can share a namespace prefix all term URI references should resolve to a human and/or machine readable description of the term term URI references should use a registered URI scheme term URI references should be assigned with the intention of them being as persistent as the Internet

101 ECDL 2004 tutorial - Bath, Sept 2004 101 A note on namespaces DCMI namespace A DCMI namespace is a collection of DCMI terms (a collection of names) DCMI term A DCMI term is a DCMI element, a DCMI qualifier or term from a DCMI-maintained controlled vocabulary each DCMI namespace is identified by a URI – each name in the namespace is also a URIDCMI namespace a mechanism for making DCMI terms unique

102 ECDL 2004 tutorial - Bath, Sept 2004 102 How do I assign URIs? no clear recommended best practice in this area yet! four strategies for assigning URIs are presented here… … there are other strategies!

103 ECDL 2004 tutorial - Bath, Sept 2004 103 Using project/service URIs simple to do… …but danger of lack of persistence examples: http://myservice.org/terms/price (XML, RDF/XML) “MYSERVICE.price” (XHTML) http://myproject.org/metadata/vocabs /color#Red (RDF/XML) http://myservice.org/terms/price http://myproject.org/metadata/vocabs /color#Red

104 ECDL 2004 tutorial - Bath, Sept 2004 104 Using PURLs PURLs are persistent URLs (under the purl.org domain) used by DCMI, RSS and others to provide persistent term URI references examples: http://purl.org/dc/elements/1.1/title (XML, RDF/XML) “DC.title” (XHTML) http://purl.org/rss/1.0/link (XML, RDF/XML) http://purl.org/dc/elements/1.1/title http://purl.org/rss/1.0/link

105 ECDL 2004 tutorial - Bath, Sept 2004 105 Using xmlns.com domain registered explicitly for use for XML namespaces but… persistence policy a little unclear used for FOAF terms example: http://xmlns.cm/foaf/0.1/firstName (RDF/XML) http://xmlns.cm/foaf/0.1/firstName

106 ECDL 2004 tutorial - Bath, Sept 2004 106 Using “info” URIs “info” URIs specifically designed for identifying vocabulary terms but… not a registered scheme yet and there is currently some “discussion” (i.e. argument!) on various lists about whether they are a good idea example: info:ddc/22/eng//004.678 info:ddc/22/eng//004.678

107 ECDL 2004 tutorial - Bath, Sept 2004 107 What have we learned? an abstract model for DC encoding DC in XHTML encoding DC in XML encoding DC in RDF/XML two practical examples OAI Protocol for Metadata Harvesting RSS how to assign identifiers to new metadata terms

108 ECDL 2004 tutorial - Bath, Sept 2004 108 Questions?


Download ppt "Encoding DC in (X)HTML, XML and RDF Andy Powell UKOLN, University of Bath, UK UKOLN is supported by: Tutorial."

Similar presentations


Ads by Google