Metadata : an overview XML and Educational Metadata, SBU, London, 10 July 2001 Pete Johnston UKOLN, University of Bath Bath, BA2 7AY UKOLN is supported by: URL
XML and Educational Metadata, SBU, London, 10 July Metadata: an overview What is metadata? An introduction to the Dublin Core An introduction to XML for metadata An introduction to RDF RDF, XML and interoperability
XML and Educational Metadata, SBU, London, 10 July What is metadata? “Data about data” “Data associated with objects which relieves their potential users of having to have full advance knowledge of their existence or characteristics. A user might be a program or a person.” –Dempsey and Heery, 1998 “Machine understandable information about web resources or other things.” –Berners-Lee, 1997
XML and Educational Metadata, SBU, London, 10 July Resources, objects, things? HTML documents digital images databases books museum objects archival records metadata records collections services physical places people abstract “works” concepts events
XML and Educational Metadata, SBU, London, 10 July What operations? User wants to –find –identify –select –obtain / use –(based on IFLA Functional Requirements for Bibliographical Record)
XML and Educational Metadata, SBU, London, 10 July What operations? (2) Owner / manager / provider wants to –describe –classify –link, relate –enable and control access and use –commerce –property rights –content rating –authenticity –privacy –manage –administer –preserve
XML and Educational Metadata, SBU, London, 10 July Metadata in practice Where is metadata created? –embedded in resource –separate entity linked to/from resource –remote database entry Where is metadata used? –harvested/aggregated to –central database? –multiple distributed databases? –queried by user –used by software agents in service of user
XML and Educational Metadata, SBU, London, 10 July Metadata for a purpose Different “flavours” of metadata serve different purposes Simple, generic vs. rich, specific Automatic generation vs. human creation Standards and specifications available…...but need to choose appropriate standard for context
XML and Educational Metadata, SBU, London, 10 July Standards for metadata Benefit of others’ experience, expertise Provide basis for good practice Reflect consensus, so facilitate exchange, access, interoperability May have support in software tools Standards for –semantics –syntax –structure
XML and Educational Metadata, SBU, London, 10 July Introducing the Dublin Core Initiative to improve resource discovery on Web –not for complex resource description –simple “document-like objects” –extended to other classes of resource Interdisciplinary consensus on simple element set –15 elements –all optional –all repeatable
XML and Educational Metadata, SBU, London, 10 July Introducing the Dublin Core Provides basic semantic interoperability –across domains –across language communities –may disclose rich description in simple, commonly understood form Allows for extensibility –but tension between extending DC and choosing other, richer schema
XML and Educational Metadata, SBU, London, 10 July Introducing the Dublin Core Simplicity of semantics, ease of use Requires clarity about what resource is being described –e.g. work, expression, manifestation, item Real resources more complex than (stable) “document-like object”? –characteristics of resources change through time –agents perform actions which produce changes
XML and Educational Metadata, SBU, London, 10 July Introducing XML Extensible Markup Language Recommendation of W3C, 1998, 2000 Defines means of describing tree- structured data in text-based format Subset of SGML –embedded markup delimits and describes data Platform-independent syntax Support for validation against structural model (DTD, XML Schema)
XML and Educational Metadata, SBU, London, 10 July Introducing XML (2) Initially addressing HTML’s limitations for describing document structure Now widely adopted syntax for transferring data between programs, systems Standard programming interfaces –reusable software components Support from major software vendors Foundation for “Web services” –distributed applications invoked over Web
XML and Educational Metadata, SBU, London, 10 July Introducing XML (3) “XML allows users to add arbitrary structure to their documents but says nothing about what the structures mean.” –Berners-Lee, 2001
XML and Educational Metadata, SBU, London, 10 July Introducing RDF Resource Description Framework Model & Syntax Recommendation of W3C, 1999 Generic “architecture” for metadata –set of conventions for applications exchanging metadata –allow semantics to be defined by different resource description communities –accommodate mixing of metadata from diverse sources
XML and Educational Metadata, SBU, London, 10 July Introducing RDF (2) Defines –model for making statements about resources –conventions for encoding statements using XML syntax Object types –Resource : any object identified by URI –not necessarily accessible via Web –Property : attribute to describe resource –properties also uniquely identified by URI –Statement : triple of specific resource, named property, and value
XML and Educational Metadata, SBU, London, 10 July The RDF model author Pete A resource has some property whose value is either (i) a simple string value (literal)…. –The resource identified by the URI has a property “author” whose value is “Pete” –Or, “Pete” is the “author” of the resource identified by
XML and Educational Metadata, SBU, London, 10 July The RDF model (2) … or (ii) another resource author name –The value of property “author” is another resource which has a property “name” with value “Pete” and a property “ ” with value
XML and Educational Metadata, SBU, London, 10 July The RDF XML syntax XML representation of model –store/exchange descriptions Property names made unique through use of XML namespaces. Variant syntaxes <rdf:Description about=” Pete
XML and Educational Metadata, SBU, London, 10 July The power of RDF Extensible model Supports arbitrary complexity of description URIs as unique fixed points to identify –resources –properties Descriptions created independently can be “merged” using URIs as “anchors”
XML and Educational Metadata, SBU, London, 10 July RDF Schema Resource Description Framework Schema Candidate Recommendation of W3C, 2000 Provides mechanisms to define vocabularies used in RDF statements –e.g. Dublin Core metadata element set defined using RDF(S)
XML and Educational Metadata, SBU, London, 10 July RDF Schema (2) Defines type system –resources grouped into classes –classes related hierarchically (subClassOf) –properties related hierarchically (subPropertyOf) –use of properties constrained (domain, range) RDF Schema employs RDF model –expressible using RDF/XML syntax
XML and Educational Metadata, SBU, London, 10 July RDF, XML & interoperability Why isn’t XML enough? –simple statement could be expressed in XML in many different ways –human reader makes interpretation/guess –application program requires prior knowledge of schema/DTD design –RDF imposes extra syntactic constraints on how statement expressed –with RDF/XML, both human and program can interpret description consistently Less flexibility, greater interoperability
XML and Educational Metadata, SBU, London, 10 July RDF, XML & interoperability Use XML for exchange when –applications both “know” semantics conveyed by structure of (meta)data Use RDF/XML for exchange when –(meta)data potentially used by applications without prior “knowledge” of specific schema –(meta)data incorporates overlapping structures from different domains
XML and Educational Metadata, SBU, London, 10 July The Semantic Web Project of W3C –Present: info on Web for human reader, navigated by simple link –Future: data processed by programs designed independently of data Requires machine-readable statements about resources and their relationships –using common model –using vocabulary terms tied to unique definitions –definitions available to programs
XML and Educational Metadata, SBU, London, 10 July The Semantic Web (2) Vision –software agents navigating web of descriptions and “ontologies” (including unknown vocabularies) –making inferences about data collected –communicating via partial understanding But… –A vision (only?) –Mistrust of the “hype”? –XML (Schema) vs. RDF (Schema)? –Doubts about RDF from KR community?
XML and Educational Metadata, SBU, London, 10 July Conclusions Meaningful discussion of interoperability requires scope, context Syntactic interoperability - XML Structural interoperability - RDF Semantic interoperability –adoption of standard schema –terminological control –access to RDFS representation of schema
XML and Educational Metadata, SBU, London, 10 July Acknowledgements / further reading UKOLN metadata pages: Dublin Core Metadata Initiative: IFLA, Functional Requirements for Bibliographic Record W3C RDF : W3C Semantic Web :