Presentation on theme: "Using Dublin Core in Museums Introduction to the CIMI Guide to Best Practice: Dublin Core Dr. Paul Miller UK Office for Library & Information Networking."— Presentation transcript:
Using Dublin Core in Museums Introduction to the CIMI Guide to Best Practice: Dublin Core Dr. Paul Miller UK Office for Library & Information Networking Thomas Hofmann Australian Museums On-Line
Overview What is metadata? Introducing the Dublin Core Introducing the CIMI testbed project CIMI DC Guidelines - Guide to Best Practice: Dublin Core Break Practical session Implementation Discussion.
What is Metadata? Meaningless jargon or a fashionable term for what weve always done or a means of turning data into information and data about data and the name of a film director (Luc Besson) and the title of a book (The Lord of the Flies).
What is Metadata? Metadata exists for almost anything people places periods objects concepts The trick lies in making descriptions suitably generic to be meaningful to the majority, whilst suitably controlled to aid location.
What is Metadata? Metadata fulfils three main functions: description of resource content What is it? description of resource form How is it constructed? description of issues behind resource use Can I afford it?.
What is Metadata? Libraries MARC AACR2 A resource description community is characterised by agreed semantic, structural and syntactic conventions for exchange of descriptive information Based on a slide by Stu Weibel
What is Metadata? Scientific Databases Museums GeoLibraries Internet Commons Home Pages Commerce Whatever... Based on a slide by Stu Weibel
What is Metadata? Many structures have evolved at different levels, and to meet different requirements...
What is Metadata? Semantic Interoperability Structural Interoperability Syntactic Interoperability Lets talk English Standardisation of content Standardisation of form Heres how to make a sentence Standardisation of expression These are the rules of grammar cat sat on mat drank milk Cat sat on mat. Drank milk. The cat sat on the mat. It drank some milk.
Approaches to Metadata (I) Search Engines Easy to build Cheap Cover large areas of the Internet. Pretty stupid, really Minimal contextualisation of data is Miller the person who made this, the person whom it is about, the profession of a differently named individual, or something else entirely? E.g. Alta Vista, Lycos, MetaCrawler, HotBot, Excite, InfoSeek, LookSmart, UK Max...
Approaches to Metadata (II) Specialist Resource Description Extremely detailed Accurate finding aids Comprehensive. Expensive Domain specific Only likely to be worthwhile for valuable resources. E.g. MARC, FGDC CSDGM, EAD, SPECTRUM...
Approaches to Metadata (III) Resource Discovery Relatively easy to build Relatively cheap? Contextualises information Enable semantic mapping across community boundaries. Insufficient to meet specialist requirements within a community? E.g. Dublin Core...
Challenges Many flavours of metadata which one do I use? Managing change new varieties, and evolution of existing forms Tension between functionality and simplicity, extensibility and interoperability Functions, features, and cool stuff Simplicity and interoperability Opportunities
Introducing the Dublin Core An attempt to improve resource discovery on the Web now adopted more broadly Building an interdisciplinary consensus about a core element set for resource discovery simple and intuitive cross–disciplinary international flexible.
Introducing the Dublin Core 15 elements of descriptive metadata All elements optional All elements repeatable The whole is extensible offering a starting point for semantically richer descriptions Interdisciplinary libraries, museums, government, education... International available in 20 languages, with more on the way.
Introducing the Dublin Core u Title u Creator u Subject u Description u Publisher u Contributor u Date u Type u Format u Identifier u Source u Language u Relation u Coverage u Rights
Extending DC (semantic refinement) Creator First Name Surname Contact Info Affiliation Based on a slide by Stu Weibel Improve descriptive precision by adding sub–structure (subelements and schemes) Greater precision = lesser interoperability Should dumb down gracefully Element qualifierValue qualifier
Extending DC (a modular approach) Modular extensibility... additional elements to support local needs complementary packages of metadata …but only if we get the building blocks right DescriptionArchival Management Terms & Conditions Based on a slide by Stu Weibel
Extending DC? DC offers a semantic framework through use of further substructure, meaning can often be clarified Paul Paul Inc. ? Paul xyz ? xyz Paul ? Paul Paul Inc. Paul xyz xyz Paul.
Extending DC? DC offers a semantic framework Use of domain–specific schemes greatly increases precision Washington Washington State ? Washington DC ? Washington monument ? Washington Washington State Washington DC Washington monument North and Central America, United States, Washington
Dublin Core in 1999 Formalise the process TAC, PAC, Directorate and Working Groups Refinement of definitions DC 1.1 Qualification semantics consensus and streamlining RDF Common understandings INDECS/DOI IMS ?
Formalisation Dublin Core Web Site purl.org/dc/ Dublin Core Directorate DC Policy Advisory Committee DC Technical Advisory Committee Working Groups Stakeholder Communities DC-General Dublin Core Mail Server Based on a slide by Stu Weibel
Testbed Phase I Goals Evaluate feasibility of DC for museum community Identifying and resolving operational, technical and intellectual issues Promote international consensus on DC practices in museum community Milestones Involvement of over 18 participants (Software vendors, Museums, Consultants, Cultural Heritage Gateways) Over 300,000 record repository (museums, collections, artefacts) using DC Simple, both created from scratch and exported from legacy systems Guide to Best Practice: Dublin Core Outcomes DC is easy to use DC simple is a machete, not a scalpel All Elements depend on Resource Type DC can be applied to both physical and electronic resources Further user evaluation necessary Introducing the CIMI testbed project (I)
Testbed Phase II Goals Finalisation and publication of Guide to Best Practice: Dublin Core Identification of proposed qualified elements (sub–structure) Examination of RDF Initial effort in mapping DC elements to CIMI Access Points User evaluation Milestones There are four meetings scheduled for Please see for updates on the testbed phase II Outcomes The schedule for 1999 for sees the following deadlines: Guide publication (April) DC recommendation (December) DC to CIMI Access Points mapping (November) RDF examination (July) Choreographed demonstration/ user evaluation (October) Final report and recommendations (December) For updates please see Introducing the CIMI testbed project (II)
Dublin Core and the museum community Challenges for museums Emphasis on attributes of the physical object (artefact) Need to associate the physical object with persons, places and events Need to account for collections Need to account for surrogates such as photographs Historical lack of content standards Assumptions regarding DC DC is useful to describe artefacts and associated information resources in the museum community DC is simple to use and learn Adequate technical infrastructure exists to support use of resource discovery
The 1:1 Principle - What does it mean? Definition: Only one object, resource or instantiation described with one single record Conclusion: Makes describing original and surrogates easier object surrogate DC records Interpretation:
DC record relationships collection DC record story DC record artefact DC record slide DC record institution DC record artefact DC record
Reality Check: Criteria for DC creation Ask yourself: Is the record itself (and each element within that record) useful for resource discovery? Is the value of the element known with certainty? Is it readily available from existing databases or information sources? If not, leave it out If not, interoperability degraded and records harder to maintain Have you selected values from enumerated lists recommended to assist in cross domain searching?
About the Guide to Best Practice: Dublin Core Basis for the Guide: Based on Dublin Core 1.0 (RFC 2413) Recommendations based on testbed experience, not large scale production efforts Syntax used in examples and testbed based on XML Document structure: 15 DC simple elements starting with TYPE to assist in following the 1:1 rule (original vs. surrogate) Each element: - Introduced with standard DC Definition (RFC 2413) - Explained with CIMI Interpretation - Manifested with CIMI Guideline - Illustrated through Examples Appendices contain sample records for different types of museum describing a variety of resource types
Element: Type Interpretation: The nature of the resource, including such aspects as originality, aggregation and manifestation. Guideline: Helps to decide the values of other elements To aid in searching across collections and across different disciplines among museums, specify TYPE from: 1. L ist of controlled values maintained by the DC community: text, image, sound, dataset, software, interactive, physical object, event and the following list of museum-related values: 2. original or surrogate 3. item or collection 4. natural or cultural list elements in the order as above for consistency reason (Note: element order is irrelevant in Dublin Core)
Element: Format Interpretation: The properties of the resource that impose the use of tools for access, display, or operation; not the tools themselves. Do not use FORMAT if no tools are required. Guideline Use to populate element - with MIME type information for digital resources - with details of technique, material and media for analogue resources Dont use to describe: - limitations to access or restrictions against usage RIGHTS - dimensions DESCRIPTION
Element: Title Interpretation: Name(s) given to the resource, regardless of whose they are so long as they are useful for resource discovery. Guideline Repeat TITLE element as required Untitled works of fine art:: use whatever value you would use on the wall label copy, exhibition catalog, or other promotional materiali.e., if the work is known as Untitled, specify this in TITLE Cultural items and collections: with no known title or name, use a term or phrase that is sufficiently descriptive to permit a user to judge relevancy. If your existing database does not contain title information, concatenate other descriptive field values as appropriate to name the resource Natural specimens: use Latin binomial name of the animal, plant, or mineral, should contain the name that is given to the object in hand by the person identified in CREATOR. These two elements plus the DC.DATE thus give a full citation for the specimen, and allow for the possibility that the same specimen can have different names allocated by the same, or different, taxonomists at different times.
Element: Description Interpretation: A textual, narrative description of the resource, including abstracts for documents or content characterizations in the case of visual resources Guideline Use this element whenever possible, as it is a rich source of indexable vocabulary. Emphasize the contextual information and popular associations (people, places, and events) of the resource If a single description field does not exist in your current database, values from other fields or wall label copy, exhibition catalogs, didactic copy, etc. may be concatenated to populate DESCRIPTION DESCRIPTION is likely a display field with the resource in the search result set, we recommend brevity but not so as to sacrifice richness.
Element: Subject Interpretation: Keywords about the theme and/or concept of the resource, as well as terms signifying significant associations of people, places, and events or other contextual information. Guideline Do not strictly interpret the element name Subject, which tends to lock our thinking into formal subject terms such as those used in bibliographic metadata. Keywords is a more appropriate interpretation of the kind of values that are useful for this element index terms, or descriptors, rather than specific-to-broad categorizations of intellectual content.
Element: Creator Interpretation: The person(s) or organization that conceived or initiated the resource. For example, author of written document; artist, photographer, or illustrator of visual resource; or founder of an institution. For natural specimens, CREATOR specifies the determiner; the person who created the name that is present in the TITLE element.
Element: Contributor Interpretation: A person or organization not specified in a Creator element because their contributions to the resource are less direct or conceptual (for example, editor or translator). Also used for patrons, benefactors, and sponsors. For natural specimens, the collector and preparator are example values.
Element: Publisher Interpretation: The person(s) or organizations responsible for making the resource available or for presenting it, such as a repository, an archive, or a museum. Also includes major financial supporters and legislative entities without whose support the resource would not be continuously available, such as a municipal historical council or a board of trustees. (Note: benefactors of the actual resources are listed under CONTRIBUTOR.) In addition, list distributors and other important agents of delivery in PUBLISHER.
Element: Date Interpretation: The date associated with the creation or availability of the resource. This is not necessarily the same as the date in the Coverage element, which refers to the date or period of the resources intellectual content. For natural specimens, the value should be the date that the name in TITLE was given by CREATOR. Guideline Repeat DATE to express both the circa value and the range it represents according to your organizations policy Repeat DATE to express both the time period during which the resource was brought into being and the specific date when it was [thought to be] first cataloged or collected
Element: Identifier Interpretation: A text and/or number string used to effectively identify the resource. Guideline Use URLs, or URNs, or DOIs (when implemented) for internet resources. For realia, use widely recognized means of identifying items and collections such as accession numbers, International Standard Book Numbers (ISBN), raisonne catalog numbers, and Kochel numbers
Element: Source Interpretation: Information about a resource from which the present resource is directly derived. Guideline SOURCE is distinguished from a RELATION value of IsBasedOn by degree or strength of the connection. The CIMI testbed group used SOURCE as a kludge element pending clarification of the IsBasedOn definition by the DC Directorate.
Element: Relation Interpretation: Used to describe significant points in the hierarchy of surrogacy, including the immediate parent and the original item. Recommended values are CREATOR, TITLE, IDENTIFIER and any and all progenitors/children including (repeating) SOURCE value(s).
Element: Language Interpretation: The language of the intellectual content of the resource, not the language of the DC record nor necessarily the language of the TITLE value. Intellectual content may be represented as text or as vocal sound. CIMIs interpretation of this element reflects a potential application of scheme in DC Qualified. Guideline Re-use terms from list of language abbreviations defined in RFC 1766 at ftp://ds.internic.net/rfc/rfc1766.txt If the language is not included in that reference, spell it out completely Use repeated elements to express multiple values LANGUAGE is not applicable to natural objects or those lacking wordsftp://ds.internic.net/rfc/rfc1766.txt
Element: Coverage Interpretation: Requires no interpretation. Guideline Repeat DC.COVERAGE values as appropriate in DC.SUBJECTe.g., colonial America or Baroque dance as an intellectual access point or keyword. Temporal characteristics: Recommended best practice for dates is defined in a profile of ISO 8601 [Date and Time Formats (based on ISO8601), W3C Technical Note datetime, which specifies the format YYYY-MM-DD. If the full date is unknown, month and year (YYYY-MM) or just year (YYYY) may be used. Repeat DC.COVERAGE to express both the time period during which the resource was brought into being and the specific date when it was [thought to be] first cataloged or collected.http://www.w3.org/TR/NOTE- datetime Spatial characteristics: Where possible, use Gettys Thesaurus of Geographic Names at specifying at a sufficient granularity to unambiguously identify the location. Concatenate place names as one string of values separated by semicolons. Start with broadest term and work down to narrowest. Do not use latitude and longitude unless your audience is accustomed to associating resources to places in this manner (e.g., maritime items or events).
Element: Rights Interpretation: A rights management or a usage statement, an identifier that links to a rights management or usage statement, or an identifier that links to a service providing information about rights management for and/or usage of the resource. A statement concerning accessibility, reproduction constraints, copyright holder, and/or inclusion of credit lines. Absence of RIGHTS in a record does not imply that the resource is not protected. Guideline Use a pointer to Terms and Conditions or copyright statements for Internet resources. Ensure proper agreement between the RIGHTS value and the resource in handdo not, for example, link reproduction notices for digital surrogates to analog objects.