Presentation on theme: "Dublin Core for Museums Day 1"— Presentation transcript:
1Dublin Core for Museums Day 1 CIMIJohn PerkinsPaul Miller UK Office for Library & Information NetworkingThomas Hofmann Australian Museums On-Line
2Overview for Thursday March 25 Introduction to MetadataIntroducing the Dublin CoreCIMI DC Guidelines - Dublin Core for MuseumsBreakDC for museums continued...LunchPracticalities of Implementing DCIntroduction to MICI
3What’s the Problem ? Need to serve a Web audience Demand for contentUncertain qualityExpectations for rapid easy accessNeed to be visible on the WebTwo million web sitesHalf a billion addressable pagesMany communities with the same problem
4What’s the Problem ? Manage and organise interconnected data Different typesDifferent repositoriesPackagesInteroperate with other communitiesInteroperate with other applicationsNeed a way to:Express meanings in rich and complex dataExpress the structure of our dataEncode the transfer of data
5What’s the Solution ? Communities address their own needs Do so in a way that works across communitiesStandards basedCollaborative
6What is a Community? Libraries Museums A resource description community is characterised by agreed semantic, structural and syntactic conventions for exchange of descriptive informationLibrariesMARCAACR2MuseumsSPECTRUMMICIBased on a slide by Stu Weibel
7Communities working together HomePagesMuseumsGeoLibraries‘InternetCommons’CommerceWhatever...ScientificDatabasesBased on a slide by Stu Weibel
8Communities working together MetadataMuseumsMetadataMetadataMetadataBased on a slide by Stu Weibel
9What is Metadata? Meaningless jargon or a fashionable term for what we’ve always doneor “a means of turning data into information”and “data about data”and the name of a film director (‘Luc Besson’)and the title of a book (‘The Lord of the Flies’).
10What is Metadata? Metadata exists for almost anything People Places ObjectsConceptsDatabasesWeb pages
11What is Metadata? Metadata fulfils three main functions: description of resource content“What is it?”description of resource form“How is it constructed?”description of issues behind resource use“Can I afford it?”.
12What is Metadata?Many structures have evolved at different levels, and to meet different requirements...MICI
13For human communication we need... Semantic InteroperabilityStandardisation of content“cat milk sat drank mat ”“Let’s talk English”Structural InteroperabilityStandardisation of form“Here’s how to make a sentence”“Cat sat on mat. Drank milk.”Syntactic InteroperabilityStandardisation of expression“These are the rules of grammar”“The cat sat on the mat. It drank some milk.”
14Many flavours of metadata Managing change ChallengesOpportunitiesMany flavours of metadatawhich one do I use?Managing changenew varieties, and evolution of existing formsTension between functionality and simplicity, extensibility and interoperabilityFunctions, features, and cool stuffSimplicity and interoperability
15Introducing the Dublin Core An attempt to improve resource discovery on the Webnow adopted more broadlyBuilding an interdisciplinary consensus about a core element set for resource discoverysimple and intuitivecross–disciplinaryinternationalflexible.
16Introducing the Dublin Core 15 elements of descriptive metadataAll elements optionalAll elements repeatableThe whole is extensibleoffering a starting point for semantically richer descriptionsInterdisciplinarylibraries, museums, government, education...Internationalavailable in 20 languages, with more on the way.
17Introducing the Dublin Core TitleCreatorSubjectDescriptionPublisherContributorDateTypeFormatIdentifierSourceLanguageRelationCoverageRights
18Extending DC (semantic refinement) Improve descriptive precision by addingsub–structure (subelements and schemes)Element qualifierValue qualifierGreater precision = lesser interoperabilityShould ‘dumb down’ gracefullyCreatorFirst NameSurnameAffiliationContact InfoBased on a slide by Stu Weibel
19Extending DC (a modular approach) Modular extensibility...additional elements to support local needscomplementary packages of metadata…but only if we get the building blocks rightTerms & ConditionsDescriptionArchival ManagementBased on a slide by Stu Weibel
20Extending DC? DC offers a semantic framework through use of further substructure, meaning can often be clarifiedJohn Inc. ?John xyz ?xyz John ?<Creator>“John”John Inc.John xyzxyz John.<Creator><fore name>“John”
21Extending DC? DC offers a semantic framework Use of domain–specific schemes greatly increases precisionWashington State ?Washington DC ?Washington monument ?<Coverage>“Washington”Washington StateWashington DCWashington monument<Coverage><TGN>“Washington”“North and Central America, United States, Washington”
22Dublin Core in the physical world Dublin Core originally designed with electronic resources in mindPhysical resources are fundamentally differentIssues of surrogacy become more importantGenre, Type, and Format models vary greatlyDifficult to remember what is being described, and which characteristics of the resource and its surrogates are ‘correct’.
23Introducing Physical Objects Aspects of the real world are key to much of what museums doPhysical objects have dimensions23 x 46 cm12 x 52 x 18 in18.6 cm3823 pagesPhysical objects have a formoil on canvasTadcaster limestonestainless steel.
24Introducing Physical Objects Physical objects change over timeconstructed between AD524 and 873repaired in AD1270incorporated into ornamental arch in AD1320Physical objects movecast in Beijingused in Shanghaitaken to Hong Kongon display in Macau.
25Introducing Physical Objects Physical objects are associated with peoplewritten by William Shakespeareacquired by Lord Elgindecreed by the Emperor Hadrianassociated with Prince Charles Edward StuartPhysical objects are contextualisedfired at the Battle of Trafalgarcarried on Apollo 11 from the moonprinted on the first printing presssalvaged from the Titanic.
26Introducing Collections Museum objects, whether original or surrogate, are normally part of a collectionCollections may be ‘real’...the Sutton Hoo hoardthe Terracotta Warriors...an aspect of the process by which objects enter the museum...the Burrell CollectionSolomon Guggenheim’s art collection…or simply practicalcoins at the British Museumthe Tate Gallery’s collection of works by Da Vinci.
27Introducing Surrogacy Many of the resources we describe are, in reality, surrogates for something elsea photograph of King Tutankhamen’s death maska photograph of a statue of George Washingtona film of President Kennedy’s assassinationa sound recording of Neil Armstrong’s “One small step for man…” speech on the moona copy of the Mona Lisaa model of the Great Wall of Chinaa reproduction of the Terracotta warriors.
28Issues of SurrogacyMany of the resources we describe are, in reality, surrogates for something elsewe need to be clear whether we are describing the resource or its surrogatethe sculptor of a statue is often not the person who made its photographic surrogatethe model of the Forbidden City is unlikely to have been created at the same date as the Forbidden City itselfthe format of a computer image of the Mona Lisa (image/jpeg ?)is not the same as the format of the original painting (oil on canvas ?).
29Other Museum IssuesMuseums need to describe real objects and surrogates in a similar mannerguidelines/standards therefore need to encompass both, despite their differencesResource descriptions will often be drawn from existing collection management systems in the first instance, rather than created afreshguidelines therefore need to respect existing practices within established systemsThere is often no ‘right’ answerso practices need to allow for approximate dates, multiple possible creators, etc.
30Introducing the 1:1 Principle 1 : 1The broader Dublin Core community is tackling some of the problems relevant to museumsTheir work on the ‘1:1 Principle’ is especially useful in resolving museum issues over original versus surrogate and item versus collection:each Dublin Core ‘record’ should describe only one resource, whether surrogate or original. Associated resources should be linked together by means of the Relation element in Dublin Core.
31Introducing the 1:1 Principle 1 : 1In a record describing a photo of the Mona Lisa on a web page, for example…Leonardo da Vinci is not the creator of the imageThe image was not created during the Renaissance…but you might include these as Subject terms, and you could usefully provided a link to the record describing the real painting via Dublin Core’s Relation elementEqually, in describing the painting itself…is not the Identifier of the paintingbut you might link to this image via Relation, just to show people what the painting looks like.
32The primacy of ‘Type’In describing museum objects, it is often most useful to first decide what you are describing and why, rather than beginning with ‘who made it’ and ‘what is it called’, as is often the case with booksif you know you’re describing a surrogate of the Mona Lisa, then you know Leonardo da Vinci is not the Creator; whoever made the surrogate isif you know you’re describing a collection of 20th century paintings, then you know that Picasso, Hockney et al are not the Creators; the collector is.
33The primacy of ‘Type’if you know you’re describing the Sutton Hoo helmet, then the fact that it was added to a particular museum collection in 1939 perhaps doesn’t matter; that information is better placed in the collection recordif you know you’re describing a natural specimen, then perhaps it has no Creator; there may be a ‘creator’ associated with its identification or collection, though.
34Dublin Core for Museums: Assumptions In applying Dublin Core to museums, we are making certain basic assumptions, many of which were tested by CIMIDC is appropriate for use in describing both physical and digital resourcesDC is easy to learn and simple to useInformation can be meaningfully and efficiently extracted from existing museum systems in order to populate DC recordsthe creation of a DC record to describe a museum object is cost–effective, and aids the discovery of resources more than simply allowing access to the underlying Collection Management system might.
35Practicalities of Implementing Dublin Core Paul Miller Uk Office for Library & Information NetworkingThomas Hofmann Australian Museums On-Line
36Overview Creation and Maintenance Harvesting and Distribution RetrievalImplementation ModelsCase Study
37Dublin Core - Refresher 15 simple elementsFocus on Resource Discovery not Resource DescriptionOne Dublin Core record per resourceInteroperable across communitiesCan be easy populated from existing databasesCan be formatted in XML/ RDF or HTML
38When should I use Dublin Core? You have a rich standard, need simpler oneYou want to disclose your data to other communities using commonly understood semanticsYou want to provide unified access to databases with different underlying schemasYou need core description semantics and don’t feel compelled to invent them anew
39 Considerations Creation and Maintenance tools educate Harvesting/ Distribution toolsRetrieval tools consensus interface design
49Example Tool: DC DotScreenshots of DC Dot output
50Example Tool: Reggie http://metadata.net Generic creation tool for any metadata schema published to metadata.netCurrently supports: Dublin Core in 5 languagesSyntax: HTML META tags (V3.2 and 4.0), RDF
52Example Tool: Site Generator Tool which parses local web site and automatically creates Dublin Core metadataSyntax: HTMLJAVA based tool which requires JDK 1.1
53Further Information - Creation and Maint. Metadata Creation Tools General METADATA PAGE AT UKOLN METAWEB TagGen SEUser GuidesOfficial User Guide for Simple Dublin CoreCIMI Guide to Best Practice: Dublin Core
54Harvesting and Distributing Dublin Core Metadata
55Harvesting / Distribution ToolsZ39.50 GatewayMetadata HarvesterFull-text Search EngineResourcesIndexing, harvesting toolsZ
57Retrieval Tools Interface design HTML - search forms HTML - predefined queriesZ39.50 clients/ Java appletsStandalone applicationsInterface designAssist users: -help them to understand what they are looking for -give them an idea what terminologies you are using -use commonly understood design language
58Bringing it all together: Implementation Models
59Implementation Models Harvesting DC into a repository (database)Distributed Database SearchFull-text indexing with metadata extraction
60Implementation Models Harvesting DC into a repository (database)HTMLQueryHarvesterRepositoryXMLOther typesDynamic document creation from databaseretrieve resource
62Implementation Models Full-text indexing with metadata extractionHTMLQueryIndexerIndex DBXMLOther typesDynamic document creation from databaseretrieve resource
63Questions before implementation Do I really need Dublin Core?What is my budget?What type of resources do I want to describe?Which encoding format for which resource?Do I have community support?Can I provide creation tools?
64Challenges of implementing Dublin Core IntellectualEducation of information creatorsCommunity consensusResistance against sharing informationTechnicalEfficient toolsInfrastructureEconomicalAutomatic generation vs. manual creationCost of trainingCost of tools
66Dublin Core for the masses Why Dublin Core hasn’t hit the consumer market yetNo killer applicationLack of standardisationNo support in public search enginesNo support in mass market applicationsNon transparent applicationsInefficient handling in HTML
67Further InformationProjects Official Dublin Core web siteMailing lists Dublin Core Implementors workgroup Mailing list
69Case Study AMOL (1) Gateway to Australian Museums and Galleries Initial idea: One central access point for all Australian collectionsCreation of AMOL standard record for object data due to lack of common standards8 basic field with focus on resource discovery and easy deployment from within existing databasesFields: Object Title, Object Name, Creator, Description, Item ID, KeySearchTerms, Date/DateRange, Associated Places
70Case Study AMOL (2)AMOL search/ system architecture - current systemMappedmetadataexportedUserqueries searchengine and gets recordsdelivered to web browserHTMLdocumentsLegacyDBRemote web serverstoring HMTL documentsAMOL index server
71Case Study AMOL (3) Lessons Learned Data and technology related Lack of consistent use of controlled vocabularies, quality of data recordedPerformance of indexing software, lack of metadata support in public search engineshigh administration effortsIntellectualUsers have problems with “empty text box” approachLimited information in record to see context with larger pictureGeneralLarge institutions: bureaucratic machinery, complex collection systems designed without interoperability in mindSmall institutions: concerned about security issues, fear of larger institutions
72Case Study AMOL (4) New perspectives New resource types: Information about institutions, Images, Video, Audio, general HTML pages - goes beyond capabilities of standard AMOL recordNeed to provide easier access for usersNew cross community projects require interoperable metadata standards for cross domain searchingStrong move in Australia towards Dublin Core based metadata schemas driven by governmentStrong move towards interpretation of objects through storiesSearch Architecture and extended AMOL metadata standard
73 Case Study AMOL (5) NEW AMOL search/ system architecture Userqueries searchengine and gets recordsdelivered to web browserRemote web serverProviding dynamic access to ODBC databasesLegacy databasesAV resourcesTextual resourcesInformation mapped to DC based metadata plus index text, imagesAMOL index server
74Case Study AMOL (6) Future Directions Implementation of RDF for dynamically served databases and text style resourcesConsensus of community: Metadata ForumFurther education of users: Metadata WorkshopsCreation of multi-type metadata schema based on Dublin CoreCreation of mapping tools for easier database implementation
75Case Study AMOL (7) Recommendations Biggest Problem still remaining: Prepare good user guidesRun workshops and educate museum professionalsGet consensus from communityPlan with interoperability in mindEvaluate tools and plan for future additionsBiggest Problem still remaining:what is the benefit to the individual institution other than being interoperable for networked resources
77Dublin Core for the masses Why Dublin Core hasn’t hit the consumer market yetNo killer applicationLack of standardisationNo support in public search enginesNo support in mass market applicationsNon transparent applicationsInefficient handling in HTML
78Further InformationProjects Official Dublin Core web siteMailing lists Dublin Core Implementors workgroup Mailing list
80For Machine Communication we need.. Semantic InteroperabilityStandardisation of content“Let’s talk Resource Description”“Creator, Publisher..,”Structural InteroperabilityStandardisation of form“Lets use MICI”“Field # 1 Element NameSyntactic InteroperabilityStandardisation of expression“Here’s how to say it in HTML”“<Meta name= Element Name= “….”>”