Presentation is loading. Please wait.

Presentation is loading. Please wait.

A centre of expertise in digital information managementwww.ukoln.ac.uk The Role Of Metadata Brian Kelly UKOLN University of Bath Bath, BA2 7AY

Similar presentations


Presentation on theme: "A centre of expertise in digital information managementwww.ukoln.ac.uk The Role Of Metadata Brian Kelly UKOLN University of Bath Bath, BA2 7AY"— Presentation transcript:

1 A centre of expertise in digital information managementwww.ukoln.ac.uk The Role Of Metadata Brian Kelly UKOLN University of Bath Bath, BA2 7AY Email B.Kelly@ukoln.ac.uk URL http://www.ukoln.ac.uk/web-focus/presentations UKOLN is supported by:

2 A centre of expertise in digital information managementwww.ukoln.ac.uk Contents Introduction Background To Metadata Metadata Standards Metadata Management Metadata And Quality Conclusions The Brief "I know from conversations … I have had with customers, that metadata poses some really difficult questions …" The talk addresses the questions: What is metadata and why is it important? What's this Dublin Core I've heard about (and why Dublin?) What benefits will I get if I use metadata? How should I do it? What will it cost me? The Brief "I know from conversations … I have had with customers, that metadata poses some really difficult questions …" The talk addresses the questions: What is metadata and why is it important? What's this Dublin Core I've heard about (and why Dublin?) What benefits will I get if I use metadata? How should I do it? What will it cost me? Introduction

3 A centre of expertise in digital information managementwww.ukoln.ac.uk About UKOLN / Web Focus UKOLN: A national centre of expertise in digital information management (including metadata) Based at University of Bath Funded by JISC and Resource to support the Higher & Further / cultural heritage sectors UK Web Focus: Provides advice and support on Web issues, especially standards and best practices Provided by Brian Kelly Funded by JISC from Nov 1996 - August 2003. Now jointly funded by JISC & Resource QA Focus: Developing QA methodology to support JISC digital library programmes Introduction

4 A centre of expertise in digital information managementwww.ukoln.ac.uk About You How many are: Librarians Software / systems developers (techies) Commercial vendors Others ExpertNoviceAverage What is the extent of your knowledge of metadata? RDF OAI CLD … MARC Dublin Core … ??? Introduction

5 A centre of expertise in digital information managementwww.ukoln.ac.uk What is Metadata? "This metadata you've been talking about …. isn't it just catalogue records?" Question at metadata seminar, 1998 Metadata can be regarded as: Catalogue records for the Web Data about data Structured information suitable for automated processing Background Metadata Demystified http://www.niso.org/standards/resources/Metadata_Demystified.pdf In current practice, the term has come to mean structured information that feeds into automated processes, and this is currently the most useful way to think about metadata

6 A centre of expertise in digital information managementwww.ukoln.ac.uk The Problem Back in mid-1990s: Size of Web growing exponentially Web being used for both scholarly and non-scholarly (!) purposes Need for better searching mechanisms Search engines seemed promising, but concerns over abuse (e.g. porn index spammers) and difficulties in finding quality information Various sectors came together to develop a core set of metadata attributes for resource discovery Background

7 A centre of expertise in digital information managementwww.ukoln.ac.uk Dublin Core In mid-1990s: Meeting held in Dublin, Ohio in 1995 Involvement from several sectors (libraries, museums, science, IT, …) Agreement reached on a core set of metadata attributes for resource discovery Given the name Dublin Core(DC) DCMI organisation later formed DC Working parties established to coordination development of DC Regular annual conferences held Dublin Core See

8 A centre of expertise in digital information managementwww.ukoln.ac.uk Why So Complex? Why is there a need for working groups, annual events, etc. for developing a standard for catalogue records? It's not just documents: an Author record is inappropriate for a painting, a piece of music, etc. It's not just for humans: the DC records will be processed by software, for which unambiguity in essential It needs to be integrated: with a rapidly- developing Web architecture It needs to be future-proofed : so we don't have to do it all again when a new technology emerges Dublin Core

9 A centre of expertise in digital information managementwww.ukoln.ac.uk Using Dublin Core Note that DCMI defined a core set of elements: TitleA name given to the resource. CreatorAn entity primarily responsible for making the content of the resource. Publisher An entity responsible for making the resource available. DateA date of an event in the lifecycle of the resource.… How this format could be represented was not defined initially Dublin Core

10 A centre of expertise in digital information managementwww.ukoln.ac.uk Representing Dublin Core Initially many people thought that DC would be embedded in HTML pages: but how are multiple author's represented: or It is not possible to describe the potential complexities of DC in the HTML language Dublin Core

11 A centre of expertise in digital information managementwww.ukoln.ac.uk Dublin Core Is Too Simple! Dublin Core was designed as a core set of metadata elements for resource discovery. However: The benefits of the standard became apparent and DC became used in many areas There was a need to be able to represent richer metadata content and relationship e.g. Multiple authors and contact details Alternative titles Use of controlled vocabularies from particular schemes A mechanism known as Qualified Dublin Core was developed to address this. Dublin Core

12 A centre of expertise in digital information managementwww.ukoln.ac.uk Use In HTML Dublin Core potential was recognised and the W3C's release of HTML 4.0 included a mechanism for defining schemes in the element: <meta name = "DC.Subject" content = "heart attack"> <meta name = "DC.Subject" scheme = "MeSH" content = "Myocardial Infarction; Pericardial Effusion"> See < http://dublincore.org/documents/2001/ 04/12/usageguide/qualified-html.shtml > See < http://dublincore.org/documents/2001/ 04/12/usageguide/qualified-html.shtml > Dublin Core

13 A centre of expertise in digital information managementwww.ukoln.ac.uk XML XML (Extensible Markup Language): Developed by W3C A meta-language used to create other languages Addresses HTML's lack of extensibility A family of standards which form the foundations for a richer and more interoperable Web: XML XML Namespaces XSLT XML Schemas … A proven success Rather than slowly tweaking HTML to allow rich DC to be embedded, XML allows new metadata applications to be developed which can be integrated with existing Web services W3C Developments

14 A centre of expertise in digital information managementwww.ukoln.ac.uk Beyond Use In HTML In parallel to release of HTML 4.0 W3C working on: A rich metadata framework which could be used for any metadata application: Content filtering (this resource contains nudity) Defining collections of related resources (Web site maps) Digital signatures … Development of the Semantic Web - An ambitious attempt to allow data from distributed services to be integrated RDF (Resource Description Framework) was developed as W3C's solution to both problems W3C Developments

15 A centre of expertise in digital information managementwww.ukoln.ac.uk RDF RDF: An XML application Richer than conventional XML applications: a mathematical model which describes relationships is embedded in the RDF This richness comes with a price - increased complexity RDF applications are being developed. However at present it may be advisable to leave RDF to the research community or well-funded pilot studies to prove its benefits before committing to use in a service environment (However note that metadata in PDF documents is stored as RDF) RDF applications are being developed. However at present it may be advisable to leave RDF to the research community or well-funded pilot studies to prove its benefits before committing to use in a service environment (However note that metadata in PDF documents is stored as RDF) W3C Developments

16 A centre of expertise in digital information managementwww.ukoln.ac.uk Beyond Resource Discovery Metadata has a role to play beyond item-level resource discovery Other metadata applications include: Metadata for digitised objects: about the object and about the digitisation process Management / administrative metadata: review this resource by xx; delete this resource on …; this resource is managed by the XYZ group; … Metadata about collections (physical and online) … Using Metadata

17 A centre of expertise in digital information managementwww.ukoln.ac.uk Metadata Modelling (1) You want to use Dublin Core metadata. How do you choose how to model your metadata? Do you use simple Dublin Core (the basic 15 elements)? Do you use qualified Dublin Core to enable richer metadata to be described? If the latter, how do you decide which qualified DC metadata to use? These are key issues to address. In some cases answers may be provided for you. In other cases, you musty answer these questions for yourself. These are key issues to address. In some cases answers may be provided for you. In other cases, you musty answer these questions for yourself. Using Metadata

18 A centre of expertise in digital information managementwww.ukoln.ac.uk Metadata Modelling (2) Why do you wish to use metadata? Because it fashionable? Because you're a librarian and librarians 'do' metadata? Because you want you Web site to be no. 1 in Google? Because you are developing an application which requires use of metadata? Please remember: Developing applications which make use of metadata can be expensive. Creating and managing metadata can be expensive Search engines such as Google typically make little or no use of metadata Please remember: Developing applications which make use of metadata can be expensive. Creating and managing metadata can be expensive Search engines such as Google typically make little or no use of metadata Using Metadata

19 A centre of expertise in digital information managementwww.ukoln.ac.uk Metadata Modelling (3) Exploit Interactive case study: EU-funded ejournal Requirement to provide local searching better than simple free text searching: Search by title, author and keywords Search by funding stream Search by issue and article type The end-user interface is illustrated See Using Metadata

20 A centre of expertise in digital information managementwww.ukoln.ac.uk Metadata Modelling (4) How did we manage and model the metadata? Article metadata doc_title = "The XHTML Interview" author="Kelly, B." title="WebWatching National Node Sites" description = "In this issue's Web Technologies column we ask Brian Kelly to tell us more about XHTML." article_type = "regular" Issue metadata issue_num = "6" pub_date="25 Oct 2002" Site metadata name = "Exploit Interactive" publisher="UKOLN" Processed by server-side script

21 A centre of expertise in digital information managementwww.ukoln.ac.uk You may wish to: Embed HTML metadata in HTML pages Link to HTML metadata from HTML Embed RDF Store metadata in application (home-grown scripts, CMS, metadata repository, image management system, …) Storing DC Metadata It is up to you how you store your metadata. Your choice will be affected by the use which will be made of your metadata and how it will be created and managed. You may wish to store your metadata in a database and make it available according to its use. HTML RDF AuthorBookPub. Date G.Orwell19841948 I. RankinQuestion Of Blood 2003 Metadata Management Metadata management tool

22 A centre of expertise in digital information managementwww.ukoln.ac.uk A Simple DC Management Tool DC-dot: Simple Web-based DC creation and management tool Output in range of formats (HTML, XHTML, RDF, …) Provides validation Useful for small-scale metadata creation But: Not ideal for large-scale usage Doesn't provide rich management capabilities http://www.ukoln.ac.uk/metadata/dcdot/ Metadata Management

23 A centre of expertise in digital information managementwww.ukoln.ac.uk Management Tools Many types of metadata tools: Type the metadata by hand Use File -> Properties menu in MS Office applications and export data Home-grown database systems Home-grown scripting solutions Use of commercial systems: Library management systems Image management systems … There is no single ideal solution. The solution you choose should reflect your needs, expertise, organisational culture, … There is no single ideal solution. The solution you choose should reflect your needs, expertise, organisational culture, … Metadata Management

24 A centre of expertise in digital information managementwww.ukoln.ac.uk Quality Assurance The Need for QA: Metadata is the 'glue' for integration of services If the metadata quality is poor, services will not be able to be interoperable There is therefore a need for quality assurance procedures to ensure fitness for purpose What Can Go Wrong? Things that can go wrong include: Metadata is out-of-date or incorrect Metadata is used inconsistently within service Metadata is used inconsistently across services Metadata is not modelled correctly Metadata not compliant with storage standard … Quality Assurance

25 A centre of expertise in digital information managementwww.ukoln.ac.uk Think About The Implementation It is important that when you deploy metadata systems you can manage and maintain the metadata. For example: Details of the person maintaining the data change (name change due to marriage, person leaves, …) Organisational details change (mergers, takeovers, …) Technology changes Prepare for change! People change, organisations change, responsibilities change, technologies change, … Ensure that you can manage the metadata which reflects such changes Prepare for change! People change, organisations change, responsibilities change, technologies change, … Ensure that you can manage the metadata which reflects such changes Quality Assurance

26 A centre of expertise in digital information managementwww.ukoln.ac.uk Need For Cataloguing Rules Your Cataloguing Rules You will need cataloguing rules to support your metadata creation You will need to provide necessary training and support (especially if you are dependent on cataloguing by non-professionals) Interoperability How will you interoperate with services which deploy different cataloguing rules: 04/07/03 – what date is this? LSC – what does this stand for? Humans use context; software products don't There is a need to define the standards you're applying (in a machine understandable way) Metadata Management

27 A centre of expertise in digital information managementwww.ukoln.ac.uk Need For QA Procedures So we have: Tools for managing metadata Cataloguing rules But: People make mistakes Software may have bugs Our rules may be ambiguous The standards may be ambiguous The metadata may be correct but confusing in other contexts, … Although humans can adapt to errors and unambiguities, software typically can't. We therefore need quality assurance procedures to ensure that metadata applications will be interoperable. Quality Assurance

28 A centre of expertise in digital information managementwww.ukoln.ac.uk Approaches To QA We may wish to consider: Systematic checking at data creation Systematic checking of output Semi-automated checking (e.g. duplication, common misspellings, out-of-range checks, …) Automated checking … Worst Case Scenario: You service is fine, and quality metadata provided. Your data is integrated with others services to provide an international portal to quality resources. However the other service providers have poor quality metadata. The poor quality of the final service brings your contributor into disrepute. Worst Case Scenario: You service is fine, and quality metadata provided. Your data is integrated with others services to provide an international portal to quality resources. However the other service providers have poor quality metadata. The poor quality of the final service brings your contributor into disrepute. Quality Assurance

29 A centre of expertise in digital information managementwww.ukoln.ac.uk Pulling It Together

30 A centre of expertise in digital information managementwww.ukoln.ac.uk Conclusions To conclude: Metadata can provide richer searching and other services within a service and the glue for integration across several services There are several key standards: Dublin Core, HTML, XML, … You will need to select the standards appropriate to your service requirements You will need to choose the metadata according to your service requirements You will need to choose the architectural framework and applications for managing your metadata according to your service requirements You will need to ensure that you have appropriate quality assurance mechanisms in place – otherwise the above work will have been wasted! It can be worth it!


Download ppt "A centre of expertise in digital information managementwww.ukoln.ac.uk The Role Of Metadata Brian Kelly UKOLN University of Bath Bath, BA2 7AY"

Similar presentations


Ads by Google