March 15, 2000 Howard Rosenbaum Metadata: Information Access for the New Century Indiana Library Federation Annual Meeting Metadata:

Slides:



Advertisements
Similar presentations
doi> Digital Object Identifier: overview
Advertisements

Configuration management
Mine Action Information Center
Spatial Data Infrastructure: Concepts and Components Geog 458: Map Sources and Errors March 6, 2006.
Oregon Spatial Data Library Partnership Metadata Training OU Knight Library Eugene, Oregon December 3, 2009 Kuuipo Walsh Institute for Natural Resources.
1 Adaptive Management Portal April
A New Computing Paradigm. Overview of Web Services Over 66 percent of respondents to a 2001 InfoWorld magazine poll agreed that "Web services are likely.
An Introduction to Metadata by Wendy Duff ECURE 2000 October 6, 2000.
Metadata: An Introduction By Wendy Duff October 13, 2001 ECURE.
© Tefko Saracevic, Rutgers University1 metadata considerations for digital libraries.
8/28/97Information Organization and Retrieval Metadata and Data Structures University of California, Berkeley School of Information Management and Systems.
1 CS 502: Computing Methods for Digital Libraries Lecture 17 Descriptive Metadata: Dublin Core.
OLC Spring Chapter Conferences Metadata, Schmetadata … Tell Me Why I Should Care? OLC Spring Chapter Conferences, 2004 Margaret.
1 CS 502: Computing Methods for Digital Libraries Lecture 4 Identifiers and Reference Links.
Developing a Basic Web Page with HTML
CORDRA Philip V.W. Dodds March The “Problem Space” The SCORM framework specifies how to develop and deploy content objects that can be shared and.
UKOLUG - July Metadata for the Web RDF and the Dublin Core Andy Powell UKOLN, University of Bath UKOLN.
Metadata and identifiers for e- journals Copenhagen Juha Hakala Helsinki University Library
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
ORGANIZING AND STRUCTURING DATA FOR DIGITAL PROJECTS Suzanne Huffman Digital Resources Librarian Simpson Library.
The Internet and E-Commerce Back to Table of Contents.
Teaching Metadata and Networked Information Organization & Retrieval The UNT SLIS Experience William E. Moen School of Library and Information Sciences.
Copyright © cs-tutorial.com. Introduction to Web Development In 1990 and 1991,Tim Berners-Lee created the World Wide Web at the European Laboratory for.
Getting started on informaworld™ How do I register my institution with informaworld™? How is my institution’s online access activated? What do I do if.
1 © Netskills Quality Internet Training, University of Newcastle Metadata Explained © Netskills, Quality Internet Training.
Lecturer: Ghadah Aldehim
8/28/97Organization of Information in Collections Introduction to Description: Dublin Core and History University of California, Berkeley School of Information.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
XHTML Introductory1 Linking and Publishing Basic Web Pages Chapter 3.
An Overview of MPEG-21 Cory McKay. Introduction Built on top of MPEG-4 and MPEG-7 standards Much more than just an audiovisual standard Meant to be a.
 To explain the importance of software configuration management (CM)  To describe key CM activities namely CM planning, change management, version management.
The Metadata Object Description Schema (MODS) NISO Metadata Workshop May 20, 2004 Rebecca Guenther Network Development and MARC Standards Office Library.
North American Profile: Partnership across borders. Sharon Shin, Metadata Coordinator, Federal Geographic Data Committee Raphael Sussman; Manager, Lands.
Organizing Internet Resources OCLC’s Internet Cataloging Project -- funded by the Department of Education -- from October 1, 1994 to March 31, 1996.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
Meta Tagging / Metadata Lindsay Berard Assisted by: Li Li.
UNESCO ICTLIP Module 1. Lesson 61 Introduction to Information and Communication Technologies Lesson 6. What is the Internet?
1 CS 502: Computing Methods for Digital Libraries Lecture 19 Interoperability Z39.50.
1 Metadata –Information about information – Different objects, different forms – e.g. Library catalogue record Property:Value: Author Ian Beardwell Publisher.
4 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved. Computer Software Chapter 4.
LIS654 lecture 5 DC metadata and omeka tables Thomas Krichel
Modularization and Interoperability: Dublin Core and the Warwick Framework Sandra D. Payette Digital Library Research Group Cornell University November.
Evolving MARC 21 for the future Rebecca Guenther CCS Forum, ALA Annual July 10, 2009.
Metadata and Documentation Iain Wallace Performing Arts Data Service.
1 UNIT 13 The World Wide Web Lecturer: Kholood Baselm.
Introduction to Informatics - Fall 02 I. What other metadata schemes are available? Digital Object Identifier Resource Description Format Persistent URL.
1 Chapter 1 Introduction to Databases Transparencies.
1 Understanding Cataloging with DLESE Metadata Karon Kelly Katy Ginger Holly Devaul
Metadata : an overview XML and Educational Metadata, SBU, London, 10 July 2001 Pete Johnston UKOLN, University of Bath Bath, BA2 7AY UKOLN is supported.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
1 Not So Strange Bedfellows: Information Standards For Librarians AND Publishers November 6, 2015.
Information Retrieval
Metadata and Meta tag. What is metadata? What does metadata do? Metadata schemes What is meta tag? Meta tag example Table of Content.
ESRI Education User Conference – July 6-8, 2001 ESRI Education User Conference – July 6-8, 2001 Introducing ArcCatalog: Tools for Metadata and Data Management.
Providing web services to mobile users: The architecture design of an m-service portal Minder Chen - Dongsong Zhang - Lina Zhou Presented by: Juan M. Cubillos.
The Semantic Web. What is the Semantic Web? The Semantic Web is an extension of the current Web in which information is given well-defined meaning, enabling.
Differences and distinctions: metadata types and their uses Stephen Winch Information Architecture Officer, SLIC.
Describing resources II: Dublin Core CERN-UNESCO School on Digital Libraries Rabat, Nov 22-26, 2010 Annette Holtkamp CERN.
An Application Profile and Prototype Metadata Management System for Licensed Electronic Resources Adam Chandler Information Technology Librarian Central.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
1 UNIT 13 The World Wide Web. Introduction 2 Agenda The World Wide Web Search Engines Video Streaming 3.
1 UNIT 13 The World Wide Web. Introduction 2 The World Wide Web: ▫ Commonly referred to as WWW or the Web. ▫ Is a service on the Internet. It consists.
Global Rangelands Data Entry Guidelines March 23, 2015.
Attributes and Values Describing Entities. Metadata At the most basic level, metadata is just another term for description, or information about an entity.
Geospatial metadata Prof. Wenwen Li School of Geographical Sciences and Urban Planning 5644 Coor Hall
WHAT DOES THE FUTURE HOLD? Ann Ellis Dec. 18, 2000
Attributes and Values Describing Entities.
Unit# 5: Internet and Worldwide Web
Attributes and Values Describing Entities.
Presentation transcript:

March 15, 2000 Howard Rosenbaum Metadata: Information Access for the New Century Indiana Library Federation Annual Meeting Metadata: Information Access for the New Century Indiana Library Federation Annual Meeting

Metadata revealed! I. Introduction The state of the net today What is metadata and why do we need it? II. What different metadata schemes are available? Dublin Core and Warwick Framework Digital Object Identifier Resource Description Format Persistent URL III. What does metadata mean for information management in libraries?

Metadata revealed! I. Introduction The state of the net today “...At some point, the Internet has to stop looking like the world’s biggest rummage sale. For taming this particular frontier, the right people are librarians, not cowboys. The Internet is made of information, and nobody knows more about how to order information than librarians, who have been pondering that problem for thousands of years.” Rennie, J. (1997). Civilizing the Internet. Scientific American. 6.

Metadata revealed! World Total million Africa 2.46 million Asia/Pacific million Europe million Middle East 1.29 million Canada & USA million South America 8.79 million How many are online?

Metadata revealed! According to the Internet Software Consortium, in Jan 2000 there were 72,398,092 hosts in the net “The publicly indexable World Wide Web now contains about 800 million pages, encompassing about 6 terabytes of text data” “Our results show that search engines are increasingly falling behind in their efforts to index the web” Lawrence and Giles (1999). Accessibility of information on the web. Nature 400 (July 8). 107, 108

Metadata revealed! Coverage Search engine coverage of the publicly indexable web has decreased substantially “with no engine indexing more than about 16% of web pages” Unequal access Search engines are typically more likely to index sites that have more links to them (more ‘popular’ sites) They are also more likely to index US sites than non-US sites and commercial sites rather than educational sites Out of date Indexing of new or modified pages by just one of the major search engines can take months

Metadata revealed! Information distribution 83% of sites contain commercial content and 6% contain scientific or educational content 1.5% of sites contain pornographic content Low metadata use The simple HTML “keywords” and “description” metatags are only used on the homepages of 34% of sites Only 0.3% of sites use the Dublin Core metadata standard

Metadata revealed! Some observations The Internet was never intended to be a tool for information organization and retrieval Network resources are proliferating rapidly, so some organization and method of access (beyond browsing) is needed These resources increasing at an increasing rate (we are helping) Material on the net is quirky, transient, and chaotically archived Because of the decentralized nature of the net, it is clear that an imposed scheme is unworkable

Metadata revealed! So what is metadata and why do we need it? The Internet is full. Go away. Metadata may be one way for us to find what we need when we need it and in the form we want “The concept of metadata predates the Web, having … been coined... in the 1960s to describe datasets effectively. Metadata is data about data, and... provides basic information such as the author of a work, the date of creation, links to any related works, etc.” Miller. P. (1996). Metadata for the Masses. Ariadne.

Metadata revealed! When we search, we find that there are many more irrelevant hits in a typical search engine return page What good is a search that returns 47,000 documents for the phrase “dublin core”? “Metadata is information that describes other information sources. [It is] a potential remedy to the problem of finding relevant information on the Internet” Thomas, C.T. and Griffin, L.S. (1999). Who will create metadata for the Internet. First Monday. 3(12).

Metadata revealed! In addition, there is the interesting question of the type of metadata that is appropriate for the web “There is an obvious requirement for metadata, [it] must be of a form suitable for interpretation both by the search engines and by human beings, and it must also be simple to create so that any web page author may easily describe the contents of their page and make it immediately both more accessible and more useful” Miller. P. (1996). Metadata for the Masses. Ariadne.

Metadata revealed! Metadata is the information necessary to identify, locate, organize, and access an electronic resource It describes what can be said about something and what people can do with it (rights) It describes datasets concisely using a standard format For this reason it has the unique ability of making all metadata records equal in worth Metadata records provide information about data in a similar way that library catalogues provide information about books A catalogue facilitates searching for particular topics or author(s) - metadata is searchable in a comparable way.

Metadata revealed! There are two levels of this problem Organizing an existing collection so that it is accessible over the Internet The American Memory Project at: Berkeley Digital Sunsite collections at: Developing schemes to organize directories of networked information and search tools This is being done with search engines and metadata

Metadata revealed! Who uses metadata? Business uses of metadata External: advertising and search engine placement Internal: management of internal digital documents Academic uses of metadata To provide a scheme for organizing digital information For extending access to these materials

Metadata revealed! Types of metadata Descriptive (access) Description: captions, keywords or categories Access points, location, identifier Relationship to other objects File type, size or creation date Administrative Management information Provenance: authentication, document conversion info Rights, terms and conditions Structural Putting the object together from its logical components

Metadata revealed! Benefits of metadata For the producer Ability to provide relevant details about the resource Ability to provide information which is not in the resource (e.g. descriptive text for images or executable files) ability to highlight most important aspects of resource For the indexing service No need to guess about resource content Highly structured data to index Less bandwidth, more efficient, easier to maintain

Metadata revealed! For the user More precise results via retrieval on surrogate content Field-based searching Access to non-textual resources Less information overload

Metadata revealed! Metadata can support many potential applications: Resource discovery Content ratings E-commerce Authentication Data management Intellectual property rights management Digital preservation Searching, location Authentication Quality/rating Semantic interoperability Resource management

Metadata revealed! There are two levels at which the problem can be attacked Classifying and organizing a core collection of digital materials The questions: what to collect, how to organize it, how to maintain it, and how to provide access to it Creating directories, search tools, metadata schemes and other means of access to digital materials outside the core collection The questions: what to include, why, maintenance, and the provision of access These questions are becoming increasingly important in the design of digital libraries

Metadata revealed! I. Introduction The state of the net today What is metadata and why do we need it? II. What different metadata schemes are available? Dublin Core and Warwick Framework Digital Object Identifier Resource Description Format Persistent URL III. What does metadata mean for information management in libraries?

Metadata revealed! One suggestion is to use “metadata” The “Dublin Core Metadata Program” is one example What are the necessary elements that should be used to describe networked information? This was discussed at an OCLC workshop in 1995 Goals Fostering a common understanding of the needs, strengths, shortcomings, and solutions of stakeholders Reaching consensus on a core set of metadata elements to describe networked resources core_report.html

Metadata revealed! A small set of metadata elements would be valuable It would encourage authors and publishers to provide metadata, in a form that automated resource discovery tools could collect It would encourage the creation of network publishing tools containing a template for metadata elements, simplifying the task of creating metadata records This type of record could serve as the basis for a more detailed cataloging record if the need arises If something like the Dublin Core becomes a standard, metadata records will be able to be understood across user communities

Metadata revealed! Defined Universal Bibliographic Language for INternet and Coherent Online REsource It is a minimal information resource description set It is intended for organization and resource discovery on the web It will improve searching with simple resource description semantics Researchers have built a consensus around a core element set that is Simple and intuitive Cross-disciplinary International Flexible

Metadata revealed! The Dublin core metadata element set supports resource discovery because it is: Easy for authors and content managers to create and maintain Interoperable, extensible, and platform independent Syntax-independent Intended for, but not limited to, network resources Intended to be embedded, but needn’t be Not intended to meet complete metadata needs of any given community

Metadata revealed! These are the elements in the Dublin Core Title: The name of the object Author: The person(s) primarily responsible for the intellectual content of the object Subject/keywords: The topic addressed by the work Typically expressed as keywords, key phrases or classification codes that describe a topic of the resource Description: An account of the content of the resource May include but is not limited to an abstract, table of contents, reference to a graphical representation of content or a free-text account of the content

Metadata revealed! Publisher: The agent or agency responsible for making the object available Date: A date associated with an event in the life cycle of the resource YYYY-MM-DD ObjectType: The genre of the object, such as novel, poem, or dictionary Format: The data representation of the object, or the physical or digital manifestation of the resource, Typically the media-type or dimensions of the resource May be used to determine the software, hardware or other equipment needed to display or operate it

Metadata revealed! Resource Identifier: An unambiguous reference to the resource within a given context, using string or number conforming to a formal identification system URI, URL, DOI, ISBN Relation: Relationship to other objects Source: Objects, either print or electronic, from which this object is derived, if applicable Coverage: The extent or scope of the content of the resource Will include spatial location (geographic coordinates or a place name), temporal period (a period label, date, or date range) or jurisdiction (a named administrative entity)

Metadata revealed! OtherAgent/contributor: The person(s), (editors and transcribers) who have made other significant intellectual contributions to the work Language: Language of the intellectual content Rights Management: information about rights held in and over the resource Using the Dublin Core: dc.title=The Book of Me dc.creator=Me dc.subject=My life dc.subject=All about me dc.publisher=The Press of Me dc.contributor=Only me

Metadata revealed! Here’s what it might look like embedded in an HTML document: The Home Page of Me

Metadata revealed! The Warwick Framework At the Warwick Workshop, researchers developed a “container architecture” known as the Warwick Framework The goal was to create an architecture that associates diverse types of metadata with a resource It is a mechanism for logically and physically aggregating distinct “packages” of metadata. The Framework is an advance because: It allows the designers of individual metadata sets to focus on specific requirements without concerns for generalization

Metadata revealed! The syntax of each metadata set can vary in conformance with semantic requirements, community practices, and functional processing requirements The management of and responsibility for specific metadata sets is left to respective “communities of expertise” It promotes interoperability by allowing tools and agents to selectively access and manipulate individual packages and ignore others It permits access to different metadata sets that are related to the same object to be separately controlled It flexibly accommodates future metadata sets by not requiring changes to existing sets or the programs that make use of them

Metadata revealed! Digital object identifiers This is an initiative from international book and journal publishers It is a new identification system to be used for all digital content The DOI system provides a unique identification for that content, protecting intellectual property It also provides a way to link users of the materials to the rights holders themselves to facilitate automated digital commerce in the new digital environment

Metadata revealed! Developed and tested over the last year, the DOI system is now being used by more than a dozen U.S. and European publishers in a pilot program that has been running since July Participation in Phase Two of the Prototype was extended to all publishers at the Frankfurt Book Fair in October 1997 DOI will be a persistent means to authenticate content to insure that what the customer is requesting is what is being sent

Metadata revealed! The DOI System has three parts, the identifier, the directory, and the database. The identifier is made up of two components The first element, the prefix, is assigned to the publisher by the Directory Manager However, at this phase of the Prototype, the prefixes all begin with 10 to designate the Directory Manager making the assignment of the prefix This is followed by a number designating the publisher who will be depositing the individual DOIs Publishers may chose to request a prefix for each imprint or product line, or may use a single prefix

Metadata revealed! The second element, following a slash mark, is the suffix This is the designation assigned by the publisher to the specific content being identified Many use recognized international standards for their suffixes If they do so, they are encouraged to indicate the standard being used by preceding it with a code The suffix can follow any system of the publisher ’ s choosing, and be assigned to objects of any size - book, article, abstract, chart - or any file type - text, audio, video, image or software

Metadata revealed! An object (book) may have one DOI, and a component within that object (chapter) may have another DOI The publisher decides the level of identification based on the nature of objects sold and distributed over the Internet The suffix can be as simple as a sequential number or a publishers' own internal numbering system /[ISBN] Prefix Suffix Directory Registrant Code Item # Prefix (Optional)

Metadata revealed! The Directory: The DOI system acts as a routing system Digital content may change ownership or location over time, so the DOI system uses a central directory When a user clicks on a DOI, a message is sent to the central directory where the current web address associated with that DOI appears This location is sent back to the user ’ s browser with a message telling it to “ go to this particular net address. ” In a second the user sees a “ response screen ” - a Web page - on which the publisher offers the reader either the content itself, or further information about the object how to obtain it

Metadata revealed! When the object moves to a new server or the copyright holder sells the product, one change is recorded in the directory and all subsequent readers are sent to the new site The DOI remains reliable and accurate because the link to the associated information or source of the content is easily and efficiently changed The database Information about the object, beyond simply the response screen is maintained by the publisher It might include the content or the information on where and how to obtain the content or other related data The information that the user has access to in response to a DOI query is the third component of the DOI system

Metadata revealed! The DOI is being developed to conform to, and take advantage of, all relevant international standards The syntax of the DOI is being proposed as NISO standard Z39.84 DOI metadata will be expressed in RDF using XML DOI conforms to the syntax for URNs laid down by IETF DOI has aligned with Interoperability of Data in ECommerce Systems (INDECS) INDECS uses the current major initiatives of structured metadata (including Dublin Core and IFLA) to define a common metadata model for ecommerce. The DOI Foundation is a member of the W3 and is in close contact with standardisation activities from ISO and others as well as initiatives from WIPO and other major bodies

Metadata revealed! Examples of DOI Usage An article reference found on the net is linked to an abstract and information about the availability of full text A reader of one article was linked to related material including similar articles, or books. A reader using DOIs saw the full text of an article,the Table of Contents of the journal in which the article appeared She could subscribe to the journal, purchase a book, or order the content for later delivery A user was able to use the DOI to automatically contact a help service, or download the current driver for a software product

Metadata revealed! RDF: Resource Description Framework 2/99 W3C (World Wide Web Consortium) initiative W3C’s (RDF) provides a generic metadata architecture It is a specification currently being developed to support the definition of metadata across the web. It describes how metadata for content is defined in web documents This metadata is descriptive information about the structure and content of information in a document RDF is useful for describing information about indexing, navigating and searching a site, as well as push channel definitions and digital signatures

Metadata revealed! RDF is the instantiation of the Warwick Framework for the Web The basic RDF data model consists of three object types: Resource: anything that can be specified by a URI, such as a web page, an entire web site, a specific newsgroup message Properties: characteristics or attributes of a resource, along with some notion of meaning, valid values, etc. Statements: resource + named property + value of property This is expressed as a “tuple” {subject predicate object}

Metadata revealed! It will be the foundation for an architecture for metadata on the Web Resource description Electronic commerce Site mapping Third party rating Digital signatures Search engine data collection (web crawling) Digital library collections Distributed authoring

Metadata revealed! Using XML, it might look like this: The W3C Folio 1999 W3C Communications Team Web development, World Wide Web Consortium, Interoperability of the Web

Metadata revealed! MARBI is an ALA committee that advises LOC on changes to USMARC record formats Proposal 93-4 set major changes to accommodate bibliographic formats to account for networked information They suggested a new set of data elements to add to the record and forced people to think of the definition of an online resource The key element was remote access They suggested changes to Field 256 “ File Characteristics, ” and lost They recommended creating Field 856 “ Electronic Location and Access ” and won

Metadata revealed! MARC Initiatives for the identification and description of networked resources name of resource acronym/initialism producer distributor location contact name and address network access network address hours of service telephone fax network access instructions terminal emulation logon instructions logoff instructions type of resource size of resource freqency of update language of resource profile of resource audience restrictions on access authorization source machine databases available other providers of database responsibility for record maintenance date/time of last update of directory information local access information and guidelines cost for use coverage indexing terms What's here now

Metadata revealed! What's new? Field 856 is an embedded holdings field within the bibliographic record It contains the information needed to locate digital information The information identifies the location containing the item or from which it is available It also contains information to retrieve the item by the access method identified in the first indicator This information is sufficient to allow for the electronic transfer of a file, subscription to an electronic journal, or logon to a library catalog

Metadata revealed! ELECTRONIC LOCATION AND ACCESS *Indicators (of which there are always two) First Access method 0 1 FTP 2 Remote login (Telnet) 8 Other Second Undefined #Undefined Subfield Codes $a - Host name (R) $b - IP address (NR) $c - Compression information (NR) $d - Path (R) $f - Filename (R) $g - Name of publication or conference (NR) $h - Processor of request (NR) $i - Instruction (R) $k - Password (NR) $l - Logon/login (NR) $m - Contact person for information, assistance (R) $n - Name of location of host in $a (NR) $p - Port (NR) $q - File mode (NR) $s - File size (R) $t - Terminal emulation (R) $x - Non-public note (R) $z - Public note (R) $2 - Source of access (NR)

Metadata revealed! OCLC ’ s Persistent URL (PURL) project Functionally, a PURL is a URL with three parts Protocol: this is used to access the PURL resolver This protocol may differ from that used to access the resource associated with the PURL Resolver address: the IP address or domain name of the PURL resolver This portion of the PURL is resolved by the Domain Name Server (DNS) Name: user-assigned name Note: This may differ from the name of the resource in the associated URL

Metadata revealed! Instead of pointing directly to the location of an Internet resource, a PURL points to an intermediate resolution service The PURL resolution service associates the PURL with the actual URL and returns the URL to the client The client can then complete the URL transaction in the normal fashion. The advantage of PURLs is that they persist over time no matter where the page moves protocol resolver address filename

Metadata revealed! CLIENTCLIENT PURL SERVER RESOURCE SERVER PURL URL RESOURCE The model works something like this:

Metadata revealed! I. Introduction The state of the net today What is metadata and why do we need it? II. What different metadata schemes are available? Dublin Core and Warwick Framework Digital Object Identifier Resource Description Format Persistent URL III. What does metadata mean for information management in libraries?

Metadata revealed! III. What does metadata mean for information management in libraries? There are social and technical issues in the use of metadata Metadata use requires collaboration because it provides little benefit if authors simply add whatever metadata they like to their resources People have to agree on the metadata schemes to use (purposes and values of the scheme) To reach agreement, people must be willing to abandon old procedures and adopt new methods It is critical to address the social aspects of metadata early and often

Metadata revealed! Issues: Implementing new ways of storing and retrieving information requires the cooperation of various stakeholders This means that education is necessary for those who may never see a metadata record up close and personal It requires attention to staffing and work flow In a practical sense, it is difficult to find people with the specialized skills needed to evaluate, implement, and maintain systems that exploit metadata The administration should understand that involvement with metadata will require commitment of time and resources for staff training and education

Metadata revealed! Convincing creators of digital information to use metadata For a library to use metadata, the creators of digital documents must embed it in the document This is not a trivial because adding metadata is an investment of time and effort Librarians can help creators work with metadata but should not take responsibility for putting it in documents and files This is because metadata is embedded in the work itself and will rarely if ever be directly controlled by the librarian

Metadata revealed! Convincing librarians to understand metadata and tools which exploit it Metadata is a tool, not a solution Librarians must understand metadata and possess certain skills in order to make it useful for library patrons Introducing entirely new systems for information access may undermine the goal of providing integrated access to all the library's holdings It may requires that librarians and patrons to learn new skills For these reasons and others, some librarians may feel that it is not worthwhile to work with metadata

Metadata revealed! Who will be responsible for creating and maintaining metadata? Publisher side Author Webmaster Institution Service side Search service Third party creators How will it be done? Automatically generated Hand crafted

Metadata revealed! Technical Issues Compatibility with present access mechanisms and data One issue when introducing metadata is to retain compatibility with existing access mechanisms The catalog is the primary access point to the vast majority of library resources MARC has become the accepted standard for exchange of library information and has heavily influenced storage and display of information It is a not wise to become dependent on a technology that is incompatible with MARC There should be a compelling case to believe that a technology will become dominant or that migration will be possible

Metadata revealed! There are no widely accepted metadata standards yet Some efforts have attracted interest, but use is infrequent and inconsistent Librarians are interested in the Dublin Core because its elements transfer relatively easily to MARC But the DC does not doc well with resources that don't behave like paper documents Another problem with the DC is that it defines a minimal set of elements and further development seems to have stopped There is no guarantee that metadata generated today will be useful for providing access to documents in the future Banerje, K. (1999). Practical Applications of Metadata at Oregon State University

Metadata revealed! Libraries Working Group Working Group Chair: Rebecca Guenther Working Group Charter: Foster increased operability between DC and library metadata by identifying issues and solutions; Keep the library community informed on DC developments; Consider reasons to experiment more widely with Dublin Core in libraries; Build a library Implementors community; Explore the need for a cross domain namespace(s) to register non-DC elements and qualifiers needed by the library community

Metadata revealed! What will information professionals have to learn? The range of applicable metadata schemes, their strengths and weaknesses How to apply appropriate schemes to digital information The ways in which various metadata schemes facilitate resource discovery How they affect the administration of digital information, information security, documentation, data mining … The relationship between standards and metadata How to test and evaluate various metadata schemes

Metadata revealed! A challenge for information professionals is to work with different metadata schemes This involves developing metadata “crosswalks” Fluid capability to work with same data in different metadata structures Requires agreement on semantics Requires standardized mappings for interoperability It will involve working with metadata in two forms Embedded with data Independent of items

Metadata revealed! Here are an examples of a metadata crosswalk from Dublin Core to MARC The conversion of DC style record involves Skeletal record for enhancement Incorporating DC record into MARC database Subject and Keywords Simple: 653$a (Index term--Uncontrolled) Complex: If scheme=LCSH: 650$a If scheme=LCC: 050$a If scheme=DDC: 082$a If scheme=(other): 650$a with $2 (code) This enables communication of the DC record in MARC

Metadata revealed! Here are some other examples of crosswalks: DC/MARC/GILS Crosswalk MARC/FGDC fgdc2marc.html fgdc2marc.html GILS/MARC Also: Dublin Core to FGDC MARC to SGML Crosswalks allows resource discovery across syntaxes

Metadata revealed! Examples of metadata schemes Text Encoding Initiative (TEI) Global Information Locator Service (GILS) Computer Interchange of Museum Information (CIMI) Encoded Archival Description (EAD) Content Standards for Digital Geospatial Metadata (CSDGM)

Metadata revealed! Nordic Metadata Project HotOil: Distributed Searching over Heterogeneous Information Sources National Biological Information Infrastructure (NBII) Categories for the Description of Works of Art (CDWA) Interoperability of Data in ECommerce Systems (INDECS) This presentation is on the web at: