Metadata Standards in Various Environments Spring 2006 30 January, 2006 Bharat Mehra IS 520 Organization and Representation of Information School of Information.

Slides:



Advertisements
Similar presentations
Putting the Pieces Together Grace Agnew Slide User Description Rights Holder Authentication Rights Video Object Permission Administration.
Advertisements

Putting together a METS profile. Questions to ask when setting down the METS path Should you design your own profile? Should you use someone elses off.
METS: An Introduction Structuring Digital Content.
An Introduction to MODS: The Metadata Object Description Schema Tech Talk By Daniel Gelaw Alemneh October 17, 2007 October 17, 2007.
3. Technical and administrative metadata standards Metadata Standards and Applications.
An Introduction to Metadata by Wendy Duff ECURE 2000 October 6, 2000.
Metadata: An Introduction By Wendy Duff October 13, 2001 ECURE.
© Tefko Saracevic, Rutgers University1 metadata considerations for digital libraries.
8/28/97Information Organization and Retrieval Metadata and Data Structures University of California, Berkeley School of Information Management and Systems.
1 CS 502: Computing Methods for Digital Libraries Lecture 17 Descriptive Metadata: Dublin Core.
OLC Spring Chapter Conferences Metadata, Schmetadata … Tell Me Why I Should Care? OLC Spring Chapter Conferences, 2004 Margaret.
The RDF meta model: a closer look Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations.
Metadata: Its Functions in Knowledge Representation for Digital Collections 1 Summary.
Digital Encoding What’s behind E-text Resources?.
Guest Lecture LIS 656, Spring 2011 Kathryn Lybarger.
UKOLUG - July Metadata for the Web RDF and the Dublin Core Andy Powell UKOLN, University of Bath UKOLN.
Metadata Standards and Applications 4. Metadata Syntaxes and Containers.
By Carrie Moran. To examine the Metadata Object Description Schema (MODS) metadata scheme to determine its utility based on structure, interoperability.
Metadata Standards and Applications 5. Applying Metadata Standards: Application Profiles.
Data Exchange Tools (DExT) DExT PROJECTAN OPEN EXCHANGE FORMAT FOR DATA enables long-term preservation and re-use of metadata,
Teaching Metadata and Networked Information Organization & Retrieval The UNT SLIS Experience William E. Moen School of Library and Information Sciences.
METS-Based Cataloging Toolkit for Digital Library Management System Dong, Li Tsinghua University Library
1 © Netskills Quality Internet Training, University of Newcastle Metadata Explained © Netskills, Quality Internet Training.
8/28/97Organization of Information in Collections Introduction to Description: Dublin Core and History University of California, Berkeley School of Information.
Metadata: An Overview Katie Dunn Technology & Metadata Librarian
Metadata Standards and Applications 1. Introduction to Digital Libraries and Metadata.
An Overview of MPEG-21 Cory McKay. Introduction Built on top of MPEG-4 and MPEG-7 standards Much more than just an audiovisual standard Meant to be a.
Metadata Xiangming Mu. What is metadata? What is metadata? (cont’) Data about data –Any data aids in the identification, description and location of.
1 XML as a preservation strategy Experiences with the DiVA document format Eva Müller, Uwe Klosa Electronic Publishing Centre Uppsala University Library,
The Metadata Object Description Schema (MODS) NISO Metadata Workshop May 20, 2004 Rebecca Guenther Network Development and MARC Standards Office Library.
Metadata Considerations Implementing Administrative and Descriptive Metadata for your digital images 1.
TEXT ENCODING INITIATIVE (TEI) Inf 384C Block II, Module C.
METADATA What Is It and What Can I Do With It? Vicki L. Gregory Associate Professor School of Library & Information Science University of South Florida.
Organizing Internet Resources OCLC’s Internet Cataloging Project -- funded by the Department of Education -- from October 1, 1994 to March 31, 1996.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
Metadata: Essential Standards for Management of Digital Libraries ALI Digital Library Workshop Linda Cantara, Metadata Librarian Indiana University, Bloomington.
Meta Tagging / Metadata Lindsay Berard Assisted by: Li Li.
Lifecycle Metadata for Digital Objects (INF 389K) September 18, 2006 The Big Metadata Picture, Web Access, and the W3C Context.
1 Metadata –Information about information – Different objects, different forms – e.g. Library catalogue record Property:Value: Author Ian Beardwell Publisher.
Evolving MARC 21 for the future Rebecca Guenther CCS Forum, ALA Annual July 10, 2009.
Metadata and Documentation Iain Wallace Performing Arts Data Service.
Discovery Metadata for Special Collections Concepts, Considerations, Choices William E. Moen School of Library and Information Sciences Texas Center for.
Metadata Bridget Jones Information Architecture I February 23, 2009.
Introduction to metadata
Lifecycle Metadata for Digital Objects November 1, 2004 Descriptive Metadata: “Modeling the World”
Slavic Digital Text Workshop 2006 The Open Archives Initiative Protocol for Metadata Harvesting: an Opportunity for Sharing Content in a Distributed Environment.
Intellectual Works and their Manifestations Representation of Information Objects IR Systems & Information objects Spring January, 2006 Bharat.
The RDF meta model Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations of XML compared.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
Digitization – Basics and Beyond workshop Interoperability of cultural and academic resources New services for digitized collections Muriel Foulonneau.
Metadata and Meta tag. What is metadata? What does metadata do? Metadata schemes What is meta tag? Meta tag example Table of Content.
Metadata and Digital Libraries M. SURULINATHI Assistant Librarian.
Differences and distinctions: metadata types and their uses Stephen Winch Information Architecture Officer, SLIC.
Describing resources II: Dublin Core CERN-UNESCO School on Digital Libraries Rabat, Nov 22-26, 2010 Annette Holtkamp CERN.
A centre of expertise in digital information management UKOLN is supported by: Metadata – what, why and how Ann Chapman.
Online Information and Education Conference 2004, Bangkok Dr. Britta Woldering, German National Library Metadata development in The European Library.
Attributes and Values Describing Entities. Metadata At the most basic level, metadata is just another term for description, or information about an entity.
Geospatial metadata Prof. Wenwen Li School of Geographical Sciences and Urban Planning 5644 Coor Hall
Some basic concepts Week 1 Lecture notes INF 384C: Organizing Information Spring 2016 Karen Wickett UT School of Information.
Information organization Week 2 Lecture notes INF 380E: Perspectives on Information Spring 2015 Karen Wickett UT School of Information.
Metadata standards and interoperability 384C – Organizing Information Spring 2016 Karen Wickett School of Information University of Texas at Austin.
Session 3 Metadata & Workflow
WHAT DOES THE FUTURE HOLD? Ann Ellis Dec. 18, 2000
Attributes and Values Describing Entities.
Metadata for research outputs management
Metadata in Digital Preservation: Setting the Scene
Oya Y. Rieger Cornell University Library May 2004
Some Options for Non-MARC Descriptive Metadata
Attributes and Values Describing Entities.
Presentation transcript:

Metadata Standards in Various Environments Spring January, 2006 Bharat Mehra IS 520 Organization and Representation of Information School of Information Sciences University of Tennessee

What is Metadata? Structured information that describes, explains, locates, or makes it easier to retrieve, use, or manage an information resource (information object?) Data about data or information about information Used differently in different communities “machine understandable information”; library environment: any formal scheme of resource (information object) description

Why Metadata?

Types of Metadata Schemes ä ä Descriptive metadata describes a resource for purposes such as discovery and identification (title, abstract, author, and keywords) ä ä Structural metadata indicates how compound objects are put together (how pages are ordered to form chapters) ä ä Administrative metadata provides information to help manage a resource (when and how it was created, file type and other technical information, and who can access it) ä ä Rights management metadata, which deals with intellectual property rights ä ä Preservation metadata, which contains information needed to archive and preserve a resource

Characteristics of Metadata Scale of Metadata can describe resources at any level of aggregation Metadata can be for work, expression, manifestation, or item Metadata can be embedded in a digital object or it can be stored separately QUESTION Metadata is often embedded in HTML documents and in the headers of image files. Advantages and disadvantages?

Functions of Metadata Facilitates discovery of relevant information (resource discovery) Organizes electronic resources Facilitates interoperability and legacy resource integration Provide digital identification, and support archiving and preservation Resource Discovery ¶ ¶allowing resources to be found by relevant criteria; ¶ ¶identifying resources ¶ ¶bringing similar resources together ¶ ¶distinguishing dissimilar resources ¶ ¶giving location information

Functions of Metadata Organizing Electronic Resources ¶ ¶ Aggregate sites or portals organizing links to resources based on audience or topic ¶ ¶ Built as static webpages, with the names and locations of the resources “hardcoded” in HTML ¶ ¶ Build pages dynamically from metadata stored in databases: extraction tools Interoperability Allows information object to be understood by both humans and machines in ways that promote interoperability

Functions of Metadata Interoperability ability of multiple systems with different hardware and software platforms, data structures, and interfaces to exchange data with minimal loss of content and functionality Two approaches to interoperability a) Cross-system search and metadata harvesting Z39.50 protocol does not share metadata but maps own search capabilities to a common set of search attributes b) Open Archives Initiative translate their native metadata to a common core set of elements and expose this for harvesting. A search service provider gathers the metadata into a consistent central index to allow cross-repository searching regardless of the metadata formats used

Functions of Metadata Digital Identification ¶ ¶ Most metadata schemes include elements (standard numbers) to uniquely identify the info object to which the metadata refers ¶ ¶ Location of a info object may also be given using a file name, URL (Uniform Resource Locator), or some more persistent identifier such as a PURL (Persistent URL) or DOI (Digital Object Identifier) Archiving and Preservation: digital resources will not survive in usable form into the future. Why? Metadata is key to ensuring that resources will survive and continue to be accessible into the future. Archiving and preservation require special elements to track the lineage of a digital object (where it came from and how it has changed over time), to detail its physical characteristics, and to document its behavior in order to emulate it

Metadata: Vocabulary Metadata framework: sets of metadata elements designed for a specific purpose, such as describing a particular type of information resource Semantics of scheme: definition or meaning of the elements themselves Content: values given to metadata elements Metadata schemes: specify names of elements, their semantics, and content rules for how content must be formulated (for example, how to identify the main title), representation rules for content (for example, capitalization rules), and allowable content values (for example, terms must be used from a specified controlled vocabulary) Syntax rules: how the elements and their content should be encoded. A metadata scheme with no prescribed syntax rules is called syntax independent

Metadata: Vocabulary Metadata Framework: Metadata Framework: Define what information is needed as metadata. Metadata content from any one framework can be encoded in many different formats Encoding in definable syntax: e.g., SGML (Standard Generalized Mark- up Language) or XML (Extensible Mark-up Language). XML is an extended form of HTML that allows for locally defined tag sets and the easy exchange of structured information. SGML is a superset of both HTML and XML and allows for the richest mark-up of a document. Metadata Encoding: Metadata Encoding: A syntactic scheme for writing down some metadata content. A specification of the kinds of info that should be presented to describe an info object Mark-up language: Mark-up language: Metadata encoded and actually embedded within a document using a specific syntactic (and semantic) scheme

Metadata Schemes Qualified and unqualified (or simple) Dublin Core. Qualifiers can be used to refine (narrow the scope of) an element, or to identify the encoding scheme used in representing an element value. The element Date, for example, can be used with the refinement qualifier created to narrow the meaning of the element to the date the object was created. Title, Creator, Subject, Description, Publisher, Contributor, Date, Type, Format, Identifier, Source, Language, Relation, Coverage, and Rights.

Metadata Scheme Dublin Core Dublin Core from a 1995 workshop sponsored by OCLC and the National Center for Supercomputing Applications Title=”Metadata Demystified” Creator=”Brand, Amy” Creator=”Daly, Frank” Creator=”Meyers, Barbara” Subject=”metadata” Description=”Presents an overview of metadata conventions in publishing.” Publisher=”NISO Press” Publisher=”The Sheridan Press” Date=” " Type=”Text” Format=”application/pdf” Identifier=” Metadata_Demystified.pdf” Language=”en”

The Text Encoding Initiative (TEI) The Text Encoding Initiative is an international project to develop guidelines for marking up electronic texts such as novels, plays, and poetry, primarily to support research in the humanities. In addition to specifying how to encode the text of a work, the TEI Guidelines for Electronic Text Encoding and Interchange also specify a header portion, embedded in the resource, that consists of metadata about the work. The TEI header, like the rest of the TEI, is defined as an SGML DTD (Document Type Definition)— a set of tags and rules defined in SGML syntax that describe the structure and elements of a document. This SGML mark-up becomes part of the electronic resource itself. Since the TEI DTD is rather large and complicated in order to apply to a vast range of texts and uses, a simpler subset of the DTD, known as TEI Lite, is commonly used in libraries. It is assumed that TEI-encoded texts are electronic versions of printed texts. Therefore the TEI Header can be used to record bibliographic information about both the electronic version of the text and about the non-electronic source version. The basic bibliographic information is similar to that recorded in library cataloging and can be mapped to and from MARC. However, there are also elements defined to record details about how the text was transcribed and edited, how mark-up was performed, what revisions were made, and other non-bibliographic facts. Libraries tend to use TEI headers when they have collections of SGML-encoded full text. Some libraries use TEI headers to derive MARC records for their catalogs, while others use MARC records as the basis for creating TEI header descriptions for the source texts. Metadata Encoding and Transmission Standard (METS) The Metadata Encoding and Transmission Standard (METS) was developed to fill the need for a standard data structure for describing complex digital library objects. METS is an XML Schema for creating XML document instances that express the structure of digital library objects, the associated descriptive and administrative metadata, and the names and locations of the files that comprise the digital object. The metadata necessary for successful management and use of digital objects is both more extensive than and different from the metadata used for managing collections of printed works and other physical materials. Structural metadata is needed to ensure that separately digitized files (for example, different pages of a digitized book) are structured appropriately. Technical metadata is needed for information about the digitization process so that scholars may determine how accurate a reflection of the original the digital version provides. Other technical metadata is required for internal purposes in order to periodically refresh and migrate the data, ensuring the durability of valuable resources. METS was originally an outgrowth of the Making of America II project, a digitization project of major research libraries that attempted to address these metadata issues, in part by providing an encoding format for metadata for textual and image-based works. The Digital Library Federation (DLF) built on that earlier work to create METS, a standard schema for providing a method for expressing and packaging together descriptive, administrative, and structural metadata for objects within a digital library. Expressed using the XML schema language, METS provides a document format for encoding the metadata necessary for management of digital library objects within a repository and for exchange between repositories. The Text Encoding Initiative (TEI) ¶ ¶ International project to develop guidelines for marking up electronic texts such as novels, plays, and poetry, primarily to support research in the humanities ¶ ¶ Specifies how to: encode the text of a work, header portion, embedded resource (contains metadata about the work) ¶ ¶ TEI header, like the rest of the TEI, is defined as an SGML DTD (Document Type Definition)—a set of tags and rules defined in SGML syntax that describe the structure and elements of a document. This SGML mark-up becomes part of the electronic resource itself Metadata Schemes

The Text Encoding Initiative (TEI) The Text Encoding Initiative is an international project to develop guidelines for marking up electronic texts such as novels, plays, and poetry, primarily to support research in the humanities. In addition to specifying how to encode the text of a work, the TEI Guidelines for Electronic Text Encoding and Interchange also specify a header portion, embedded in the resource, that consists of metadata about the work. The TEI header, like the rest of the TEI, is defined as an SGML DTD (Document Type Definition)— a set of tags and rules defined in SGML syntax that describe the structure and elements of a document. This SGML mark-up becomes part of the electronic resource itself. Since the TEI DTD is rather large and complicated in order to apply to a vast range of texts and uses, a simpler subset of the DTD, known as TEI Lite, is commonly used in libraries. It is assumed that TEI-encoded texts are electronic versions of printed texts. Therefore the TEI Header can be used to record bibliographic information about both the electronic version of the text and about the non-electronic source version. The basic bibliographic information is similar to that recorded in library cataloging and can be mapped to and from MARC. However, there are also elements defined to record details about how the text was transcribed and edited, how mark-up was performed, what revisions were made, and other non-bibliographic facts. Libraries tend to use TEI headers when they have collections of SGML-encoded full text. Some libraries use TEI headers to derive MARC records for their catalogs, while others use MARC records as the basis for creating TEI header descriptions for the source texts. Metadata Encoding and Transmission Standard (METS) The Metadata Encoding and Transmission Standard (METS) was developed to fill the need for a standard data structure for describing complex digital library objects. METS is an XML Schema for creating XML document instances that express the structure of digital library objects, the associated descriptive and administrative metadata, and the names and locations of the files that comprise the digital object. The metadata necessary for successful management and use of digital objects is both more extensive than and different from the metadata used for managing collections of printed works and other physical materials. Structural metadata is needed to ensure that separately digitized files (for example, different pages of a digitized book) are structured appropriately. Technical metadata is needed for information about the digitization process so that scholars may determine how accurate a reflection of the original the digital version provides. Other technical metadata is required for internal purposes in order to periodically refresh and migrate the data, ensuring the durability of valuable resources. METS was originally an outgrowth of the Making of America II project, a digitization project of major research libraries that attempted to address these metadata issues, in part by providing an encoding format for metadata for textual and image-based works. The Digital Library Federation (DLF) built on that earlier work to create METS, a standard schema for providing a method for expressing and packaging together descriptive, administrative, and structural metadata for objects within a digital library. Expressed using the XML schema language, METS provides a document format for encoding the metadata necessary for management of digital library objects within a repository and for exchange between repositories. Metadata Encoding and Transmission Standard (METS) ¶ ¶ Provides a standard data structure for describing complex digital library objects ¶ ¶ METS is an XML Schema for creating XML document instances that express the structure of digital library objects, the associated descriptive and administrative metadata, and the names and locations of the files that comprise the digital object Metadata Schemes

Metadata Object Description Schema (MODS) ¶ ¶ Descriptive metadata schema that is a derivative of MARC 21 and intended to either carry selected data from existing MARC 21 records or enable the creation of original resource description records ¶ ¶ Includes a subset of MARC fields and uses language based tags rather than the numeric ones used in MARC 21 records. In some cases, it regroups elements from the MARC 21 bibliographic format ¶ ¶ Like METS, MODS is expressed using the XML schema language

Metadata Schemes A MODS Record Example Metadata demystified Brand Amy author text 2003 Bethesda, MD NISO Press

Metadata Schemes Learning Object Metadata ¶ ¶ Developed by the Learning Technology Standards Committee (LTSC) to enable the use and re-use of technology-supported learning resources such as computer-based training and distance learning ¶ ¶ LOM defines the minimal set of attributes to manage, locate, and evaluate learning objects that include: General, containing information about the object as a whole Lifecycle, containing metadata about the objects evolution Technical, with descriptions of the technical characteristics and requirements Educational, containing the educational

Metadata Schemes The Encoded Archival Description (EAD) Used for marking up the data contained in finding aids so that they can be searched and displayed online for archive and special collections E-Commerce – and ONIX Developed to support electronic commerce applications. Framework (Interoperability of Data in ECommerce Systems) and ONIX (Online Information Exchange): XML-based scheme developed by publishers in the book industry Visual Objects – CDWA and VRA Categories for Descriptors for Works of Art Visual Resource Association MPEG Multimedia Metadata Moving Pictures Expert Group Metadata for Datasets Federal Geographic Data Committee (FGDC) Content Standard for Digital Geospatial Metadata

Sample Metadata Encoding: Dublin Core in HTML <META NAME=“DC.Creator.Address” <META NAME=“DC.Subject” Content=“Information Organization and Representation”>

Sample Metadata Encoding: Dublin Core in XML IS 520 Syllabus Page IS 520 Syllabus Page Bharat Mehra Bharat Mehra Information Organization and Representation Information Organization and Representation IS 520 IS 520 <dc:date> </dc:date><dc:format>text/html</dc:format><dc:language>en</dc:language> Comparison: same content but different encoding HTML: HTML: XML: XML: Bharat Mehra

Metadata Encoding: Interoperability & Exchange The Resource D e s c r i p t i o n Framework (RDF), developed by the World Wide Web Consortium (W3C), is a data model for the description of resources on the Web that provides a mechanism for integrating multiple metadata schemes Metadata Cross-walks: that help to map the elements, semantics, and syntax from one metadata scheme to those of another Metadata Registries: Provide information about definition, origin, source, and location of data

Extensions and Profiles: Modifications to metadata schemes An extension is the addition of elements to an already developed scheme to support the description of an information resource of a particular type or subject or to meet the needs of a particular interest group Profiles are subsets of a scheme that are implemented by a particular interest group Creation Tools: Templates, Mark-up tools, Extraction tools, Conversion tools

Metadata Quality Control Metadata Quality Control Framework of Guidance for Building Good Digital Collections Articulates six principles applying to good metadata: Appropriate to the materials in the collection, users of the collection, and intended, current and likely use of the digital object Supports interoperability Uses standard controlled vocabularies to reflect the what, where, when and who of the content Includes a clear statement on the conditions and terms of use for the digital object Records are objects themselves and therefore should have the qualities of archiving, persistence, unique identification Authoritative and verifiable Supports the long-term management of objects in collections

“Critical Reflection Assignment 3” Each group will be given an “information object” and the group task is to create a bibliographic card of eight fields. Each group has to choose the eight fields according to their perception of the importance of those fields in searching for that particular item and rank the fields in the order of importance. Identify what you did and what you learned from the process. What thoughts did you have before, during or after the exercise? What did the process help you understand about the process of organization and representation of information?