Presentation is loading. Please wait.

Presentation is loading. Please wait.

Non-MARC Cataloging Standards Overview: TEI & EAD, MODS, METS, XML- based MARC Eric Childress OCLC Eric Childress OCLC February 10, 2003 OCLC.

Similar presentations


Presentation on theme: "Non-MARC Cataloging Standards Overview: TEI & EAD, MODS, METS, XML- based MARC Eric Childress OCLC Eric Childress OCLC February 10, 2003 OCLC."— Presentation transcript:

1 Non-MARC Cataloging Standards Overview: TEI & EAD, MODS, METS, XML- based MARC Eric Childress OCLC Eric Childress OCLC February 10, 2003 OCLC

2 Overview Fundamentals –Metadata and content –Types of metadata –Document mark-up languages & character encoding The Big Picture Metadata formats: –MARC –MODS –METS –MIX –TEI –EAD –ONIX

3 Fundamentals Metadata and content 33 Metadata linked to content object MARC record with URL for ftp object 22 Metadata separate from content object Book + catalog card Book + MARC record 11 Metadata embedded in content object Title page / CIP HTML header in HTML document 44 Metadata embedded and linked MARC record with URL for HTML document PDF document linked to DC-XML record Aggregation of discrete objects linked to record

4 Fundamentals Types of metadata Administrative metadata: Data about the metadata (e.g. record number) Descriptive metadata: Description of the object for discovery and retrieval (e.g. Title) Technical metadata: Technical characteristics of the object (e.g. file size)

5 Fundamentals Markup languages: –Address the structure of a document –Convey instructions to software that will process text to: Index the text for searching To render the text (e.g., for screen display or print) Transform the text (e.g., for a voice synthesizer) for some output device(s) –The markup is generally invisible to end-users Extensible Markup Language (XML): –XML is metalanguage: agencies define their own XML to suit their task by creating Document Type Definitions (DTDs) or XML schema –Data separate from presentation instructions (recorded in a style sheet) –Offers just the right mix of flexibility and structure Character encoding: –Used for communicating text characters in a computing environment –Hundreds of character encoding standards exist –Character conversion is complex and expensive Unicode: –A single, “comprehensive” global encoding standard –Includes characters from scripts of all major modern, most minor, and selected ancient languages Markup languages & Character encoding

6 The Big Picture Standards in a grid Rich Description Simple Description ItemCollections Dublin Core RSLP OAI set record TEI VRA Core ONIX MARC 8 CSDGM

7 Library-related standards MARC 21 (ISO 2709) MARC 8: –Library metadata communications format based on ISO 2709 –Strengths: Mature standard Widely adopted by libraries (U.S., Canada, and beyond) Large universe of records available Wide choice of software vendors –Weaknesses (in the present & future): Virtually unused outside of libraries Field and record size limitations Restricted range of scripts supported (MARC 8 repertoire only) Limited ability to convey hierarchical & complex relationships, attributes No ability to embed related objects (e.g., book cover GIF) Cannot be directly processed by widely-used web applications MARC 21 (ISO 2709) Unicode: –MARC 21 with Unicode character encoding –Limited to 16K characters equivalent to MARC 8 repertoire MARC 21 (ISO 2709) MARC

8 Library-related standards MARC 21 and XML: –Library of Congress’ MARCXML: LC’s schema provides a lossless conversion of MARC 21 (ISO2709) to XML LC’s XML framework positions MARCXML as both an end format and as an intermediate format to non-MARC formats –Stanford University’s Lane Medical School’s XMLMARC: Developed before LC’s MARCXML schema Ignores/simplifies some MARC 21 data UNIMARC and XML: –Ministère de la culture et de la communication (France), Board of Research and Technology BiblioML DTD for converting UNIMARC to XML Conversion tools in development MARC and XML MARC « BiblioML »

9 Library-related standards Metadata Object Description Schema (MODS) –Essentially MARC 21 recast in an XML-native framework Text-based tags rather than numeric ones, Selected clusters of related MARC 21 attributes condensed into single MODS element –MARC 21 readily converts to MODS, but can’t do a lossless reverse conversion of MODS to MARC 21 Value of MODS: –A rich, library-metadata-oriented XML metadata schema –Optimized for from-MARC conversion of legacy records –Selectively “improves” some of MARC’s mechanisms for representing resource type –Well-suited as a metadata format for OAI harvesting –Maintained by the same agency (LC) that maintains MARC 21 Applications of MODS: –LC planning to convert 100K American Memory records –Minerva project, U of Chicago Press, California Digital Library, others using or planning to use for records for web sites, e-texts. MODS

10 Library-related standards Metadata Encoding and Transmission Standard (METS) –Standard for encoding descriptive, administrative, structural, rights and other data essential for retrieving, preserving, and serving up digital resources –Six modules (header, descriptive metadata, administrative metadata, file section, structural map, behavior section) –Header and structural map are required; descriptive, administrative, behavior metadata may reside in METS object or be external. Value of METS: –Need for METS identified at DLF metadata experts meetings – varied local approaches to non-descriptive metadata not scaling well nor supporting interoperability between agencies –Can be used to collect digital resource metadata for submission to repository, hold metadata in the repository, inform user access applications Applications of METS: –LC using for moving images, audio recordings, folk life mixed media collections –OCLC DPR, RLG, Harvard, National Library of Wales exploring or using for variety of projects METS

11 Library-related standards Metadata for Images in XML (MIX) –Collaboration of LC and NISO Technical Metadata for Digital Still Images Standards Committee –XML schema for a set of technical data elements required to manage digital image collections –Format for interchange and/or storage of the data specified in the NISO Draft Standard Data Dictionary: Technical Metadata for Digital Still Images (version 1.2) –Still in early development and testing phases Value of MIX: –Provides a common XML schema for expressing technical data particular to still and moving digital images –Can be used with other schema such as METS and MODS as part of a comprehensive approach to managing and preserving digital images Applications of MIX: –OCLC DPR, LC, others planning or testing –MIX still in nascent stage of development and testing MIX

12 E-text-related standard Text Encoding Initiative (TEI): –For complex markup of literary texts –Both SGML & XML [new] DTDs available –TEI “header” (TEIH) can be used as a descriptive metadata record –Maintenance agency: TEI Consortium TEI Consortium has executive offices in Bergen, Norway, and is hosted at four university sites worldwide: the University of Bergen, Brown University, Oxford University, and the University of Virginia Consortium maintains “P4” Guidelines for Electronic Text Encoding and Interchange Value of TEI: –Designed to meet the needs of scholarly research community (esp. in the humanities) for a variety of activities including: Adding in-line academic commentary in e-texts As an aid to research through supporting special indexing points, etc. Applications of TEI: –Widely used by major humanities electronic text collections such as CETH, UVa e-text center, many others. TEI

13 Archives-related standard Encoded Archival Description (EAD) –A format for expressing electronic archival finding aids –Created by LC and the Society of American Archivists (SAA) –EAD DTD (Version 2002) is designed to function as both an SGML and XML DTD Value of EAD: –Effectively an organized presentation of a collection of documents EAD header carries metadata for the finding aid Provides for simple or complex mark-up to support varying levels of indexing Well-suited for interweaving narrative with links to specific objects in a collection (either directly to the object or via a record for the object that may link to the object). Applications of EAD: –Conversion of existing paper finding aids to electronic form –Widely used by academic institutions and archives in North America –RLG Archival Resources database host copies of many EADs EAD

14 Publishing-related standard ONIX International (Online Information Exchange): –Standard format for publishers to use to distribute electronic information about their publications. –XML schema with Unicode encoding –Based on EPICS (EDItEUR Product Information Communication Standards) –Maintenance agency: EDItEUR working with input from the Book Industry Communication (BIC) and the Book Industry Study Group (BISG) Value of ONIX: –Designed to meet needs of publishers, jobbers, retail sellers for richer book data online (including cover art) a common data exchange format that will allow players to be rid of the burden of costly, custom programming to handle data from individual suppliers –Offers two levels of richness (level 1 & level 2) Applications of ONIX: –Primarily oriented towards jobbers and publishers – Most major players (Amazon, Baker & Taylor, etc.) now using/supporting –Some interest in implementation in library systems ONIX

15 & Q uestions A A nswer s

16 Links MARC 21: http://lcweb.loc.gov/marc/marcdocz.htmlhttp://lcweb.loc.gov/marc/marcdocz.html MARCXML: http://www.loc.gov/marc/marcxml.htmlhttp://www.loc.gov/marc/marcxml.html XMLMARC: http://laneweb.stanford.edu:2380/wiki/medlane/xmlmarc http://laneweb.stanford.edu:2380/wiki/medlane/xmlmarc BiblioML (UNIMARC XML): http://www.culture.fr/BiblioMLhttp://www.culture.fr/BiblioML MODS: http://www.loc.gov/standards/modshttp://www.loc.gov/standards/mods METS: http://www.loc.gov/standards/metshttp://www.loc.gov/standards/mets MIX: http://www.loc.gov/standards/mixhttp://www.loc.gov/standards/mix TEI: http://www.tei-c.orghttp://www.tei-c.org EAD: http://www.loc.gov/eadhttp://www.loc.gov/ead ONIX: http://www.editeur.org/onix.htmlhttp://www.editeur.org/onix.html Further reading on MARCXML, MODS, METS: “New Metadata Standards for Digital Resources,” Bulletin of the American Society for Information Science and Technology. Dec/Jan 2003, pp 12- 15. http://www.asis.org/Bulletin/Dec-02/ASISTDecJan.pdfhttp://www.asis.org/Bulletin/Dec-02/ASISTDecJan.pdf Major emphasis in this presentation

17 Links SCORM: http://www.adlnet.org/index.cfm?fuseaction=scormabthttp://www.adlnet.org/index.cfm?fuseaction=scormabt RSLP: http://www.ukoln.ac.uk/metadata/rslphttp://www.ukoln.ac.uk/metadata/rslp VRA Core: http://www.vraweb.org/vracore3.htmhttp://www.vraweb.org/vracore3.htm IMS LOM: http://www.imsglobal.org/metadatahttp://www.imsglobal.org/metadata CSDGM: http://www.fgdc.gov/metadata/contstan.htmlhttp://www.fgdc.gov/metadata/contstan.html GEM: http://www.geminfo.org/Workbenchhttp://www.geminfo.org/Workbench CIMI: http://www.cimi.org/old_site/standardshttp://www.cimi.org/old_site/standards Also appearing (in Big Picture)


Download ppt "Non-MARC Cataloging Standards Overview: TEI & EAD, MODS, METS, XML- based MARC Eric Childress OCLC Eric Childress OCLC February 10, 2003 OCLC."

Similar presentations


Ads by Google