Create and Manage METS in retrodigitization Markus Enders Goettingen State and University Library www.sub.uni-goettingen.de/GDZ.

Slides:



Advertisements
Similar presentations
How to Author Teaching Files Draft Medical Imaging Resource Center.
Advertisements

IRRA DSpace April 2006 Claire Knowles University of Edinburgh.
Putting together a METS profile. Questions to ask when setting down the METS path Should you design your own profile? Should you use someone elses off.
Citavi – Adding References – Articles from EBSCOhost Databases
Digital Library Services by Kodak i Center. Kodak i Centre - Sino Data Kodak i Centre Imaging expert Sino Data Library expert Bibliographic record creation.
DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?
Seminario SINM Lecce, October 2-4, 2000 ERAM ERAM The Jahrbuch Project Electronic Research Archive for Mathematics ( ERAM )
EndNote. What is EndNote:  EndNote is referencing software that enables you to create a database of references from your readings. Your database of references.
1 Actuate Corporation © 2010 THE BIRT COMPANY THE BIRT COMPANY THE BIRT COMPANY THE BIRT COMPANY THE BIRT COMPANY THE BIRT COMPANY THE BIRT COMPANY THE.
METS In order to reconstruct the archive, we will need to understand the METS files. METS is schema that provides a flexible mechanism for encoding descriptive,
Joachim Bauer Senior System Engineer, CCS
DSpace Devika P. Madalli DRTC, ISI Bangalore.
1 Uppsala University Library Eva Müller Peter Hansson Stefan Andersson Uwe Klosa Electronic Publishing Centre Krister Östlund Waller project.
Providing Online Access to the HKUST University Archives: EAD to INNOPAC Sintra Tsang and K.T. Lam The Hong Kong University of Science and Technology 7th.
StatCat Building a Statistical Data Finder ssrs.yale.edu/statcat Steven Citron-Pousty Ann Green Julie Linden Yale University.
1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,
Technical Tips and Tricks for User Support Mike Gardner
Contents and Formats Existing Digital Sources Gertraud Griepke Cornell University, July 26th 2002.
CIS101 Introduction to Computing Week 05. Agenda Your questions Exam next week - Excel Introduction to the Internet & HTML Online HTML Resources Using.
Introduction to EndNote Martin Snelling March 2007.
CONTENT: A model for collaborative database building Trevor Bond Alan Cornish Washington State University Libraries.
Introduction to HTML 2006 CIS101. What is the Internet? Global network of computers that are connected and communicate via a series of Protocols Protocols.
Introduction to HTML 2006 INT197B. What is the Internet? Global network of computers that are connected and communicate via a series of Protocols Protocols.
Introduction to HTML 2004 CIS101. What is the Internet? Global network of computers that are connected and communicate via a series of Protocols Protocols.
National Aeronautics and Space Administration Implementing DSpace at NASA Langley Research Center 1 Greta Lowe Librarian NASA Langley Research Center
Software and Multimedia
CIS101 Introduction to Computing Week 06. Agenda Your questions Excel Exam during second hour Our status after the snow day Introduction to the Internet.
ViciDocs for BPO Companies Creating Info repositories from documents.
METS-Based Cataloging Toolkit for Digital Library Management System Dong, Li Tsinghua University Library
Chapter 16 The World Wide Web. 2 The Web An infrastructure of information combined and the network software used to access it Web page A document that.
Dspace 1 Introduction to DSpace Mukesh Pund Scientist NISCAIR, New Delhi.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
Web based METS creation Ralf Stockmann case study.
Copyright © 2008 Pearson Prentice Hall. All rights reserved. 1 Exploring Microsoft Office Word 2007 Chapter 8 Word and the Internet Robert Grauer, Keith.
1 XML as a preservation strategy Experiences with the DiVA document format Eva Müller, Uwe Klosa Electronic Publishing Centre Uppsala University Library,
The Metadata Object Description Schema (MODS) NISO Metadata Workshop May 20, 2004 Rebecca Guenther Network Development and MARC Standards Office Library.
Title, meta, link, script.  The title looks like:  The tag defines the title of the document in the browser toolbar.  It also: ◦ Provides a title for.
DSpace UI Alexey Maslov. DSpace in general A digital library tool useful for storage, maintenance, and retrieval of digital documents Two types of interaction:
Technology Choices for the JSTOR Online Archive Presented by Chang Feng Department of Computer Engineering and Computer Science, University of Missouri-Columbia,
University of North Texas Libraries Building Search Systems for Digital Library Collections Mark E. Phillips Texas Conference on Digital Libraries May.
1 The Digitization Centre at Goettingen State and University Library Andrea Rapp Goettingen State and University Library
Implementor’s Panel: BL’s eJournal Archiving solution using METS, MODS and PREMIS Markus Enders, British Library DC2008, Berlin.
Introduction to metadata
Connexion Comparison Client or Browser? Fran Juergensmeyer Waukegan Public Library 2 nd Annual WILIUG Conference June 16, 2006 Cataloging from A (Authority)
EndNote. What is EndNote? EndNote is referencing software that enables you to create a database of references from your readings.
A Multi-Tiered Architecture for Distributed Data Collection and Centralized Data Delivery Stacy Kowalczyk and James Halliday April 28, 2008.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
Metadata and Meta tag. What is metadata? What does metadata do? Metadata schemes What is meta tag? Meta tag example Table of Content.
DSpace - Digital Library Software
Headings are defined with the to tags. defines the largest heading. defines the smallest heading. Note: Browsers automatically add an empty line before.
The library is open Digital Assets Management & Institutional Repository Russian-IUG November 2015 Tomsk, Russia Nabil Saadallah Manager Business.
Writing Your Own Web Page: Using HTML and FrontPage Chapter 10.
1 « Luxembourg, 18 April 2007 « Virtual Library of Official Statistics « Dissemination Working Group.
Sarvashrestha Paliwal ISV Evangelist Microsoft India.
CHAPTER 7 LESSON C Creating Database Reports. Lesson C Objectives  Display image data in a report  Manually create queries and data links  Create summary.
FACES General Overview ViRR (Virtueller Raum Reichsrecht) Software Solutions Kristina Büchner and Bastien Saquet Contact:Kristina Buechner:
Integrating Laserfiche and SharePoint PO108 Alex Wilson and Jessica Huang.
5/29/2001Y. D. Wu & M. Liu1 Content Management for Digital Library May 29, 2001.
A Presentation Presentation On JSP On JSP & Online Shopping Cart Online Shopping Cart.
Microsoft FrontPage 2003 Illustrated Complete Creating a Web Site.
Using Publishing Profiles to dump data out of Alma needed for resource sharing systems such as HathiTrust Margaret Briand Wolfe Systems Librarian Boston.
7th Annual Hong Kong Innovative Users Group Meeting
Building Search Systems for Digital Library Collections
Software and Multimedia
Software and Multimedia
Introduction to DSpace
EndNote by: fatimah alotaibi.
DIGITAL LIBRARY.
Presentation transcript:

Create and Manage METS in retrodigitization Markus Enders Goettingen State and University Library

Digitization Center Located at State and University Library Göttingen Founded in 1997 Funded by DFG Build infrastructure Set up production line for digitization

Digitization Center 3 bw/greyscale book scanners Quality control 2 color digitization working places Production line Image enchancement Ca pages / year Production line for all inhouse digitization projects

Digitization Center Software to create contents Software to present content on the web Software to manage contents Infrastructure Hardware to store contents

Digitization Center Software to create content Software to present content on the web Software to manage content Infrastructure Hardware to store and manage content } DMS

Document model Logical struture Physical structure Monograph, chapters, articles etc... only pages; no metadata for pages

Document model Logical struture Monograph, chapters, articles etc...

Document model Logical struture Physical structure Monograph, chapters, articles etc... only pages; no metadata for pages...

Document model Logical struture Physical structure Monograph, chapters, articles etc... only pages; no metadata for pages...

Document model Logical struture Physical structure Descriptive Metadata Monograph, chapters, articles etc... only pages; no metadata for pages MODS extension – own namespace

Document model Logical struture Physical structure Descriptive Metadata Monograph, chapters, articles etc... only pages; no metadata for pages Fulltext with coordinates for words separate TEI/XML file, linked to METS

Document model Logical struture Physical structure Descriptive Metadata Monograph, chapters, articles etc... only pages; no metadata for pages Fulltext Problem TEI: tag physical structure in TEI (TEI only support page- and column breaks.

Document model Logical struture Physical structure Descriptive Metadata Monograph, chapters, articles etc... only pages; no metadata for pages Fulltext Solution: Tag smallest physical structure in fulltext: text-blocks ( element)

Document model Logical struture Physical structure Descriptive Metadata Monograph, chapters, articles etc... only pages; no metadata for pages Fulltext with coordinates for words One image per page

Production (Metadata) Excel spreadsheet Bibliographic information Pagination information Structure information with metadata

Excel spreadsheet – bibliographic information on Monograph level

Excel spreadsheet – pagination information Columns A and C: counted pages start and end, logical page numbers Columns D and E: uncounted pages start and end Columns M and N: calculated physical page numbers

Excel spreadsheet – structural information Column B: type of structure element Columns C and D: start location of strucutre element (sequence and page) Columns H and I: Author and Title of structure element

Excel spreadsheet: Conversion of content to XML-file using a visual basic script RDF-XML based file

Excel spreadsheet: Conversion of content to XML-file using a visual basic script RDF-XML based file Conversion of content to METS using JAVA (POI library) METS file still in beta-test

AGORA Editor Commercial program Structural and bibliographic metadata Images are displayed during capturing Pagination information is captured „automatically“

AGORA Editor

Writes RDF/XML based file Converted to METS using Java program

Production (Metadata & fulltext) docWorks Software by CCS Structure data, Metadata and fulltext Direct METS output (no conversion necessary) Testing started in june

Production METS: Only docWorks has direct METS output For other solutions: Java program will convert output to METS Excel -> METS RDF/XML -> METS Can be used to migrate old data to METS

Management and Presentation Document Management System One platform for all digitization projects Development began in 1998 Defining own RDF/XML based format Cooperation with external company: „Satz-Rechen-Zentrum“, Berlin

Document Management System “AGORA” Java based server Verity search engine for: metadata fulltext Java based system; uses relational database Windows Administration client

Document Management System “AGORA” Data storage: Metadata, Structure data and fulltext in relation database Images stored in file-system

Document Management System “AGORA” Import: RDF/XML files (metadata; structure) Image data from file system METS support in August-release TEI/XML for fulltext (stored in database) Batch-import possible (hotfolder)

Document Management System “AGORA” Access: Web-Frontend HTML Templates (webmacro) Caching of HTML pages -> high performance XML-output possible (via webmacro)

Document Management System “AGORA” Access: Web-Frontend HTML Templates (webmacro) Caching of HTML pages -> high performance XML-output possible (via webmacro)

Document Management System “AGORA” Access: Web-Frontend HTML Templates (webmacro) Caching of HTML pages -> high performance XML-output possible (via webmacro)

DMS “AGORA” Page view: zoom with on-the fly conversion of images

DMS “AGORA” Hitlist:

DMS “AGORA” Hitlist: Image highlighting possible (fulltext search)

Document Management System “AGORA” Access: JAVA API Full functionality available: Add, update, read and delete elements retrieval OAI-PMH implementation based on API

Document Management System “AGORA” Export: XML export (with images)

Document Management System “AGORA” PDF-Export – logical structure as bookmarks:

Future document model Logical struture Physical structure Descriptive Metadata Monograph, chapters, articles etc... Pages, columns... Technical Metadata for images: NISO / MIX Fulltext Derivates of content files (images)

Future document model Metadata production line (using METS) docWorksAGORA Editor AGORA DMS Archive METS Converter

Further information GDZ DigiZeitschriften (example) AGORA