ECHO DEPository Project: Highlight on tools & emerging issues The ECHO DEPository Project is a 3-year digital preservation research and development project.

Slides:



Advertisements
Similar presentations
Preserv: Preservation architecture and interface A brief overview of ideas wrt to the project plan For Preserv partners meeting, BL, London, 18th November.
Advertisements

ETD Preservation Workshop Session Four: Collection Management for Preservation Gail McMillan, Virginia Tech.
OAI and Publishers metadata Using the static repositories approach to disclose small journals.
DSpace: the MIT Libraries Institutional Repository MacKenzie Smith, MIT EDUCAUSE 2003, November 5 th Copyright MacKenzie Smith, This work is the.
METS: An Introduction Structuring Digital Content.
Advanced Metadata Usage Daan Broeder TLA - MPI for Psycholinguistics / CLARIN Metadata in Context, APA/CLARIN Workshop, September 2010 Nijmegen.
Digital Preservation - Its all about the metadata right? “Metadata and Digital Preservation: How Much Do We Really Need?” SAA 2014 Panel Saturday, August.
An Arizona Model for Capturing and Describing Documents on the Web Richard Pearce-Moses Director of Digital Government Information Arizona State Library,
Transformations at GPO: An Update on the Government Printing Office's Future Digital System George Barnum Coalition for Networked Information December.
PREMIS in Thought: Data Center for LC Digital Holdings Ardys Kozbial, Arwen Hutt, David Minor February 11, 2008.
Mark Evans, Tessella Digital Preservation Boot Camp – PASIG meeting, Washington DC, 22 nd May 2013 PREMIS Practical Strategies For Preservation Metadata.
1 Strategies for Collecting and Preserving Open Access Materials on the Web William Y. Arms Cornell University Federal Library and Information Center Committee.
Kristin Eberle Monica Hampton Carmen Velasquez Kristin Eberle Monica Hampton Carmen Velasquez Knowledge Management.
Persistent Digital Archives and Library System (PeDALS) A Guide for Wisconsin State Agencies.
North Carolina Geospatial Data Archiving Project (NCGDAP) Project Overview Partnership –University library (NCSU) and state agency (NCCGIA) –$520,000 funding,
OCLC Online Computer Library Center OCLC’s Digital Archive – Disseminating with METS Jay Goodkin Software Engineer Digital Collection and Preservation.
Chapter 1 Overview of Databases and Transaction Processing.
Bibliography in the Digital Age - IFLA Satellite Meeting Warsaw, 9 August Online materials published in Austria collecting, archiving and metadata.
Joanne Archer University of Maryland Kate Odell Archive-It Abbie Grotke Library of Congress Tessa Fallon Columbia University Creating and Maintaining Web.
WebArchiv Czech Web Archive IIPC 2007, Paris.
Metadata Harvesting The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing Workshop.
WORKFLOWS AND OTHER CONSIDERATIONS FOR DIGITIZATION  Steve Bingo  Processing Archivist Washington State University Libraries  Alex Merrill  Assistant.
Persistent Digital Archives and Library System (PeDALS) SC Department of Archives and History.
1 XML as a preservation strategy Experiences with the DiVA document format Eva Müller, Uwe Klosa Electronic Publishing Centre Uppsala University Library,
The ECHO DEPository Project A project of the University of Illinois at Urbana-Champaign and OCLC in partnership with the Library of Congress ALA Annual.
LIS 506 (Fall 2006) LIS 506 Information Technology Week 11: Digital Libraries & Institutional Repositories.
Digital Preservation: Lessons learned through national action Digital Preservation Interoperability Framework Workshop April 2010.
University of North Texas Libraries Building Search Systems for Digital Library Collections Mark E. Phillips Texas Conference on Digital Libraries May.
The Real At Risk E-Content: University Web Resources EDUCAUSE Joanne Kaczmarek University of Illinois at Urbana-Champaign Taylor Surface OCLC October 12,
Preserving Digital Culture: Tools & Strategies for Building Web Archives : Tools and Strategies for Building Web Archives Internet Librarian 2009 Tracy.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Archival Information Packages for NASA HDF-EOS Data R. Duerr, Kent Yang, Azhar Sikander.
November 2004 NDIIPP: Future Directions and Relevance to Other Countries Beth Dulabahn Office of Strategic Initiatives Library of Congress November 7,
Creating Archive Information Packages for Data Sets: Early Experiments with Digital Library Standards Ruth Duerr, NSIDC MiQun Yang, THG Azhar Sikander,
Implementor’s Panel: BL’s eJournal Archiving solution using METS, MODS and PREMIS Markus Enders, British Library DC2008, Berlin.
The Canadian Information Network for Research in the Social Sciences and Humanities Tim Au Yeung and Mary Westell Libraries.
CBSOR,Indian Statistical Institute 30th March 07, ISI,Kokata 1 Digital Repository support for Consortium Dr. Devika P. Madalli Documentation Research &
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Persistent Management of Distributed Data Reagan W. Moore.
Jan 9, 2004 Symposium on Best Practice LSA, Boston, MA 1 Comparability of language data and analysis Using an ontology for linguistics Scott Farrar, U.
Habing1 Integrating PREMIS and METS PREMIS Tutorial Implementers’ Panel June 21, 2007, 9:00-5:30 Library of Congress, Jefferson Building, Whittall.
CyberCemetery Preserving At-Risk Government Web Content.
ALA Institutional Repository Update ALA Archives at the University of Illinois Urbana-Champaign Chris Prom Cara Bertram Denise Rayman.
Global Digital Format Registry Progress Andrea Goethals, Harvard University Library NDIIPP Digital Preservation Partners’ Meeting Arlington, VA July 9,
Persistent Digital Archives and Library System (PeDALS)
DSpace vs Fedora Ralph LeVan OCLC Research. What Do You Want From a Repository? How do you create your metadata? How do you assemble your objects? How.
OAIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0.
Metadata Extraction & Web Archives: Automating the Record Creation Process Abbie Grotke / Gina Jones /
Preservation Program Digital Preservation Program Digital Preservation Services: Extending tools to meet campus needs Patricia Cruse, Director, Digital.
The Mint Mapping tool The MoRe aggregator Vassilis Tzouvaras, Dimitris Gavrilis National Technical University of Athens Digital Curation Unit - IMIS, Athena.
DAITSS: Dark Archive in the Sunshine State Priscilla Caplan Florida Center for Library Automation (FCLA)
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
Overviews of the Library of Texas & ZLOT Project Dr. William E. Moen Principal Investigator.
Metadata Training for SEFSC Science Staff Part Two.
Copyright (c) 2014 Pearson Education, Inc. Introduction to DBMS.
From Access to Archive Transforming Scholars Portal into an E-Journal Archive.
Lifecycle Metadata for Digital Objects November 15, 2004 Preservation Metadata.
Repository-specific Spoke Scripts Content Repository JSR-170/283 Content Repository for Java Technology API Normalized H&S METS Files METS Import/ExportMETS.
Chapter 1 Overview of Databases and Transaction Processing.
Archivists' Toolkit - All Hands Meeting Scope Both multilevel and single-level description Accommodates description of collections, series, sub-series,
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
Building A Repository for Digital Objects
Multiple approaches to archival description
Joanne Archer University of Maryland Libraries
VI-SEEM Data Repository
Outline Pursue Interoperability: Digital Libraries
Integrating PREMIS and METS
PREMIS Tools and Services
Márton Németh – László Drótos How to catalogue a web archive?
Robin Dale RLG OAIS Functionality Robin Dale RLG
ArchivesSpace – Archivematica – DSpace Workflow Integration
Presentation transcript:

ECHO DEPository Project: Highlight on tools & emerging issues The ECHO DEPository Project is a 3-year digital preservation research and development project at the University of Illinois at Urbana-Champaign in partnership with OCLC and a consortium of content provider partners, and funded by the Library of Congress under their National Digital Information Infrastructure Preservation Program (NDIIPP). Overview This poster presents selected highlights of work to-date in three core areas of activity. 3. Tools Development 1. Repository Evaluation 2. Preservation Research Based on the Arizona selection model, the Workbench will help users identify, select and harvest Web content, add metadata to harvested objects, and package harvested objects for ingest into a digital repository. The first two tools are in beta testing as of May Web Archives Workbench (WAW) A suite of four tools Discovery tool Identify and maintain list of domains relevant for collecting area (May 2005 ) Properties tool Organize collecting scope; add metadata that will be inherited by harvested content (May 2005) Analysis tool Provide visual analysis of site structure to inform decisions at the series level regarding what should be harvested (Jan 2006) Packager tool Package the harvested content and associated metadata for ingest into a repository (Jan 2006) Reduces time to identify & select web materials to a more practical size, while keeping a scalable degree of human involvement Fits within any number of preservation strategies quickly and efficiently; operate in multiple repository environments Not a repository but a “front-end” for many repositories a. Discovery tool Helps curators identify domains that are within their collecting scope Crawls web sites & discovers domains of possible interest from content Maintains lists of domains Monitors selected domains for changes Analysis toolPackaging tool Entry points: Select ‘seed’ URI from which crawler will discover domains of possible interest Domains: Review list of discovered domains; assign as in/out of collecting scope b. Properties tool Relates content providers to web sites Organizes a ‘group’ of web sites hierarchically Associates metadata with content providers and, later, with selected content Metadata can be subject headings, preferred names, aliases, etc. Entities: Associate metadata with content creators c. Analysis tool d. Packaging tool See our web site at Content selection at varying levels of granularity Harvests an entire site or one document Scheduled harvesting of content Shows site structure for content identification and harvest parameters Content is associated to content provider’s metadata Combines descriptive metadata with the digital object Packages web content and metadata into an XML standard package (METS) Neutral format for ingest into OCLC archive and other repositories Emerging issues and research questions Initial population of DSpace in progress in preparation for export to other repository packages. Content includes DOQs, full-text journal articles, PBS media files. Emerging issues Storage space consumed faster than anticipated due to duplication needed to run multiple repositories Status How exactly do we understand resources as digital objects? Consider the UIUC legacy mainframe database. We have arrived (with effort and luck) at a description of its file structure. If we record a description of that structure we can later review the resource for archival purposes, but not continue to use the database for retrieval. On the other hand, if we convert the records into a modern, relational database that can run via a DBMS, we have extended the lifespan of the resource. But the file structure at the byte level and the record structure at the logical level will be very different from the legacy binary file. There will be certain questions (of, e.g., provenance) that we can no longer answer. Or we can do both. But what, exactly have we preserved in either case, and how do we describe what we've done precisely using metadata? Need for common packaging format Anticipated paucity of metadata What does it mean to say that we have “preserved” a digital resource? Is the encapsulation of an object's internal structure behind an object-oriented interface a benefit for preservation or a drawback? Materials are managed as a hierarchy of aggregates Website is viewed as similar to an archival collection Creates efficiencies for … selection of content name authority & other metadata creation beyond initial analysis, content selection & descrip- tion can be automated WAW ToolsWAW Basis: The Arizona ModelWAW Benefits In development