Presentation is loading. Please wait.

Presentation is loading. Please wait.

Working prototype Multi-Institution Testbed for Scalable Digital Archiving Three institutions are working together to rescue at-risk media, establish interoperability,

Similar presentations


Presentation on theme: "Working prototype Multi-Institution Testbed for Scalable Digital Archiving Three institutions are working together to rescue at-risk media, establish interoperability,"— Presentation transcript:

1 Working prototype Multi-Institution Testbed for Scalable Digital Archiving Three institutions are working together to rescue at-risk media, establish interoperability, and provide community access to shipboard and deep submergence vehicle data. The scientific benefits extend beyond these three institutions, as the holdings contain the results of 1600 major research expeditions from dozens of institutions, worldwide. Preservation is also motivated by acquisition cost, at $1M to $1.5M per expedition. This DIGARCH project tests the extension of digital library architectures, establishment of controlled vocabularies, auto-harvesting of metadata, automation of ingest and validation of content. SIO and SDSC contribute technology from the SIOExplorer NSDL project, including data and metadata harvesting, federated digital libraries, and user interfaces, as well as a digital library of 647 SIO cruises with 92,000 digital objects. WHOI contributes GeoBrowser Technologies and GIS Server based applications, and a collection of digital, video, film and paper items from 5000 campaigns to explore the deep sea over 40 years. Stephen Miller (SIO), John Helly (SDSC), Bob Detrick (WHOI) Scripps Institution of Oceanography (SIO), San Diego Supercomputer Center (SDSC), Woods Hole Oceanographic Institution (WHOI) We thank the DIGARCH Program of the National Science Foundation and the Library of Congress for their support (NSF IIS 0455998). SIOExplorer was largely developed as an NSDL Collections Track project (NSF DUE 0121684). OverviewAt-risk data of historic significance Collection building tools Acknowledgements Access and display tools Building a human network Testbed combining WHOI and SIO resources 1.Modify information architecture to enable scalable metadata evolution across institutions. Make greater use of controlled vocabularies. 2.Extend access tools across federated collections. 3.Inventory at-risk media and stage selected content for prototype test. 4.Harvest metadata from data and distributed resources. 5.Publish metadata and data in digital library 6.Adapt video display tools for digital library use. 7.Adapt GIS server for digital library use. Related project Original film archives Alvin nuclear bomb search, 1966 Discovery of 350° C Black Smoker hot vents 20 years of digital tapes, in critical need of migration mtfCreator - design a project Create a metadata template file (mtf) Define digital library structure Design for arbitrary digital object (ado) Data, image, document Metadata blocks Collection, Canonical ADO, domain-specific Controlled vocabularies Dictionaries Allow scalable, flexible changes to project adoHarvest – manage collection building Ingest “ado” objects of all types Automatic recognition of data categories Prioritize selection from alternative resources Auto-harvest metadata from data Scalable for individual objects or mass migrations Graphically monitor status, collection-wide Manage collections at distributed institutions adoQC - quality control QC of data and metadata Used during harvest, and throughout lifecycle Evaluation and maintenance, collection-wide adoCreator - prepare digital library entry Arbitrary digital object (ado) creator Finalize metadata record, synchronized with ado Implement persistent filename Collection access tools Data, images, documents Jason2 ROV Virtual Control Van Alvin submersible Framegrabber DIGARCH is more than storage systems and metadata. This multi- institution testbed is developing a network of computer scientists, researchers, librarians, programmers and students with a wide range of expertise. SIO: Miller, Clark, Gee, Peckman, Symons, Thach SDSC: Helly, Sutton, Weatherford WHOI: Detrick, Chandler, Gaylord, Gegg, Goldsmith, Lemmond, Lerner, Maffei, Tivey, Walden WHOI/MBL Library: Norton, Raymond, Rioux WHOI cruise, Alvin, and Jason2 data are fed into GeoBrowser and GIS Server applications. Working with federated collections. Template-driven, designed for re- use with diverse projects. Metadata and data are harvested from SIO cruises, ingested into the digital library and then accessed with SIOExplorer GUIs. Next Generation IODP Site Survey Data Bank Support international community 1000 scientists, 40 nations 68 proposals online in Digital Library Expedition cost $10-15M Archival lifecycle data access Preliminary ideas Review panels Expedition planning and safety Publication Subcontract to IODP-MI from NSF OCE 0432224 Sustainable effort 9-year contract Technology shared with DIGARCH Video display tools Integrated with metadata and other sensor streams Java search tool GIS Server Text-based search Prototype working across federated collections Shipboard DataGrabber


Download ppt "Working prototype Multi-Institution Testbed for Scalable Digital Archiving Three institutions are working together to rescue at-risk media, establish interoperability,"

Similar presentations


Ads by Google