ADAPT An Approach to Digital Archiving and Preservation Technology Principal Investigator: Joseph JaJa Lead Programmers: Mike Smorul and Mike McGann Graduate.

Slides:



Advertisements
Similar presentations
New Release Announcements and Product Roadmap Chris DiPierro, Director of Software Development April 9-11, 2014
Advertisements

ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
Audit Control Environment Mike Smorul UMIACS. Issues surrounding asserting integrity Threats to Integrity of Digital Archives –Hardware/media degradation.
Data Grid: Storage Resource Broker Mike Smorul. SRB Overview Developed at San Diego Supercomputing Center. Provides the abstraction mechanisms needed.
Background Chronopolis Goals Data Grid supporting a Long-term Preservation Service Data Migration Data Migration to next generation technologies Trust.
Hydra Partners Meeting March 2012 Bill Branan DuraCloud Technical Lead.
DESIGNING A PUBLIC KEY INFRASTRUCTURE
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
CoreGRID Workpackage 5 Virtual Institute on Grid Information and Monitoring Services Authorizing Grid Resource Access and Consumption Erik Elmroth, Michał.
Network Management Overview IACT 918 July 2004 Gene Awyzio SITACS University of Wollongong.
Chronopolis: Preserving Our Digital Heritage David Minor UC San Diego San Diego Supercomputer Center.
PAWN: Producer-Archive Workflow Network University of Maryland Institute for Advanced Computer Studies Joseph Ja’Ja, Mike Smorul, Mike McGann.
May Archiving PAWN: A Policy-Driven Software Environment for Implementing Producer- Archive Interactions in Support of Long Term Digital.
Tools and Services for the Long Term Preservation and Access of Digital Archives Joseph JaJa, Mike Smorul, and Sangchul Song Institute for Advanced Computer.
Producer-Archive Workflow Network (PAWN) Goals Consistent with the Open Archival Information System (OAIS) model Use of web/grid technologies and platform.
ACE: A Software Tool to Ensure the Integrity of Digital Archives Principal Investigator: Joseph JaJa Graduate Student: Sangchul Song Lead Programmer: Michael.
PAWN V0.7 University of Maryland Institute for Advanced Computer Studies.
1 Using Scalable and Secure Web Technologies to Design Global Format Registry Muluwork Geremew, Sangchul Song and Joseph JaJa Institute for Advanced Computer.
Supporting Customized Archival Practices Using the Producer-Archive Workflow Network (PAWN) Mike Smorul, Mike McGann, Joseph JaJa.
Brief Overview of Major Enhancements to PAWN. Producer – Archive Workflow Network (PAWN) Distributed and secure ingestion of digital objects into the.
July NAGARA 1 Producer-Archive Workflow Network Mike Smorul, Mike McGann, Joseph JaJa Institute for Advanced Computer Science Studies University.
Robust Tools for Archiving and Preserving Digital Data Joseph JaJa, Mike Smorul, and Mike McGann Institute for Advanced Computer Studies Department of.
PAWN: A Novel Ingestion Workflow Technology for Digital Preservation
Tools and Services for the Long Term Preservation and Access of Digital Archives Joseph JaJa, Mike Smorul, and Sangchul Song Institute for Advanced Computer.
ACE: A Software Tool to Ensure the Integrity of Digital Archives Principal Investigator: Joseph JaJa Graduate Student: Sangchul Song Lead Programmers:
FOCUS: FOrmat CUration Service Advisor: Dr. Joseph JaJa Students: Sang Chul Song Muluwork Geremew.
May 23, 2007 Archiving ACE: A Novel Software Platform to Ensure the Integrity of Digital Archives Sangchul Song and Joseph JaJa Institute for Advanced.
Archiving Digital Government Data Joseph JaJa Institute for Advanced Computer Studies Department of Electrical and Computer Engineering University of Maryland.
Robust Technologies for Automated Ingestion and Long-Term Preservation of Digital Information Principal Investigator: Joseph JaJa Lead Programmers: Mike.
PAWN: Producer-Archive Workflow Network University of Maryland Institute for Advanced Computer Studies Joseph JaJa, Mike Smorul, Mike McGann.
Mike Smorul Saurabh Channan Digital Preservation and Archiving at the Institute for Advanced Computer Studies University of Maryland, College Park.
PAWN: Producer-Archive Workflow Network University of Maryland Institute for Advanced Computer Studies Joseph Ja’Ja, Mike Smorul, Mike McGann.
UMIACS PAWN, LPE, and GRASP data grids Mike Smorul.
WORKDAY TECHNOLOGY Stan Swete CTO - Workday 1.
Robust Technologies for Automated Ingestion and Long-Term Preservation of Digital Information PI: Joseph JaJa Co-PIs: Allison Druin and Doug Oard Major.
PAWN: A Novel Ingestion Workflow Technology for Digital Preservation Mike Smorul, Joseph JaJa, Yang Wang, and Fritz McCall.
Software Issues Derived from Dr. Fawcett’s Slides Phil Pratt-Szeliga Fall 2009.
Archival Prototypes and Lessons Learned Mike Smorul UMIACS.
FOCUS – A Scalable and Extensible Digital Format Registry Principal Investigator: Joseph JaJa Graduate Students: Sang Song and Muluwork Geremew Lead Programmers:
SAN DIEGO SUPERCOMPTER CENTERUC SAN DIEGO LIBRARIESNDIIPP PARTNERS MEETING David Minor SDSC Robert H. McDonald SDSC Sangchul Song UMIACS Bryan.
An Introduction to DuraCloud Carissa Smith, Partner Specialist Michele Kimpton, Project Director Bill Branan, Lead Software Developer Andrew Woods, Lead.
INTRUSION DETECTION SYSTEMS Tristan Walters Rayce West.
Slide 1 of 9 Presenting 24x7 Scheduler The art of computer automation Press PageDown key or click to advance.
Cardea Requirements, Authorization Model, Standards and Approach Globus World Security Workshop January 23, 2004 Rebekah Lepro Metz
Cluster Reliability Project ISIS Vanderbilt University.
Rule-Based Data Management Systems Reagan W. Moore Wayne Schroeder Mike Wan Arcot Rajasekar {moore, schroede, mwan, {moore, schroede, mwan,
ASG - Towards the Adaptive Semantic Services Enterprise Harald Meyer WWW Service Composition with Semantic Web Services
DCE (distributed computing environment) DCE (distributed computing environment)
The Grid System Design Liu Xiangrui Beijing Institute of Technology.
1 Introduction to Microsoft Windows 2000 Windows 2000 Overview Windows 2000 Architecture Overview Windows 2000 Directory Services Overview Logging On to.
Maintaining and Updating Windows Server Monitoring Windows Server It is important to monitor your Server system to make sure it is running smoothly.
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Persistent Management of Distributed Data Reagan W. Moore.
Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,
Distributed System Concepts and Architectures 2.3 Services Fall 2011 Student: Fan Bai
OAIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0.
Microsoft Management Seminar Series SMS 2003 Change Management.
GCRC Meeting 2004 BIRN Coordinating Center Software Development Vicky Rowley.
Managing and Monitoring the Microsoft Application Platform Damir Bersinic Ruth Morton IT Pro Advisor Microsoft Canada
A Technical Overview Bill Branan DuraCloud Technical Lead.
Rights Management for Shared Collections Storage Resource Broker Reagan W. Moore
Managing live digital content with DuraSpace services Bill Branan PASIG Spring 2015.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
De Rigueur - Adding Process to Your Business Analytics Environment Diane Hatcher, SAS Institute Inc, Cary, NC Falko Schulz, SAS Institute Australia., Brisbane,
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
Data Grids, Digital Libraries and Persistent Archives: An Integrated Approach to Publishing, Sharing and Archiving Data. Written By: R. Moore, A. Rajasekar,
PAWN: Producer-Archive Workflow Network
Joseph JaJa, Mike Smorul, and Sangchul Song
IT INFRASTRUCTURES Business-Driven Technologies
ACE – Auditing Control Environment
Presentation transcript:

ADAPT An Approach to Digital Archiving and Preservation Technology Principal Investigator: Joseph JaJa Lead Programmers: Mike Smorul and Mike McGann Graduate Students: Sang Song and Muluwork Geremew Institute for Advanced Computer Studies University of Maryland, College Park

Research Objectives Development of tools and technologies for: –Automated Distributed Ingestion – flexible platform for Producer-Archive Interactions –Management of Preservation Processes – Monitoring, Integrity Auditing, and Preservation Services. Evaluation and demonstration of tools on widely different collections.

Recent Major Accomplishments FOCUS – a scalable, and secure registry for persistent information and services applied to formats. ACE (Auditing Control Environment) - a policy-driven software environment to continually verify the integrity of an archive’s holdings. PAWN – Producer-Archive Workflow Network software platform for data ingestion. SRB Replication Monitor – 3 rd party replication in a data grid environment

FOrmat CUration Service Maintains persistent information on digital formats and applications to access and manipulate them. Accessible either –Directly through LDAP –Or indirectly through SOAP (Web Services) Web Service Agent Format Registry LDAP SOAP

Integrity Auditing Service Many types of errors: –Media or hardware degradation –Technology evolution/upgrades –Operational errors –Malicious alterations –Hardware/software malfunctions –…. Digital objects are subject to transformations and changing standards/protocols.

Basic Ideas Auditing service is managed and run independently of the archiving system. Active and user-triggered auditing. Time-stamped certificates that enable the verification of the integrity of the object throughout its lifetime – auditable record of every transformation. Highly available and secure service with the ability to detect and correct errors.

Overall Structure

Software Components Audit Manager: registers objects to be audited, and performs auditing either actively or as triggered by user/archive. Certificate Management System: An independent, highly available, and highly secure environment for preserving and ensuring the integrity of the certificates. Object Monitor: Verifies the availability of the data in the archive using the object ids in the CMS.

PAWN Flexible platform for creating custom package ingest workflows. Handle complex interactions while providing simple end-user ingestion. Accountability of transfer and guarantee of data integrity. Scalable infrastructure.

Distributed Ingestion with PAWN Multiple producing sites with different requirements. Separation of administrative responsibility. Customizable roles for various parties.

Components

Software Components Management Servers – Track administrative functionality and high level package details for a set of domains. Scheduler – Allocate resources from receiving servers for client packages Receiving Server – Holding pool for packages in pawn, handles 3 rd party package operations. Client – Creates packages and submits to receiving server.

Package Workflow Overview 1.Create Producer-Archive Agreement 2.Client package template. 3.Create package based on template 4.Once approved, packages can be archived 5.Rejected packages can be held until rectified or deleted for resubmission.

Extensible Platform Customizable roles for ingestion. –Arbitrary grouping of actions within PAWN. API for creating custom clients. –Hierarchical package building. –PAWN handles transport and tracking. Pluggable modules for communicating with various archive resources

Replication Monitoring Automatically synchronize collections between master and mirror sites. Log any actions or anomalies. Support multiple collections.

Replica Monitor Demonstrations Transcontinental Persistent Archive Prototype –5.5million files between UMD, Archives I and Archives II –1.2Tb image collection between UMD and SDSC Chronopolis testbed –>5Tb replicated monitored between SDSC, UMD, NCAR

Conclusion Research program focusing on tools and environments for ingestion, management of preservation processes, and in the near future access for long term digital archives. Software prototyping and testing on a wide variety of collections that are available locally. Tools to be used by the Chronopolis Consortium, NARA, and NDIIPP partners.