Presentation is loading. Please wait.

Presentation is loading. Please wait.

Presentations Introduction Case Studies: – Policies, Services, Interoperability, Mashups: BNF, DCAPE, PoDRI, e-Legacy – RENCI Federated Data Projects:

Similar presentations


Presentation on theme: "Presentations Introduction Case Studies: – Policies, Services, Interoperability, Mashups: BNF, DCAPE, PoDRI, e-Legacy – RENCI Federated Data Projects:"— Presentation transcript:

1 Presentations Introduction Case Studies: – Policies, Services, Interoperability, Mashups: BNF, DCAPE, PoDRI, e-Legacy – RENCI Federated Data Projects: NARA TPAP, RENCI VO, TIP – Interfaces: Islandora, Jargon, CDR

2 A Unified Web interface for Browsing or searching Flickr file system /flickr/commons/ Using flickr API, a RESTful web API Each /flickr/commons/Institution folder translates to the result of one or two calls to the flickr API, presented to iRODS as if it were a file system For a collection to integrate, it would need to have some remote API that we could write a driver for and one or more ways to map that collection into a tree Each mountable service is made into a resource with all relevant info (location, resource type, etc. iRODS federates major collections From Ken Arnold, SHAMAN project YouTube Media accessible through API User Sees Single Hierarchy New Service Mountable file system: Hulu, photobucket, etc.

3 User With Client Views & Manages Data My Data Disk, Tape, Database, Filesystem, etc. The iRODS Data System can install in a layer over existing or new data, letting you view, manage, and share part or all of diverse data in a unified Collection. iRODS Shows Unified Virtual Collection My Data Disk, Tape, Database, Filesystem, etc. User Sees Single Virtual Collection Partners Data Remote Disk, Tape, Filesystem, etc.

4 User With iRODS Client searches CATALOG to find and get Data Users can search for, access, add/extract metadata, annotate, analyze & process, replicate, copy, share data, manage & track access, subscribe, and more. Accessing Data in the iRODS System Gets data to user. I need data! Finds the data. Data Server Disk, Tape, Database, Filesystem, etc. iRODS Metadata Catalog Keeps track of data iRODS Data System

5 User Interface Web or GUI Client to Access and Manage Data & Metadata* Overview of iRODS Components iRODS Server Data on Disk iRODS Metadata Catalog Database Tracks state of data iRODS Rule Engine Implements Policies *Access data with: Web-based Browser, iRODS GUI, Command Line clients, Dspace, Fedora, Kepler workflow, WebDAV, user level file system, etc.

6 Community Decides how to manage shared Collection(s) "Layers" in iRODS: From Users to Storage Policies Express goals for data access, sharing, preservation, etc. Policies Express goals for data access, sharing, preservation, etc. Administrator/User Applies Rules Rules Implement Policies in computer-actionable form Rules Implement Policies in computer-actionable form iRODS Server Executes Micro- services Micro-services Operate on reomte data Micro-services Operate on reomte data

7 Under the hood - a glimpse iRODS Server Rule Engine Data request goes to 1 st Server iRODS Server Rule Engine iRODS Server Rule Engine DB Server looks up information in catalog Catalog tells 2 nd federated server has data 1 st server asks 2 nd server for data 2 nd server applies Rules and serves data User asks for data (using logical properties) Meta Data Catalog NC State Duke Chapel Hill

8 Policies in iRODS Policies: Express community goals for data access and sharing, management, long-term preservation, uses, etc. Policy Examples – Run a particular workflow when a set of files is ingested into a collection (e.g. make thumbnails of images, post to website). – Automatically replicate a file added to a collection into 3 geographically distributed sites. – Automatically extract metadata for a file of a certain type and store in metadata catalog. – Periodically check integrity of files in a Collection and repair/replace if needed/possible. – Automatically pick a certain storage location based on user or collection or size or type. – Let a user access a collection only if using certificate-based login. – Send a notification when a certain file is ingested. – etc.

9 Policies, Services, Interoperability, Mashups: Richard Marciano, SILS

10 e-Legacy Mashup RSS Feed Reader Data Grid (SRB/iRODS)

11 Appraisal Description Arrangement Preservation e-Legacy Demo Subscribe to RSS Review Received Entry Share and Tag Meet Preservation Criteria Preserve to iRODS Yes

12 National Library of France: Distributed Archiving & Preservation System (SPAR)

13 BNF: French National Library Three rules: – Import Import an input document into iRODS Add import date and checksum as AVU-triplet metadata Replicate to other resources – Get Locate a copy of the record Return if physical checksum.eq. stored checksum If not, delete replica, copy a good one over it – Audit Locate all replicas of a data object Compute a physical checksum using systems MD5 Compare the result of the checksum stored in user metadata All stale copies are removed and then replicated from another good copy When all copies are audited, a clean copy is staged onto a specific FS directory

14 BNF: French National Library Three rules: – Import Import an input document into iRODS Add import date and checksum as AVU-triplet metadata Replicate to other resources – Get Locate a copy of the record Return if physical checksum.eq. stored checksum If not, delete replica, copy a good one over it – Audit Locate all replicas of a data object Compute a physical checksum using systems MD5 Compare the result of the checksum stored in user metadata All stale copies are removed and then replicated from another good copy When all copies are audited, a clean copy is staged onto a specific FS directory

15 BNF: French National Library Micro-Services – Add metadata to an iRODS object – Import an object into iRODS, compute MD5 checksum and validate against the supplied one. Once validated, add MD5SUM and import date as metadata. If invalid, content is removed from iRODS – Return the value of an iRODS object metadata attribute – Prepare to retrieve a metadata attribute for a resource – Prepare to retrieve a metadata attribute for an object – Get the input resources belonging to a zone name – Get iCAT results regarding location info for a record – Execute MD5SUM on the physical content and return value – Return a pseudo random string of specified length – Delete a stale replica and replicate over it from another fresh copy – Stale replica replacement can be eager (synchronous execution) or lazy (delayed execution)

16 DCAPE

17

18

19 PoDRI: Policy-Driven Repository Interoperability

20 RENCI Federated Data Projects Leesa Brieger, RENCI

21 RENCI VO Data Grid iRODS Server Metadata Catalog (iCAT) DB RENCI, Europa Center iRODS Server UNC-A UNC-CH NCSU Duke iRODS Server Client asks for data Data request goes to iRODS server Server looks up information in iCAT iCAT tells which iRODS server has data Data is retrieved from physical location and delivered to client Client asks for data Data request goes to iRODS server Server looks up information in iCAT iCAT tells which iRODS server has data Data is retrieved from physical location and delivered to client ECU

22 National Archives and Records Administration Transcontinental Persistent Archive Prototype (TPAP) UMD UCSD iCAT Georgia Tech iCAT Federation of Seven Independent Data Grids NARA II iCAT NARA I iCAT Extensible Environment: can federate with additional research and education sites. Each data grid uses different vendor products. Rocket Center UNC iCAT

23 Federated Repositories TUCASI Infrastructure Project (TIP)

24 Leverage data resources for competitive research and leadership Support research and education efforts in a wide range of disciplines and domains National leadership in next-generation data management Model for long term campus storage Architecture and design; hardware, software Operations and support Data policies Selection and retention Ingest, curation and preservation Collections and repository management Goals

25 A Test Classroom content on a DICE/RENCI data grid PanoptoElluminate

26 Interfaces Jargon, Web, REST, SOAP Mike Conway, DICE Center Jargon, Java, Interface Developer

27 Goals Make integration simple by creating clear, familiar service API. Make IRODS a familiar, easy-to-use resource to mid-tier Java developers. Develop a REST/SOAP service model for common use-cases using mature tools. Create an out-of-the-box web interface that makes IRODS easy for administrators and archivists.

28 Currently... Jargon is a pure-Java API that talks to IRODS over Java sockets. Jargon is fairly low-level and can be tricky at first. Used in multiple projects including WebDAV interface, as well as integration with the Fedora repository via the irodsfedora library.

29 Jargon (next...) Jargon-core: Jargon re-factored High level service API, POJO's, Spring-friendly Emphasis on testability Jargon-akubra: Implementation of an Akubra module for IRODS via Jargon Jargon-lingo: Application of mature open- source tools over Jargon-core to provide REST- ful, SOAP, and Web interface to IRODS.

30 Conceptual Diagram IRODS Grid Jargon-core Jargon-lingoJargon-akubra Custom code (Java, Groovy, Jython Jruby, etc.) DuraSpace Framewor ks Web SOAP/RE ST IRODS Service Model

31 TRLN Partners Questionnaire NC State Jim Tuttle Duke Seth Shaw Duke Winston Atkins Duke Russell Koonts UNC Will Owen 1. Preservation Projects Geo NDIIPP Images e-Theses Dissertations records TRAC 30 criteria Fedora iRODS checksum 2 copies CDR 2. Status Planned planned production ½ way testing phase near production 3. Preservation Challenges permission auditing replication search/browse version control policies tiered storage getting the backlog generating meta. consolidating meta. prez. planning sys. reliability 4. iRODS no yes 5. iRODS Challenges NA none rules syntax documentation production configuration stable release 6. QuestionsNone working w. archivists maintenance releases iRODS book


Download ppt "Presentations Introduction Case Studies: – Policies, Services, Interoperability, Mashups: BNF, DCAPE, PoDRI, e-Legacy – RENCI Federated Data Projects:"

Similar presentations


Ads by Google