Presentation is loading. Please wait.

Presentation is loading. Please wait.

WMS, RUcore and Fedora Mini-Conference

Similar presentations


Presentation on theme: "WMS, RUcore and Fedora Mini-Conference"— Presentation transcript:

1 WMS, RUcore and Fedora Mini-Conference
Wednesday Morning Greetings and Introduction – Grace Collaboration and Architecture Overview – Ron RUcore Data Model – Grace WMS Tutorial - Mary Beth, Kalaivani, Sharon Lunch (box lunch in conference room) Wednesday Afternoon Hands-On Experience – Mary Beth, Kalaivani, Sharon Feedback from WMS sessions Collaboration Discussion – All

2 WMS, RUcore and Fedora Mini-Conference
Thursday Morning Brief Recap – Ron WMS architecture - Yang User Interface, Search engine and collections - Chad Management services - Ron Lunch (on your own) Thursday Afternoon Further collaboration discussion Wrap-up and next steps

3 Possible Areas for Collaboration
Data Registries File formats Content Models Software Development Requirements Sharing software Joint development Life cycle support Sharing Content Exchange, harvesting Federated Searching Fedora Experimentation Relationship services Directory ingest Use of xacml Very large files Event management

4 Fedora Enterprise Architecture Major Goals – 2007 thru 2009
Paradigm Focus Scholarly Communication Collaboration Libraries and Museums Access and Publishing Infinite Scalability Size of and number of objects Capacity and throughput (e.g. ingest 20TB a day) Life cycle preservation Trust Model Transactions - Begin/Commit Transactions across repositories Enable graph based objects (compound objects)

5 Persistence and Layered Architecture
Applications Repository Middleware App. Prog. Interface Data

6 Layered Architecture - RUcore
Applications and Portals (NJDH, RUcore, workflow, etc) Middleware Services (searching, alerting, integrity, etc) API Fedora Core & Framework FOXML & Datastreams

7 RUcore - How it Works Digital Object Repository User Input
Metadata and Archival masters RUCORE Portal NJ Digital Highway Custom Portals Dissertations User, Collection, & Preservation Services Workflow Management System Fedora Repository Service Faculty Submissions XML Digital Object Repository (Fedora) Digital Object Ingest 7

8 Simple and Compound Objects
Compound Object - Graph Model Article Object (Simple) Persistent ID IsAnnotationOf article Metadata Behaviors (Disseminators) Data streams IsAnnotationOf SMAP1 – StrMap (TOC) A2 DJVU1- presentation PDF1 - presentation XML1 – OCR text A1 ARCH1- Archival master (tiffs of each page)

9 Collections In RUcore A digital collection is simply a grouping of objects according to some criteria. Types of digital collections in RUcore Explicit – A digital collection whose object membership is specified explicitly within the descriptive metadata. Dynamic – A digital collection of objects which are grouped according to user specified criteria.

10 Using Explicit and Dynamic Collections
Personal Collections Department Collections Including Faculty Personal collections (e.g. preprints, reports, etc) ETDs for the Department Centers and Grant Funded Research New Jersey Digital Highway Center for Remote Sensing and Spatial Analysis (CRRSA) – Access and preservation of GIS resources related to New Jersey

11 RUcore Collection Architecture
Circles – collection objects Rectangles – content objects RUCORE Solid line – explicit membership Dashed line – dynamic membership NJDH (Grant Project) New Jersey Historical Society Rutgers University Libraries Rutgers University O1 O2 Centers/ Departments Eagleton Archive General Collections Special Collections Roosevelt M1 B1 N1 N2 N3 O1 P1 P2 11

12 Collection Architecture - Lefty
RUCORE Penn State (1782.1) Princeton (1782.1) Rutgers University (1782.2) N’Western (1782.1) RUL (1782.1) Center/Dept Collections RU ETDs Department ETDs (Graduate School) D3 D2 D1 Dept. ETDs FacColl One FacColl Two Solid line – explicit membership Dashed line – dynamic membership 12

13 Management Services (incl. Collection and Preservation)
Super-user editing (handles, datastreams, metadata) Purging an object Export (foxml, mets) Collections Collection administration Statistics Preservation Creation of archival master Creation of persistent ID (handle) Checksum verification

14 Management Services Access to individual objects is provided by a special search portal using the same indexes as the public search but providing Fedora API management functionality: Viewing, Exporting and/or purging objects Editing metadata, adding/changing datastreams Validating objects, checking audit trails, testing signatures There is a special Fedora database search allowing access to all objects whether or not they are members of an active collection.

15 Collection Administration
Edit collection information Add parents to a collection Add dynamic search terms to a collection Generate an XML structure map

16 Collections - Indexing and Ingest
Active Collections may be indexed individually or all together at any time, though this is typically done using a nightly cron job. Ingest is done through the management API and is typically called by the WMS program, but may be called directly from the management interface as well.

17 Preservation - Alerting
All Fedora API management functions trigger alerting messages, are stored in the Fedora audit trails, and are registered in the collection statistics database. Statistics are kept for all object downloads as well as editing activities and may be accessed at collection or repository levels.

18 Preservation – PIDs and Handles
Handles are normally created as part of the ingest process, but may be manually created, changed, or purged on a per object basis using the management interface. Three global registries for RU – Rutgers University Libraries – Rutgers University – NJ Digital Highway

19 Object Integrity – Verifying Checksums
Archival datastreams have SHA1 checksums, created during the WMS pipeline process, as well as filesize data stored in the technical metadata section of each objects. SHA1 checksums are tested using the sha1sum checking algorithm in conjunction with a management function that polls the repository and extracts sha1sum character strings from the techMD of individual objects or groups of objects. It has a calendar feature that allows it to be run as a cron on a subset of objects for each day of the week with result reports ed to appropriate data managers.

20 Certification as a Trusted Repository*
Ultimately, we want to become certified as a trusted repository. There are four major areas: A. Organization B. Repository Functions Repository actively monitors Archival Information Package Integrity. Repository staff have skills appropriate to their duties. C. Designated Community D. Technologies Repository has technologies to monitor security. Repository defines its Designated Community * RLG/NARA draft “An Audit Checklist for the Certification of Trusted Digital Repositories”

21 Preservation Services Architecture
Preservation Portal Preservation Services . . . Alerting Migration Monitoring Statistics Preservation Monitoring Event Messaging Preservation Integrity Fedora Repository Service Event notification. Provides basic messaging service based on JMS. The event message payload is based on the PREMIS event semantic units with extensions. Preservation monitoring. Includes the ability to create checksums, validate checksums and have Fedora send a message via “event notification” if there is a checksum failure. Preservation integrity. Includes the capability to define content models and validate an object’s content model on ingest. Other services include audit trails and versioning. Preservation Services. These are application services that rely on the Fedora framework services. For example, alerting messages are sent based on certain events such as an ingest, an object purge, an object edit, etc. Migration would involve services for detecting object obsolescence and invoking tools to perform mass migrations. Statistics of various kinds can be built based on selected event messages (e.g. reporting number of ingests per collection). Digital Object Repository Content Models Format Registry Fedora Service Framework 21

22 Content Models (Content Model Dissemination Architecture – CMDA)
The CM object specifies constraints on the digital object (DO) MIME type and format Min/max of number of datastreams Whether multiple datastreams are ordered The CM is used to determine runtime behavior On ingest, Fedora validates DO based on CM constraints Disseminators are not bound into the DO Run time binding occurs through the CM object and the rels-ext datastream The CM can point to a format registry

23 Content Models, Formats, and Disseminators
Book Object Persistent ID Metadata Rels-Ext (cmodel: book) Data streams PDF1 - presentation XML1 – OCR text ARCH1- Archival master (tiffs of each page) DJVU1- presentation SMAP1 – StrMap (TOC) Composite Model Content Model WSDL Bmech Object Bdef Object MethodMap hasCM hasBmech hasBdef <dsCompositeModel> <dsTypeModel ID=“PDF1” ordered=“false” min=“1” max=“1”> <form MIME=“application/pdf”</form> </dsTypeModel> <dsTypeModel ID=“ARCH1” <form MIME=“application/tar”</form> . </dsCompositeModel> tiff tar Format Registry pdf 23

24 Events and Outcomes An event is an:
. . . action that involves at least one object, agent, and/or rights entity (PREMIS). . . . occurrence that is significant to the performance of a task Event outcome – a situation or state that follows an event and is a result of the event.

25 Fedora Event Management
Generic Framework Events can have messages which are associated with all types of services (preservation, collection, user, etc) Messages represent events with actions and outcomes Fedora will provide a middle-ware messaging solution based on open-source Java Messaging Service (JMS) Fedora Working Group Focus Preservation events are atomic (i.e. associated with a Fedora API) The event message will be based on the PREMIS event entity Initial types: ingest, delete, modify, fixityCheck

26 The Event Message Event message structure
The message payload will be xml-based and use the PREMIS event entity semantic units Global identifiers (URIs) will be used for event type and outcome An example might look like the following: <event> <eventIdentifier> <eventIdentifierType>Rucore event</eventIdentifierType> <eventIdentifierValue>30169</eventIdentifierValue> </eventIdentifier> <eventType>info:premis/preservation/event/ingest<eventType> <eventDateTime> T19:20:30</eventDateTime> <eventDetail>(to be used for general information)</eventDetail> <eventOutcomeInformation> <eventOutcome>info:premis/preservation/outcome/success</eventOutcome> <eventOutcomeDetail>(more text)</eventOutcomeDetail> </eventOutcomeInformation> <linkingAgentIdentifier>rutgers-lib:200</linkingAgentIdentifier> <linkingAgentIdentifier>rutgers-lib:400</linkingAgentIdentifier> <linkingObjectIdentifier>rutgers-lib:4291</linkingObjectIdentifier> </event>

27 Event Management - Ingest (Using the publisher/subscriber model)
User Input Preservation Service (alerting) (snd/rcv) JMS JMS Topic Queue <eventType>ingest<> Preservation Service (reporting) (snd/rcv) JMS <eventType>delete<> <eventType> <eventType> Workflow Management System <eventType> (snd/rcv) JMS XML Digital Object Repository (Fedora) Digital Object Ingest


Download ppt "WMS, RUcore and Fedora Mini-Conference"

Similar presentations


Ads by Google