Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Fedora Project March 19, 2003 ISTEC Symposium, Brazil

Similar presentations


Presentation on theme: "The Fedora Project March 19, 2003 ISTEC Symposium, Brazil"— Presentation transcript:

1 The Fedora Project March 19, 2003 ISTEC Symposium, Brazil
Sandy Payette Cornell Information Science

2 The Problem of Complex Content
Motivation The Problem of Complex Content

3 Digital Library Content not just documents ...
Some familiar objects Complex, compound, dynamic objects

4 Research Questions How can clients interact with heterogeneous collections of complex objects in a simple and interoperable manner? How can complex objects be designed to be both generic and genre-specific at the same time? How can we associate services and tools with objects to provide different presentations or transformations of the object content? How can we associate fine-grained access control policies with specific objects, or with groups of objects? How can we facilitate the long-term management and preservation of complex objects that have dependencies on distributed content and services?

5 The Flexible Extensible Digital Object Repository Architecture (FEDORA)
DARPA and NSF-funded research at Cornell (1997-present) CORBA-based reference implementation (Payette/Lagoze) Extensive interoperability testing (with Arms/Blanchi/Overly) Policy Enforcement (Payette/Schneider) Interpreted and re-implemented at U of Virginia (1999-) Simple web-oriented implementation, focused on access to collections Java servlet and relational db Testbed of 10,000,000 objects with performance metrics ( ) Mellon-Funded FEDORA Software(2002-) University of Virginia and Cornell - joint development Open source Web services and XML Mediation of distributed services Preservation focus

6 Fedora: Key Features Open System – public APIs, exposed as web services Flexible Digital Object Model XML submission and storage (METS Schema) Local and distributed content Data (any type) and metadata (any schema – DC, other) Supports inter-relationships among objects Behavior “contracts” for objects Associate services with objects Objects can provide launch-pad or tool to use object content Repository System: Management Service - manage digital resources, metadata, as well as computer programs, services and tools that support them Access Service – repository search and object disseminations Mediation - interacts with other distributed web services for content transformation and presentation OAI Provider Access Control Preservation service (future release)

7 Requirements: Heterogeneous Digital Collections
Books Rare Books Multimedia Music E-texts Maps Photographs Statistics Video Art Manuscripts Data Images 3-D Objects Journals Sound Effects

8 Shortcomings of commercial digital library products
Narrow focus on specific media formats (e.g. image databases, document management) Fail to effectively address interrelationships among digital entities Fail to address interoperability; no open interfaces to facilitate sharing of services; no standard protocols for cross-system interoperability Fail to provide facilities for managing programs and tools that are integral to delivering digital content. Not extensible; does not enable easy integration of new tools and services Do not address fine-grained access control and preservation issues.

9 The Fedora Architecture
Digital Object Model The Repository Web Services

10 FEDORA Basic Object Architecture
Digital Object Model Container to aggregate digital content of any type Data or metadata Local or distributed Behavior “contracts” Definitions of abstract operations Fulfillment via bindings to external services Enables multiple “disseminations” of content

11 Digital Object Model Functional View
Dynamic data Application services

12 Globally unique persistent id
Digital Object Model Architectural View Globally unique persistent id Persistent ID ( PID ) Public view: access methods for obtaining “disseminations” of digital object content Disseminators Internal view: metadata necessary to manage the object System Metadata Datastreams Protected view: content that makes up the “basis” of the object

13 Digital Object Model Example Disseminators
Get Profile List Items Get Item List Methods Get DC Record Persistent ID ( PID ) Disseminators Default Simple Image System Metadata Get Thumbnail Get Medium Get High Get VeryHigh Datastreams

14 Behavior Definition Object Behavior Mechanism Object
Object Behavior Contracts Behavior Definition Object Persistent ID (PID) Behavior Definition Metadata System Datastreams behavior subscription Data Object Persistent ID (PID) Disseminators Datastreams System Metadata behavior contract data contract Persistent ID (PID) Service Binding Metadata (WSDL) System Metadata Datastreams Web Service Behavior Mechanism Object

15 FEDORA Basic Repository Architecture
Repository System Object Management Lifecycle (Ingest/create  Store  Delete  Approve  Purge) Validation PID Generation Version management Access Control Preservation support Object Access Object Dissemination Object Reflection Service Mediation

16 Fedora Implementation
Understanding the system implementation Web Services Server Design

17 What is a Web Service? A distributed application that runs over the internet. A web application that publishes an open interface through which clients can send requests and received responses Standards Transport protocol: HTTP, others Messaging protocol: SOAP, HTTP GET/POST Message encoding: XML Service description: WSDL

18 Fedora and Web Services
Fedora Repository system is a web service Access/Search (API-A) and Management (API-M) Service descriptions published using WSDL Both SOAP and HTTP bindings Back-end services Digital object behaviors implemented as linkages to other distributed web services Service binding metadata (WSDL) stored in special Fedora Behavior Mechanism objects. Fedora acts as mediator to these services.

19 Fedora Repository System Client and Web Service Interactions
Frontend Backend Fedora Repository System Content Transform Service user application client application client Service Web Service Web Service Dispatch Content Transform Service user browser web Service

20 3-Tiered Architecture Modular & Extensible System Diagram
Fedora Server Design 3-Tiered Architecture Modular & Extensible System Diagram

21 Server Design: 3 Layers Interface Service Exposure
API-A, API-M, pure HTTP and SOAP via HTTP. Application Logic Implements requests in terms of the Fedora object model. Storage Database, File system, Object serializations and cache(s).

22 Fedora System Diagram

23 Open Source Fedora: Implementation Technologies
Fedora Web Services Layer Apache Axis for SOAP over HTTP Apache Tomcat 4.1 Core Repository System Sun Java J2SDK1.4 Xerces for XML parsing and validation Saxon 6.5 for XSLT transformation Schematron 1.5 for validation MySQL and Mckoi relational database Deployment Platforms Windows 2000, NT, XP Solaris Linux

24 DEMO: Use Cases Connect to Repository

25 Release Plan Phase 1 – Fedora 1.0 (May 1, 2003 public)
Advanced Access Control Preservation Service R2R Repository Federation Reliability Fault tolerance Mirroring and replication Performance tuning Caching Load balancing Storage scalability

26 Deployment Partners Los Alamos National Laboratory: Research Library
Library of Congress: Motion Picture and Recorded Sound Division Indiana University: Digital Library group Kings College London: Humanities Computing NYU: Humanities Computing Northwestern University: Academic Computing Oxford: Oxford Digital Library and The Refugee Studies Center Tufts: Digital Collections and Archives Department

27 More Information


Download ppt "The Fedora Project March 19, 2003 ISTEC Symposium, Brazil"

Similar presentations


Ads by Google