Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Mellon-Funded Fedora Project A Presentation to the European Digital Library Conference September 17, 2002 Sandy Payette and Thornton Staples.

Similar presentations


Presentation on theme: "The Mellon-Funded Fedora Project A Presentation to the European Digital Library Conference September 17, 2002 Sandy Payette and Thornton Staples."— Presentation transcript:

1 The Mellon-Funded Fedora Project A Presentation to the European Digital Library Conference September 17, 2002 Sandy Payette and Thornton Staples

2 The Flexible Extensible Digital Object Repository Architecture (FEDORA) Developed as a DARPA and NSF-funded research project at Cornell (1997-present) Interpreted and re-implemented at University of Virginia (1999) Virginia prototype supported a testbed of 10,000,000 digital objects with very good results (1999-2001) Andrew W. Mellon Foundation granted Virginia and Cornell $1,000,000 to develop a full-featured production FEDORA system that that is web-based (2002+)

3 Managing the Collection Provide a way to universally name all resources without respect to machine address Track all files for resources, metadata and computer programs consistently Enforce appropriate policies for use of Library resources Manage resources in all media and content types Support preservation activities appropriately

4 Delivering the Collection Deliver tools with content for all media and content types Allow every resource to be used in any number of contexts Interoperate with other digital libraries Move towards a library which aware user’s can configure for themselves

5 Supporting Digital Scholarship Supporting the creation of digital scholarly projects Collecting born-digital scholarly projects  For preservation  Taking over responsibility for primary delivery Supporting information communities

6 Shortcomings of commercial digital library products Narrow focus on specific media formats (e.g. image databases, document management) Fail to effectively address interrelationships among digital entities Fail to address interoperability; no open interfaces to facilitate sharing of services; no standard protocols for cross-system interoperability Fail to provide facilities for managing programs and tools that are integral to delivering digital content. Not extensible; do not enable easy integration of new tools and services

7 The Current Project An efficient, scalable, freely distributable FEDORA repository system ASAP A complete basic management interface with the initial release Add important digital library functionality in later releases Multiple testbed repositories to deploy and evaluate the software Make all software open source

8 Deployment Group The Digital Library group, Indiana University The Humanities Computing group, New York University The Digital Collections and Archives Department, Tufts University The Humanities Computing group, Kings College London The Refugee Studies Center, Oxford University Audio/Video Project, Library of Congress Library group, Los Alamos National Laboratory

9 The Fedora Architecture Overview of Basic Model

10 FEDORA Basic Architectural Abstractions Digital Object – Container for aggregating any digital content – Content disseminations based on behavior definitions – Extensibility of behavior mechanisms Repository – Service layer for “contained” Digital Objects – Object lifecycle management – Access management

11 Persistent ID (PID) Disseminators SystemMetadata Datastreams FEDORA Digital Object Globally unique persistent id Public view: access methods for obtaining “disseminations” of digital object content Internal view: metadata necessary to manage the object Protected view: content that makes up the “basis” of the object

12 Persistent ID (PID) Service Definition Metadata SystemMetadata Datastreams Behavior Definition Object Behavior Mechanism Object Persistent ID (PID) Disseminators SystemMetadata Datastreams Data Object FEDORA Digital Object Architecture Persistent ID (PID) Service Binding Metadata SystemMetadata Datastreams

13 Data Object Association to External Behavior Service

14 Digital Object Interoperability Common Behaviors for Variable Content Functional equivalency

15 Virginia Prototype Content Models and Fedora Demos

16 (Mycenae image example) General Image Content Model

17 (Pavilion III image example) MrSID Image Content Model

18 (ICPSR survey example) Numerical Data Content Model

19 The New FEDORA Repository System Implementation

20 Web Services and XML Web Service: A web application that publishes an open interface through which clients can send structured messages and receive structured responses. Simple Object Access Protocol (SOAP) – SOAP is a messaging protocol that can run over different transport protocols (e.g., HTTP, SMTP) – Requests and responses sent as XML messages Web Service Description Language (WSDL) – XML Schema used to formally define APIs (application programming interfaces) as a set of abstract operations and service bindings – Supports simple and complex data typing in requests and responses

21 New Fedora: Key Features Repository system exposed as two related Web services – described using WSDL – both SOAP and HTTP bindings Digital objects encoded and stored as XML using Metadata Encoding and Transmission Standard (METS) Digital object behaviors implemented as linkages to distributed web services (also described using WSDL) Digital objects support versioning of both content and services.

22 New Fedora System

23 Web Service Communication View

24 The New FEDORA Encoding Digital Objects in XML

25 Metadata Encoding and Transmission Standard (METS) XML emerging standard for encoding descriptive, administrative, and structural metadata of digital library objects Developed under auspices of the Digital Library Federation METS standard maintained by the Network Development and MARC Standards Office of the Library of Congress http://www.loc.gov/standards/mets/

26 METS Schema METS is written in the XML Schema Language METS defines four sections for an object – Descriptive metadata – Administrative metadata – File group – Structure map METS goals include: – Facilitate management of objects within a repository – Provide a standard format for exchange of objects between repositories – Provide standard format for transmission of objects to users for rendering (via tools or applications)

27 Digital Object Versioning Versioning within Data Objects – Datastream versioning Date/time stamped New version every time datastream is modified – Disseminator versioning Date/time stamped New version if disseminator is modified to reference a different Behavior Mechanism (“better mousetrap”) Versioning within Behavior Definition and Mechanism Objects – New versions of WSDL metadata recorded in these objects (with date/time stamps) – This deserves much more explanation that this slide can offer!

28 The New FEDORA Repository Services and Sub-systems

29 Fedora Repository System

30 FEDORA Web Service API Definitions “API-M” – interface for management sub-system – Operations necessary to create and maintain objects and their components – Interface directly with authoritative XML version of object “API-A” – interface for access sub-system – Operations necessary for clients to perform disseminations on objects in the repository – No direct access to object internal structure or components – Will work against cached representation of object to optimize performance.

31 Fedora Management Sub-System Implements API-M Object Management Object Component Management Object Validation PID Generation Interacts with Storage Subsystem Access control via Security Subsystem

32 Fedora Access Sub-System Implements API-A Object Reflection – Identify the types of Behavior Definitions to which an object subscribes (via the object’s Disseminators) – Reflect on a Behavior Definition to identify the kinds of disseminations that can be run on the object (i.e,. as method requests) Dissemination – Fulfills requests for particular methods (i.e., of a Behavior Definition) to be run on an object – Mediates access to supporting services (i.e., Behavior Mechanisms) used to present or transform datastreams of the object – Returns a view of the object’s content to client

33 API-A: Object Reflection Requests Identify Types of Behavior Definitions Each Disseminator is said to “subscribe” to a Behavior Definition It does this by referencing the PID of a particular Behavior Definition Object. Each Behavior Definition Object contains metadata that describes a set of related behaviors (or operations) Via API-A, clients can send a service request to determine what Behavior Definitions an object subscribes to.

34 API-A: Object Reflection Request Get Behavior Methods Each Disseminator has a Behavior Definition Object associated with it. Each Disseminator has a Behavior Mechanism Object associated with it that describes how to bind to a particular service that complies with the Disseminator’s Behavior Definition. Via API-A, clients can send a service request to obtain the list of method definitions associated with a particular Disseminator of the digital object.

35 API-A: Object Reflection Requests Web-default, Web-image, Admin get-as-page; get-in-context MrSID Image Object Web-default Web-image Admin System Metadata Basis (MrSID-encoded image file) Repository API-A GetBehaviorDefinitions?PID=101 PID = 101 GetBehaviorMethods?PID=101&BID=Web-default

36 API-A: Dissemination Request Clients can obtain content from a digital object with minimal knowledge about the object. Behavior Definition identifiers and method definitions are the basis for making dissemination requests on digital objects Client’s do not need to know particulars of how to attach to the service (Behavior Mechanism) that is operating on its behalf. A dissemination request requires just three things: – Digital Object Identifier (PID) – Behavior Definition Identifier (BID) – Method name (and optional parameters) for a behavior

37 API-A: Dissemination Request Digital Object: 101 Image of bird Bird Digital Library1 White Birds: Image 1 Image 2 Image 3 GetDissemination?PID=101&BID=Web-default &method=get-as-page MrSID Image Object Web-default Web-image Admin System Metadata Basis (MrSID-encoded image file) Repository API-A

38 Disseminations Benefits Simple access: dissemination requests shield clients from the internal structure of digital objects Stable interface: dissemination requests are like requests against an abstract interface in that they are not tied to object implementation details that may change over time (e.g., storage locations of datastreams) Foster Interoperability: different digital objects can vary in both the format of content and how it is structured, yet we can access them in a consistent manner via disseminations.

39 Fedora Project Plan Phase 1: (pre-release Oct 31, 2002; final Jan 2003) –Repository system with management and access subsystems exposed as web services –Storage subsystem with XML object store and replication to relational database cache –Object builder tools (GUI and batch) –Basic set of behavior services Phase 2: Add more production support –Security and policy enforcement –Additional management tools –Optimize performance for accessing XML objects –Object versioning –Collection objects –Advanced disk management Phase 3: Enhance end-user support –New kinds of disseminators, with supporting behavior services –Efficiency and scale optimization

40 FEDORA Web Site: www.fedora.info


Download ppt "The Mellon-Funded Fedora Project A Presentation to the European Digital Library Conference September 17, 2002 Sandy Payette and Thornton Staples."

Similar presentations


Ads by Google