Presentation is loading. Please wait.

Presentation is loading. Please wait.

Fedora An Architecture for Complex Objects and their Relationships Old Dominion University, VA April 7, 2005 Sandy Payette Cornell University.

Similar presentations


Presentation on theme: "Fedora An Architecture for Complex Objects and their Relationships Old Dominion University, VA April 7, 2005 Sandy Payette Cornell University."— Presentation transcript:

1 Fedora An Architecture for Complex Objects and their Relationships Old Dominion University, VA April 7, 2005 Sandy Payette Cornell University

2 The Fedora Project Fedora –Flexible –Extensible –Digital –Object –Repository –Architecture Open source software –Not Red Hat ! –Mozilla Public License

3 The Digital Problem Space: Manage, Publish, Preserve… Conventional digital objects Complex, compound, dynamic objects

4 Fedora History Cornell Research (1997-present) –DARPA and NSF-funded research –First reference implementation developed –Distributed, Interoperable Repositories (experiments with CNRI) –Policy Enforcement First Application (1999-2001) –University of Virginia digital library prototype –Technical implementation: adapted to web; RDBMS storage –Scale/stress testing for 10,000,000 objects Open Source Software (2002-present) –Andrew W. Mellon Foundation grants –Technical implementation: XML and web services –Fedora 1.0 (May 2003) –Fedora 2.0 (Jan 2005)

5 Why Fedora? (1) Digital Object Model –Abstraction for heterogeneous digital resources –Flexibility for different “content models” –Container for content and metadata (both local and remote) –Behaviors (extensible service attachments) Service-Oriented Architecture –Core Repository web service (SOAP and REST) –Service Framework for collaborating and supporting services Feature-worthy for archiving and preservation –XML object serialization for ingest, storage, and export –Content versioning –Event history –Manage dependencies (e.g., object-to-object relationships) –New service framework for plug-in of preservation services

6 Why Fedora? (2) Relationships –RDF-based index –Define and query “the graph” of objects in repository Content repurposing –Reuse digital content in different contexts –Re-purpose content via mechanisms for dynamically transforming content to fit new requirements Web Services –SOAP and REST bindings –WSDL to define interfaces –XML transmission Easy integration with other apps and systems –Does not assume any particular workflow or end-user application –Generic repository service as substrate

7 Fedora Use Cases Digital Library Collections Institutional Repository Educational Software Information Network Overlay Digital archives and preservation Digital Asset Management Content Management System Scholarly publishing

8 Selected Fedora Adopters University of Virginia (image, EAD, e-texts)imageEADe-texts VTLS Tufts University OhioLink Northwestern: Library and Academic Technologies National Science Digital Library (NSDL): Core Integration ARROW: National Library of Australia and Monash University Royal Library Denmark, National Library, and DTU Rutgers University Indiana University American Geophysical Union Library of Congress: I Hear America Singing University of Delaware Hamilton College Cornell CIT Tibetan Buddhist Resource Center Yale University DISA – South Africa, History of Apartheid resistance

9 Fedora Digital Object Model

10 Digital object identifier Reserved Datastreams Key object metadata Disseminators Pointers to service definitions to provide service-mediated views Datastreams Set of content or metadata items Fedora Digital Object Model Component View Persistent ID (PID) Dublin Core (DC) Datastream Audit Trail (AUDIT) Relations (RELS-EXT) Disseminator Default Disseminator

11 Managed Content Fedora stores and manages the content bytestream Fedora stores a reference (URL) to the content Fedora stores a reference (URL) to the content, but will not mediate access to content. Fedora stores a name-spaced block of XML content within the Fedora digital object XML file. The Datastream Component External Referenced External Redirected Inline XML 4 Classifications for Datastreams

12 implements Behavior Definition (BDEF Object) External Service The Disseminator Component Library Resource (Data Object) Persistent ID (PID) Disseminator Default Disseminator Persistent ID (PID) Service Binding Metadata (WSDL) Persistent ID (PID) Service Definition Metadata Behavior Mechanism (BMECH Object) Encapsulates references to service definition objects

13 Example Digital Object Get Object Profile Get DC Get THUMB Get MRSID Get Medium Quality Get High Quality Get MARC Record External Service External Service Default Disseminations Custom Disseminations Persistent ID (PID) DC (text/xml) AUDIT (text/xml) RELS-EXT (text/xml) Image Disseminator Default Disseminator Metadata Disseminator MRSID (image/x-mrsid) THUMB (image/gif)

14 Fedora – XML for digital objects FOXML (Fedora Object XML) –Simple XML format directly expresses Fedora object model –Easily adapts to Fedora new and planned features –Easily translated to other well-known formats –Internal storage format for objects in repository XML-based Ingest/Export of objects –FOXML, METS (Fedora extension) –Extensible to accommodate new XML formats –Planned: METS 1.4, MPEG21 DIDL

15 FOXML – Object Properties 2

16 FOXML – Datastream (type ‘E’) 2

17 FOXML – Relationships Datastream 2

18 FOXML – Disseminator 2

19 Fedora Repository Service

20 RDF files rdbms

21 Fedora 2.0 Repository Modules Management and Validation (via API-M) –Object Ingest and Export –Object Validation –Object Maintenance (create-modify-delete-purge-etc.) –Object Versioning –Initiates incremental object Indexing Access and Dissemination (via API-A and API-A-Lite) –Get object profile –Get Datastream Disseminations –Get Service-Mediated Disseminations Storage –Default file system implementation –Stores XML object wrappers and datastream byte stream content –Relational database registry

22 Fedora 2.0 Repository Modules Basic Search –Search object properties and DC record of each object –Relational database Resource Index Search –Search “graph” of objects with object properties, object relationships, DC –Kowari triple-store (RDF-based index of repository) Authorization –XACML policy enforcement –Repository policies and object-specific policies –Sun XACML Engine OAI-PMH

23 Fedora Web Service APIs in a Nutshell Management Service (API-M) –Ingest Object –Export Object –Get Object XML –Purge Object –Modify Object –Get Next PID –Get Datastream(s) –Get DatastreamHistory –Get DisseminatorHistory –Get Disseminator(s) –Add/modify/purge Datastream –Add/modify/purge Disseminator –Set State

24 Fedora Web Service APIs in a Nutshell Access Service (API-A and API-A-LITE) –Describe Repository –Get Object Profile –Get Object History –Get Datastream –Get Dissemination –Find Objects –Resume Find Objects

25 Fedora Web Service APIs in a Nutshell API-A-Lite –Repository-level operations: fedora/describe - Describe Repository fedora/search – methods to locate objects via the default repository index –Object-level operations: fedora/get - method to get object profile fedora/get/.. – method to “disseminate” a view of an object’s content Fedora/getMethods – methods get information about all disseminations available on object OAI-PMH Provider Service –All OAI-PMH methods to harvest OAI-DC from each object

26 Fedora 2.1 (May 2005) Authentication plug-ins –HTTP basic authentication and SSL –Plug-in #1 : Tomcat user/password file/db –Plug-in #2 : LDAP tie-in –Plug-in #3 : Radius Authentication Authorization module –XML-based policies using XACML –Fine-grained policy enforcement (API actions X subject attrs X object attrs) –Repository-wide policies –Object-specific policies

27 Fedora 2.0 - Clients Fedora Administrator (via SOAP interfaces) –Java Swing client –Ingest/Export objects –Batch and single object creation and modification –Wizards for creating BDEF/BMECH objects Web Browser (via REST interfaces) –API-A-LITE: Access, Search, –OAI –RISearch: Resource Index Search –API-M-LITE: Selected management operations Command Line Utilities –Ingest, export, purge –Migration Policy Builder (available in 2.1) –Graphical user interface to assist in authoring XACML policies

28 Fedora Service Framework (as of Fedora 2.1)

29 Phase 2 – Preservation Services Enhanced ingest/export (for archive transfers) –Focus on exchange of self-contained archives among different systems –OAI Harvesting of complex objects (e.g., MPEG21-DIDL) Preservation Integrity Service –Check Datastreams on ingest and modification –Validate byte stream format integrity (e.g., via JHOVE) –Checksums Preservation Monitoring Service –Dependency checking –Publication of repository events significant to preservation

30 Fedora Service Framework (Year 1 - starting v2.1)

31 Fedora Service Framework (Year 2)

32 Fedora Service Framework (Year 3)

33 Fedora Resource Index: Practical Use of Semantic Web Technologies

34 Fedora Digital Objects Resource Index View

35 Fedora Digital Objects Service Relationships – Object to BDef/BMech

36 Fedora 2.0 and RDF Object-to-object Relationships –Ontology of common relationships (RDF schema) –Relationships stored in special datastream (RELS-EXT) Resource Index (RI) –RDF-based index of repository (Kowari triple-store) –Graph-based index includes: Object properties and Dublin Core Object Relationships Object Disseminations RI Search –Powerful querying of graph of inter-related objects –REST-based query interface (using RDQL or ITQL) –Results in different formats (triples, tuples, sparql)

37 Uses of Object Relationships Define collections (e.g., collection objects) Assert critical relationships among object for management purposes Enable network overlay –Surrogate objects referring to external entities –Assert relationships among them –Assert other relationships (e.g., annotations) Enable navigation of repository (as tree or graph)

38 Fedora Relationship Ontology (RDFS) isPartOf / hasPart isMemberOf / hasMember isDescriptionOf / hasDescription hasEquivalent … others

39 Demo: Collection – Member Relationships Collection Object [smiley]smiley –Datastream containing a query to Resource Index for all members of collection Image Objects [brush]brush –Use RELS-EXT datastream to assert relationship to collection object

40 Example: UVA’s Scholarly Text Collections

41 UVa TEI Book Content Model

42 TEI Book Example link

43 Example: UVA’s Quantitative Data Collections

44 Example: NSDL Network Overlay Architecture Fedora

45 Phase 2 - Development Plans New Services (as depicted in Framework) Federated repositories Performance – goal of 10 million objects Web services security and Shibboleth Event Notification – pub/sub Code Refactoring New Client Applications –Fedora-Web-IR (for institutional repository use) –Advanced Scholarly workbench –Workflow –Tools for RDF browse and graph traversal

46 Fedora Software Distribution Open Source (Mozilla Public License) 100% Java (Sun Java J2SDK1.4) Supporting Technologies –Apache Tomcat –Apache Axis (SOAP) –Xerces for XML parsing and validation –Saxon for XSLT transformation –Schematron for validation –RDBMS: MySQL, Mckoi, Oracle 9i support –Kowari triple store –Sun XACML Engine for policy enforcement Deployment Platforms –Windows 2000, NT, XP –Solaris –Linux –Mac OSX

47 Fedora Development Consortium Advisory Board –University of Virginia –Tufts –VTLS –ARROW (Monash University and Nat’l Lib Australia) –Harris Corp. –Danish Royal Library and DTU –Northwestern University –NSDL – Core Integration Mission –Requirements Definition, Specifications. Joint Development –Commission of Working Groups Content Modeling Outreach and Education Workflow and Service-Oriented Processes –Recommendation for Long-Term sustainability model Governance and Funding Set Fedora Free – full open source model (e.g., public SourceForge) Code Maintenance (UVA until 2012; plan for beyond)

48 Recent News Downloads ~20K; 52 countries Growth – lots of new interest Fedora Users Conference (May 13-14, Rutgers) Interesting new adopters –OhioLink –DISA (South Africa history) Interesting new proposals –Company X finalist for large government contract –Cornell Lab of Ornithology (data + tools + documents) Recent Article –XML CoverPages http://xml.coverpages.org/ni2005-03-18-a.html http://xml.coverpages.org/ni2005-03-18-a.html

49 Finally, a new Fedora Web Site! www.fedora.info


Download ppt "Fedora An Architecture for Complex Objects and their Relationships Old Dominion University, VA April 7, 2005 Sandy Payette Cornell University."

Similar presentations


Ads by Google