Presentation is loading. Please wait.

Presentation is loading. Please wait.

Digital Library Architecture: A Service-Based Approach

Similar presentations


Presentation on theme: "Digital Library Architecture: A Service-Based Approach"— Presentation transcript:

1 Digital Library Architecture: A Service-Based Approach
Mo i Rana, Norway November 10, 1998 Sandra Payette Department of Computer Science Cornell University

2 Overview Why talk about DL architecture?
Digital Libraries - the architectural perspective Review of service-based architecture NCSTRL - a working example Dienst - existing service-oriented architecture Cornell next generation (component-oriented) Conclusion

3 Why Talk about Digital Library Architecture?
Web alone is not a digital library Commercial packages limited limited flexibility standards issues network-enabled applications not DL architecture Must position for broader DL opportunities

4 Web by itself not a DL Architecture
Documents - Files, CGI, MIME-Types Naming - URLs Document Servers - HTTP servers Resource Discovery - web crawlers Collections - web pages, ad-hoc IP - Access Control List, passwords, ad-hoc

5 WWW Infrastructure Evolving
Resource Description Framework (RDF) will allow rich metadata semantics for documents Extensible Markup Language (XML) will allow highly structured documents and rich linking (relationship) capabilities Uniform Resource Names (URNs) will allow for persistent, globally unique identifiers

6 But still need Digital Library Architecture
Richer document model - digital objects Persistent, unique naming - URNs Well-defined digital library services Better facilities for resource discovery Flexible definition of collections Management of distributed content & services Rights management for intellectual property

7 Digital Library Interoperability
Cornell Digital Library Nordic Digital Library

8 Digital Library Architecture: Key Principles
Open Architecture functionality partitioned into set of well-defined services services accessible via well-defined protocol Modularization promotes interoperability scalable to different clientele (research library, informal web) Federation enable aggregations into logical collections Distribution of content (collections) and services of administration and management of DL

9 Component-Ware Digital Libraries
User Interface Gateway Collection Services Persistent NAMES Name Service Index Services Repository Services Digital Objects

10 NCSTRL A Working Example
A Globally Distributed Digital Library 120+ Institutions in US, Europe, and Asia

11 NCSTRL Participants: collections federated
120+ institutions Universities/labs - research reports European Research Consortium for Informatics and Mathematics (ERCIM) Los Alamos (Physics pre-prints, ACM ) D-Lib Magazine 40+ independent servers

12 Federation of Collections

13 Documents in Distributed Repositories

14 Multi-Format Document Model

15 NCSTRL Real-world testbed for ...
modular system based on a standard open architecture study of hard, real-world problems: policy issues, quality of service, federation of publishers creation of a self-sustaining international federated digital collection

16 Dienst NCSTRL technical base
Implements a service-based architecture for distributed digital libraries Protocol and reference implementation Network of services WWW browser access Uniform search over distributed indexes Access to documents in distributed repositories Access to multi-formatted documents

17 Dienst: Service-Based Architecture
Document model Naming service (CNRI’s Handle System) Repository service Indexer service Collection service User Interface service

18 Dienst Document Model decompositions representations logical physical
Handle (URN) decompositions representations logical physical ASCII TIFF PostScript metadata underlying formats

19

20 Dienst: Document Protocol
Documents addressable through their URNs Document service requests get document metadata get document formats get document in format get document partition (page) in format

21 Dienst 5.0 : Document Protocol
More complex document model: versions hierarchical part specification binders (multi-part documents) “Structure” service request Reveal, in XML, full or collapsed structure of a document e.g., chapters, sections, figures, etc. Describe multiple views of a document e.g., bibliography, content, thumbnails

22 receive unified hit list
Dienst: Core Services WWW browser send search request Protocol receive unified hit list send document request receive MIME-typed document Protocol Dienst User Interface send site specific search request receive hit list Protocol Repository Index

23 Dienst Protocol Building Gateways to non-Conforming Sites
Standard Servers User Interface Gateway Server FTP/HTTP “Repositories”

24 Dienst: Collection Service

25 Naming Service Documents identified by globally unique names
Names are persistent, permanent Registered names resolve to specific location (URL) cnri.dlib/april97-payette Persistent Identifier (e.g., URN) Naming Authority Item Name Location (URL)

26 Identifiers: Current Initiatives
IETF Uniform Resource Names (URN) specification of URN framework requirements for resolution systems syntax definition Existing Systems CNRI’s Handle System (**NCSTRL uses) OCLC PURLs DOI Initiative

27 Looking Ahead: Current Research at Cornell
Digital Objects and Repository FEDORA Joint work in Interoperability with CNRI Access Management Resource Discovery STARTS (Cornell/Stanford collaboration) Intelligent Distributed Searching Collection Definition

28 recognizable by what it can do
Digital Object is... getSection getArticle getTrack getLabel getChapter getPage getFrame getLength recognizable by what it can do

29 What the client sees vs. What the object is
Book Content-Type Interfaces MARC Mechanism Structure

30 FEDORA DigitalObject GetChapter GetIndex GetPage Book Disseminator
application/ MARC DS1 postscript DS2 Generic Disseminator Book Disseminator DublinCore ListContentTypes Book, DublinCore Get(Book.getPage(1))

31 FEDORA: Extensibility for Content Types
Simple, familiar content types Complex, compound, dynamic content types

32 Resource Discovery Meta-Searching for Resource Discovery
query multiple document sources choose best sources to evaluate a query evaluate the query at these sources merge the query results from these sources Stanford Protocol Proposal for Internet Retrieval and Search (STARTS) www-db.stanford.edu/~gravano/starts.html

33 Distributed Collection Service Definition and Access
User Interface Intelligent routing based on regional conditions Collection Query Router Collection Query Router Central Collection Server Collection Query Router

34 Conclusions: Design with an Eye Toward the Future
Know limitations of ad-hoc web development and commercial packages Embrace a service-based approach modular designs increase flexibility, extensibility, plug-in/plug-out well-defined services with protocols to enable federation and interoperability can utilize various technologies or commercial software underneath the service layers Watch Web developments in XML and RDF

35 Further reading Lagoze and Payette: An Infrastructure for Open-Architecture Digital Libraries Davis and Lagoze: NCSTRL: Design and Deployment of a Globally Distributed Digital Library, Draft of submission to IEEE Computer Special Issue on Digital Libraries, February Payette: Persistent Identifiers, RLG DigiNews Payette and Lagoze: Flexible and Extensible Digital Object and Repository Architecture (FEDORA)


Download ppt "Digital Library Architecture: A Service-Based Approach"

Similar presentations


Ads by Google