Presentation is loading. Please wait.

Presentation is loading. Please wait.

Digital Library Architecture: A Service-Based Approach Sandra Payette Department of Computer Science Cornell University Mo i Rana,

Similar presentations

Presentation on theme: "Digital Library Architecture: A Service-Based Approach Sandra Payette Department of Computer Science Cornell University Mo i Rana,"— Presentation transcript:

1 Digital Library Architecture: A Service-Based Approach Sandra Payette Department of Computer Science Cornell University Mo i Rana, Norway November 10, 1998

2 Overview Why talk about DL architecture? Digital Libraries - the architectural perspective Review of service-based architecture NCSTRL - a working example Dienst - existing service-oriented architecture Cornell next generation (component-oriented) Conclusion

3 Why Talk about Digital Library Architecture? Web alone is not a digital library Commercial packages limited –limited flexibility –standards issues –network-enabled applications not DL architecture Must position for broader DL opportunities

4 Web by itself not a DL Architecture Documents - Files, CGI, MIME-Types Naming - URLs Document Servers - HTTP servers Resource Discovery - web crawlers Collections - web pages, ad-hoc IP - Access Control List, passwords, ad-hoc

5 WWW Infrastructure Evolving Resource Description Framework (RDF) –will allow rich metadata semantics for documents – Extensible Markup Language (XML) –will allow highly structured documents and rich linking (relationship) capabilities – Uniform Resource Names (URNs) –will allow for persistent, globally unique identifiers

6 But still need Digital Library Architecture Richer document model - digital objects Persistent, unique naming - URNs Well-defined digital library services Better facilities for resource discovery Flexible definition of collections Management of distributed content & services Rights management for intellectual property

7 Nordic Digital Library Cornell Digital Library Digital Library Interoperability

8 Digital Library Architecture: Key Principles Open Architecture –functionality partitioned into set of well-defined services –services accessible via well-defined protocol Modularization –promotes interoperability –scalable to different clientele (research library, informal web) Federation –enable aggregations into logical collections Distribution –of content (collections) and services –of administration and management of DL

9 Repository Services Component-Ware Digital Libraries Collection Services Index Services Persistent NAMES Name Service User Interface Gateway Digital Objects

10 NCSTRL A Working Example 120+ Institutions in US, Europe, and Asia A Globally Distributed Digital Library

11 NCSTRL Participants: collections federated 120+ institutions –Universities/labs - research reports –European Research Consortium for Informatics and Mathematics (ERCIM) –Los Alamos (Physics pre-prints, ACM ) –D-Lib Magazine 40+ independent servers

12 Federation of Collections

13 Documents in Distributed Repositories

14 Multi-Format Document Model

15 modular system based on a standard open architecture study of hard, real-world problems: policy issues, quality of service, federation of publishers creation of a self-sustaining international federated digital collection NCSTRL Real-world testbed for...

16 Dienst NCSTRL technical base Implements a service-based architecture for distributed digital libraries Protocol and reference implementation Network of services WWW browser access Uniform search over distributed indexes Access to documents in distributed repositories Access to multi-formatted documents

17 Dienst: Service-Based Architecture Document model Naming service (CNRI’s Handle System) Repository service Indexer service Collection service User Interface service

18 Dienst Document Model decompositions representations Handle (URN) physical logical ASCIITIFFPostScript metadata underlying formats


20 Dienst: Document Protocol Documents addressable through their URNs Document service requests –get document metadata –get document formats –get document in format –get document partition (page) in format

21 Dienst 5.0 : Document Protocol More complex document model: –versions –hierarchical part specification –binders (multi-part documents) “Structure” service request –Reveal, in XML, full or collapsed structure of a document e.g., chapters, sections, figures, etc. –Describe multiple views of a document e.g., bibliography, content, thumbnails

22 Dienst: Core Services WWW browser Dienst User Interface Repository Index Repository receive unified hit list send search request send site specific search request receive hit list send document request receive MIME-typed document send document request receive MIME-typed document

23 Dienst Protocol Building Gateways to non-Conforming Sites FTP/HTTP “Repositories” Standard Servers User Interface Gateway Server

24 Dienst: Collection Service

25 Naming Service Documents identified by globally unique names Names are persistent, permanent Registered names resolve to specific location (URL) cnri.dlib/april97-payette Naming Authority Item Name Persistent Identifier (e.g., URN) Location (URL)

26 Identifiers: Current Initiatives IETF Uniform Resource Names (URN) –specification of URN framework –requirements for resolution systems –syntax definition Existing Systems –CNRI’s Handle System (**NCSTRL uses) –OCLC PURLs –DOI Initiative

27 Looking Ahead: Current Research at Cornell Digital Objects and Repository –FEDORA –Joint work in Interoperability with CNRI –Access Management Resource Discovery –STARTS (Cornell/Stanford collaboration) –Intelligent Distributed Searching Collection Definition

28 Digital Object is... recognizable by what it can do getChapter getPage getTrack getLabel getSection getArticle getFrame getLength

29 Structure Mechanism Content-Type Interfaces Book MARC What the client sees vs. What the object is

30 application/ MARC DS 1 application/ postscript DS 2 Generic Disseminator FEDORA DigitalObject Book, DublinCore ListContentTypes Book Disseminator DublinCore Disseminator GetChapter GetIndex GetPage Get(Book.getPage(1))

31 FEDORA: Extensibility for Content Types Simple, familiar content types Complex, compound, dynamic content types

32 Resource Discovery Meta-Searching for Resource Discovery –query multiple document sources –choose best sources to evaluate a query –evaluate the query at these sources –merge the query results from these sources Stanford Protocol Proposal for Internet Retrieval and Search (STARTS) – –

33 Distributed Collection Service Definition and Access Central Collection Server Collection Query Router Collection Query Router Collection Query Router User Interface Intelligent routing based on regional conditions

34 Conclusions: Design with an Eye Toward the Future Know limitations of ad-hoc web development and commercial packages Embrace a service-based approach –modular designs increase flexibility, extensibility, plug-in/plug-out –well-defined services with protocols to enable federation and interoperability –can utilize various technologies or commercial software underneath the service layers Watch Web developments in XML and RDF

35 Further reading Lagoze and Payette: An Infrastructure for Open-Architecture Digital Libraries Davis and Lagoze: NCSTRL: Design and Deployment of a Globally Distributed Digital Library, Draft of submission to IEEE Computer Special Issue on Digital Libraries, February 1999. Payette: Persistent Identifiers, RLG DigiNews Payette and Lagoze: Flexible and Extensible Digital Object and Repository Architecture (FEDORA)

Download ppt "Digital Library Architecture: A Service-Based Approach Sandra Payette Department of Computer Science Cornell University Mo i Rana,"

Similar presentations

Ads by Google