Download presentation
Presentation is loading. Please wait.
Published byAmaya Marshman Modified over 10 years ago
1
Digital Library Architecture: A Service-Based Approach
Mo i Rana, Norway November 10, 1998 Sandra Payette Department of Computer Science Cornell University
2
Overview Why talk about DL architecture?
Digital Libraries - the architectural perspective Review of service-based architecture NCSTRL - a working example Dienst - existing service-oriented architecture Cornell next generation (component-oriented) Conclusion
3
Why Talk about Digital Library Architecture?
Web alone is not a digital library Commercial packages limited limited flexibility standards issues network-enabled applications not DL architecture Must position for broader DL opportunities
4
Web by itself not a DL Architecture
Documents - Files, CGI, MIME-Types Naming - URLs Document Servers - HTTP servers Resource Discovery - web crawlers Collections - web pages, ad-hoc IP - Access Control List, passwords, ad-hoc
5
WWW Infrastructure Evolving
Resource Description Framework (RDF) will allow rich metadata semantics for documents Extensible Markup Language (XML) will allow highly structured documents and rich linking (relationship) capabilities Uniform Resource Names (URNs) will allow for persistent, globally unique identifiers
6
But still need Digital Library Architecture
Richer document model - digital objects Persistent, unique naming - URNs Well-defined digital library services Better facilities for resource discovery Flexible definition of collections Management of distributed content & services Rights management for intellectual property
7
Digital Library Interoperability
Cornell Digital Library Nordic Digital Library
8
Digital Library Architecture: Key Principles
Open Architecture functionality partitioned into set of well-defined services services accessible via well-defined protocol Modularization promotes interoperability scalable to different clientele (research library, informal web) Federation enable aggregations into logical collections Distribution of content (collections) and services of administration and management of DL
9
Component-Ware Digital Libraries
User Interface Gateway Collection Services Persistent NAMES Name Service Index Services Repository Services Digital Objects
10
NCSTRL A Working Example
A Globally Distributed Digital Library 120+ Institutions in US, Europe, and Asia
11
NCSTRL Participants: collections federated
120+ institutions Universities/labs - research reports European Research Consortium for Informatics and Mathematics (ERCIM) Los Alamos (Physics pre-prints, ACM ) D-Lib Magazine 40+ independent servers
12
Federation of Collections
13
Documents in Distributed Repositories
14
Multi-Format Document Model
15
NCSTRL Real-world testbed for ...
modular system based on a standard open architecture study of hard, real-world problems: policy issues, quality of service, federation of publishers creation of a self-sustaining international federated digital collection
16
Dienst NCSTRL technical base
Implements a service-based architecture for distributed digital libraries Protocol and reference implementation Network of services WWW browser access Uniform search over distributed indexes Access to documents in distributed repositories Access to multi-formatted documents
17
Dienst: Service-Based Architecture
Document model Naming service (CNRI’s Handle System) Repository service Indexer service Collection service User Interface service
18
Dienst Document Model decompositions representations logical physical
Handle (URN) decompositions representations logical physical ASCII TIFF PostScript metadata underlying formats
20
Dienst: Document Protocol
Documents addressable through their URNs Document service requests get document metadata get document formats get document in format get document partition (page) in format
21
Dienst 5.0 : Document Protocol
More complex document model: versions hierarchical part specification binders (multi-part documents) “Structure” service request Reveal, in XML, full or collapsed structure of a document e.g., chapters, sections, figures, etc. Describe multiple views of a document e.g., bibliography, content, thumbnails
22
receive unified hit list
Dienst: Core Services WWW browser send search request Protocol receive unified hit list send document request receive MIME-typed document Protocol Dienst User Interface send site specific search request receive hit list Protocol Repository Index
23
Dienst Protocol Building Gateways to non-Conforming Sites
Standard Servers User Interface Gateway Server FTP/HTTP “Repositories”
24
Dienst: Collection Service
25
Naming Service Documents identified by globally unique names
Names are persistent, permanent Registered names resolve to specific location (URL) cnri.dlib/april97-payette Persistent Identifier (e.g., URN) Naming Authority Item Name Location (URL)
26
Identifiers: Current Initiatives
IETF Uniform Resource Names (URN) specification of URN framework requirements for resolution systems syntax definition Existing Systems CNRI’s Handle System (**NCSTRL uses) OCLC PURLs DOI Initiative
27
Looking Ahead: Current Research at Cornell
Digital Objects and Repository FEDORA Joint work in Interoperability with CNRI Access Management Resource Discovery STARTS (Cornell/Stanford collaboration) Intelligent Distributed Searching Collection Definition
28
recognizable by what it can do
Digital Object is... getSection getArticle getTrack getLabel getChapter getPage getFrame getLength recognizable by what it can do
29
What the client sees vs. What the object is
Book Content-Type Interfaces MARC Mechanism Structure
30
FEDORA DigitalObject GetChapter GetIndex GetPage Book Disseminator
application/ MARC DS1 postscript DS2 Generic Disseminator Book Disseminator DublinCore ListContentTypes Book, DublinCore Get(Book.getPage(1))
31
FEDORA: Extensibility for Content Types
Simple, familiar content types Complex, compound, dynamic content types
32
Resource Discovery Meta-Searching for Resource Discovery
query multiple document sources choose best sources to evaluate a query evaluate the query at these sources merge the query results from these sources Stanford Protocol Proposal for Internet Retrieval and Search (STARTS) www-db.stanford.edu/~gravano/starts.html
33
Distributed Collection Service Definition and Access
User Interface Intelligent routing based on regional conditions Collection Query Router Collection Query Router Central Collection Server Collection Query Router
34
Conclusions: Design with an Eye Toward the Future
Know limitations of ad-hoc web development and commercial packages Embrace a service-based approach modular designs increase flexibility, extensibility, plug-in/plug-out well-defined services with protocols to enable federation and interoperability can utilize various technologies or commercial software underneath the service layers Watch Web developments in XML and RDF
35
Further reading Lagoze and Payette: An Infrastructure for Open-Architecture Digital Libraries Davis and Lagoze: NCSTRL: Design and Deployment of a Globally Distributed Digital Library, Draft of submission to IEEE Computer Special Issue on Digital Libraries, February Payette: Persistent Identifiers, RLG DigiNews Payette and Lagoze: Flexible and Extensible Digital Object and Repository Architecture (FEDORA)
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.