Presentation on theme: "IndonesiaDLN km rg Ismail Fahmi, Ismail Khalil Ibrahim, Donny Fauzan, Rurie Muharto Knowledge Management Research Group ITB Extending The OAI Protocol."— Presentation transcript:
IndonesiaDLN km rg Ismail Fahmi, Ismail Khalil Ibrahim, Donny Fauzan, Rurie Muharto Knowledge Management Research Group ITB Extending The OAI Protocol as the Data Integration Framework for the Digital Library Network in the Third World
IndonesiaDLN km rg Outline Introduction Background & Motivation Basic Concept of OAI Protocol Extending OAI Protocol for IndonesiaDLN
IndonesiaDLN km rg Introduction
IndonesiaDLN km rg Digital Library A digital library is a vast collection of entities stored and maintained by multiple information sources including databases, image banks, file systems, email systems, the Web, and applications providing structured or semi-structured data.
IndonesiaDLN km rg Network of Digital Libraries Why Network ? Physically distributed information sources Heterogeneous storage Autonomous content and data format Goal provide users with a uniform interface to access, relate, and combine data stored in multiple, distributed, autonomous, and possibly heterogeneous information sources.
IndonesiaDLN km rg Background & Motivation
IndonesiaDLN km rg Challenges Interoperability Unreliable, low speed & high cost Internet Connection Most of Knowledge & Information Sources don’t have Dedicated Internet connection (IndonesiaDLN partners consist of 50:50 dedicated vs dial-up connection) Centralized data provider solution results slow responds, high costs and dissappointments due to unreliable connection.
IndonesiaDLN km rg Data Integration Architecture Virtual approach : Large number of information Rapid change of data Unpredictable client needs Queries to vast amount of data from very large number of information sources
IndonesiaDLN km rg Data Integration Architecture Materialization approach : Predictable portions of the available information required High performance query required Access to private copies Requirement to save information which not maintained by the source
IndonesiaDLN km rg Data Integration Architecture Approach that suitable for Digital Library Network in the third world is Materialization / Harvesting Approach because : The need of fast query response Low quality internet connection Availability of dedicated internet connection used by knowledge source
IndonesiaDLN km rg OAI : Basic Concept
IndonesiaDLN km rg Open Archives Initiative http://www.openarchives.org
IndonesiaDLN km rg OAI Objectives The Open Archives Initiative has been set up to create a forum to discuss and solve matters of interoperability between preprint solutions, as a way to promote their global acceptance. Paul Ginsparg, Rick Luce & Herbert Van de Sompel
IndonesiaDLN km rg OAI Implementers arXivphysics, mathematics, non-linear systems and computer science (Los Alamos) clinmedClinical Medicine and Health Research Netprints CogPrintsU. Southampton CSTCComputer Science Teaching Center, Digital library of peer-reviewed teaching resources for computer science educators ETDVirgina Tech HeinOnlineLaw journals from Corneel U HUBerlinHumboldt University at Berlin/Germany Document Server NACANACA Technical Report Server, Scanned reports of the National Advisory Committee for Aeronautics (1917-1958); the predecessor organisation to NASA. NCSTRLNetworked Computer Science Technical Reference Library NDLTDNetworked Digital Library of Theses and Dissertations OLAOpen Language Archives, U Pennsylvania OVPOpen Video Project, U Northern Carolina WCRWeb Characterisation Repository, database of meta-information relating to trace files, tools and publications that are relevant to characterisation of the World Wide Web T&D WorldcatOCLC
IndonesiaDLN km rg OAI Definitions & Concept Repository is a network accessible server to which OAI protocol request can be submitted. A record is an XML-encoded byte stream that is returned by a repository in response to an OAI protocol request for metadata from an item in that repository. The OAI records are organized into header, metadata, and about. Header is necessary for the harvesting process, and consists of two parts: unique-identifier, the key for extracting metadata from an item in a repository; and datestamp of creation, deletion, and last date of modification. Metadata is a single manifestation of a metadata from an item. The OAI protocol supports multiple format of metadata. About is an optional container to hold data about the metadata of the record, such as rights information, term and conditions for usage, etc.
IndonesiaDLN km rg OAI Definitions & Concept Data Providers administer systems that support the OAI protocol as a means of exposing metadata about the content in their systems Service Providers issues OAI protocol requests to the systems of data providers and use the returned metadata as a basis for building value-added services.
IndonesiaDLN km rg OAI Data Flow Local data provider harvests metadata from remote data providers, and then serves requests from service provider Service provider acts as the interface for the users (searching, browsing, etc)
IndonesiaDLN km rg OAI Protocol Specification Uses HTTP as the transport protocol Uses HTTP’s URL as the request format Uses XML as the response data coding
IndonesiaDLN km rg OAI Service Requests Identify is a request for information about the repository as a whole. Returned is such information as the name of the repository, the version of the protocol, and the email address of the administrator. ListIdentifiers lists identifiers for all objects or within a given date range and/or within a given set. ListMetadataFormats will return the list of all metadata formats supported by the archive. ListRecords lists complete metadata for all objects or within a given date range and/or within a given set. ListSets lists the sets (and subsets, recursively) contained within the repository.
IndonesiaDLN km rg OAI-PMH (Protocol For Metadata Harvesting) Data Provider Local Network Service Provider
IndonesiaDLN km rg OAI-PMH (Protocol For Metadata Harvesting) Data Provider (gdlhub.indonesiadln.org) Service Provider (digilib.itb.ac.id) http://gdlhub.indonesiadln.org/OAI/response.php? verb=ListIdentifiers&metadataPrefix=oai_dc http://localhost oai:gdlhub.indonesiaDLN.org:agriknow-2002-607 2002-07-02 oai:gdlhub.indonesiaDLN.org:agriknow-2002-603 2002-03-10... oai:gdlhub.indonesiaDLN.org:agriknow-1998-521 1998-01-12
IndonesiaDLN km rg OAI Extension Background Idea : Many resources providers are connected to internet temporarily i.e. dial-up connection, even provider behind proxy The barriers of Indonesia network connection: i.e. Internet limited bandwith capacity
IndonesiaDLN km rg OAI Extension Solution : Data Uploading functionality to resolve the problems, so provider which has problems can put data into Data Provider And then the Data Provider may be harvested by others through OAI-PMH
IndonesiaDLN km rg OAI Extension
IndonesiaDLN km rg Interoperability through Metadata Harvesting and Posting Mechanism Harvesting Mechanism OAI v2.0 protocol metadata harvesting Posting Mechanism –non-dedicated server/temporary server, both data and metadata involved. –has been implemented in the GDL software environment.
IndonesiaDLN km rg OAI Extension: Framework Definitions and Concepts Protocol Metadata Posting (PMP) involves two participants; –Data Provider Retrieves uploaded metadata to build its own value-added services –Service Provider. Puts metadata into data provider
IndonesiaDLN km rg There are 6 verbs to use as request: Connect this request contains keyword argument containing PUBLISHER_ID, its serial number and epoch_time. Disconnect finishes connection to the current hub data provider PutRecord puts a record into the repository PutListRecords put the list of record needs to be uploaded, by comparing them so the newer metadata only will be uploaded. PutFileFragment put the fragments of file to the server. MergeFileFragments merge the uploaded fragments of file in the server. Protocol Requests and Responses:
IndonesiaDLN km rg Protocol Metadata Posting pertama kali service provider melakukan Connect to Hub Data Provider. data provider memberikan sebuah id ke service provider jika authentication sukses. Remainder verb selanjutnya bisa digunakan, tetapi jika authenetication failed, remainder verb gak bisa digunakan. service provider kemudian mem-put single metadata dgn PutRecord atau bbrp metadata dgn PutListRecords. Intinya request ini akan mendapatkan response berupa status penguploadan yang success or not. service provider juga bisa mem-put file dgn PutFileFragment dimana potong2an file diupload ke Data Provider. Kemudian di merged dgn request MergeFileFragment
IndonesiaDLN km rg OAI-PMH (Protocol For Metadata Posting) Hub/Central Data Provider (gdlhub.indonesiadln.org) http://gdlhub.indonesiadln.org/OAI/OAI-PMP-script.php? verb=Connect&providerId=JBPTITBPP&providerSerialNumber=28 H3oIZETdASw&epochTime=1031366328 Data Provider (dial-up client) 2002-02-08T08:55:46Z http://agri/OAI-PMP- script.php ca612fe33acc768d4aa2f5940238c8ae
IndonesiaDLN km rg Conclusion IndonesiaDLN has a standard interoperability framework that based on both Metadata Harvesting and Metadata Uploading Bla..bla
IndonesiaDLN km rg Final Remarks Diharapkan model protokol ini bisa diimplementasikan di negara2 dunia ke 3 seperti Indonesia.