ICDL 2004 Improving Federated Service for Non-cooperating Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer Science Old Dominion University.

Slides:



Advertisements
Similar presentations
Putting the Pieces Together Grace Agnew Slide User Description Rights Holder Authentication Rights Video Object Permission Administration.
Advertisements

GTS MetaData Generation data GTS data bases GTS Switch Volume C1 Central Support Office Information Classes white-list Metadata Synchronization.
Retrieval of Information from Distributed Databases By Ananth Anandhakrishnan.
Application Graphic design / svetagraphics.com 01 FRAMEWORK data service.
Test Case Management and Results Tracking System October 2008 D E L I V E R I N G Q U A L I T Y (Short Version)
Rapid Visual OAI Tool S. Kothamasa, K. Maly, M. Zubair (Old Dominion University) X. Liu (Los Alamos National Laboratory) RCDL 2003, St. Petersburg.
AskMe A Web-Based FAQ Management Tool Alex Albu. Background Fast responses to customer inquiries – key factor in customer satisfaction Costs for customer.
Technical BI Project Lifecycle
Chapter 2. Slide 1 CULTURAL SUBJECT GATEWAYS CULTURAL SUBJECT GATEWAYS Subject Gateways  Started as links of lists  Continued as Web directories  Culminated.
Provenance in Open Distributed Information Systems Syed Imran Jami PhD Candidate FAST-NU.
June 22-23, 2005 Technology Infusion Team Committee1 High Performance Parallel Lucene search (for an OAI federation) K. Maly, and M. Zubair Department.
ELPUB 2006 June Bansko Bulgaria1 Automated Building of OAI Compliant Repository from Legacy Collection Kurt Maly Department of Computer.
Information Retrieval in Practice
A Robust System Architecture For Mining Semi-structured Data By Aby M Mathew CSE
U of R eXtensible Catalog Team MetaCat. Problem Domain.
An Agent-Oriented Approach to the Integration of Information Sources Michael Christoffel Institute for Program Structures and Data Organization, University.
1 BrainWave Biosolutions Limited Accelerating Life Science Research through Technology.
Overview of Search Engines
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
State of Connecticut Core-CT Project Query 4 hrs Updated 1/21/2011.
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
Application Training — Lead Management System. Slide 2 Module Agenda Module Break-upDuration (minutes) Lesson 1: Introduction to Lead Management System10.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
CS621 : Seminar-2008 DEEP WEB Shubhangi Agrawal ( )‏ Jayalekshmy S. Nair ( )‏
Rapid Visual OAI Tool S. Kothamasa, K. Maly, M. Zubair (Old Dominion University) X. Liu (Los Alamos National Laboratory) RCDL 2003, St. Petersburg.
Online Autonomous Citation Management for CiteSeer CSE598B Course Project By Huajing Li.
Building Search Portals With SP2013 Search. 2 SharePoint 2013 Search  Introduction  Changes in the Architecture  Result Sources  Query Rules/Result.
OracleAS Reports Services. Problem Statement To simplify the process of managing, creating and execution of Oracle Reports.
Dec 9-11, 2003ICADL Challenges in Building Federation Services over Harvested Metadata Hesham Anan, Jianfeng Tang, Kurt Maly, Michael Nelson, Mohammad.
Indo-US Workshop, June23-25, 2003 Building Digital Libraries for Communities using Kepler Framework M. Zubair Old Dominion University.
OpenURL Link Resolvers 101
RELATIONAL FAULT TOLERANT INTERFACE TO HETEROGENEOUS DISTRIBUTED DATABASES Prof. Osama Abulnaja Afraa Khalifah
SharePoint 2010 Search Architecture The Connector Framework Enhancing the Search User Interface Creating Custom Ranking Models.
Ontologies and Lexical Semantic Networks, Their Editing and Browsing Pavel Smrž and Martin Povolný Faculty of Informatics,
Design of a Search Engine for Metadata Search Based on Metalogy Ing-Xiang Chen, Che-Min Chen,and Cheng-Zen Yang Dept. of Computer Engineering and Science.
1 CS 502: Computing Methods for Digital Libraries Lecture 19 Interoperability Z39.50.
1 Metadata –Information about information – Different objects, different forms – e.g. Library catalogue record Property:Value: Author Ian Beardwell Publisher.
An Interoperable Portal for the Historic Environment Tony Austin, Julian Richards Archaeology Data Service, Department of Archaeology,
Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA Digital Libraries, OAI and Free Software.
Keyword Searching Weighted Federated Search with Key Word in Context Date: 10/2/2008 Dan McCreary President Dan McCreary & Associates
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
Deposit Module for Depositor DigiTool Version 3.0.
An OAI-Compliant Federated Physics Digital Library for the NSDL Department of Computer Science Old Dominion University, Norfolk, VA In Collaboration.
1 GRID Based Federated Digital Library K. Maly, M. Zubair, V. Chilukamarri, and P. Kothari Department of Computer Science Old Dominion University February,
OAI Overview DLESE OAI Workshop April 29-30, 2002 John Weatherley
NDD (National Oceans Office Data Directory) development overview as at 1 July 2002 Tony Rees/Miroslaw Ryba CSIRO Marine Research, Hobart.
Data Integration Hanna Zhong Department of Computer Science University of Illinois, Urbana-Champaign 11/12/2009.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Automatic Metadata Discovery from Non-cooperative Digital Libraries By Ron Shi, Kurt Maly, Mohammad Zubair IADIS International Conference May 2003.
JISC/NSF PI Meeting, June Archon - A Digital Library that Federates Physics Collections with Varying Degrees of Metadata Richness Department of Computer.
May 26-28ICNEE 2003 ARCHON: BUILDING LEARNING ENVIRONMENTS THROUGH EXTENDED DIGITAL LIBRARY SERVICES Hesham Anan, Kurt Maly, Mohammad Zubair,et al. Digital.
Oct 12-14, 2003NSDL Challenges in Building Federation Services over Harvested Metadata Kurt Maly, Michael Nelson, Mohammad Zubair Digital Library.
Functional Requirements Specification for Open Repository for Doctoral Thesis at UNSA Dušanka Bošković University of Sarajevo 15 th Workshop on “Software.
Dispatching Java agents to user for data extraction from third party web sites Alex Roque F.I.U. HPDRC.
Feb 24-27, 2004ICDL 2004, New Dehli Improving Federated Service for Non-cooperating Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer.
Sharing Digital Scores: Will the Open Archives Initiative Protocol for Metadata Harvesting Provide the Key? Constance Mayer, Harvard University Peter Munstedt,
Arc – Federated Searching Service Kurt Maly, Xiaoming Liu, M.Zubair, Michael L.Nelson Old Dominion University January 23, 2001.
1 CS 430: Information Discovery Lecture 26 Architecture of Information Retrieval Systems 1.
June 3-6, 2003E-Society Lisbon Automatic Metadata Discovery from Non-cooperative Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer Science.
Improvement of Semantic Interoperability based on Metadata Registry(MDR) Doo-Kwon Baik Dept. of CSE Korea University.
General Architecture of Retrieval Systems 1Adrienn Skrop.
NDLTD Toward Universal Accessibility of ETDs: Building the NDLTD Union Archive Hussein Suleman, Edward A. Fox,
XML 1. Chapter 8 © 2013 Pearson Education, Inc. Publishing as Prentice Hall SAMPLE XML SCHEMA (XSD) 2 Schema is a record definition, analogous to the.
1 Copyright © 2008, Oracle. All rights reserved. Repository Basics.
Online Information and Education Conference 2004, Bangkok Dr. Britta Woldering, German National Library Metadata development in The European Library.
WHAT DOES THE FUTURE HOLD? Ann Ellis Dec. 18, 2000
An educational system for medical billers in training
Panagiotis G. Ipeirotis Tom Barry Luis Gravano
Submitted By: Usha MIT-876-2K11 M.Tech(3rd Sem) Information Technology
OAI and Metadata Harvesting
Presentation transcript:

ICDL 2004 Improving Federated Service for Non-cooperating Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer Science Old Dominion University

ICDL 2004 Overview Introduction Architecture & Design Experimentation & Implementation Conclusion & Future Works

ICDL 2004 Introduction Many approaches for DL Interoperation Harvesting and distributed search Earlier work on LFDL – Lightweight Federated Digital Library Universal search interface DL specification in DLDL DL registration Query mapping Limitations Organizing result set and performance Enhanced LFDL Interactive user-centered search

ICDL 2004 LFDL Introduction General principle Aim at non-cooperating digital libraries Distributed search Lightweight: both to data and service providers Basic solution DL specification definition language Dynamic DL metadata registration Universal interface Dynamic Query mapping Local repository

ICDL 2004 Limitations and Issues Limited service usability Search results presented in flat structure Need metadata to present rich search results Performance Caching is neither flexible nor efficient Need local metadata repository to generate intelligent cache Solution Retrieve metadata from remote digital libraries Intelligent cache based on retrieved metadata

ICDL 2004 LFDL Architecture - Enhancement

ICDL 2004 LFDL Architecture – data flows among modules 1) At initialization the system reads all DL specifications including query mapping rules and metadata parsing rules 2) A resource discovery user submits a query using the universal search interface 3) The front-end filter does pre-processing (query clean-up) and then the query is passed to the Search Engine 4) The Search Engine uses the query mapping rules to transform the universal query to a DL’s native local query 5) A DL agent sends the transformed query to the remote DL and receives the search results 6) The Result Process Engine parses the search results pages and extract the metadata according to the metadata parsing rules and store them in the Local Repository 7) All parsed results are merged by the Controller into an intermediate XML document 8) The resulting XML document is displayed using a XSLT processor. 9) Once the Local Repository has been populated, the Search Engine executes searches against the Local Repository (cache) first instead of sending queries directly to remote DLs.

ICDL 2004 Search Usability and Performance: metadata is key Available metadata sources List page of search results Detail page of a selected document/record Metadata retrieval approach Define specification on how metadata are presented in those pages Use Dublin Core as common metadata mapping set Develop metadata parser to extract metadata Store parsed metadata in local repository Build up metadata repository Proactive Passive or piggyback

ICDL 2004 Performance Improvement – Intelligent Cache Search scenario Case 1: a query for keyword=computer Case 2: a query for keyword=computer AND date=2002 Results: LFDL v1 caching Cache grouped by query string, so Case 1: no cache hits, distributed search request sent to DLs Case 2: no cache hits, distributed search request sent to DLs Intelligent Cache: Enhanced LFDL caching Cache grouped by metadata, so Case 1: no cache hits, distributed search request sent to DLs Case 2: cache hits, search served locally

ICDL 2004 Local Metadata Repository All searches are served locally first A secondary in memory metadata cache for better performance and system reliability Cache grouped by metadata instead of query string Cache-based distributed search Display results from cache, at the same time Still send out query to DLs to update cache Transparent to end users

ICDL 2004 Local Metadata Search – detailed process 1) System starts, load most recently and most often used metadata from database to memory cache. 2) User submits a query using LFDL unified search interface. 3) Query is converted to local sql query using predefined translation rules. 4) SQL query is sent to local metadata database and the query results will be matching metadata internal Ids. 5) The in-memory cache is searched based on Ids, if matched the metadata is merged, if not, the missing ones will be loaded from database to cache. 6) If local db has no results, the original query string is transformed to native non-cooperating DL query and sent to the remote DL. Results returned from DL are parsed to extract metadata, which is saved to local repository and loaded to in-memory cache.

ICDL 2004 Cache Replacement Algorithm Replacement algorithm: least used plus least recent used metadata Initial system-wide parameters: cache size, cache keep safe size Runtime parameters per metadata record: date_last_used, total_usage Algorithm implementation when first start: load from db order by date_last_used, total_usage and pick based on cache size String orderBy = " ORDER BY total_usage desc, date_last_used desc"; String selectMetadata = "SELECT internalID, identifier, archive, datestamp, title, creator, subject, description, publisher, publication, keyword, category contributor, type, format, source, language, status, date_last_used, total_usage FROM dc “ + orderBy; each time when user view a metadata, update date_last_used and total_usage if cache full, remove least used from cache and save to db(first sort by date_last_used, keep safe, then sort by total_usage) cache size and keep safe size can changed at runtime

ICDL 2004 Results Merging and Presentation Show results based on metadata field Tailor interface using XSLT

ICDL 2004 Results

ICDL 2004 Conclusion and Future Works Federation service for non-cooperating DLs is possible Local metadata repository improve service usability and performance Future works Complex interface mapping, access control Populate metadata repository more efficiently Cache maintenance: size, consistency… Automatic specification generation, DL behavior changes discovery Personalized portal: customized interface and results displaying; most often used search and remember search preference; caching options for fresh data or fast results …