Presentation is loading. Please wait.

Presentation is loading. Please wait.

ICDL 2004 Improving Federated Service for Non-cooperating Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer Science Old Dominion University.

Similar presentations


Presentation on theme: "ICDL 2004 Improving Federated Service for Non-cooperating Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer Science Old Dominion University."— Presentation transcript:

1 ICDL 2004 Improving Federated Service for Non-cooperating Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer Science Old Dominion University (shi,maly,zubair)@cs.odu.edu

2 ICDL 2004 Overview Introduction Architecture & Design Experimentation & Implementation Conclusion & Future Works

3 ICDL 2004 Introduction Many approaches for DL Interoperation Harvesting and distributed search Earlier work on LFDL – Lightweight Federated Digital Library Universal search interface DL specification in DLDL DL registration Query mapping Limitations Organizing result set and performance Enhanced LFDL Interactive user-centered search

4 ICDL 2004 LFDL Introduction General principle Aim at non-cooperating digital libraries Distributed search Lightweight: both to data and service providers Basic solution DL specification definition language Dynamic DL metadata registration Universal interface Dynamic Query mapping Local repository

5 ICDL 2004 Limitations and Issues Limited service usability Search results presented in flat structure Need metadata to present rich search results Performance Caching is neither flexible nor efficient Need local metadata repository to generate intelligent cache Solution Retrieve metadata from remote digital libraries Intelligent cache based on retrieved metadata

6 ICDL 2004 LFDL Architecture - Enhancement

7 ICDL 2004 LFDL Architecture – data flows among modules 1) At initialization the system reads all DL specifications including query mapping rules and metadata parsing rules 2) A resource discovery user submits a query using the universal search interface 3) The front-end filter does pre-processing (query clean-up) and then the query is passed to the Search Engine 4) The Search Engine uses the query mapping rules to transform the universal query to a DL’s native local query 5) A DL agent sends the transformed query to the remote DL and receives the search results 6) The Result Process Engine parses the search results pages and extract the metadata according to the metadata parsing rules and store them in the Local Repository 7) All parsed results are merged by the Controller into an intermediate XML document 8) The resulting XML document is displayed using a XSLT processor. 9) Once the Local Repository has been populated, the Search Engine executes searches against the Local Repository (cache) first instead of sending queries directly to remote DLs.

8 ICDL 2004 Search Usability and Performance: metadata is key Available metadata sources List page of search results Detail page of a selected document/record Metadata retrieval approach Define specification on how metadata are presented in those pages Use Dublin Core as common metadata mapping set Develop metadata parser to extract metadata Store parsed metadata in local repository Build up metadata repository Proactive Passive or piggyback

9 ICDL 2004 Performance Improvement – Intelligent Cache Search scenario Case 1: a query for keyword=computer Case 2: a query for keyword=computer AND date=2002 Results: LFDL v1 caching Cache grouped by query string, so Case 1: no cache hits, distributed search request sent to DLs Case 2: no cache hits, distributed search request sent to DLs Intelligent Cache: Enhanced LFDL caching Cache grouped by metadata, so Case 1: no cache hits, distributed search request sent to DLs Case 2: cache hits, search served locally

10 ICDL 2004 Local Metadata Repository All searches are served locally first A secondary in memory metadata cache for better performance and system reliability Cache grouped by metadata instead of query string Cache-based distributed search Display results from cache, at the same time Still send out query to DLs to update cache Transparent to end users

11 ICDL 2004 Local Metadata Search – detailed process 1) System starts, load most recently and most often used metadata from database to memory cache. 2) User submits a query using LFDL unified search interface. 3) Query is converted to local sql query using predefined translation rules. 4) SQL query is sent to local metadata database and the query results will be matching metadata internal Ids. 5) The in-memory cache is searched based on Ids, if matched the metadata is merged, if not, the missing ones will be loaded from database to cache. 6) If local db has no results, the original query string is transformed to native non-cooperating DL query and sent to the remote DL. Results returned from DL are parsed to extract metadata, which is saved to local repository and loaded to in-memory cache.

12 ICDL 2004 Cache Replacement Algorithm Replacement algorithm: least used plus least recent used metadata Initial system-wide parameters: cache size, cache keep safe size Runtime parameters per metadata record: date_last_used, total_usage Algorithm implementation when first start: load from db order by date_last_used, total_usage and pick based on cache size String orderBy = " ORDER BY total_usage desc, date_last_used desc"; String selectMetadata = "SELECT internalID, identifier, archive, datestamp, title, creator, subject, description, publisher, publication, keyword, category contributor, type, format, source, language, status, date_last_used, total_usage FROM dc “ + orderBy; each time when user view a metadata, update date_last_used and total_usage if cache full, remove least used from cache and save to db(first sort by date_last_used, keep safe, then sort by total_usage) cache size and keep safe size can changed at runtime

13 ICDL 2004 Results Merging and Presentation Show results based on metadata field Tailor interface using XSLT

14 ICDL 2004 Results

15 ICDL 2004 Conclusion and Future Works Federation service for non-cooperating DLs is possible Local metadata repository improve service usability and performance Future works Complex interface mapping, access control Populate metadata repository more efficiently Cache maintenance: size, consistency… Automatic specification generation, DL behavior changes discovery Personalized portal: customized interface and results displaying; most often used search and remember search preference; caching options for fresh data or fast results …


Download ppt "ICDL 2004 Improving Federated Service for Non-cooperating Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer Science Old Dominion University."

Similar presentations


Ads by Google