LinkSphere: P2P Cross Database Search -- Architecture and Issues Hugo Mills University of Reading.

Slides:



Advertisements
Similar presentations
Heinrich Stamerjohanns Institute for Science Networking Distributed Open Archives Dr. Heinrich Stamerjohanns Institute for Science Networking at the University.
Advertisements

9th June Presented by: Prof Mark Baker ACET, University of Reading Tel: Web:
A PPARC funded project AstroGrid Framework Consortium meeting, Dec 14-15, 2004 Edinburgh Tony Linde Programme Manager.
May 21, A Developers Viewpoint Prof Mark Baker School of Systems Engineering University of Reading Tel:
Collections Management Museums EMu Web Development IMu New EMu Web Developments.
U-P2P: A Peer-to-Peer Framework for Universal Resource Sharing and Discovery Neal Arthorne, Babak Esfandiari, Aloke Mukherjee Carleton University Ottawa,
1 G2 and ActiveSheets Paul Roe QUT Yes Australia!
Virtualizing Entomology Collection Student: Di Wang (Alan) Sponsors: John Marris: Curator, Entomology Research Museum Stuart Charters: Department of Applied.
A Scalable Virtual Registry Service for jGMA Matthew Grove CCGRID WIP May 2005.
8th December Presented by: Prof Mark Baker SSE, University of Reading Tel:
ARCHIMÈDE Presented by Guy Teasdale Directeur, Services soutien et développement Bibliothèque de l’Université Laval CARL Workshop on Institutional Repositories.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
MIT iCampus iLabs Software Architecture Workshop June , 2006.
1 CS 502: Computing Methods for Digital Libraries Lecture 22 Repositories.
System Analysis and Design
PHP-Fusion. Introduction PHP-Fusion is a lightweight open source content management system (CMS) written in PHP. PHP-Fusion utilizes a MySQL database.
Web Programming Language Dr. Ken Cosh Week 1 (Introduction)
Digital Library Architecture and Technology
DEF System Architecture XML Web Services Fedora and the Zebra Search Engine in an OAI Eprints Application by Gert Schmeltz Pedersen, DTV
Describing Collections So Visitors Can Find Them: A sampling of ways to get materials on-line Amanda Focke, Rice University
Eprints Open Source Document Repository Henok Mikre ORNL and University of Tennessee Summer Intern 1.
ISpheres Project. Project Overview iSpheresCore iSpheresImage Demonstration References.
Using SRB and iRODS with the Cheshire3 Information Framework Building Data Grids with iRODS May, 2008 National e-Science Centre Edinburgh Dr Robert.
ALCME: OAI at OCLC Jeffrey A. Young OCLC Online Computer Library Center, Inc.
Fundamentals of Database Chapter 7 Database Technologies.
A centre of expertise in digital information management The MEG Metadata Schemas Registry Pete Johnston, Research Officer (Interoperability),
LinkSphere Linking Researchers to repositories, collections and to each other Professor Shirley Williams University of Reading, UK.
Open access to biodiversity data: the speciesLink experience Dora Ann Lange Canhos
University of North Texas Libraries Building Search Systems for Digital Library Collections Mark E. Phillips Texas Conference on Digital Libraries May.
MET280: Computing for Bioinformatics Introduction to databases What is a database? Not a spreadsheet. Data types and uses DBMS (DataBase Management System)
University of Illinois at Urbana-Champaign A Unified Platform for Archival Description and Access Christopher J. Prom, Christopher A. Rishel, Scott W.
CSE Waitlist Made By: Peng Hu, Zhicheng Lin, Mark Mosby, Robert Pittman, and Derek Robati.
Archivists' Toolkit - CRADLE Presentation, 10 Feb The Archivists’ Toolkit CRADLE Presentation 10 Feb
Archivists' Toolkit - CDL Presentation, October 17, 2005 The Archivists’ Toolkit Lee Mandell Brad Westbrook.
Design of a Search Engine for Metadata Search Based on Metalogy Ing-Xiang Chen, Che-Min Chen,and Cheng-Zen Yang Dept. of Computer Engineering and Science.
MobileMAN Internal meetingHelsinki, June 8 th 2004 NETikos activity in MobileMAN project Veronica Vanni NETikos S.p.A.
ISpheresImage iSpheresImage Feature Overview and Progress Summary.
Jian Gui WANG New Implementation of Agriculture Models APAN19---Jan New Implementations of Agriculture Models Using Mediate Architecture.
Uwe SchindlerGES 2007 – May 2-4, 2007 Data Information Service based on Open Archives Initiative Protocols and Apache Lucene Uwe Schindler 1, Benny Bräuer.
IODE Ocean Data Portal - ODP  The objective of the IODE Ocean Data Portal (ODP) is to facilitate and promote the exchange and dissemination of marine.
Alternative Architecture for Information in Digital Libraries Onno W. Purbo
An introduction to data exchange protocols in TDWG Renato De Giovanni TDWG 2008.
Registries, ebXML and Web Services in short. Registry A mechanism for allowing users to announce, or discover, the availability and state of a resource:
A Scalable Virtual Registry Service for jGMA Matthew Grove DSG Seminar 3 rd May 2005.
Tycho: A General Purpose Virtual Registry and Asynchronous Messaging System Matthew Grove ACET Invited Talk February 2006.
Overviews of the Library of Texas & ZLOT Project Dr. William E. Moen Principal Investigator.
Dispatching Java agents to user for data extraction from third party web sites Alex Roque F.I.U. HPDRC.
DSpace System Architecture 11 July 2002 DSpace System Architecture.
Networking Biodiversity Data – Online Access to Distributed Data Sources in GBIF-D Andrea Hahn, A. Kirchhoff & W.G. Berendsohn Botanic Garden and Botanical.
 An essential supporting structure of any thing  A Software Framework  Has layered structure ▪ What kind of functions and how they interrelate  Has.
Steven Perry Dave Vieglais. W a s a b i Web Applications for the Semantic Architecture of Biodiversity Informatics Overview WASABI is a framework for.
NSDL STEM Exchange: Technical Overview and Implications for Active Dissemination of Federally Funded Resources Across Implementation Systems.
Digital Data Preservation: a schema-driven model Student: Stacy Kowalczyk Co-Authors: Clare McInerney and Phil Mitchell Digital Data Preservation – the.
External Data Access Adam Rauch, 6/05/08 Team: Geoff Snyder, Kevin Beverly, Cory Nathe, Matthew Bellew, Mark Igra, George Snelling.
XML 2002 Annotation Management in an XML CMS A Case Study.
The Holmes Platform and Applications
Outline Introduction and motivation, The architecture of Tycho,
Triple Stores.
Open Source distributed document DB for an enterprise
VI-SEEM Data Discovery Service
Building Search Systems for Digital Library Collections
CS 501: Software Engineering Fall 1999
Wsdl.
Designed for Big Data Visual Analytics, Zoomdata Allows Business Users to Quickly Connect, Stream, and Visualize Data in the Microsoft Azure Platform MICROSOFT.
Lecture 1: Multi-tier Architecture Overview
Content of Presentation
Context Interoperability Submission Search Preservation
Introduction of Week 11 Return assignment 9-1 Collect assignment 10-1
Internet Protocols IP: Internet Protocol
Database Management Systems
Presentation transcript:

LinkSphere: P2P Cross Database Search -- Architecture and Issues Hugo Mills University of Reading

LinkSphere Linking Researchers and their Data Social networking for researchers Cross-database search – Mostly Arts and Humanities datasets – Promoting serendipity – Access by and presentation of datasets to wider audiences

Datasets Museums Archives Archaeology: Silchester Excavation, IADB Ure Museum of Classical Archaeology CentAUR: ePrints Library Beckett Collection Cole Museum of Zoology Film Collection Herbarium Typography Collections

Tycho Fully asynchronous peer-to-peer communications framework Written in Java Fully distributed Robust A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable. (Leslie Lamport) Has a simple distributed data store (Virtual Registry) for client metadata

Tycho (Relatively) lightweight 3MiB for a fully functional system Fast Flexible, Extensible – Bootstrap handlers – Additional message types – VR extensions – Alternative communication protocols – Discovery of core mediators via Bonjour/ZeroConf

XDB System Architecture VR Repo Tycho Core Repo JDBC Web APISPARQL... REST search API Search App Meta

User Interface Main UI is web-based – Uses AJAX – Currently embedded within the LinkSphere project site – Will ultimately move to the SNS Any UI possible using the REST API

Issues Getting the data is hard – Implementation problems – Maintenance problems – Admin problems – Social problems – Legal problems

Muddling along Archive of material for intra-departmental use only – Some legal issues involved Group of technicians administering the data – Poor quality data Excel spreadsheet(!) Reluctant to have index of material made public

Not ready yet Big university projects New systems, (potentially) large data sets MERL museums archive (AdLib) – Data all loaded from previous systems – Access modules not yet installed CentAUR publications archive (ePrints 3) – Very little data available yet

Works For Me Custom web application – PHP, sophisticated External developer No documentation MySQL underneath

It works, but... (part 1) Non-technical users Admins are Mac-only, desktop-only people FileMaker Pro DB structure and UI developed externally – No documentation – This has bad implications

It works, but... (part 2) Completely custom application – External developer – No documentation (again) – Large lump of write-only perl Custom data store – Not SQL. Not XML. Not RDF. No external access

Unreachable data Uncommunicative systems Custom applications – Developers/administrators AWOL Custom data models Lost passwords Excel spreadsheets – See also, Uncommunicative

Unreachable data Private data – Legal issues – Possessive owners Internal use only Poor quality No data!

Conclusions Building the software is easy There is still lots of hard-to-reach data out there Issues are largely not technical More outreach to A&H areas needed

Acknowledgements and thanks LinkSphere team: Mark Baker, Shirley Williams, Pat Parslow (Reading), Claire Warwick, Melissa Terras, Claire Ross (UCL) Repository owners at Reading: Amy Smith (Ure Museum), Guy Baxter (University Archivist), Mary Dyson, Hadj Messelles (Typography), Jonathan Bignell (Film Studies), Alison Sutton (CentAUR), Mike Fulford, Amanda Clarke (Silchester) JISC VRE 3 programme

Tycho Architecture VR M M M M C C C C C C C C

REST Interface /api/query – POST to start new query asynchronously /api/query/query_id – GET for query metadata – DELETE to cancel query (or it will time-out naturally) /api/query/query_id/start/finish – GET a range of results from the query Feedback API coming soon

REST Interface /api/repository – GET list of repositories currently online /api/repository/repo_id – GET for repository metadata Link to repository itself Link to LinkSphere description of it