Presentation is loading. Please wait.

Presentation is loading. Please wait.

LinkSphere: P2P Cross Database Search -- Architecture and Issues Hugo Mills University of Reading.

Similar presentations


Presentation on theme: "LinkSphere: P2P Cross Database Search -- Architecture and Issues Hugo Mills University of Reading."— Presentation transcript:

1 LinkSphere: P2P Cross Database Search -- Architecture and Issues Hugo Mills University of Reading

2 LinkSphere Linking Researchers and their Data Social networking for researchers Cross-database search – Mostly Arts and Humanities datasets – Promoting serendipity – Access by and presentation of datasets to wider audiences

3 Datasets Museums Archives Archaeology: Silchester Excavation, IADB Ure Museum of Classical Archaeology CentAUR: ePrints Library Beckett Collection Cole Museum of Zoology Film Collection Herbarium Typography Collections

4 Tycho Fully asynchronous peer-to-peer communications framework Written in Java Fully distributed Robust A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable. (Leslie Lamport) Has a simple distributed data store (Virtual Registry) for client metadata

5 Tycho (Relatively) lightweight 3MiB for a fully functional system Fast Flexible, Extensible – Bootstrap handlers – Additional message types – VR extensions – Alternative communication protocols – Discovery of core mediators via Bonjour/ZeroConf

6 XDB System Architecture VR Repo Tycho Core Repo JDBC Web APISPARQL... REST search API Search App Meta

7 User Interface Main UI is web-based – Uses AJAX – Currently embedded within the LinkSphere project site – Will ultimately move to the SNS Any UI possible using the REST API

8 Issues Getting the data is hard – Implementation problems – Maintenance problems – Admin problems – Social problems – Legal problems

9 Muddling along Archive of material for intra-departmental use only – Some legal issues involved Group of technicians administering the data – Poor quality data Excel spreadsheet(!) Reluctant to have index of material made public

10 Not ready yet Big university projects New systems, (potentially) large data sets MERL museums archive (AdLib) – Data all loaded from previous systems – Access modules not yet installed CentAUR publications archive (ePrints 3) – Very little data available yet

11 Works For Me Custom web application – PHP, sophisticated External developer No documentation MySQL underneath

12 It works, but... (part 1) Non-technical users Admins are Mac-only, desktop-only people FileMaker Pro DB structure and UI developed externally – No documentation – This has bad implications

13 It works, but... (part 2) Completely custom application – External developer – No documentation (again) – Large lump of write-only perl Custom data store – Not SQL. Not XML. Not RDF. No external access

14 Unreachable data Uncommunicative systems Custom applications – Developers/administrators AWOL Custom data models Lost passwords Excel spreadsheets – See also, Uncommunicative

15 Unreachable data Private data – Legal issues – Possessive owners Internal use only Poor quality No data!

16 Conclusions Building the software is easy There is still lots of hard-to-reach data out there Issues are largely not technical More outreach to A&H areas needed

17 Acknowledgements and thanks LinkSphere team: Mark Baker, Shirley Williams, Pat Parslow (Reading), Claire Warwick, Melissa Terras, Claire Ross (UCL) Repository owners at Reading: Amy Smith (Ure Museum), Guy Baxter (University Archivist), Mary Dyson, Hadj Messelles (Typography), Jonathan Bignell (Film Studies), Alison Sutton (CentAUR), Mike Fulford, Amanda Clarke (Silchester) JISC VRE 3 programme

18

19 Tycho Architecture VR M M M M C C C C C C C C

20 REST Interface /api/query – POST to start new query asynchronously /api/query/query_id – GET for query metadata – DELETE to cancel query (or it will time-out naturally) /api/query/query_id/start/finish – GET a range of results from the query Feedback API coming soon

21 REST Interface /api/repository – GET list of repositories currently online /api/repository/repo_id – GET for repository metadata Link to repository itself Link to LinkSphere description of it


Download ppt "LinkSphere: P2P Cross Database Search -- Architecture and Issues Hugo Mills University of Reading."

Similar presentations


Ads by Google