Aggregation as a tactic - to support discovery Peter Burnhill & Stuart Macdonald EDINA national data centre University of Edinburgh CERN workshop on Innovations.

Slides:



Advertisements
Similar presentations
Zetoc.mimas.ac.uk Zetoc Electronic Table of Contents from the British Library Zetoc Support.
Advertisements

Resource description and access for the digital world Gordon Dunsire Centre for Digital Library Research University of Strathclyde Scotland.
E-learning and Libraries WSIS Forum, Geneva,11 May 2010 Tullio Basaglia, CERN Scientific Information Service, Geneva.
A centre of expertise in digital information management IMS Digital Repositories Interoperability Andy Powell UKOLN,
Joint Information Systems Committee Supporting Higher and Further Education Portals and the JISC Information Environment Strategy Chris Awre Programme.
Subject Based Information Gateways in The UK Coordinated Activities in The UK Within the UK Higher Education community, the JISC (Joint Information Systems.
13 February 2009ESDS – whats in it for librarians? Royal Statistical Society The strange case of the local data librarian - a peculiarly Edinburgh perspective!
AddressingHistory – Tracing the Past Stuart Macdonald AddressingHistory Project Manager Nicola Osborne Project Officer & EDINA Social Media Officer EDINA.
Accessing treasure on lands and peoples Peter Burnhill Director, EDINA, University of Edinburgh.
Metadata workshop, June The Workshop Workshop Timetable introduction to the Go-Geo! project metadata overview Go-Geo! portal hands on session.
Where next…. Stakeholder workshop, 29 Jan To the end of the project.
Why metadata matters for libraries... Rachel Heery UKOLN: The UK Office for Library and Information Networking, University of Bath
Collection-level description & collection management: tool for the trade or information trade-off? Collection Description Focus Workshop 4 Newcastle, 8.
UKOLN is supported by: Digital Repositories Roadmap: looking forward The JISC/CNI Meeting, July 2006 Rachel Heery Assistant Director R&D, UKOLN
Joint Information Systems Committee Digital Library Services BL/JISC Workshop Rachel Bruce JISC Programme Director The Digital Library and its Services,
The metadata challenge for libraries: a view from Europe Michael Day UKOLN: The UK Office for Library and Information Networking, University of Bath
UKOLN, University of Bath
An overview of collection-level metadata Applications of Metadata BCS Electronic Publishing Specialist Group, Ismaili Centre, London, 29 May 2002 Pete.
A centre of expertise in digital information management UKOLN is supported by: Digital Futures for MLAs? A snapshot in real time. Dr Liz.
A centre of expertise in digital information management UKOLN is supported by: Memory institutions and the social fabric of the Web Dr.
UKOLN is supported by: The JISC Information Environment Metadata Schema Registry (IEMSR): Update DC-2006, Manzanillo, Mexico October 3-6, 2006 Rachel Heery.
UKOLN is supported by: JISC Information Environment update Repositories and Preservation Programme meeting, October 24-25, 2006 Rachel Heery UKOLN
Linking Repositories Scoping Study Key Perspectives Ltd University of Hull SHERPA University of Southampton.
Collection-level description & the Information Landscape: users evaluate strategies for resource discovery Collection Description Focus Workshop 5 Cambridge,
A centre of expertise in data curation and preservation DigCCur2007 Symposium, Chapel Hill, N.C., April 18-20, 2007 Co-operation for digital preservation.
Lorcan Dempsey OCLC Big Heads – Heads of Technical Services of Large Research Libraries ALA 2013 Chicago 28 June things about
Collections and services in the information environment JISC Collection/Service Description Workshop, London, 11 July 2002 Pete Johnston UKOLN, University.
Welsh Repository Network (WRN).  Introduce repositories and their role within institutions  Explore the benefits of an institutional repository to its.
Joint Information Systems Committee Bloomsbury Conference 24 June 2010 e-Publishing and e-Publications: Environment and Discovery Professor David Baker.
The Library behind the scene How does it work ? The Library behind the scenes 1 JINR / CERN Grid and advanced information systems 2012 Anne Gentil-Beccot.
Hosted at the Institute for Learning and Research Technology, University of Bristol. Technical Advisory Service for Images International Seminary on Digitisation.
Joint Information Systems Committee Supporting Higher and Further Education Development of an Information Environment for UK Learning and Teaching NOF-Digitise.
Challenges for the DL and the Standards to solve them Alan Hopkinson Technical Manager (Library Systems) Learning Resources Middlesex University.
Digital Pathways: Digital humanities and the interface between scholarly community and libraries Lorna Hughes University of Wales Chair in Digital Collections,
Dr. Jūratė Kuprienė Director for innovations and infrastructure development Workshop: Information services for research process , Rīga Research.
Interoperable Digitised Content “Discover, search, extract, link, associate, and view digitised content” Les Carr.
UK LOCKSS Alliance: Content Development Adam Rusbridge EDINA, University of Edinburgh 10 th May 2011.
Joint Information Systems Committee Supporting Higher and Further Education Catherine Grout Assistant Director for Development, JISC/DNER
Supporting further and higher education The UK FAIR Programme: OAI in context Chris Awre OAI3, CERN, February 2004.
Stuart Macdonald AddressingHistory Project Manager EDINA & Data Library University of Edinburgh
Preserving Digital Collections for Future Scholarship Oya Y. Rieger Cornell University
Finding out about the preservation of e-journals: the PEPRS Project Piloting an E-journals Preservation Registry Service Fred Guy, Project Manager, EDINA,
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
The JISC IE Metadata Schema Registry and IEEE LOM Application Profiles Pete Johnston UKOLN, University of Bath CETIS Metadata & Digital Repositories SIG,
CBSOR,Indian Statistical Institute 30th March 07, ISI,Kokata 1 Digital Repository support for Consortium Dr. Devika P. Madalli Documentation Research &
1 Metadata –Information about information – Different objects, different forms – e.g. Library catalogue record Property:Value: Author Ian Beardwell Publisher.
DNER Architecture Andy Powell 6 March 2001 UKOLN, University of Bath UKOLN is funded by Resource: The Council for.
1 The Future Of Union Catalogues Some BL Perspectives Neil Wilson Head of Bibliographic Development Scholarship & Collections Boston Spa 17 th March 2006.
10/07/2008 Semantic Web Technologies & Higher Education.
Supporting Further and Higher Education Collection description as Middleware The Information Environment Service Registry (IESR) Rachel Bruce, Information.
HEFCE/Higher Education Academy/JISC cc-by-sa (uk2.5) Image source – flickr (cc-by) OER and the Open Agenda Malcolm Read, Executive Secretary, JISC.
Joint Information Systems Committee Supporting Higher and Further Education Rachel Bruce Programme Manager, JISC Executive Collection.
From small beginnings: Developing collection level description Mapping the Information Landscape Showcase day British Library Conference Centre, London,25.
Introduction to the Semantic Web and Linked Data
A centre of expertise in digital information managementwww.ukoln.ac.uk DCMI Affiliates: Implications for Institutions Rosemary Russell UKOLN University.
Resource Description and Access (RDA) information session Deirdre Kiorgaard Australian Committee on Cataloguing Representative to the Joint Steering Committee.
Data mediators experience with metadata – A national data centre view Peter Burnhill (Director) & Tony Mathys EDINA National Data Centre University of.
National Library of Finland Strategic, Systematic and Holistic Approach in Digitisation Cultural unity and diversity of the Baltic Sea Region – common.
A centre of expertise in digital information management Shaping the e-future? Grids, Web Services and Digital Libraries Professor Tony.
Open Archive Forum Rachel Heery UKOLN, University of Bath UKOLN is funded by Resource: The Council for Museums, Archives.
Stuart Macdonald AddressingHistory Project Manager EDINA To create an online crowdsourcing tool that will combine.
Surveying the landscape: collection-level description & resource discovery JISC/NSF DLI Projects meeting, Edinburgh, 24 June 2002 Pete Johnston UKOLN,
Collection-level description: from theory to practice Minerva project meeting Paris, 24 January 2003 Pete Johnston UKOLN, University of Bath Bath, BA2.
Practical Aspects of Preservation Peter Simpson Development Officer Arts and Humanities Data Service.
Setting the stage: linked data concepts Moving-Away-From-MARC-a-thon.
Spotlight on the digital: improving discoverability of digital collections Paola Marchionni, Head of digital resources for teaching, learning and research,
TRSS Terminology Registry Scoping Study
Bloomsbury Conference 24 June 2010
LOD reference architecture
Malte Dreyer – Matthias Razum
Presentation transcript:

aggregation as a tactic - to support discovery Peter Burnhill & Stuart Macdonald EDINA national data centre University of Edinburgh CERN workshop on Innovations in Scholarly Communication (OAI7) University of Geneva, 23 June 2011

RDTF Vision: The joint JISC / RLUK Resource Discovery Task Force (RDTF) Vision: UK researchers and students will have easy, flexible, and ongoing access to content and services through a collaborative, aggregated and integrated resource discovery and delivery framework which is comprehensive, open and sustainable Making content more discoverable both by people and machine via a mixed economy of technological solutions. The Discovery Initiative aims to: Engage stakeholders across libraries, archives and museums Build critical mass of open content to inspire others to participate Encourage development of purposeful aggregations and compelling applications - mashing at the macro-level Exemplify what can be done across domains to free data and explore how to make that data work harder No one-size fits all solution! Context

Key concept in RDTF Vision is aggregation, directly or represented through metadata – to unlock the online & digital riches held in our organisations Regard aggregation as intervention to exploit the telematic opportunity for things [that] are 'remote, digital & published - a phrase derived from an IASSIST conference in 1990 exploring what it meant with the Internet if we regarded all [content] as remote and published. The Web in mid-1990s simplified and thus improved Unfortunately, even now, much which is online and on the Web is badly or inadequately published … We have to improve, re-interpreting what it means to be well-published aggregation as a tactic - a phrase coined to end an an impasse during a meeting to discuss technical aspects of the RDTF Vision statement to identify stakeholder groups

The term aggregation is used a lot in computer science for: objects … assembled or configured together to create a more complex object UML, IBM aggregating resources based on … properties. … they are owl:sameAs and their other properties can be intermixed. For purposes of RDTF aggregation means: an assembly of data sources –more than a collection of objects (image banks, data services, catalogues, activity data) – related or otherwise for machine-as-user – independent of presentation layer However aggregation is not a goal nor an end in itself - It is an intervention to be used for a twofold strategic purpose: improvement - merge & match, customisation and consumption, multiple output formats, reduce duplication of effort discoverability – via promiscuous or well-dressed metadata through e.g. Google or tailored services

Digital Library has mixed parentage - a re-mix of the document tradition & the computation tradition approaches based on a concern with documents, with signifying records: archives, bibliography, documentation, librarianship, records management, and the like … [Content Provider speak] approaches based on uses of formal techniques, whether mechanical (such as punch cards and data-processing equipment) or mathematical/computational (as in algorithmic procedures). [Developer speak] Prof. Michael Buckland, Presidential Address, American Society for Information Science, JASISs 50th (1998) Language & Perspectives

EDINA - develops and delivers JISC-sponsored national online services –adding value to data and content Digimap Collections (OS mapping; SeaZone; BGS) NewsfilmOnline (various; digitised with JISC £) UK Access Management Federation (institutions; authentication) Data Library – move from support to middle folk Research data support for Edinburgh researchers Research data management guidelines, training, OER materials Edinburgh DataShare – open data repository RADAR – Researching A Data Asset Registry Maybe as middle folk - c.f. those who deal in middleware sometimes having the role of creator and supplier of some service sometimes being the user of what others supply inter-operator Perspectives … as provider

Perspective … as aggregator: developing and delivering JISC-sponsored aggregation services JISCMediahub - links to collections & hosted content (c. 1m resources) CultureGrid; First World War Poetry; Films of Scotland; Getty images (all content searchable and viewable within JISC Media Hub) GoGeo! - metadata registry for spatially-referenced data Geodoc Metadata creation tool, ShareGeo Open SUNCAT – serials union catalogue: 80 libraries metadata/links to full text, download MARC records (& XML & SUTRS - Simple Unstructured Text Record Syntax - data exchange format widely used in Z39.50) PEPRS - e-journal preservation registry jointly led by EDINA with the ISSN International Centre metadata registry of available back copy e-journals - aggregated from preservation agencies (incl. British Library, UK LOCKSS Alliance, CLOCKSS)

Some RDTF-related EDINA GOgeo Linked Data (GOLD) – triplify INSPIRE compliant metadata to – improve discoverability of metadata records via search engines SUNCAT : Exploring Open [bibliographic] Metadata (working with OKF to open up data sent by contributing libraries – convert to RDF) Sharing OpenURL Activity Data - monthly usage data: date & time; anonymised IP address/inst. ID; title; author; ISSN, DOI Uses – article/journal recommendations, publishers reviewing what content is of interest to specific communities, innovative services to meet users needs CHALICE – Use data mining to extract placenames from the English Place Name Survey to create a UK historic gazetteer published as Linked Data & link it to the Geonames ontology on the semantic web. AddressingHistory – Geo-parsing of Scottish Post Office Directories, API onto digitised content, output in XML, CSV, JSON 3 further case studies on other EDINA services illustrating how other collections can benefit from the same techniques.

The end is the start of a new beginning … In earlier web time we had the MODELS user-verbs: Discover -> Locate -> Request -> Access (Deliver) Dempsey, Russell & Murray (1999) where Access was the end game for us middle folk even if the beginning & part of a deeper process for researchers, students … Now there is call for more than bilateral & negotiated interoperability, where Access is the beginning for developers and for other services RDF/Linked Data enables information to be shared in a more Web-friendly way RDF/Linked Data enables structure and content of those data sources to be explicit - vocabularies, ontologies, relationships Exposing the complexity and relationship in the underlying data, hanging the insides on the outside!

The treasures are on show inside, but … 10 Centre Pompidou

… and so to summarise.. Early web approaches focused on making content accessible for humans hiding the complexity and relationship in the underlying data paying attention to the user interface: HCI & GUI; Usability and Accessibility However to ensure content gets noticed it must be made easier for machines to understand by: exposing the complexity and relationship in the underlying data having in mind the machine-as-user: API as well as HCI Aggregation should be seen as intervention, with strategic purpose: 1.to engage in value-added improvement of content 2.to enhance the discoverability of that which is aggregated to be a focus of attention (thro promiscuous metadata!) If it is with RDF, then thats good dont make a fuss if not Publish RDBMS schemas, catalogue records, codebooks, and ancillary or related content in multiple, machine-readable formats

The Many Minds principle the coolest thing to do with your data will be thought of by someone else Using data as the building platform Jo Walsh & Rufus Pollock ( ). Open Data and Componentization. XTech 2007 (slide 14)Open Data and Componentization "Benefits of freeing data are many, arguably being the most relevant one the Many Minds principle: therell always be someone that will find out a way to reuse data that you wouldnt have even figured. José Manuel Alonso, Notes from the 5th Internet, Law and Politics Conference: The Pros and Cons of Social Networking Sites, organized by the Open5th Internet, Law and Politics Conference: The Pros and Cons of Social Networking Sites University of Catalonia, School of Law and Political Science, and held in Barcelona, Spain, on July 6th and 7th, 2009.

Repository Fringe 2011 – call for participants: THANK YOU CC BY-NC-ND image by enggul courtesy of Flickr –