The Faster Research Cycle Interoperability for better science Brian Matthews, Leader, Information Management Group, E-Science Centre, STFC Rutherford Appleton.

Slides:



Advertisements
Similar presentations
DRIVER Long Term Preservation for Enhanced Publications in the DRIVER Infrastructure 1 WePreserve Workshop, October 2008 Dale Peters, Scientific Technical.
Advertisements

Creating Institutional Repositories Stephen Pinfield.
A centre of expertise in data curation and preservation DCC/NeSC eScience Workshop, June 2008 Working in partnership with the eScience community This work.
© S.J. Coles 2006 Digital Repositories as a Mechanism for the Capture, Management and Dissemination of Chemical Data Simon Coles School of Chemistry, University.
Linking Data and Publications: the Chemistry Way Simon Coles School of Chemistry, University of Southampton, U.K. CLADDIER workshop.
Data and Publication Discovery Brian Matthews, Information Management Group, STFC Rutherford Appleton Laboratory CLADDIER workshop, Chilworth, Southampton,
© S.J. Coles 2006 Digital Repositories as a Mechanism for the Capture, Management and Dissemination of Chemical Data Simon Coles School of Chemistry, University.
RCUK, Octiber Archiving research data and research publications. Dr Leslie Carr, Intelligence, Agents Multimedia, University of Southampton Dr Simon.
A centre of expertise in digital information management UKOLN is supported by: British Academy e-Resources Policy Review: UKOLN Report.
UKOLN is supported by: Put functionality Augmenting interoperability across scholarly repositories 20/21 April 2006 Rachel Heery, UKOLN, University of.
© S.J. Coles 2006 Institutional Data Repositories for Chemistry Simon Coles School of Chemistry, University of Southampton, U.K.
EBankII Workshop 1 Making Scientific Data Openly Available Simon Coles School of Chemistry, University of Southampton.
EBank UK CCLRC Workshop February eBank and CCLRC Workshop February 2005 University of Bath.
Digital Repositories: interoperability & common services Closing Remarks Dr Liz Lyon, UKOLN, University of Bath, UK
PaN-data WP7 - Integration Brian Matthews STFC-e-Science.
The Data Lifecycle and the Curation of Laboratory Experimental Data Tony Hey Corporate VP for Technical Computing Microsoft Corporation.
The Central Role of Data ‘Capturing and Sharing Chemistry Research Data’ Simon Coles School of Chemistry, University of Southampton, U.K.
December 2008 MRC Data Support Services (DSS) Chris Morris 13 th February 2009 Sharing Research Data: Pioneers, Policies and Protocols The seventh cat.
Supporting Further and Higher Education Building the UK National Information Environment - Lessons from the Past and Pointers To the Future Norman Wiseman.
Planning for Flexible Integration via Service-Oriented Architecture (SOA) APSR Forum – The Well-Integrated Repository Sydney, Australia February 2006 Sandy.
1 Enriching UK PubMed Central SPIDER launch meeting, Wolfson College, Oxford Paul Davey, UK PubMed Central Engagement Manager.
University of Southampton, U.K.
UKOLN is supported by: OAI-ORE a perspective on compound information objects ( Defining Image Access.
EPrints Workshop, January eBank UK: Dissemination of research data using EPrints Simon Coles, School of Chemistry, University of Southampton.
Thee-Framework for Education & Research The e-Framework for Education & Research an Overview TEN Competence, Jan 2007 Bill Olivier,
© S.J. Coles 2006 Data Management in the Chemistry Domain Simon Coles School of Chemistry, University of Southampton, U.K.
E-Infrastructure Use Cases and Service Usage Models (eIUS) & Barriers to Uptake Matthew Mascord eIUS Project Manager/Analyst NGS Users Forum, OeRC, 19.
Co-funded by the European Union under FP7-ICT Alliance Permanent Access to the Records of Science in Europe Network Co-ordinated by aparsen.eu #APARSEN.
Data-PASS Shared Catalog Micah Altman & Jonathan Crabtree 1 Micah Altman Harvard University Archival Director, Henry A. Murray Research Archive Associate.
Chinese-European Workshop on Digital Preservation, Beijing July 14 – Network of Expertise in Digital Preservation 1 Trusted Digital Repositories,
The OpenAIRE Project Open Access Infrastructure for Research in Europe Stefania Biagioni, Donatella Castelli, Paolo Manghi CNR - ISTI GL11 - Library of.
1 Common Challenges Across Scientific Disciplines Laurence Field CERN 18 th November 2013.
User requirements for and concerns about a European e-Infrastructure Steven Newhouse, Director.
©euroCRIS/Keith G JefferyCRIS Seminar Brussels Discussion Topics Keith G Jeffery President, euroCRIS
Deploying Trust Policies on the Semantic Web Brian Matthews and Theo Dimitrakos.
Integrated e-Infrastructure for Scientific Facilities Kerstin Kleese van Dam STFC- e-Science Centre Daresbury Laboratory
Supporting further and higher education The UK FAIR Programme: OAI in context Chris Awre OAI3, CERN, February 2004.
Metadata for Large Science: The ICAT Data Model Brian Matthews, Leader, Scientific Applications Group, E-Science Centre, STFC Rutherford Appleton Laboratory.
University of Bergen Library Electronic publishing Bergen – Makerere visit February 2005.
Donatella Castelli CNR-ISTI
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
Brian Matthews, DeFINE, Pisa 26/11/02 Trust and the Semantic Web Brian Matthews, Business & Information Technology Dept, CLRC
EBank UK: linking scientific data, scholarly communication and learning Michael Day and Rachel Heery UKOLN, University of Bath
Context and Linking in the Research Lifecycle CERIF and other standards Catherine Jones Scientific Information Group Scientific Computing Department STFC.
Metadata in a distributed information environment: Interoperability as recombinant potential Lorcan Dempsey OCLC/SCURL pre-IFLA conference, 15/16 Aug 02.
ShibGrid: Shibboleth access to the UK National Grid Service University of Oxford and STFC.
Cross-linking and Referencing Data and Publications in CLADDIER Brian Matthews, E-Science Centre, STFC Rutherford Appleton Laboratory.
© euroCRIS/Keith G Jeffery 1 euroCRIS and e-Infrastructure Keith G Jeffery President, euroCRIS Premium Members.
Interoperability from the e-Science Perspective Yannis Ioannidis Univ. Of Athens and ATHENA Research Center
NeSC Workshop - February /14 Study of User Priorities for e-Infrastructure for e-Research (SUPER) Steven Newhouse Jennifer Schopf Andrew Richards.
Metadata for structural science Workshop on research metadata in context Nijmegen, 7–8 September 2010 Simon Lambert STFC e-Science UK.
UKOLN is supported by: Introduction to UKOLN Dr Liz Lyon, Director UKOLN, University of Bath, UK Grand Challenge Meeting, June a centre.
26/05/2005 Research Infrastructures - 'eInfrastructure: Grid initiatives‘ FP INFRASTRUCTURES-71 DIMMI Project a DI gital M ulti M edia I nfrastructure.
CombeDay Making Data Openly Available Simon Coles.
A centre of expertise in digital information management UKOLN is supported by: Functional Requirements Eprints Application Profile Working.
Brian Matthews, euroCRIS, 18/09/03 CRIS architecture to support an ERA Brian Matthews.
Data Preservation at Rutherford Lab David Corney 9 th July 2010 KEK.
UKOLN is supported by: Library futures in the new research landscape. Dr Liz Lyon, UKOLN, University of Bath, UK CURL Members Meeting October 2004, London.
SciencePAD Open Software for Open Science Alberto Di Meglio – CERN.
Joint Information Systems Committee Repositories Support Project Summer School 2008 Amber Thomas, JISC.
Fedora Commons Overview and Background Sandy Payette, Executive Director UK Fedora Training London January 22-23, 2009.
PARTHENOS-project.eu EOSC market demand for art, humanties and cultural heritage Amsterdam– EGI Conference– 7/4/2016 Franco Niccolucci Scientific Coordinator,
Online Information and Education Conference 2004, Bangkok Dr. Britta Woldering, German National Library Metadata development in The European Library.
E-infrastructure requirements from the ESFRI Physics, Astronomy and Analytical Facilities cluster Provisional material based on outcome of workshop held.
Accessing the VI-SEEM infrastructure
GISELA & CHAIN Workshop Digital Cultural Heritage Network
Donatella Castelli CNR-ISTI
DART: Drivers, Design, Dimensions, Demonstrators and Deliverables
GISELA & CHAIN Workshop Digital Cultural Heritage Network
Developing Institutional Data Repositories
Presentation transcript:

The Faster Research Cycle Interoperability for better science Brian Matthews, Leader, Information Management Group, E-Science Centre, STFC Rutherford Appleton Laboratory

The Research Lifecycle E-Science: providing the infrastructure for the research lifecycle

How do we speed up this cycle? By speeding up the cycle we can increase the volume of good science –Make a better return from the investment in science –Make breakthroughs in science earlier Do this via: –Integration Support the whole lifecycle See Kerstin’s talk –Interoperability Support across lifecycles

Interoperability Sharing across boundaries –Across different research lifecycles –Across institutions –Across information objects –Across disciplines –Across time Characteristics –Loosely coupled –Across different authorities –Different internal models

Enabling better science Neutron diffraction X-ray diffraction NMR } High-quality structure refinement } SCIENCE MASHUPS

Vision Infrastructure to support science across disciplines, scientific institutions and research groups

EDNS European Data Infrastructure for Neutron and Synchrotron Sources Combining European Neutron and Synchrotron Facilities Already a common user community Across many disciplines –Materials, chemistry, proteomics, pharmaceuticals, nuclear physics, archaeology …

Interoperability Across Facilities ISIS ILL DiamondESRF e-Science Synchrotron X-Rays Neutrons UK France

Integration and interoperation across facilities Single Infrastructure  Single User Experience Different Infrastructures  Different User Experiences Facility 1 Raw Data Data Analysis Analysed Data Published Data Publications User Data Facility 3 Raw Data Data Analysis Analysed Data Published Data Publications User Data Raw Data Catalogue Data Analysis Analysed Data Catalogue Published Data Catalogue Publications Catalogue User Catalogue Facility 2 Raw Data Data Analysis Analysed Data Published Data Publications User Data Publications Repositories Data Repositories Software Repositories User Registries Capacity Storage Common CRIS

Potential Impact Most of Research Lifecycle –User Management, Data Collection, Analysis, Publication Establish a Production service –benefit to users – usability, findability: user info, data, pubs, software –benefit to facilities – manageability: users, data, pubs, software Outreach and expansion –Linking with other facilities in Europe and the wider world USA, Canada, Australia –Linking with User communities But at the moment, we are still in the planning and discussion phase

Sharing Users Sharing knowledge of users –Enhancing level of support for users –Can correlate similar applications put into different facilities –Facilities can provide a continuity of service –Facilities can increase accuracy Common Authentication –Common UID ? –Shibboleth –Grid Certificates –SSO at STFC, ShibGrid –Virtual Organisation Support Policy Issues –Data protection –Institutional Security policy FedID Facility User DN Shibbol eth ID SRBSystem UID SSH PK Facility UserID

Sharing Data Sharing data is hard: –Different data formats –Different access rights –Complex objects –Maintaining context Metadata is key –Structural Metadata (CSMD) –Conceptual structures (Ontologies) – maintain meaning –Metadata is hard to collect Consistent data policies are needed

Aggregator services Institutional data repositories Deposit, Validation Publication Validation Data analysis Search, harvest Presentation services / portals Data discovery, linking, citation Laboratory repository Deposit eCrystals ‘Data Federation’ Model Publishers: peer- review journals, conference proceedings, etc Curation Preservation Subject Repository Institution Library & Information Services Data creation & capture in “Smart lab” Data discovery, linking, citation Search, harvest Deposit

Data Policy Data policy –Retention –Quality –Access Learning how to manage policy as part of the SOA infrastructure –E.g GridTrust –Consequence – looking at Data Policy Remains as a very large Business question Goals & Requirements Self-* … Dynamic VO Policies VO Mngt … … Trust and Security for NGGs Usage control Resources

Sharing Publications Institutional Repository s/w now very well established –ePrints, DSpace, Fedora, ePubs –Large body of expertise available –Standard metadata models and protocols: DC-APs, FRBR, OAI-PMH, OAI-ORE –Not yet embedded in science practise except HEP! Linking science data and publications –Not yet well established –Needs data citation –Needs peer review of data –Can (and should) be done on a P2P basis

STFC

Sharing Software Analysis software tends to be specialised –Dependent on specific data formats –Dependent of nature of data –Dependent on the particular result to demonstrated Nevertheless common s/w repositories exist –GAMS, StarLink, NAG, CCPForge etc Advantages in sharing it –Saves programmer effort –Verification of results –Common algorithms –Visualisation tools Little work on systematic preservation of s/w –Signficant properties of s/w

Common Representation and Transport of Information To support the infrastructure we need a means to share information –Lightweight –Minimal impact on internal systems –Keeps control at the source –Easy to share and merge –Can share conceptual information The Semantic Web (still) provides the best current option

DataWebs DataWeb concept –David Shotton, Oxford –Biological images –Publishing metadata locally –With different conceptual description –Mapped to core Ontology –Search and aggregator service Integration comes for free

SKOS: Simple conceptual relationships

A Reality Check: the SUPER Report Do the users really want all this? Study of User Priorities for e-Infrastructure for e-Research (SUPER) Survey commissioned by the UK NeSC –Steven Newhouse, Jennifer Schopf, Andrew Richards, Malcolm Atkinson Covered 45 people from over 30 e-Science projects –Small survey –Selected from the already converted! Available:

SUPER Results Some Concerns: How to share data with colleagues –Large-scale data sets (files) –Metadata standards seen as key –Automatic capture of provenance Long-term data curation –Help with best practice to curate data Authentication –Simpler authentication mechanisms –Easier use of Virtual Organisations Training and outreach We seem to be hitting the right points!

Summary Leverage to speed up the science lifecycle from interoperability Access to resources across institutions and disciplines Metadata Key Policy Key Need to use semantic description to share meaning Loose coupling of resources via Semantic Web