UK Digital Curation Centre : enabling research data management at the coalface Dr Liz Lyon Associate Director DCC / Director UKOLN University of Bath,

Slides:



Advertisements
Similar presentations
Data Publishing Service Indiana University Stacy Kowalczyk April 9, 2010.
Advertisements

DRIVER Building a worldwide scientific data repository infrastructure in support of scholarly communication 1 JISC/CNI Conference, Belfast, July.
Useful tools for ESRC Research Centres
A centre of expertise in data curation and preservation DCC/NeSC eScience Workshop, June 2008 Working in partnership with the eScience community This work.
S.J. Coles a*, M.B. Hursthouse a, R.A. Stephenson a, P. Cliff b, E. Lyon b, M. Patel b J. Downing c & P. Murray-Rust.
Opening the Research Data Lifecycle Workshop Capturing and Sharing Research Data Simon Coles School of Chemistry, University of Southampton, U.K.
© S.J. Coles 2006 Digital Repositories as a Mechanism for the Capture, Management and Dissemination of Chemical Data Simon Coles School of Chemistry, University.
© S.J. Coles 2006 Digital Repositories as a Mechanism for the Capture, Management and Dissemination of Chemical Data Simon Coles School of Chemistry, University.
RCUK, Octiber Archiving research data and research publications. Dr Leslie Carr, Intelligence, Agents Multimedia, University of Southampton Dr Simon.
A centre of expertise in digital information management UKOLN is supported by: Curating the Scientific Record: The Challenges Ahead Dr.
Towards an information model for I2S2
A centre of expertise in digital information management UKOLN is supported by: Open Science at Genome Scale Dr Liz Lyon, Director, UKOLN,
A centre of expertise in digital information management UKOLN is supported by: Monica Duke Project.
A centre of expertise in digital information management UKOLN is supported by: Dealing with Data: Roles, Rights, Responsibilities & Relationships.
Project E: Citation Understanding the problem space Progress so far How you can contribute : afternoon session Lessons learned and challenges ahead Acknowledgements:
A centre of expertise in digital information management UKOLN is supported by: British Academy e-Resources Policy Review: UKOLN Report.
A centre of expertise in digital information management UKOLN is supported by: Mind the Gap: Reflections on Data Policies and Practice.
A centre of expertise in digital information management UKOLN is supported by: Open Science at Web-Scale: Breaking all Barriers? Dr Liz.
A centre of expertise in digital information management UKOLN is supported by: Data Informatics Top Ten : (for Libraries) Dr Liz Lyon,
A centre of expertise in digital information management UKOLN is supported by: Introducing the Community Capability Model Project Dr Liz.
A centre of expertise in digital information management UKOLN is supported by: Evolution or revolution? The changing data landscape Dr.
A centre of expertise in digital information management UKOLN is supported by: Evolution or revolution? The changing data landscape Dr.
A centre of expertise in digital information management UKOLN is supported by: UK Perspectives on the Curation and Preservation of Scientific.
A centre of expertise in digital information management UKOLN is supported by: Research Data & Institutions Roles & Responsibilities? Dr.
A centre of expertise in digital information management UKOLN is supported by: Acting as Advocate? Seven steps for libraries in the data.
Federation eCrystals Federation: Open Repositories for Data-driven Science Dr Liz Lyon, UKOLN, University of Bath, UK Dr Simon Coles, University of Southampton,
A centre of expertise in digital information management UKOLN is supported by: Data Publishing: Challenges for HEIs and Libraries Dr Liz.
A centre of expertise in digital information management UKOLN is supported by: Digital Futures for MLAs? A snapshot in real time. Dr Liz.
A centre of expertise in digital information management UKOLN is supported by: Memory institutions and the social fabric of the Web Dr.
EBankII Workshop 1 Making Scientific Data Openly Available Simon Coles School of Chemistry, University of Southampton.
I2S2 - Infrastructure for Integration in Structural Sciences Cross-Institutional Pilot
I2S2 - Infrastructure for Integration in Structural Sciences Information Model Development Workshop RAL 11 th February 2010
A centre of expertise in digital information management UKOLN is supported by: Evolution or revolution? The changing data landscape Dr.
Digital Repositories: interoperability & common services Closing Remarks Dr Liz Lyon, UKOLN, University of Bath, UK
Requirements Gathering (work in progress) Manjula Patel, UKOLN & DCC I2S2 Models Workshop 11 th February 2010 STFC, RAL, Didcot
APA CONFERENCE, FRASCATI 6 November 2012 Data management planning at the DCC Martin Donnelly Digital Curation Centre University of Edinburgh.
ICAT + Information Model Brian Matthews Scientific Information Group E-Science Centre STFC Rutherford Appleton Laboratory
PaN-data WP7 - Integration Brian Matthews STFC-e-Science.
… because good research needs good data DMP Online, Lincoln, 28 th Feb 2013 DMP Online Kerry Miller Digital Curation Centre University of Edinburgh
December 2008 MRC Data Support Services (DSS) Chris Morris 13 th February 2009 Sharing Research Data: Pioneers, Policies and Protocols The seventh cat.
The Data Curation Profile IASSIST 2010 Jake Carlson Data Research Scientist Purdue University Libraries.
EPrints Workshop, January eBank UK: Dissemination of research data using EPrints Simon Coles, School of Chemistry, University of Southampton.
Active Data Curation in Libraries: Issues and Challenges ASEE ELD Presentation June 27, 2011 William H. Mischo & Mary C. Schlembach.
Data Publishing Workflows: Strategies and Standards
A centre of expertise in digital information management UKOLN is supported by: What is a Data Scientist? (…Data Scientists in the Wild…)
A centre of expertise in digital information management UKOLN is supported by: Building Capacity and Capability for Data : Requirements,
Moving forward our shared data agenda: a view from the publishing industry ICSTI, March 2012.
A centre of expertise in digital information management UKOLN is supported by: Evolution or revolution? The changing data landscape Dr.
A centre of expertise in digital information management UKOLN is supported by: Monica Duke Project.
ACCESS for VALIDITY ACCESS for INNOVATION. Starting January 2011 for NEW proposals Not voluntary – “integral part” of proposal and FastLane Required for.
Elements of a Data Management Plan Bill Michener University Libraries University of New Mexico Data Management Practices for.
UVa Library Research Data Services
Manjula Patel Scaling-up to Integrated Research Data Management Workshop 6 th International Digital Curation Conference Holiday Inn, Mart Plaza Chicago,
Because good research needs good data The DCC lifecycle model, Exeter Uni, 19 May 2012 Funded by: The Digital Curation Lifecycle Model Joy Davidson and.
Because good research needs good data Funded by: Digital Curation for Researchers, 28th February 2013 The Shifting Research Data Management Policy Landscape.
UKOLN is supported by: Digital Preservation Benefits Tools Project Dissemination Workshop Dr Liz Lyon, Associate Director, UK Digital Curation Centre Director,
Deepcarbon.net Xiaogang (Marshall) Ma, Yu Chen, Han Wang, John Erickson, Patrick West, Peter Fox Tetherless World Constellation Rensselaer Polytechnic.
Data in the NEES Data Repository Conditions for Current and Future Use and Re-Use Quake Summit 2012, Boston, Massachusetts July 12, 2012 Stanislav Pejša.
Lucia Lötter NeDICC 26 February 2014 Lucia Lötter NeDICC 26 February 2014 Social science that makes a difference Research Methodology Centre Research Data.
Metadata for structural science Workshop on research metadata in context Nijmegen, 7–8 September 2010 Simon Lambert STFC e-Science UK.
Introduction to Research Data Management Joy Davidson and Sarah Jones Digital Curation Centre
Research Data Management 26 th April 2016 Federica Fina, Data Scientist, University of St Andrews Library.
DART: Drivers, Design, Dimensions, Demonstrators and Deliverables
Research Data Context Preservation in SCAPE
eCrystals Federation: Open Repositories for global Open Science
Mind the Gap: Reflections on Data Policies and Practice
Data Management Plans Session 3.2
Research Data Management
Bird of Feather Session
eCrystals Federation: Open Repositories for global Open Science
Presentation transcript:

UK Digital Curation Centre : enabling research data management at the coalface Dr Liz Lyon Associate Director DCC / Director UKOLN University of Bath, UK

Overview 1.Moving data across boundaries : structural science 2.Managing data in institutions : emerging DCC tools 3.Making data count : publication and attribution

Bridging the chasm between the local laboratory bench and large scale facilities e.g. DIAMOND synchotron Develop Integrated Information Model Use cases and Inter-disciplinary Pilots Cost-benefit analysis: before and after

Structural Sciences Infrastructure

Diamond Light Source Synchotron National Crystallography Service University of Southampton Local Earth Sciences Lab University of Cambridge FunctionInternational service -multiple communities UK service - multiple institutions. Also uses Diamond Lone researcher at institution - uses NCS and ISIS large-scale facility AdministrationPeer-reviewed proposal requiredVetted applications. Electronic & paper-based records –experiments, safety ERA, instrument time Multiple proposals, multiple forms WorkflowFormulaic and bespokeFormulaicComplex, unrecorded SoftwareIn-house scriptsIn-house scripts + open-source suite Raw data storageIn-house GDA storeATLAS data-storeLaptop / local server Derived data storageTaken offsite on laptop / USB stickeCrystals repositoryLaptop / local server / USB stick MetadataCore Scientific MetaData ModeleBank/eCrystals schema ? IdentifiersBeam-line numberDOI InChI ?

Research Outputs Citations, References User registration data; Instrument allocation data etc. Comments, annotations, ratings etc. Risk assessment data; other sample data Process & Analyse Derived Data Research Concept and/or Experiment Design Start Project Peer-review Proposal Conduct Experiment Generate, Create, & Collect Raw Data Check & Clean Raw Data Interpret & Analyse Results Data Archive, Preservation & Curation (OAIS conformant; Representation Information etc.) IPR, Embargo & Access Control Discover, Access, Validate, Reuse & Repurpose Data Publish Research Results DataDerived DataProcessed DataRaw Data Documentation, Metadata & Storage (Reference, Provenance, Context, Calibration etc.) Acquire Sample Write Proposal (include DMP) Scholarly Knowledge Write Usage Report Research ActivityAdministrative Activity Curation Activity Information Flow KEY: Peer Review Prepare Manuscript Prepare Supplementary Data Publications Database Publication Activity An Idealised Scientific Research Activity Lifecycle Model Appraisal & Quality Control Programs (generate customised software) Papers, articles, presentations, reports

Existing work : mappings and gaps Data Management and Provenance (CSMD, OPM?) Bibliographic records (FRBR, SWAP) Curation (OAIS, PREMIS?) DC, Ontologies Software descriptions (??) Slide : Brian Matthews, STFC PROCESS

Focus on Open Methodology Develop Data Model Join up to other Data Model work : OreChem Data Conservancy Linked data approach Integrated Information Model

Requirements Analysis Report … it is apparent that the greatest need is for a robust data management infrastructure which supports each researcher in capturing, storing, managing and working with all the data generated during an experiment. Internal sharing of research data amongst collaborating scientists … is also a primary concern as is a requirement for access to research data in the long run so that a researcher … can return to and validate the results well into the future.

INCREMENTAL Project Institutional perspective : Scoping study Creating & organising data Storage and access Back-up Preservation Sharing and re-use

Incremental Project Report, June 2010 While many researchers are positive about sharing data in principle, they are almost universally reluctant in practice using these data to publish results before anyone else is the primary way of gaining prestige in nearly all disciplines. The majority of people felt that some form of policy or guidance was needed....

Emerging funder requirements

Data types, formats, standards, capture Ethics and Intellectual Property Access, sharing and re-use Short-term storage & data management Deposit & long-term preservation Adherence and review

DMP Online Currently updating Version 2.0 Version 3.0 summer 2010

Making DMPs work : the start of a long process… Embed DMPs in research lifecycles / activity model as the norm Code of Conduct for Research Assess & review DMPs (not just the science content of proposals) Educate reviewers (DCC guidance for social science in prep) Manage compliance Infrastructure to share DMPs Analyse cost-benefits

Research Outputs Citations, References User registration data; Instrument allocation data etc. Comments, annotations, ratings etc. Risk assessment data; other sample data Process & Analyse Derived Data Research Concept and/or Experiment Design Start Project Peer-review Proposal Conduct Experiment Generate, Create, & Collect Raw Data Check & Clean Raw Data Interpret & Analyse Results Data Archive, Preservation & Curation (OAIS conformant; Representation Information etc.) IPR, Embargo & Access Control Discover, Access, Validate, Reuse & Repurpose Data Publish Research Results DataDerived DataProcessed DataRaw Data Documentation, Metadata & Storage (Reference, Provenance, Context, Calibration etc.) Acquire Sample Write Proposal (include DMP) Scholarly Knowledge Write Usage Report Research ActivityAdministrative Activity Curation Activity Information Flow KEY: Peer Review Prepare Manuscript Prepare Supplementary Data Publications Database Publication Activity An Idealised Scientific Research Activity Lifecycle Model Appraisal & Quality Control Programs (generate customised software) Papers, articles, presentations, reports

Data citation, credit, metrics, attribution Incentives?

Journal Article Workflow Visualisation Model Data Annotation Concept Macro Attribution granularity Complexity : what are we citing? Micro / Nano

Integrative genomics Gene expression & clinical traits data in Sage Commons Genome-Wide Association Studies (GWAS) Large-scale predictive network models of disease Co-expression and Bayesian (probabilistic graph) networks Complex data analysis pipelines

Large-scale predictive network models of disease Sage Pipeline Multiple datasets Visualise: Cytoscape Workflow: Taverna

Functionality? How do we cite? Persistent identification - URIs Identifier-agnostic framework Resilient resolution service Multi-directional linking e.g. to peer-reviewed paper, to datasets Version control, provenance

Take homes... Infrastructure : seamless & cost-effective Open Methodology : emerging Data Model Researchers need help with data management Data Management Plans : DCC DMP online tool We need to incentivise data management Citation Framework : assure credit & attribution

Chicago Mart Plaza, 6-8 December 2010 Thank you…