Presentation on theme: "A centre of expertise in digital information management www.ukoln.ac.uk UKOLN is supported by: Changing Roles, Responsibilities and Relationships Dr Liz."— Presentation transcript:
A centre of expertise in digital information management UKOLN is supported by: Changing Roles, Responsibilities and Relationships Dr Liz Lyon, Director, UKOLN Associate Director, UK Digital Curation Centre Opening the research data lifecycle, JISC Conference 2007 This work is licensed under a Creative Commons Licence Attribution-ShareAlike 2.0
Preliminary findings from a JISC study Terms of Reference for UKOLN To define how institutions (collectively and individually) and scientific data centres can together effectively achieve –Preservation –Access – Managed and open –Re-use – Data citation, data mining and re-interpretation October 2006 – March 2007 N.B. Work in progress!
Some of the data stakeholders?
Funders Interviews: 4 Research Councils + 1 charity Support for data curation is (still) patchy Mixed approaches: proactive to passive Gaps in infrastructure support for data outputs Limited formal links between programme planning and support infrastructure Some Data management and sharing policies Some use of Data Management Plans Wellcome Trust – Policy + Q&A January 2007
A centre of expertise in digital information management January 2007 Data Management and Sharing Plan required if creating or developing a resource for the research community as the primary goal or involve the generation of a significant quantity of data that could potentially be shared for added benefit
Funders 2 Limited advocacy work Funding models for infrastructure support vary Funding models for research programmes vary Some productive partnerships e.g. MRC and Wellcome Trust, CCLRC and Wellcome Some examples of good practice
Hierarchy of drivers (for data sharing) Acknowledgement: Mark Thorley, NERC Level 0: deliver project. Level 1: meet good scientific practice. Level 2: support own science. Level 3: employers requirements. Level 4: funders requirements. Level 5: public policy requirements. NATURAL ENVIRONMENT RESEARCH COUNCIL NERC has: 7 designated data centres Data Management Co-ordinator DataGrid
MRC developing a data support plan Acknowledgement Alan Sudlow
Data centres & Data services Interviews with 5 data services Deep levels of expertise and subject knowledge Exemplars of good practice: standards, policies, manuals, robust curation / preservation practice Limited sharing of expertise between centres Some effective partnerships: –AHDS Stormont Papers with Queens Belfast –BADC with CLADDIER Project Wide range of community awareness Use of licences but IPR issues: performing arts, Technical issues: complexity of data sets, version control, identifiers, application profiles
Data centres & Data services 2 Exemplar of good practice –European Bio-informatics Institute –Microarray data to inform gene expression –Consensus on community standards MIAME –Data pipelines at source via Laboratory Information Management Systems LIMS –User tools MIAMExpress & value-added services –Annotation of data using the Gene Ontology –Submission & deposit is embedded in community culture: requirement for publication –Training programme, eLearning materials coming –This level of data curation is expensive!!
Reactome EnsEMBL Genome Annotation EMBL-Bank DNA sequences UniProt Protein Sequences Array-Express Microarray Expression Data EMSD Macromolecular Structure Data IntAct Protein Interactions Source: Graham Cameron, EBI
Flybase MGD SGD BRENDA Chemical data resources Medical data resources Biodiversity data resources IMGT Pasteur DBs Eumorphia/ Phenotypes Core biomolecular resources Specialist biomolecular data resource examples Mutants Large resources in related disciplines Model organism resource examples Mouse Atlas Source: Graham Cameron, EBI
General Data Selection Criteria Usability –Quality of data –Usable data format –Conditions of Use –Reputable Author –Documentation Usefulness –Data quality –Uniqueness of data –Potential Strategic Use –Usefulness of parameters
Institutions & Data Repositories Not much data…. or duplication …… (yet?) Departmental audits of research data practice at University of Southampton to inform developing institutional data & curation policy Barriers to data sharing: –IPR and geospatial data –Lack of awareness amongst researchers –Cultural roots and resistance to change Exemplars of good practice: eBank Project
Aggregator services Institutional data repositories Deposit, Validation Publication Validation Data analysis Search, harvest Presentation services / portals Data discovery, linking, citation Laboratory repository Deposit eCrystals Global Federation Model Publishers: peer- review journals, conference proceedings, etc Curation Preservation Subject Repository Institution Library & Information Services Data creation & capture in Smart lab Data discovery, linking, citation Search, harvest Deposit
Roles, Rights & Responsibilities Scientist: Creation and use of data. Data centre: Curation of and access to data. User: Use of 3 rd party data. Funder: Set / react to public policy drivers. Publisher: Maintain integrity of the scientific record. Acknowledgement: Mark Thorley, NERC NATURAL ENVIRONMENT RESEARCH COUNCIL
Closing thoughts Co-ordination and join up –High level and strategic : Funders –Operational level and practical : JISC data services & research council data centres Funding –Are current economic models for preservation & data sharing infrastructure a) appropriate? b) adequate? c) sustainable? –Should inform prioritisation and investment
Closing thoughts 2 Good Practice requirements –Data management and sharing Policies –Data Management Plans (peer-reviewed) –Institutional data curation policies & planning Technical interoperability and integration –Data are diverse and complex –JISC IIE vision of discovery across repositories – Contextual linking offers opportunity for data centres and institutional repositories to realise synergies and work more closely together
Closing thoughts 3 Advocacy –Programmes to reach across sectors –Harmonisation and consistent messages –Tailored & targeted to disciplines –Researcher has some curatorial responsibility Training –Lack of skills –eLearning opportunity –Data scientists? Recognition and career development –Native data scientists are coming….
Dealing with the Data Deluge JISC Repositories Programme Supporting Institutions in the Digital Age Digital Repositories Conference 5-6 June 2007 University of Manchester Research Data Strand