Presentation is loading. Please wait.

Presentation is loading. Please wait.

Research data management – a brief introduction

Similar presentations


Presentation on theme: "Research data management – a brief introduction"— Presentation transcript:

1 Research data management – a brief introduction
Research Services Slides provided by the DaMaRO Project, University of Oxford This slideshow was prepared by the DaMaRO Project (a joint endeavour between IT Services, the Bodleian Libraries, and Research Services at the University of Oxford) and is made available under the Creative Commons Attribution Non-Commercial License: Subject to the terms of the license, you are welcome to reuse or adapt this material for your own purposes. November 11, 2018

2 What is research data management?
Research data is all the information you make use of in your research Structured data (in databases, tables, etc.) Unstructured data (in textual sources, images, audio recordings, personal notes, s, etc.) Data management is how you organize, structure, store, and care for this It’s about ensuring you have the information you need at your fingertips And about ensuring that information remains available and intelligible in the longer term Research data management is a rather forbidding term – but it actually just refers to how you deal with the information you make use of in the course of your research. Some people think of research data in fairly narrow terms – perhaps it conjures up images of experimental results, or of something like a database of statistics. Those are important kinds research data, of course, but the term can also be used much more broadly, to cover both structured and unstructured information – which includes textual sources, images, manuscripts, websites, sound recordings, your own notes, and a whole host of other things. Data management is about how you organize and structure material (both print and electronic), and about storage and backing up. It’s also about what happens to the data in the longer term – about preserving data, and perhaps sharing it with other researchers. Research data management – a brief introduction November 11, 2018

3 Data management basics
Do your current methods of dealing with information allow you to find what you need, when you need it? Are files and data suitably labelled to aid retrieval? Be proactive and plan ahead – what will you want to be able to do with your data in the future? How are you storing your data for the duration of your project? What’s your back-up plan? The HFS service offers free back-up to graduates and staff Do you have access to space on a departmental server? Synchronization software (e.g. Dropbox) can help if you use multiple computers One chief goal of research data management is to make the research process as efficient as possible – cutting down on the time spent looking for things so you have more time available for the meat of the research process: the analysis, the critical thinking, and the writing. In the short term, the aim of data management is to make sure you can find the information you need when you want it. There are other benefits to organizing information, too – for example, it can help highlight connections that might not otherwise have been obvious. If you realize that your current approach isn’t meeting your needs, are there small (or larger) changes you can make that will make life easier? One of the keys to good data management is planning ahead. At the start of a research project, you haven’t yet accumulated much material, so it’s usually easy to keep track of everything, and consequently it’s very tempting not to think too much about organization – because you just want to dive in and get on with the research. But it gets progressively harder as you go along and gather more stuff – and it’s much, much easier to think about how you’re going to organize everything at the beginning rather than waiting until things have already descended into chaos. (If you feel that you’re already dealing with chaos, don’t panic! The important thing is to get started: set up a system for the new material you acquire, and deal with the backlog as and when you have time - if you try to deal with the backlog first, it may never happen). Look at data management as an investment: it takes a little time now, but will save you more in the long run. It’s also worth thinking about what you’re likely to be doing later on in your research project. While you may not be able to anticipate everything, considering what you’ll ultimately want to do with the information you’re collecting can help avoid time-consuming delays later on, by reducing the need to go back and fill in gaps or reorganize material. You don’t need me to tell you that the information on your computer is valuable. However, although we all know we should store things securely and back up regularly, it’s very easy not to get around to it. It’s best to have a back-up set to run automatically, and ideally you should have copies in multiple places – so you’re still protected in case of fire or equipment theft. IT Services’ HFS back-up service is available free of charge to university postgraduates and staff, and will store back-ups of your data in three separate places, one of which is outside Oxford (so even if IT Services goes up in flames, your data will be safe). In some cases, you may have the option of storing your data on a server owned by your department. If you work on multiple computers, a file synchronization service is an easy way of keeping the content of multiple machines the same (and provides an additional back-up copy). Dropbox is a popular online service, but plenty of others exist. Research data management – a brief introduction November 11, 2018

4 Data management tools Are you using the best tools for the task in hand? Don’t struggle on with the wrong software or technique just because it’s what you know If you’ve ever thought ‘I wish I could…’, maybe someone else has, too – and has written some software to make it possible Tools exist to do a huge range of jobs – to help organize and analyse information, annotate Web pages, PDFs, or images, and much more Discover new tools via the Research Skills Toolkit website You could also ask colleagues for recommendations, or search for online reviews Make sure you’re using the right tools for the job in hand. It’s sometimes tempting to struggle on with a familiar tool that isn’t really designed for a particular task just because it’s what we know. But if you’re going to be spending a substantial amount of time on something, it will almost certainly save time and effort in the long run to seek out and learn how to use a tool that does the job properly. If you find yourself thinking ‘I wish I could…’ while in the middle of a tedious or laborious task, there’s a reasonable chance you’re not the first. Very often, you’ll find someone has written a software tool to make it possible. There’s a wide range of software out there designed to do specific jobs – to annotate Web pages, PDF files, or images, for example, or to help organize and analyse ideas, notes, and other types of information. Remember that new software is appearing all the time, so it’s worth checking periodically to see if anything new has appeared. Bibliographic software is one example of something that is useful in nearly all research fields. As well as using it to store references and to add citations to written work, many people find it helpful for creating a searchable index of their research materials – you can store your notes in it, and attach electronic copies of articles. There’s a lot of software available free of charge online (though be careful what you download – and make sure you have up-to-date anti-virus software). In some cases, though, it will be worth investing in specialist software – you may find there are special deals available to you if you’re a student or a university staff member. One thing to be wary of (particularly with free software) is ending up with your data locked in to a particular application which might cease to be supported, or if it’s a Web service, even cease to exist. It’s always worth checking whether a program will allow you to export your data in a format you can use elsewhere – and being extra careful to make regular back-ups. The Research Skills Toolkit website ( provides an overview of lots of useful software and services. It’s also worth asking your colleagues what they use, and whether they’d recommend it. Searching for online software reviews may also provide pointers. Research data management – a brief introduction November 11, 2018

5 Longer term goals If you return to your data in a year or two, will it still be intelligible? Does the format make it clear what everything means? Are there abbreviations that need explanation? Is the data adequately documented? Where did it come from? Who created it? What changes have been made to it? Is any additional information needed to place the data in context? Are there any restrictions on how it can be used? What’s your long term storage plan? A second major aim of data management is to make sure the information you’ve collected remains useful. You want it to be stored safely, and easy to retrieve, not just now, but a few months or a few years (or even a few decades) down the road; you want to be sure that the information will still make sense when you return to it. When you’re working with a dataset on a daily basis, it’s usually easy to keep track of everything. But if you return to that dataset after a few months or a year or more of working on something else, will it still be obvious what everything means – or are there additional details that need to be included to ensure clarity? Have you used abbreviations or other non-standard conventions? If so, do you need to include a key to ensure you can interpret them correctly later on? Are there brief notes you’ve made for yourself that require some amplification to remain intelligible? Adequate documentation means ensuring that data is accompanied by whatever additional information is needed to make sense of it and place it in context. This might include: Details of where the information came from, who created or collected it, how it was gathered, and so on. Has the data been edited or otherwise manipulated since creation (or if you obtained the dataset from elsewhere, since you received it)? If so, what’s been done to it? Is there background information that’s needed to enable accurate interpretation? If so, are copies of this (or information about where to find it) stored alongside the data? Are there restrictions on how the data can be used – because of confidentiality requirements or licensing issues, for example? If so, is it clear exactly what these restrictions are? Do you have a plan for keeping the data safe beyond the lifetime of the project? For example, if your data is stored or backed up on an institutional server, what happens if you move institutions? (Will the data go with you, or stay at the institution – and if the latter, who will become responsible for it?) Research data management – a brief introduction November 11, 2018

6 Data sharing and curation
Data sharing is… Good practice – helps make the most of data Good for you – improve your citation rate Now required by most major funders Much easier if planned for early on in a project Is the data in an appropriate format? Does it have the relevant documentation and metadata? Are there confidentiality or IP issues? Consider depositing data in a repository or archive There are many subject-specific repositories From 2013, Oxford will offer DataBank (an institutional digital data archive) and DataFinder (a catalogue of datasets) Data management also covers what happens to research data at the end of a project. Many research projects will work towards a book, thesis, series of articles, or some other form of written output. But the underlying data generated in the course of the project may also be a valuable output in its own right – whether it’s a set of lab results or survey returns, the outcome of some statistical analysis, a database or spreadsheet you compiled for your own use, a marked-up text, a manuscript transcription, or countless other types of material. Sharing your data by making it publicly available is a good thing for many reasons. First of all, it benefits the academic community. Other researchers may be able to make use of your data – by sharing it, you’re avoiding duplication of effort, and helping ensure that research funding is used as efficiently as possible. It also allows others to verify the conclusions, helping to ensure research integrity. Secondly, it’s good for you and your own academic reputation. Datasets can be cited, and additionally, studies have shown that publications for which the underlying data is available often have higher citation rates (see, for example, Piwowar et al. (2007), ‘Sharing Detailed Research Data Is Associated with Increased Citation Rate’, PLoS ONE 2(3)). Thirdly, most major funding bodies now require that data is made publicly available at the end of a project (it may be worth checking the conditions of your grant if you’re not sure whether this applies to you). Making data available at the end of a project is made much more straightforward if you start thinking about it from the beginning. That way, you can ensure that data is prepared in a format suitable for sharing – which is likely to save a lot of time-consuming work later on. If the data will need to be accompanied by documentation or metadata to make it intelligible for others, it’s usually easiest to factor this in from the beginning, rather than having to go back and try to add it later. (Metadata is data about the data – this might include information about who compiled it and when, a brief description, keywords, etc.) In some cases, there may be restrictions that limit your ability to share the data – it may contain confidential information, or there may be intellectual property issues (for example, you may have obtained the data from another source, and hence may not have the right to share it). In some cases, however, some advance planning can help on this front. For example, if you’re collecting data via interviews or surveys, it’s much easier to ask your subjects if they’re happy for their responses to be shared at the time than to have to try to go back and seek consent later on. One of the best ways of sharing your data is to deposit it in a repository or archive. Repositories offer search tools that will help other researchers to discover your dataset, and once you’ve handed over your data, they will take care of its long-term curation, meaning you don’t have to worry about making sure it stays available. Many repositories also offer other features – such as the ability to embargo data for a certain period, or to require anyone who wants to access it to agree to a set of terms and conditions (which may be important if, for example, the data has intellectual property or confidentiality restrictions). There are many subject-specific repositories - e.g. the Economic and Social Data Service, or the NERC data centres for environmental data In 2013, Oxford will launch two new services: an institutional digital data archive called DataBank, and a catalogue of datasets called DataFinder, designed to aid data discovery (and therefore reuse). Datasets made available via DataBank will automatically be included in DataFinder, and researchers will also be strongly encouraged to register data held elsewhere. These will complement Oxford’s existing repository for textual research outputs: ORA, or the Oxford Research Archive. Research data management – a brief introduction November 11, 2018

7 Training and advice The Oxford University Research Data Management website provides information and guidance Covers data management planning, back-up and security, sharing and archiving, and more Bodleian Libraries can advise on curation and description of datasets (metadata), and can assign DOIs The IT Learning Programme offers courses on a wide range of software Also database design, working with digital images, etc. The Graduate Training site and the Skills Hub (both on WebLearn) detail other training opportunities The University’s Research Data Management website provides guidance and further information about the various services available, and so is a great place to start if you want to follow up any of the topics covered in this presentation. The Bodleian Libraries can provide advice on many issues relevant to data curation: how to describe your dataset (that is, what metadata you need) and data standards, for example. They can also assign DOIs (permanent unique digital object identifiers) for datasets deposited in DataBank. A network of subject librarians provides discipline-specific support. If you want to learn more about a particular software package, IT Services’ IT Learning Programme offers a wide range of courses – both for beginners, and for more advanced users who want to learn additional tips and tricks. There are also courses which focus more on general skills rather than specific software – database design, working with digital images, building websites, and so on. If you want to find out what other training is available, the Graduate Training page lists some of the courses available to graduate students, while the Skills Hub provides links to the university’s training providers. Both of these are available through WebLearn. Research data management – a brief introduction November 11, 2018


Download ppt "Research data management – a brief introduction"

Similar presentations


Ads by Google