Presentation on theme: "Demystifying the data interview SEQld Data Intensive - 30 January 2015 Kathryn Unsworth."— Presentation transcript:
Demystifying the data interview SEQld Data Intensive - 30 January 2015 Kathryn Unsworth
2 barack obama air force one – Image of Barack Obama and his team laughing with the caption “And then I told them – we’re only collecting your metadata” Not displayed as may breach Copyright regs.
Metadata for impact: making RIF-CS work for data producers In RIF-CS certain elements are required (or mandatory) while others are recommended (or optional). Creating metadata descriptions involves some effort, so how should researchers decide which recommended elements should be included in their data descriptions? A good way to think about this is what your institution wants to achieve by publishing data, particularly via Research Data Australia (RDA), and as mentioned previously, how you expect people will search for and re-use research data. 3
So… What does your institution want to achieve by publishing data via RDA? 4 Goal 1: We want our data to be highly visible and easy to find (discovery) Goal 2: We want to know who is reusing our data and how often (citation) Goal 3: We want our data to be widely shared and reused across the research community (re-use)
Goal 2: Citation ▪ citationMetadata ▪ DOIs ▪ DCI encoding guidance ▪ Closely linked to reuse 6
Goal 3: reuse ▪ Rights and Licensing ▪ Description includes provenance & reuse information ▪ Direct access to the data ▪ Closely linked to citation 7
8 Scenarios for data producers Scenarios for data producers (researchers):
9 We want to highlight our ‘open’ data. The transparency of our research is important to our reputation. Which RIF-CS elements are important to us? The ‘openness’ of data is defined by several characteristics, therefore describing this requires more than one element: Electronic address attribute download link for dataset (direct download or link to landing page (can also include additional information such as title, media type, byte size and notes) Access rights – choose ‘open’ from the suggested vocabulary to indicate the data are free to access and reuse (additional choices are ‘conditional’ and ‘restricted’)open Licence – choose an open licence such as CC-BY to ensure the researcher is attributed when the data are reused.
10 We want citation metrics for our data – the same as our publications. It’s another way to demonstrate the impact of our research and we might identify new collaborators. Which RIF-CS elements are important to us? ANDS has established a process for harvesting records from a data source in RDA to the Thomson Reuters Data Citation Index, enabling metrics for citation of data to be counted in the same way as publications: Citation metadata – enables description in a structured way (human and machine readable), i.e. how the data should be cited, which means citations can be readily identified by indexing services such as DCI Identifier – if possible, assign a DOI to the data. DOIs uniquely identify a dataset and are considered ‘best practice’ for the accurate capture of citation metrics.
11 How do we in research support (data describers) help researchers create the metadata their research deserves? I’d like to share what another institution is doing in engaging researchers!
12 Our colleagues at Edith Cowan University have been on a mission to engage with researchers for the dual purpose of raising awareness around RDM, and to build their research data collections. The following slides describe the steps they have taken in this process:
13 Step 1: We sent a first invite via to the targeted researcher for an interview session. (Note: For the purpose of this project, we have excluded all the Category 1 researchers as the RDM interviews are taken care of by our Office of Research and Innovation.) As we already have an ECU Research Data Management Policy in place, a link to the policy, the RDA webpage and the ECU Institutional Repository (Research Online) Dataset pages were placed in the invite. Purpose: It made it easier for us when the researcher was made aware that the institution is committed to RDM. The Policy is expected to be implemented university wide in We tried to get buy in from the researchers with the argument that now is a good time to start thinking about how they can meet the mandatory requirements of the policy when it is officially implemented.
14 Step 2: Once the interview date was set, we did a profile of the researcher in terms of his/her research area via their published articles. We specifically looked out for any existing datasets that may have already been published via Dryad, PLOS, Nature etc. We developed a form to guide us in this process (Researcher Meeting Profile Checklist).Researcher Meeting Profile Checklist) Purpose : This helped us to understand who we were dealing with and how we wanted to engage with the researcher.
15 Step 3: We also had some sales kit items that we brought along to the interview session to show how exactly the dataset records will appear on our Institutional Repository webpage, how the metadata is represented on the RDA webpage and how the dataset can be linked to the related publications we have in the institutional Repository. We printed these examples on A3 paper and had them laminated. Purpose : We found that showing something tangible to the Researchers helped them to understand what we plan to do with their dataset. This was to give researchers an idea on what exactly the end result might look like. Sometimes we were only given 30 minutes for the interview session, so we did not want to waste time logging on to different web pages and distract the researcher from the discussion we were having with them.
16 Step 4: During the interview session, we also introduced a list of required metadata that we will need from them to create a dataset record in our Institutional Repository. (Researcher Meeting – Metadata fields required) During the project, we quickly learned that obtaining research data from our researchers is certainly not a straight forward process. For completed and current research projects, most researchers realised that they have issues with the types of restrictions they have imposed on themselves at the Ethics Committee application stage. Many were unaware that there are various options in dealing with data and even sensitive data. For future research projects, we realised that many institutional changes have to start from the very top level and be integrated into the university research activities administration.
“We have learned so much from the ANDS team and other librarians during our Gold Coast training session and we are happy to be able to share our experience in return.” Poh Lin Teow, Librarian/ICT Business Analyst, Edith Cowan University 17
Learning together: A researcher has just had an article accepted in PLOS and has found out that the article won’t be published unless the supporting data are available. He asks you for help with getting the data record ready for publication in your repository. 18
19 This work is licensed under a Creative Commons Attribution 3.0 Australia License ANDS is supported by the Australian Government through the National Collaborative Research Infrastructure Strategy (NCRIS).