Presentation is loading. Please wait.

Presentation is loading. Please wait.

Exploring the lifecycle

Similar presentations


Presentation on theme: "Exploring the lifecycle"— Presentation transcript:

1 Exploring the lifecycle
RDMRose: Research Data Management for LIS Session 3 The Digital Curation Lifecycle Session 3.1 Exploring the lifecycle Exploring the lifecycle Session 3.1 Nov-18 Learning material produced by RDMRose

2 Learning outcomes At the end of this session you will be able to:
Explain the DCC Curation Lifecycle Model Reflect upon the relevance of lifecycle actions to your role and to the institution Further develop ideas on how aspects of digital curation might be explained to students and researchers Nov-18 Learning material produced by RDMRose

3 Session overview Background Target audience The lifecycle actions:
8 sequential actions 3 occasional actions 4 full lifecycle actions Nov-18 Learning material produced by RDMRose

4 Digital Curation Centre
The DCC Curation Lifecycle Model is an authoritative generic model outlining what the umbrella term RDM consists of It outlines the activities that are required to successfully curate research data throughout its entire lifecycle It was developed by the DCC = Digital Curation Centre, The DCC is the leading centre of expertise in digital information curation You will explore the DCC and its extensive website in session 4.1 Nov-18 Learning material produced by RDMRose

5 Target audience of the model
The model is an idealised situation: curation is planned from the very beginning, and planned for throughout the lifecycle You can start at any point and use the model to identify gaps and undertake appropriate actions According to the DCC (2012c) this model is relevant to: Data creators Data archivists/curators Data (re)users Nov-18 Learning material produced by RDMRose

6 Background The DCC Curation Lifecycle Model is based on the OAIS Reference Model OAIS = Open Archival Information System (pictured) OAIS is a model that defines a generic framework for building a digital archive You can find out more about this influential model from the excellent introductions by Lavoie (2004) and Ball (2006) Nov-18 Learning material produced by RDMRose

7 Background The DCC Curation Lifecycle Model adds to the OAIS Reference Model It includes activities that take place outside the archival system: the research lifecycle In particular: the creation of data, the use and reuse of data Nov-18 Learning material produced by RDMRose

8 DCC Curation Lifecycle Model
Nov-18 Learning material produced by RDMRose

9 Actions Three sets of actions:
Sequential Actions (8): key actions needed as data move through their lifecycle Occasional Actions (3): only occur when special conditions are met, but they do not apply to all data Full Lifecycle Actions (4): apply to all stages in the lifecycle Nov-18 Learning material produced by RDMRose

10 Actions Sequential actions (3): Conceptualise Create or receive
Appraise and select Ingest Preservation action Store Access, use and reuse Transform Nov-18 Learning material produced by RDMRose

11 Actions Occasional actions (3): Dispose Re-appraise Migrate
Nov-18 Learning material produced by RDMRose

12 Actions Full Lifecycle Actions (4):
Description and representation information Preservation planning Community watch and participation Curate and preserve Nov-18 Learning material produced by RDMRose

13 Action 1 Conceptualise Sequential Action 1 Aim:
Designing research projects (and grant proposals) with digital curation in mind, curation-ready data Rusbridge (2008): “Repeat after me: curation begins before creation!” Nov-18 Learning material produced by RDMRose

14 Information entropy See Michener et al. (1997, fig. 1) for a graphic representation of “the normal degradation in information content associated with data and metadata over time (‘information entropy’).” Nov-18 Learning material produced by RDMRose

15 Key activities DCC’s key activities include planning for:
Data capture and storage in curation-friendly file formats (open standards) Recording sufficient information at the time of data capture to assist with ongoing management of those data and with their use Scrupulous identification of files Data storage on appropriate media Identification of a safe place for the data and ensuring that an archive will take them Nov-18 Learning material produced by RDMRose

16 Action 2 Create or receive
Sequential Action 2 Aims: Creating or receiving digital data that is curation-ready DCC’s key activities: Researcher: Create data that is curation ready, including administrative, descriptive, structural and technical metadata. Preservation metadata may also be added at the time of creation. LIS professional: Receive data, in accordance with documented collecting policies, from data creators, other archives, repositories or data centres, and if required assign appropriate metadata Nov-18 Learning material produced by RDMRose

17 Data quality Authentic: be what it purports to be.
Reliable: have trusted contents. Have integrity: be complete and unaltered. Usable: can be located, retrieved, presented and interpreted. (Based on Higgins, 2012, p. 20 and ISO ) Nov-18 Learning material produced by RDMRose

18 Metadata Descriptive: ensures identification, location and retrieval.
Technical: records the technical infrastructure used to create or access the data. Administrative: for management of data such as acquisition, appraisal decisions, and IPR. Use: manages access rights and usage. Preservation: records preservation actions, such as checksums (Based on Higgins, 2012, p. 38.) Nov-18 Learning material produced by RDMRose

19 Action 3 Appraise and select
Sequential Action 3 “the process of evaluating material in order to decide which to retain over the long term, which to retain for the meantime, and which to discard.” (Higgins, 2012, p. 28) DCC’s key activities: Evaluate data and select for long-term curation and preservation Adhere to documented guidance, policies or legal requirements Nov-18 Learning material produced by RDMRose

20 Why appraise and select?
Digital content expands. Backup and mirroring increases costs. Discovery gets harder. Managing and preserving is expensive. (Based on Whyte and Wilson, 2010.) Nov-18 Learning material produced by RDMRose

21 Significance Appraisal = “determination of significance” (Harvey, 2010, p. 132) What data do you need/want to keep? Which datasets or digital resources do you want to keep? Which characteristics or elements of these datasets or resources do you want to keep? (Designated community, representation information.) How long do you need/want to keep the data? E.g. in terms of user requirements (as evidence for verifying conclusions) or risks of not keeping the data. Nov-18 Learning material produced by RDMRose

22 Criteria General appraisal criteria (Whyte & Wilson, 2010)
Relevance to mission of the repository Scientific or historical value (inferring anticipated future use) Uniqueness (the only or most complete source? At risk of loss if not accepted?) Potential for redistribution (depending on reliability, integrity and usability of the data; legal issues may limit this) Non-replicability (not feasible or impossible) Economic case (costs vs potential future benefits, available funding) Full documentation (to facilitate discovery, access, reuse etc.) Nov-18 Learning material produced by RDMRose

23 Occasional actions Occasional Actions related to appraise and select:
Reappraise Dispose Nov-18 Learning material produced by RDMRose

24 Action 4 Ingest Sequential Action 4 DCC’s key activities:
Transfer data to an archive, repository, data centre or other custodian Adhere to documented guidance, policies or legal requirements The term “Ingest” was introduced by the Open Archival Information System (OAIS) Reference Model Nov-18 Learning material produced by RDMRose

25 Key activities for ingest
Preparing the data for placing in long-term storage could involve (identified by the CAIRO project): Assigning a persistent identifier Checking that the data does not contain malware Extracting, creating and assigning description and representation information Creating fixity values Confirming technical details such as file formats Combining the data and their associated metadata into an Archival Information Package Migrating data to a different file format (DCC, n.d. a) Nov-18 Learning material produced by RDMRose

26 Action 5 Preservation Action
Sequential Action 5 Aims: To ensure that data remains authentic, reliable and usable while maintaining its integrity (data quality). DCC’s key activities: Undertaking actions to ensure long-term preservation and retention of the authoritative nature of data. Nov-18 Learning material produced by RDMRose

27 Preservation actions and methods
DCC’s specific actions: Data cleaning (detecting and correcting/removing corrupt or inaccurate data) Validation Assigning preservation metadata Assigning representation information Ensuring acceptable data structures or file formats (open standards) Preservation methods (Lord & Macdonald, 2003): Migration Emulation Also: Formal descriptions (UVC) Digital archaeology Computer museums Nov-18 Learning material produced by RDMRose

28 Significant properties (Higgins, 2012, p. 34)
Preservation actions are undertaken to ensure data retains its significant properties. These could be a choice of: Look and feel: does the migration need to retain formatting which gives it a specific appearance or is it sufficient to maintain the contents? Structure: are there relationships between constituent parts which need to be retained? Functionality: does certain functionality such as hyperlinks to other material, or embedded comments need to be retained? Interoperability: does the data need to retain interoperability with other datasets? Nov-18 Learning material produced by RDMRose

29 Action 6 Store Sequential Action 6 DCC’s key activities:
Storing the data in a secure manner adhering to relevant standards This includes the storage facilities themselves, including refreshment of storage media to avoid hardware obsolescence or bit-rot And the administration of the data storage service with appropriate policies Nov-18 Learning material produced by RDMRose

30 Specific activities (Harvey, 2010)
Develop, maintain, and apply policies relating to secure data storage Ensure that sufficient description and representation information is stored with data Use a reliable storage medium, preferably on more than one carrier and with geographically distributed backup systems Monitor events that might trigger other preservation actions (e.g., file format migration, file corruption) Regularly check to ensure the integrity of the stored data and their description and representation information Ensure system and physical security Maintain and replace the technical infrastructure as necessary Develop, and administer as necessary, data recovery procedures Nov-18 Learning material produced by RDMRose

31 Action 7 Access, use and reuse
Sequential Action 7 Aims: Data can be located, and used and reused by legitimate users DCC’s key activities: Ensuring that data is accessible to both designated users and reusers, on a day-to-day basis, usually (but not necessarily) in the form of publicly available published information Applying robust access controls and authentication procedures where applicable Nov-18 Learning material produced by RDMRose

32 Specific actions Ensuring data is able to be discovered (located) by applying standards that ensure appropriate metadata are present so data can be located Ensuring that the required legal permissions are available for data to be used and reused, and that legal restrictions on the use and reuse of data are adhered to (funding bodies, legislation about confidentiality and privacy, IPR) Providing tools that allow collaboration in the use and reuse of data (e.g. annotation) Ensuring data is accessible only by authorised users, by applying access controls and authentication procedures. Nov-18 Learning material produced by RDMRose

33 Action 8 Transform Sequential Action 8 DCC’s key activities: Methods:
Create new data from the original Methods: Creating a subset (by selection or query) to create newly derived results for verification of results as the basis of further research Migration into a different format (migration changes data) Nov-18 Learning material produced by RDMRose

34 Occasional action Occasional action: related to transform: Migrate
Nov-18 Learning material produced by RDMRose

35 Activity 3.1.1 Exploring the DCC curation lifecycle
Nov-18 Learning material produced by RDMRose

36 Activity 3.1.1 Exploring the DCC curation lifecycle
For the sequential actions in the DCC Lifecycle Model, discuss one or more of the following questions on your activity sheet: What role could LIS professionals play in this Action and what skills would they need? How could/should an institution support this action? How would you explain this Action to undergraduates? postgraduates? researchers? Nov-18 Learning material produced by RDMRose

37 Full lifecycle Actions: Curate and Preserve
DCC’s key activities: Being aware of management and administrative actions planned to promote curation and preservation throughout the curation lifecycle Undertaking management and administrative actions planned to promote curation and preservation throughout the curation lifecycle Nov-18 Learning material produced by RDMRose

38 Full lifecycle Actions: Community Watch and Participation
DCC’s key activities: Maintaining a watch on appropriate community activities Participating in the development of shared standards, tools and suitable software Nov-18 Learning material produced by RDMRose

39 Specific activities DCC’s indicative activities for data curators include: Keeping up-to-date with data curation activities and developments in related areas (session 4) Sharing data and participating in other activities which form the basis of data reuse Participating in the development of standards for data curation Participating in the development of tools and toolkits for data curation Nov-18 Learning material produced by RDMRose

40 Full lifecycle Actions: Preservation Planning
DCC’s key activities: Planning for preservation throughout the curation lifecycle of digital material Developing and applying plans for management and administration of all curation lifecycle actions Nov-18 Learning material produced by RDMRose

41 Full lifecycle Actions: Description and representation information
DCC’s key activities: Assigning administrative, descriptive, technical, structural and preservation metadata, using appropriate standards, required to ensure adequate description and control over the long term Collecting and assigning representation information, which is required to understand and render both the digital material and the associated metadata Nov-18 Learning material produced by RDMRose

42 Activity 3.1.2 How usable is the model?
Nov-18 Learning material produced by RDMRose

43 Activity 3.1.2 How usable is the model?
How is the model different from a library’s typical emphasis on collection development for access (as opposed to preservation)? If you were a researcher, how would you interpret this model? How usable is the DCC Lifecycle Model for you? Write down your initial thoughts. Then discuss with a partner. Nov-18 Learning material produced by RDMRose

44 Activity 3.1.3 Alternative lifecycle models
Nov-18 Learning material produced by RDMRose

45 Activity 3.1.3 Alternative lifecycle models
Look at the Review of Data Management Lifecycle Models by A. Ball at The document gives an overview of 8 alternative models, including the DCC Curation Lifecycle Model Of the UK Data Archive lifecycle model (number 7 in Ball’s review) there is also a visual representation available on Examine the models. Which of these, if any, would you prefer to use when discussing RDM with a researcher, and why? Nov-18 Learning material produced by RDMRose

46 Sources and References
Nov-18 Learning material produced by RDMRose

47 Sources Slides on the DCC Curation Lifecycle Model are based on:
DCC (n.d. a). Digital 101 materials. Edinburgh: Digital Curation Centre. Retrieved from DCC (n.d. b) DCC Charter and Statement of Principles. Edinburgh: Digital Curation Centre. Retrieved from DCC (n.d. c) Lifecycle Model FAQ. Edinburgh: Digital Curation Centre. Rretrieved from Nov-18 Learning material produced by RDMRose

48 References Ball, A. (2006). Briefing Paper: the OAIS Reference Model. Retrieved from Ball, A. (2012). Review of Data Management Lifecycle Models. Bath: University of Bath. Retrieved from Donnelly, M. (2012). Data management plans and planning. In G. Pryor (Ed.). Managing Research Data (pp ). London: Facet. Harvey, R. (2010) Digital Curation: A How-To-Do-It Manual. London: Facet. Higgins, S. (2012) The lifecycle of data management. In G. Pryor (Ed.). Managing Research Data (pp ). London: Facet. Nov-18 Learning material produced by RDMRose

49 References Lavoie, B.F. (2004) The Open Archival Information System Reference Model: introductory guide. Dublin, Ohio; York: OCLC Online Computer Library Centre; Digital Preservation Coalition. Retrieved from Lord, P. & Macdonald, A. (2003). Data Curation for e-Science in the UK: An Audit to Establish Requirements for Future Curation and Provision. Twickenham: The Digital Archiving Consultancy. Michener, W.K., Brunt, J.W., Helly, J.J., Kirchner, T.B., & Stafford, S.G. (1997). Nongeospatial metadata for the ecological sciences. Ecological Applications, 7(1), Rusbridge, C. (2008). Project data life course. Blogs. Edinburgh: Digital Curation Centre, Whyte, A., & Wilson, A. (2010). How to Appraise & Select Research Data for Curation. Edinburgh: Digital Curation Centre, Nov-18 Learning material produced by RDMRose


Download ppt "Exploring the lifecycle"

Similar presentations


Ads by Google