Presentation is loading. Please wait.

Presentation is loading. Please wait.

GRAD 521, Research Data Management Winter 2014 – Lecture 7 Amanda L. Whitmire, Asst. Professor.

Similar presentations


Presentation on theme: "GRAD 521, Research Data Management Winter 2014 – Lecture 7 Amanda L. Whitmire, Asst. Professor."— Presentation transcript:

1 GRAD 521, Research Data Management Winter 2014 – Lecture 7 Amanda L. Whitmire, Asst. Professor

2 Follow-up from last class What is a reasonable timeline for DCP? MTuWThF WEEK 4 2728293031 WEEK 5 3491011 WEEK 6 1415161718 WEEK 7 2122232425 WEEK 8 282930

3 Overview for today Why? Where to store data Local drive | network drive | cloud Consider: capacity & access by co-workers Data backup Disaster recovery (research continuity) Data security Corruption or loss (hardware failure or data deletion) Confidentiality (personal or intellectual property)

4 Why data storage, backup & security are important “Your data are the life blood of your research. If you lose your data recovery could be slow, costly or even worse… it could be impossible it could be impossible.”

5 Most common loss scenario: drive failure

6 This happens a lot: physical theft & unintentional damage Cute, but not a valid security plan.

7 Rare, unexpected events happen University of Southampton, School of Electronics and Computer Science, Southampton, UK, 2005

8 8 It CAN happen to you

9 Real-world lesson: Audit your backups…

10 Data storage options 1.Personal computers (PCs) & laptops 2.External storage devices 3.Networked Drives 4.Cloud servers

11 Storage: PC/laptop Advantages Convenient Disadvantages Drive failure common Laptops: susceptible to theft & unintentional damage Not replicated Bottom Line Do NOT use to store master copies of data Not a long term storage solution Back up important data & files regularly

12 Storage: external storage devices Advantages Convenient, cheap & portable Disadvantages Longevity not guaranteed (e.g. Zip disks) Errors writing to CD/DVD are common Easily damaged, misplaced or lost (=security risk) May not be big enough to hold all data; multiple drives needed Bottom Line Do NOT use to store master copies of data Not recommended for long-term storage

13 Storage: networked drives Advantages Data in single place, backed up regularly Replicated storage not vulnerable to loss due to hardware failure Secure storage minimizes risk of loss, theft, unauthorized use Available as needed (assuming network avail.) Disadvantages Cost may be prohibitive; export control Bottom Line Highly recommended for master copies of data Recommended for long-term storage (~5 years)

14 Storage: cloud storage Advantages Data in single place, backed up regularly Replicated storage not vulnerable to loss due to hardware failure Secure storage minimizes risk of loss, theft, unauthorized use Disadvantages Cost may be prohibitive Upload/download bottleneck & fees Longevity? Export control Bottom Line Possibly recommended for master copies of data Not recommended for in-process data, large files

15 Storage: Google Drive for OSU Advantages All same advantages of network & cloud storage File sharing & collaboration w/variable access levels Unlimited storage (GD), 30 GB non-GD Automatic version control on GD Disadvantages 30 GB may not be enough Upload/download bottleneck Bottom Line Possibly recommended for master copies of data Possibly not recommended for in-process data, large files

16 16 ? ? ? ?

17 Data backup “Keeping backups is probably your most important data management task.” -Everyone

18 Data backup Original External Local External Remote Best Practice: 3 Copies of datasets

19 Backups: full Advantages Data can be easily & fully restored from a recent full backup Disadvantages Time consuming Take up the most storage Bottom Line Recommended for master copies of data Frequency depends on data size & mutability

20 Backups: differential Advantages Data can be easily & fully restored from a full backup + 1 differential backup Disadvantages Size of each differential backup increases each time Backup window increases each time Bottom Line Frequency depends on data size & mutability

21 Backups: incremental Advantages Smallest file size between backups (full or incremental) Shortest backup window Disadvantages When you need to restore data, the full backup +all incremental backups are required = more difficult restore scenario Bottom Line Frequency depends on data size & mutability

22 Backups: bottom line Pick a strategy Be consistent Test your approach!

23 Data security “Data security is the means of ensuring that research data are kept safe from corruption and that access is suitably controlled.”

24 Data security It is important to consider the security of your data to prevent: Accidental or malicious damage/modification to data Theft of valuable data Breach of confidentiality agreements and privacy laws Premature release of data, which can void intellectual property claims Release before data have been checked for accuracy and authenticity

25 Data security There are different levels of security to consider for your research data: Access : This refers to the mechanisms for limiting the availability of your data Systems : This covers protecting your hardware and software systems Data Integrity : This refers to the mechanisms for ensuring that your data is not manipulated in an unauthorized way

26 Data security: access Limit the availability of your data: ID/Password : Step 1, for everyone really Role-based access : limited privileges/permissions to data depending on user Wireless devices : lack anti-virus software and firewalls; vulnerable to theft & theft of device Use a PIN; limit storage of sensitive data on device

27 Data security: systems Protect your hardware & software systems: Anti-virus software : required of all OSU computers OS & media software : keep them up to date Firewalls : block unwanted network traffic from reaching your computer or server (e.g. typical home router) Intrusion detection software : detects & alerts, does not prevent Physical access : locked office; password on wake; cable lock for laptops;

28 Data security: data integrity Protect the integrity of your data @ file-level: Encryption : the process of converting data into an unreadable code. You must have access to a password or a secret encryption key to be able to read an encrypted file. Check with OSU Data Security team for advice (no “one size fits all” solution). Electronic signatures : meant to ensure the authenticity of the signer and by extension, the document; now carry legal significance Watermarking : embeds a digital marker for authorship verification and can alert someone of alterations made to data files; most often w/images & media

29 29 ? ? ? ?

30 Exercise Complete the ‘Data Storage, Backup & Security Checklist’


Download ppt "GRAD 521, Research Data Management Winter 2014 – Lecture 7 Amanda L. Whitmire, Asst. Professor."

Similar presentations


Ads by Google