Presentation is loading. Please wait.

Presentation is loading. Please wait.

Documenting and organising your data For an easier life lib.uts.edu.au utslibrary.

Similar presentations


Presentation on theme: "Documenting and organising your data For an easier life lib.uts.edu.au utslibrary."— Presentation transcript:

1 Documenting and organising your data For an easier life lib.uts.edu.au utslibrary

2 Over the next 60ish mins: Why this stuff matters Metadata Tagging and file hierarchies File naming and renaming Version control

3 Documenting your data

4 So what might this be?

5 Why document? Enables you to understand/interpret data Tells the story of where the data came from Ensures informed and correct use, reduces chance of incorrect use/misinterpretation

6 What to document? Wider contextual information Data collection methodology and processes Information on dataset structure Variable-level documentation Data confidentiality, access and use conditions

7 Bad vs Good http://figshare.com/articles/Excel_database_of_th e_PhD_thesis/1360019 http://figshare.com/articles/Main_Dataset_for_Evo lution_of_Popular_Music_USA_1960_2010_/1309 953

8 Let’s get organised

9 Why? You think you’ll remember things, but over time… Multitude of formats and version of data and documentation Investment of time at the beginning can save time in the long run Good file management practices/naming protocols enable sharing with collaborators

10 Can you relate? Experimentdata.txt Laurensdata.dat Data:currentversion.dat Todaysimage.tif ReportDraft.doc ReportFinal.doc ReportFinalv2LastOne.doc ReportFinalFinal.doc

11 Some filing principles There’s no single right way to do it Establish and document a system that works for you Strike the balance between doing too much and too little: be realistic The 5 Cs: be Clear, Concise, Consistent, Correct, and Conformant

12 Hierarchical or Tag-based Hierarchical – Items are organised in folders and sub-folders Tag-based – Each item assigned one or more tags Often used in combination

13 Hierarchical filing Familiar and widely used Good at representing the structure of information – constructing the hierarchy can itself be a helpful exercise Similar items are stored together Sub-folders can function as task lists Surprisingly hard work to set up and maintain – ‘a heavyweight cognitive activity’ Can be hard to get the right balance between breadth and depth Items can only go in one place Time consuming to re-organise if the hierarchy becomes out of date The good The not so good

14 Sample folder hierarchy from the UK data archive

15 Tag-based filing Items can go in more than one category – and multiple types of category can be used Many people find tagging quicker and easier than hierarchical filing Can be easier to combine than hierarchical systems when collaborating You can search for tags in Finder and Windows explorer Not how operating systems store files If material isn’t tagged properly at first it can be hard to find later Inconsistent tagging is common Similarly named categories can get mixed Less good at representing the structure of information The good The not so good

16 Lets do Metadata Open a Word doc and choose file>information

17 File naming Important for future access and retrieval Provides contextual information Creates logical structure for skimming through many files and versions

18 How could these file names be improved?

19 Best practice for File Naming Keep file names short but meaningful Define the types of data and file formats for the research Avoid using generic file names – ie: draft, final version etc. Use underscores to differentiate between words (avoid spaces) Avoid special characters such as: & * % $ £ ] { ! @ / as these are often used for specific tasks in a digital environment Consider scalability Not all systems/software are case-sensitive and recognize capitals; so assume that TANGO, Tango and tango are the same Don’t rely on file names as your sole source of documentation

20 Possible elements Project/grant name and/or number Date of creation: useful for version control, e.g., YYYYMMDD Name of creator/investigator: last name first followed by (initials of) first name Description of content/subject descriptor Data collection method (instrument, site, etc.) Version number

21 Example of good file naming FG1_CONS_12Feb10 is the file that contains the transcript of the first focus group with a study of consumers, that took place on 12 February 2010 Int024_AP_5June08 is an interview with participant 024, interviewed by Anne Parsons on 5 June 2008

22 Naming and renaming Check to see if your instrument, software, or other equipment that outputs your data files can be set with a file naming system Less work than retrospectively changing filenames Batch renaming tools available

23 Version control Create a version control table or file history Document your convention and be consistent Record every change Put old versions in separate folder Consider discarding or deleting obsolete versions (while retaining the original 'raw' copy) if appropriate

24 Version control cont. In the file/folder names, use ordinal numbers (1,2,3, etc.) for major changes and the decimal for minor changes e.g v1, v1.1, v2.6 Beware of imprecise labels: revision, final, final2, definitive_copy - they may not be as definitive as you thought

25 Version Control Doc

26 Version Control Final Final Some software has built in version control facilities, e.g.:  control rights to file editing: read/write permissions (Windows Explorer)  versioning or tracking features in collaborative documents (Wikis, intranets, GoogleDocs) Consider using version control software: Guidance from MIT Libraries on software options: http://libraries.mit.edu/data- management/files/2014/05/version-control-handout.pdfhttp://libraries.mit.edu/data- management/files/2014/05/version-control-handout.pdf

27 But how will I remember all this stuff? You can use this form to plot out the structure of your own data Establishes good practice early by helping form working habits. Print out and stick on the wall above your desk!

28 Questions? David Litting david.litting@uts.edu.au Many thanks to MIT Libraries for making the excellent materials this workshop is based on available for reuse http://libraries.mit.edu/data-management/files/2014/05/file-organization-july2014.pdf lib.uts.edu.au utslibrary This work is licensed under a Creative Commons Attribution 4.0 International License.Creative Commons Attribution 4.0 International License


Download ppt "Documenting and organising your data For an easier life lib.uts.edu.au utslibrary."

Similar presentations


Ads by Google