Form Preparation Case IDs Variable Names Data Codes Case IDs & Variable Names are Unique Identifiers It is the combination of the two that make your data meaningful.
Case IDs Best Practices for Case IDs – Unique, Meaningful, Confidential, & Stable – Put the Case ID on every form Major considerations – How many times are you collecting data from each person? – Do you need institutional data? – Do you need to protect confidentiality? Employees v. Participants (mandatory v. voluntary) – Are any of the data sensitive?
Rules of Thumb Once, with no need for institutional data – Sequential numbers, with random start (randomize forms before numbering) More than once, no need for institutional data? – Use something meaningful to respondent – Sample size may challenge uniqueness Need institutional data? – Use something meaningful to you
Trade-Offs Meaningful for participant – Easy to remember (stable) – Might not be confidential – May or may not link to institutional data Meaningful to you – Not easy to remember (not stable) – Promotes confidentiality – May need a key, risks to confidentiality – Promotes link to institutional data
Put the Case ID on Everything Every Form, Every page Double Check Back Track Multiple Coders
Naming Variables Best Practices for Variable Names – Unique & Meaningful Abbreviations – Short Standard 8 characters Excel can handle more, but your column size will increase – Start with letter, not # Mnemonic strategy vs. Question Number – workhrs vs. Question1 (q001) – Mnemonic, one-time projects, with one person handling data – Question Numbers, repeated or large projects, multiple people handling the data
I use both Meaningful Abbreviations – Less meaningful…question1 or q001 – More meaningful… q1reshall Avoid generic, be specific – What can you expect to find for q1reshall? Names of residence halls (Nye, Lincoln, White Pine…) Codes for residence halls (1 = Nye, 2 = Lincoln…) Lives in a residence hall (0 = no, 1 = yes)
Meaningfulness is tied to your coding!! 1s and 0s 0 = no, does not have characteristic 1 = yes, has the characteristic sex vs. female – what does a 0 mean? – what does a 1 mean? race vs. white?
Coding Data Categorical Data – Two Categories, use 0s and 1s, variable name should be your reference group – Three or more categories Nominal - No meaningful numerical difference between categ. dummy code, instead of one variable race, make five variables 0 = not Asian, 1 = Asian 0 = not Black, 1 = Black 0 = not Hispanic, 1 = Hispanic 0 = not Native American, 1 = Native American 0 = not White, 1 = White
Coding Data (cont.) Categorical Data – Three or more categories with a meaningful, numerical difference between categories – Academic Level 1 = Freshman 2 = Sophomore 3 = Junior 4 = Senior 5 = Second Degree 6 = Masters Student 7 = PhD or Professional Medical 10 = 0 = 13 (years) 20 = 16 = 14 30 = 33 = 15 40 = 50= 16.4 50 = 66= 17 60 = 83= 18 70 = 100 = 22
Coding Data Many types of coding you do when you create your survey – How committed are you to Nevada? 1 = not committed at all … 7 = Extremely committed – Even if it is not perfect, enter that information in your spreadsheet Then RECODE it into a NEW VARIABLE Never throw information out Always have a system to check your codes – Enter Race Then Recode (Dummy Code)
Entering Data Enter it exactly Recode anything that can be quantified into new variables Missing Data – Leave it Blank – If you must, use an extreme number, something way out of range (-999)
Qualitative Data Content Analysis (Implicit Quantification) Identify themes, categories, patterns Start Broad Get multiple perspectives Narrow it down to a manageable number of themes Count