Presentation on theme: "Forms to Spreadsheets A-Team Spring Brown Bags February 7, 2014 Jennifer Lowman Coordinator, Student Persistence Research University of Nevada, Reno."— Presentation transcript:
Forms to Spreadsheets A-Team Spring Brown Bags February 7, 2014 Jennifer Lowman Coordinator, Student Persistence Research University of Nevada, Reno
Outline Excel Basics Form Preparation Naming Variables Coding Data Entering Data Cleaning Data Analyzing Data
Excel Basics Workbooks & Worksheets 3 Parts to a Spreadsheet – Rows (numbered) – Columns (alphabetized) – Cells (combo, H7)
Excel Basics Row Cases Columns Variables Cells Data
Form Preparation Case IDs Variable Names Data Codes Case IDs & Variable Names are Unique Identifiers It is the combination of the two that make your data meaningful.
Case IDs Best Practices for Case IDs – Unique, Meaningful, Confidential, & Stable – Put the Case ID on every form Major considerations – How many times are you collecting data from each person? – Do you need institutional data? – Do you need to protect confidentiality? Employees v. Participants (mandatory v. voluntary) – Are any of the data sensitive?
Rules of Thumb Once, with no need for institutional data – Sequential numbers, with random start (randomize forms before numbering) More than once, no need for institutional data? – Use something meaningful to respondent – Sample size may challenge uniqueness Need institutional data? – Use something meaningful to you
Trade-Offs Meaningful for participant – Easy to remember (stable) – Might not be confidential – May or may not link to institutional data Meaningful to you – Not easy to remember (not stable) – Promotes confidentiality – May need a key, risks to confidentiality – Promotes link to institutional data
Put the Case ID on Everything Every Form, Every page Double Check Back Track Multiple Coders
Naming Variables Best Practices for Variable Names – Unique & Meaningful Abbreviations – Short Standard 8 characters Excel can handle more, but your column size will increase – Start with letter, not # Mnemonic strategy vs. Question Number – workhrs vs. Question1 (q001) – Mnemonic, one-time projects, with one person handling data – Question Numbers, repeated or large projects, multiple people handling the data
I use both Meaningful Abbreviations – Less meaningful…question1 or q001 – More meaningful… q1reshall Avoid generic, be specific – What can you expect to find for q1reshall? Names of residence halls (Nye, Lincoln, White Pine…) Codes for residence halls (1 = Nye, 2 = Lincoln…) Lives in a residence hall (0 = no, 1 = yes)
Meaningfulness is tied to your coding!! 1s and 0s 0 = no, does not have characteristic 1 = yes, has the characteristic sex vs. female – what does a 0 mean? – what does a 1 mean? race vs. white?
Coding Data Categorical Data – Two Categories, use 0s and 1s, variable name should be your reference group – Three or more categories Nominal - No meaningful numerical difference between categ. dummy code, instead of one variable race, make five variables 0 = not Asian, 1 = Asian 0 = not Black, 1 = Black 0 = not Hispanic, 1 = Hispanic 0 = not Native American, 1 = Native American 0 = not White, 1 = White
Coding Data (cont.) Categorical Data – Three or more categories with a meaningful, numerical difference between categories – Academic Level 1 = Freshman 2 = Sophomore 3 = Junior 4 = Senior 5 = Second Degree 6 = Masters Student 7 = PhD or Professional Medical 10 = 0 = 13 (years) 20 = 16 = = 33 = = 50= = 66= = 83= = 100 = 22
Coding Data Many types of coding you do when you create your survey – How committed are you to Nevada? 1 = not committed at all … 7 = Extremely committed – Even if it is not perfect, enter that information in your spreadsheet Then RECODE it into a NEW VARIABLE Never throw information out Always have a system to check your codes – Enter Race Then Recode (Dummy Code)
Entering Data Enter it exactly Recode anything that can be quantified into new variables Missing Data – Leave it Blank – If you must, use an extreme number, something way out of range (-999)
Qualitative Data Content Analysis (Implicit Quantification) Identify themes, categories, patterns Start Broad Get multiple perspectives Narrow it down to a manageable number of themes Count