2Agenda The whole process Questionnaire design Data collection Software designData entry
3The whole process Questionnaire Design Data collection Software design Asking the right questions, in the right wayStructure the questionnaire effectively Pilot & Back-TranslateVeracityQuality of surveyQuality of filling questionnaires Back Checks & AccompanimentsSoftwaredesignData entry andmanagementMinimize data entry errorsOrganize data in an effective wayClean data Double entry & error checking
4Agenda The whole process Questionnaire design Data collection Software designData entry
5Questionnaire design Clear skip patterns whenever needed. Grids The software designer will then need to include those in the data entry software.GridsSingle/multiple optionsInterviewer checkpointsWhen coding your questions, make sure that all options are included.For example, if there is a chance, even small, that people will say “I don’t know”, do include the code “-999” in the question.
6Pilot and translate survey Pilot: in non research areas, but similar settingDepending on how ready questionnaire is, 30 to 40 pilotsCan also pilot some sections more intensivelyTranslation: back translation is MANDATORY
7Agenda The whole process Questionnaire design Data collection Software designData entry
8Data collection: surveyors SelectionTraining: before survey, and on-goingBefore survey:Classroom and fieldQuestionnaire + field instructions + behavior on fieldTraining on the issue of interestAlso, if you have time to do an instruction manual, it is usefulKeep going to the field with them and do reminder trainings (ex. You notice they prompt too much etc.)Maintain motivation: go out with them, bonuses etc.STAY IN THE FIELD WITH THEM
9Data collection: quality checks Team structureOne supervisor for five surveyorsA field monitor if your team is big to help you manage the teamMonitoring on the fieldAccompaniments by supervisor: all the timeAccompaniments by monitor: 75% of the timeAccompaniments by yourself: maybe 15% of the timeBack-checks by field monitor: 15% of questionnaires, some sections (mandatory!)Do some back-checks yourselfAnalyse the data from back-checks right away!If you use a survey company, you still need to do your own back-checks and some accompaniments
10Questionnaire quality: scrutiny Scrutinize questionnairesHave surveyors, and supervisors do itBut also do it yourself!If you have a project assistant, ask him to scrutinize 100% but still scrutinize 50% or so yourself (at least most tricky sections)Examples of instances where only you can catch mistakes: codes for activity, logical consistencyWhen scrutinizing, write all codes, even if not pre-coded“-777” for missing, or “-999” for “I don’t know”If you find too many missing data, or data not consistent, send surveyors back to the field
11Agenda The whole process Questionnaire design Data collection Software designData entry
12Data management: goals QualityTimingTiming is important, and you need to monitor the Data Entry Officers (DEO) or the Data Entry (DE) company carefully to make sure they stick to timelines, but by no mean you should sacrifice any steps related to quality check (if you save time on those steps, you’ll lose time later).
13Data entry software Software Need to think about it as soon as questionnaire close to finalCould be done by survey company or outsourced to someone else (less expensive, or someone you trust better)Goal is that DEO should be able to do as few mistakes as possible
14Data entry softwareSoftware developing: send the developer a detailed spreadsheet indicating instructions for each question (what is the range of acceptable values, logical checks, etc.). The more detailed this will be, the more time you’ll save later.Software testing: When a software designer does the software, you need to test it your self by entering a bunch of questionnaires (for e.g pilot questionnaires, or also invent the responses, just make sure you test all the parts of the software).Check output: Then look at the output carefully and make sure it looks fine, and also send it to the professors you work with to make sure they are satisfied with the output.
15Checking outputWhen checking output try to imagine yourself analyze the data!All field need to be numerical (except text fields, like comments or “others – specify”). Again, there is not much you can do with text fields when you analyse.One example: when questions have multiple choice responses (let’s say the question is “where do you take your water from?” and there are 5 options “well, tap, etc.”)This question should be considered as 5 questions (1. Do you take your water from the well? Yes or no 2. Do you take your water from the tap? Yes or no etc.).The response for this question will be a binary variable (i.e either 1 (yes) or 0 (no).This becomes obvious if you put your self in the shoes of the person who will analyse the data (among others, you!). If this is considered as only one question, and the DEO fills “1, 2, 5” in the unique response field, you can not do anything with that data!
16Agenda The whole process Questionnaire design Data collection Software designData entry
17Data entryTiming: Data entry should start no as soon as possible after data collection start – and before collection is over!Double entry: Mandatory. Must be written in contract.One outputTwo outputs, reconciledError checking: Check the error rate on a regular basis (batches of 200 or 300 questionnaires). And before you do any cleaningPayment to DE company: In contract, clause that the first payment will be done only after 200 or so questionnaires have been given to you, the error rate checked by you, and less than 0.5%. Pay only after that.Get bad data re-entered entirely: whatever is the nature of the errors
18Error rate checkingWhat is it? For each batch, re-enter a sample of data fields and compare this data with the data given by the company (for those fields)Need approximately 3000 by batchHow to do?Divide your data in sub-sections (of about 25 questions)In some cases you will receive your data split in tabs – you can use those tabs as sub-sections – if small enoughFor each sub-section select 5% of questionnaires in your batch, randomly selectedEnter data from that section of the selected questionnaires (using an excel spreadsheet, or the data entry software)Compare your dataset with original data (use stata, excel, or comparison software), and check on physical questionnaire who did the mistakeError rate: numbers of errors made by the company/number of fields (one error is one field with a mistake, not one question!)Calculate error rate for each section, and overall
19Data cleaning and organizing Clean your data in a different fileRename and label variablesCheck for logical errorsLook at ranges and outliersDo basic data summariesCheck for duplicate dataCheck for missing dataLook at distribution of data by surveyors/teams