Presentation on theme: "Trusting human beings to accurately report on their own characteristics."— Presentation transcript:
Trusting human beings to accurately report on their own characteristics
Surveys (along with observations, discussed later) are a very common technique for collecting nonexperimental data Surveys are a systematic way of asking people to volunteer information about their attitudes, behaviors, opinions and beliefs The success of survey research rests on how closely the answers that people give to survey questions matches reality (i.e., how people really think and act)
The first problem that a survey researcher has to tackle is how to design the survey so that it gets the right information. Is this survey necessary? Is the purpose of the survey to evaluate people or programs? Can the data be obtained by other means? What level of detail is required? The second problem is how accurate does the survey have to be? Is this a one-time survey or can the researcher repeat the survey on different occasions and in different settings? How will the results be used? How easy is it to do the survey?
The survey is an appropriate means of gathering information under three conditions: when the goals of the research call for quantitative and qualitative data when the information sought is specific and familiar to the respondents and the researcher has prior knowledge of the responses likely to emerge
Categorical data: numbers or words are used to group things EX: gender, race, religion, food group, or place of residence. Ordinal data: When the numbers are used to order a list of things EX: The ranking of football or basketball teams, A list of things to do, color of medal in the Olympics Interval data: responses represent actual quantity EX: height, weight, age
There are six basic types of data that you might collect: Attitudes Opinions Beliefs Behavior Attributes (demographic characteristics) Preferences
The way a question or statement is worded and the response options offered determine the nature of the data received. Types of survey questions include: Open-ended response Closed response Semantic differential scales Agreement and rating scales Ranking scales Checklists
Respondent writes response in own words Considerations for using open-ended questions: Need to enter data by hand Develop a coding scheme for responses Content analysis? Frequently used in exploratory studies to facilitate better understanding of a concept Suggestion: it’s a good idea to always include an open-ended question giving the respondent the opportunity to add any additional comments.
Advantages: Allows the respondent to answer the question with few limitations Report more information than with discrete answers Disadvantage: Need qualitative methods or coding system to analyze the responses Require subjective judgements Example: What habits increase a person’s risk for being overweight? Describe the pain you experience with walking?
These are the "multiple-choice" variety where a person has to choose among several possible answers. There are two types of closed response questions: Ordered answer choices represent points along a continuum. Pain on a scale of 0 (none) to 10 (worst pain ever) Unordered answer choices with each choice is an independent answer. Examples: ethnicity and marital status
Advantages: Quicker and easier to answer Easier to tabulate and analyze List of possible responses helps participant understand the meaning of the question Suitable to multi-item scales designed to provide a single score Disadvantages: Do not allow participants to express their own answers Set of answers may not be exhaustive Must be clear about selection of items, one or as many as applicable
Require specific, short answers that do not encourage free expression. Are a compromise between closed response and open response forms. Provide an “Other” category where a person can provide additional information. Example: blank spaces provided for the questions on racial background and persons living with you.
Use a five to seven-point rating scale with each end of the scale having an adjective or phrase. These adjectives, called bipolar adjectives, are direct opposites. Semantic differential scales have three common factors: an Evaluative factor covering such dimensions as good-bad, pleasant-unpleasant and positive-negative; A Potency factor representing the dimensions of strong-weak, hard-soft and heavy-light, and An Activity factor with such scales as fast- slow, active-passive and excitable-calm
End points are identified by adjectives or phrases all of the steps may have an adjective or phrase associated with it Example: A five point scale with steps labeled Strongly Agree, Agree, Neutral, Disagree, and Strongly Disagree (“Likert” scale)
ADVANTAGES: Generally yields highest cooperation and lowest refusal rates Allows for longer, more complex interviews High response quality Takes advantage of interviewer presence Multi-method data collection DISADVANTAGES: Most costly mode of administration Longer data collection period Interviewer concerns
ADVANTAGES: Less expensive than personal interviews RDD samples of general population Shorter data collection period than personal interviews Interviewer administration (vs. mail) Better control and supervision of interviewers (vs. personal) Better response rate than mail for list samples DISADVANTAGES: Biased against households without telephones, unlisted numbers Nonresponse Questionnaire constraints Difficult to administer questionnaires on sensitive or complex topics
ADVANTAGES: Generally lowest cost Can be administered by smaller team of people (no field staff) Access to otherwise difficult to locate, busy populations Respondents can look up information or consult with others DISADVANTAGES: Most difficult to obtain cooperation No interviewer involved in collection of data Need good sample More likely to need an incentive for respondents Slower data collection period than telephone
ADVANTAGES: Lower cost (no paper, postage, mailing, data entry costs) Can reach international populations Time required for implementation reduced Complex skip patterns can be programmed Sample size can be greater DISADVANTAGES: Approximately 77% homes have computer (2010 data) – what about the other 23%? Representative samples difficult - cannot generate random samples of general population Differences in capabilities of people's computers and software for accessing Web surveys Different ISPs/line speeds limits extent of graphics that can be used
Are clear and use simple language Are concise Are specific Are possible to answer Are relevant to the respondent Do not use negatives Avoid bias terms Have only one part (not two parted question)
So that every respondent will understand a question, it is important to keep the reading level at or below the average reading level of the population. Complex words may be replaced by simpler ones or ones more easily understood. If you are giving a survey to a particular group, you would want to use words that are common to the group.
Should reflect concepts you are trying to measure, and fit with the wording of the question Avoid simple “yes” or “no” answers and attempt to measure intensity if possible Mutually exclusive (select only one answer) Exhaustive (all possible answers are listed, including other or not applicable or don’t know)
Be attentive: Initial questions affect answers to subsequent ones Start with easy, salient, non- threatening questions near the end Cluster questions addressing the same topic or concept together. Avoid redundancy
Reliability: Expect to obtain the same information time after time. Assessed by correlation coefficient The concept of reliability can be applied to sampling If we repeatedly draw random samples of equal size from a population, we can expect to get the same sample values each time (plus or minus a certain amount due to sampling error). Validity: Measures the concept intended to measure Instrument is presented or used in the way for which it was intended An IQ test is valid only if it is used to measure intelligence it is not valid if it used to assign individuals to groups. A psychological test that is a valid measure of anxiety is not a valid measure of stress.
Face validity Information collected appears to be what was expected (Face value) A question that asked “Do you smoke?” would appear to have face validity as a measure of smoking behavior Content validity A question adequately reflects the underlying behavior or body of knowledge Content validity is established by having a panel of experts evaluate and agree on the relevance of the test items Concurrent validity One instrument or question is comparable to another that has been shown to validly measure the same content or construct Concurrent validity is established by correlating one question with another that has previously been validated
Discriminant validity A question or survey that is able to discriminate between group differences. Example Depressions scales have discriminate validity if individuals who are depressed score differently from those who are not clinically depressed. Predictive validity A question can be used to predict behavior. Example: Can you walk 5 blocks? Construct validity A construct is a theoretical dimension like self-esteem that is measured by having several questions that all relate to how people view themselves. Self-esteem does not exist by itself but is represented by how people respond to these questions. In this example, construct validity measures the extent to which these responses can be called self-esteem.
Test-retest reliability Obtained by administering the same test on two or more successive occasions and then correlating the scores Statistic that reflects reliability is correlation coefficient, higher is better Internal consistency Obtained by correlating the scores on several questions that pertain to the same content to the sum total of the scores The average item-total correlation is a measure of how consistently people respond to related items on a test
Stability How much variation exists in scores upon repeated administrations of the instrument. Stable measures will reproduce the same score on repeated administrations of the instrument. This concept is similar to test-retest except that in test-retest situations there is no assumption that the absolute value of each persons test score will stay the same. Considerations: Time between administration, if too close may be remembering answers Learning effect on repeated administrations