# Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5 1.

## Presentation on theme: "Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5 1."— Presentation transcript:

Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5 1

Copyright ©2011 Brooks/Cole, Cengage Learning 2 Principle Idea: The data collection method used affects the extent to which sample data can be used to make inferences about a larger population.

Copyright ©2011 Brooks/Cole, Cengage Learning 3 5.1Collecting and Using Sample Data Wisely Descriptive Statistics: using numerical and graphical summaries to characterize a data set or describe a relationship. Inferential Statistics: using sample information to make conclusions about a broader range of individuals than just those observed.

Copyright ©2011 Brooks/Cole, Cengage Learning 4 The Fundamental Rule for Using Data for Inference Available data can be used to make inferences about a much larger group if the data can be considered to be representative with regard to the question(s) of interest.

Copyright ©2011 Brooks/Cole, Cengage Learning 5 Example 5.1 Do First Ladies Represent Other Women? Past First Ladies are not likely to be representative of other American women, nor even future First Ladies, on the question of age at death, since medical, social, and political conditions keep changing in ways that may affect their health.

Copyright ©2011 Brooks/Cole, Cengage Learning 6 Example 5.2 Do Penn State Students Represent Other College Students? If question of interest = average handspan of females in college age range? => Yes If question of interest = how fast ever driven a car? => No, since Penn State in rural area with open spaces, county roads, little traffic.

Copyright ©2011 Brooks/Cole, Cengage Learning 7 Populations, Samples, and Simple Random Samples Population: the entire group of units about which inferences are to be made. Sample: the smaller group of units actually measured or surveyed. Census: every unit in the population is measured or surveyed.

Copyright ©2011 Brooks/Cole, Cengage Learning 8 Populations, Samples, and Simple Random Samples Simple Random Sample: every conceivable group of units of the required size from the population has the same chance to be the selected sample. Helps ensure sample data will be representative of population, but can be difficult to obtain.

Copyright ©2011 Brooks/Cole, Cengage Learning 9 Populations, Samples, and Simple Random Samples Sample Survey: a subgroup of a large population questioned on set of topics. Special type of observational study. Less costly and less time than a census.

Copyright ©2011 Brooks/Cole, Cengage Learning 10 Advantages of a Sample Survey over a Census Sometimes a Census Isn’t Possible when measurements destroy units Speed especially if population is large Accuracy devote resources to getting accurate sample results

Copyright ©2011 Brooks/Cole, Cengage Learning 11 Bias: How Surveys Can Go Wrong Results based on a survey are biased if method used to obtain those results would consistently produce values that are either too high or too low. Selection bias occurs if method for selecting participants produces sample that does not represent the population of interest. Nonparticipation bias (nonresponse bias) occurs when a representative sample is chosen but a subset cannot be contacted or doesn’t respond. Biased response or response bias occurs when participants respond differently from how they truly feel.

Copyright ©2011 Brooks/Cole, Cengage Learning 12 5.2Margin of Error, Confidence Intervals, and Sample Size With proper methods, a sample of 1600 people from an entire population of millions can fairly certainly gauge the percentage of the entire population who have a certain trait or opinion to within 2.5%.

Copyright ©2011 Brooks/Cole, Cengage Learning 13 Margin of Error: The Accuracy of Sample Surveys The sample proportion and the population proportion with a certain trait or opinion differ by less than the margin of error in at least 95% of all random samples. Conservative margin of error = Add and subtract the margin of error to create an approximate 95% confidence interval.

Copyright ©2011 Brooks/Cole, Cengage Learning 14 Confidence Intervals 95% Confidence Interval for a Population Proportion: For about 95% of properly conducted sample surveys, the interval sample proportion to sample proportion will contain the actual population proportion. Another way to write it: sample proportion

Copyright ©2011 Brooks/Cole, Cengage Learning 15 Example 5.3 The Importance of Religion for Adult Americans Poll of n = 1025 adult Americans: “How important would you say religion is in your own life?” Very important56% Fairly important25% Not very important19% Conservative margin of error is 3%: Approx. 95% confidence interval for the percent of all adult Americans who say religion is very important: 56%  3% or 53% to 59%

Copyright ©2011 Brooks/Cole, Cengage Learning 16 Interpreting Confidence Interval The interval 53% to 59% may or may not capture the percent of adult Americans who considered religion to be very important in their lives. But, in the long run this procedure will produce intervals that capture the unknown population values about 95% of the time => called the 95% confidence level.

Copyright ©2011 Brooks/Cole, Cengage Learning 17 Choosing a Sample Size for a Survey If m.e. is the desired margin of error for a 95% confidence interval for a population proportion, the required sample size is:

Copyright ©2011 Brooks/Cole, Cengage Learning 18 The Effect of Population Size The m.e. for a sample of 1000 is about 3% whether the population size is 30,000 or 200 million. In practice, as long as the population size is ≥ 10 times as large as the sample size, the population size has almost no influence on the accuracy of sample estimates.

Copyright ©2011 Brooks/Cole, Cengage Learning 19 5.3Choosing a Simple Random Sample Probability Sampling Plan: everyone in population has specified chance of making it into the sample. Simple Random Sample: every conceivable group of units of the required size has the same chance of being the selected sample.

Copyright ©2011 Brooks/Cole, Cengage Learning 20 Choosing a Simple Random Sample You Need: 1.List of the units in the population. 2.Source of random numbers. Table of Random Digits Random Number Generator Computer Software

Copyright ©2011 Brooks/Cole, Cengage Learning 21 Simple Random Sample of Students School has 5000 students. Want a simple random sample of 10 students. 1.Number the units: Students numbered 1 to 5000. 2.Ask a computer program (e.g. Minitab) to randomly select 10 of them.

Copyright ©2011 Brooks/Cole, Cengage Learning 22 Example 5.6 Representing the Heights of British Women Simple random sample of 10 from 199 British women. 1.Assign an ID number from 001 to 199 to each woman. 2.Use random digits to randomly select ten numbers between 001 to 199, sample the heights of the women with those IDs. Sample 1: Using Statistical Package Minitab IDs: 176, 10, 1, 40, 85, 162, 46, 69, 77, 154 Heights: 60.6, 63.4, 62.6, 65.7, 69.3, 68.7, 61.8, 64.6, 60.8, 59.9; mean = 63.7 inches Sample 2: Using Table of Random Digits IDs: 41, 93, 167, 33, 157, 131, 110, 180, 185, 196 Heights: 59.4, 66.5, 63.8, 62.6, 65.0, 60.2, 67.3, 59.8, 67.7, 61.8; mean = 63.4 inches

Copyright ©2011 Brooks/Cole, Cengage Learning 23 5.4Other Sampling Methods Not always practical to take a simple random sample, can be difficult to get a numbered list of all units. Example: College administration would like to survey a sample of students living in dormitories. Shaded squares show a simple random sample of 30 rooms.

Copyright ©2011 Brooks/Cole, Cengage Learning 24 Stratified Random Sampling Divide population of units into groups (called strata) and take a simple random sample from each of the strata. College survey: Two strata = undergrad and graduate dorms. Take a simple random sample of 15 rooms from each of the strata for a total of 30 rooms. Ideal: stratify so little variability in responses within each of the strata.

Copyright ©2011 Brooks/Cole, Cengage Learning 25 Cluster Sampling Divide population of units into groups (called clusters), take a random sample of clusters and measure only those items in these clusters. College survey: Each floor of each dorm is a cluster. Take a random sample of 5 floors and all rooms on those floors are surveyed. Advantage: need only a list of the clusters instead of a list of all individuals.

Copyright ©2011 Brooks/Cole, Cengage Learning 26 Systematic Sampling Order the population of units in some way, select one of first k units at random and then every k th unit thereafter. College survey: Order list of rooms starting at top floor of 1 st undergrad dorm. Pick one of the first 11 rooms at random  room 3, then pick every 11 th room after that. Note: often a good alternative to random sampling but can lead to a biased sample.

Copyright ©2011 Brooks/Cole, Cengage Learning 27 Random-Digit Dialing Method approximates a simple random sample of all households in the United States that have telephones. 1.List all possible exchanges (= area code + next 3 digits). 2.Take a sample of exchanges (chance of being sampled based on white pages proportion of households with a specific exchange). 3.Take a random sample of banks (= next 2 digits) within each sampled exchange. 4.Randomly generate the last two digits from 00 to 99. Once a phone number determined, make multiple attempts to reach someone at that household.

Copyright ©2011 Brooks/Cole, Cengage Learning 28 5.5Difficulties and Disasters in Sampling Using wrong sampling frame Not reaching individuals selected Nonresponse or nonparticipation Self-selected sample Convenience/Haphazard sample Some problems occur even when a sampling plan has been well designed.

Copyright ©2011 Brooks/Cole, Cengage Learning 29 Case Study 5.1 The Infamous Literary Digest Poll of 1936 Election of 1936: Democratic incumbent Franklin D. Roosevelt and Republican Alf Landon Literary Digest Poll: Sent questionnaires to 10 million people from magazine subscriber lists, phone directories, car owners, who were more likely wealthy and unhappy with Roosevelt. Only 2.3 million responses for 23% response rate. Those with strong feelings, the Landon supporters wanting a change, were more likely to respond. (Incorrectly) Predicted a 3-to-2 victory for Landon.

Copyright ©2011 Brooks/Cole, Cengage Learning 30 Case Study 5.1 The Infamous Literary Digest Poll of 1936 Election of 1936: Democratic incumbent Franklin D. Roosevelt and Republican Alf Landon Gallup Poll: George Gallup just founded the American Institute of Public Opinion in 1935. Surveyed a random sample of 50,000 people from list of registered voters. Also took a random sample of 3000 people from the Digest lists. (Correctly) Predicted Roosevelt the winner. Also predicted the (wrong) results of the Literary Digest poll within 1%.

Copyright ©2011 Brooks/Cole, Cengage Learning 31 5.6How to Ask Survey Questions Deliberate bias: The wording of a question can deliberately bias the responses toward a desired answer. Unintentional bias: Questions can be worded such that the meaning is misinterpreted by a large percentage of the respondents. Desire to Please: Respondents have a desire to please the person who is asking the question. Tend to understate response to an undesirable social habit/opinion. Possible Sources of Response Bias in Surveys

Copyright ©2011 Brooks/Cole, Cengage Learning 32 Asking the Uninformed: People do not like to admit that they don’t know what you are talking about when you ask them a question. Unnecessary Complexity: If questions are to be understood, they must be kept simple. Some questions ask more than one question at once. Ordering of Questions: If one question requires respondents to think about something that they may not have otherwise considered, then the order in which questions are presented can change the results. Possible Sources of Response Bias in Surveys (cont)

Copyright ©2011 Brooks/Cole, Cengage Learning 33 Confidentiality and Anonymity: People will often answer questions differently based on the degree to which they believe they are anonymous. Easier to ensure confidentiality, promise not to release identifying information, than anonymity, researcher does not know the identity of the respondents. Possible Sources of Response Bias in Surveys (cont)

Copyright ©2011 Brooks/Cole, Cengage Learning 34 Be Sure You Understand What Was Measured: Words can have different meanings. Important to get a precise definition of what was actually asked or measured. E.g. Who is really unemployed? Some Concepts Are Hard to Precisely Define: E.g. How to measure intelligence? Measuring Attitudes and Emotions: E.g. How to measure self-esteem and happiness?

Copyright ©2011 Brooks/Cole, Cengage Learning 35 Open or Closed Questions: Should Choices Be Given? Open question = respondents allowed to answer in own words. Closed question = given list of alternatives, usually offer choice of “other” and can fill in blank. If closed are preferred, they should first be presented as open questions (in a pilot survey) for establishing list of choices. Results can be difficult to summarize. Problems with Open Questions

Download ppt "Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5 1."

Similar presentations