# Chapter Eleven Sampling Foundations.

## Presentation on theme: "Chapter Eleven Sampling Foundations."— Presentation transcript:

Chapter Eleven Sampling Foundations

Chapter Objectives Define and distinguish between sampling and census studies Discuss when to use a probability versus a nonprobability sampling method and implement the different methods Explain sampling error and sampling distribution Construct confidence intervals for population means and proportions List the factors to consider in determining sample size, and compute the required sample size to achieve a specific degree of precision at a desired confidence level Copyright © Houghton Mifflin Company. All rights reserved.

Gallup Poll on Sampling: China
12,500 counties, cities, and urban districts were divided into 50 strata based on their geographic location, degree of economic development, and proportion of non-agricultural population One primary sampling unit (PSU), consisting of either a county or a city, was selected from each stratum based on probability proportional to population size Within each PSU, the populations of all neighborhoods and villages were compiled. From this listing, four neighborhoods or villages were selected proportional to size. From each of these four neighborhoods or villages, five households were selected at random Copyright © Houghton Mifflin Company. All rights reserved.

Gallup Poll on Sampling: China (Cont’d)
One respondent was selected from each of the selected households, ensuring proper representation in the sample of all age groups by both genders The respondent to be interviewed is then selected according to a prescribed systematic procedure If the designated respondent was not at home, or could not be reached, a second or, if needed, a third adult family member was selected systematically from among the household members remaining on the list If contact with the designated respondent could not be made after a total of three separate visits to the household, an interview with a respondent in a substitute household in the same locality was permitted Two substitute households were kept in reserve for each five assigned households in the interviewing area Copyright © Houghton Mifflin Company. All rights reserved.

Gallup Poll on Sampling: China (Cont’d)
By following this methodology and correcting for any rural/urban sampling issues the Gallup China polls are statistically accurate to within + or – 2% Copyright © Houghton Mifflin Company. All rights reserved.

National Poll –Sample Size
Harris Poll A weekly study that monitors the reactions of the American public to a variety of economic, political, and social issues Sample Size Based on a nationally representative telephone survey of 1,000 adults age 18 or over Copyright © Houghton Mifflin Company. All rights reserved.

AC Nielsen Scantrack Index
Offers valuable scanner-based sales and brand share data on a regular basis to manufacturers of a wide variety of consumer products such as food, drugs, and cosmetics Sample Size Sales and brand share estimates are gathered weekly from a representative sample of more than 4,800 stores representing over 800 retailers in 50 major markets Copyright © Houghton Mifflin Company. All rights reserved.

Sampling vs. Census Studies
A census study draws inferences from the entire body of units of interest (the population) A sample study draws inferences from a sample drawn from the population Copyright © Houghton Mifflin Company. All rights reserved.

Sampling and Nonsampling Errors
Sampling error: The difference between a statistic value that is generated through a sampling procedure and the parameter value, which can be determined only through a census study Nonsampling error: Any error in a research study other than sampling error (which arises purely because a sample, rather than the entire population, is studied) Copyright © Houghton Mifflin Company. All rights reserved.

Minimizing Sampling Errors
Increase the sample size Use a statistically efficient sampling plan Make the sample as representative of the population as possible Copyright © Houghton Mifflin Company. All rights reserved.

Types of Nonsampling Errors
Any error other than sampling error Sampling Frame Error Sampling frame not being representative of ideal population Nonresponse Error Final sample not representative of planned sample Data Error Distortions in collected data and mistakes in data coding, analysis, or interpretation Copyright © Houghton Mifflin Company. All rights reserved.

Potential Causes of Sampling Frame Errors

Minimizing Sampling Frame Errors

Potential Causes of Nonresponse Errors
Mail surveys/Internet surveys Certain types of sample units being more likely to respond than others Telephone and personal interview surveys Person not-at‑home problem and respondent refusal problem Copyright © Houghton Mifflin Company. All rights reserved.

Minimizing Nonresponse Errors
Mail surveys: increase response rates through the use of incentives, follow-up mailings, etc. Caution: increase in response rate per se may not reduce non-response error Telephone and personal interview surveys: make call-backs and spread out the time blocks during which interviews are conducted Copyright © Houghton Mifflin Company. All rights reserved.

Potential Causes of Data Errors

Exhibit 11.1 Types and Potential Causes of Nonsampling Errors

When Census Studies Are Appropriate
The feasibility condition Whenever a population is relatively small or can be accessed easily The necessity condition When the population units are extremely varied and each population unit is likely to be very different from all the other units Copyright © Houghton Mifflin Company. All rights reserved.

Probability and Nonprobability Sampling
Probability sampling is an objective procedure in which the probability of selection is known in advance for each population unit Nonprobability sampling is a subjective procedure in which the probability of selection for each population unit is unknown beforehand Copyright © Houghton Mifflin Company. All rights reserved.

Exhibit 11.3 Classification of Sampling Methods

Probability Sampling Methods

Gallup Poll: USA Identify and describe the population that a given poll is attempting to represent Choose or design a method that will enable Gallup to sample the target population randomly Random Digit Dialing (RDD): a procedure that creates a list of all possible household phone numbers in America and then selects a sub-set of numbers from that list for Gallup to call Copyright © Houghton Mifflin Company. All rights reserved.

Simple Random Sampling
Every possible sample of a certain size within a population has a known and equal probability of being chosen as the study sample Copyright © Houghton Mifflin Company. All rights reserved.

Stratified Random Sampling

Proportionate Stratified Random Sampling
Sample consists of units selected from each population stratum in proportion to the total number of units in the stratum Copyright © Houghton Mifflin Company. All rights reserved.

Kirkwood University- Proportionate Stratified Random Sampling
Administrators of Kirkwood University wanted to determine the attitudes of their students toward various aspects of the university They selected a proportionate stratified random sample of 500 students for conducting the attitude survey Copyright © Houghton Mifflin Company. All rights reserved.

Table 11.2 Proportionate Allocation of Total Sample of Kirkwood University Students

Disproportionate Stratified Random Sampling

Exhibit 11. 4 Disproportionate Stratified Random Sampling Used by A. C

Cluster Sampling Clusters of population units are selected at random and then all or some units in the chosen clusters are studied Copyright © Houghton Mifflin Company. All rights reserved.

Systematic Sampling Steps
An organized procedure, selecting a sample from a list containing all the population units Steps: Determine the sampling interval, number of units in the population k = number of units desired in the sample Copyright © Houghton Mifflin Company. All rights reserved.

Systematic Sampling Steps (Cont’d)
2) Choose randomly one unit between the first and kth units in the population list 3) The randomly chosen unit and every kth unit thereafter are designated as part of the sample Copyright © Houghton Mifflin Company. All rights reserved.

Practical Considerations: Probability Sampling Methods
Probability sampling techniques are generally used by large commercial marketing research firms that maintain national samples or panels that can be readily accessed for conducting periodic research surveys Copyright © Houghton Mifflin Company. All rights reserved.

Nonprobability Sampling Methods

Convenience Sampling Researcher's convenience forms the basis for selecting a sample of units The administrators of a college have announced a sharp increase in tuition fees for the next year. A TV reporter covering this news item is shown standing on campus talking to several students, one at a time, about their reactions to the proposed tuition fee increase. TV Reporter says: “While some of the students feel that the 10 percent fee hike is justified, most of them consider it to be unfair.” Copyright © Houghton Mifflin Company. All rights reserved.

Judgment Sampling A procedure in which a researcher exerts some effort in selecting a sample that he or she believes is most appropriate for a study Example The administrators of a college have announced a sharp increase in tuition fees for the next year A judgment sample of student officers may be more representative than a convenience sample of students The researcher should be knowledgeable about the ideal population for a study Copyright © Houghton Mifflin Company. All rights reserved.

Quota Sampling Involves sampling a quota of units to be selected from each population cell based on the judgment of the researchers and/or decision makers Steps Divide the population into segments (referred to as cells) based on certain control characteristics Determine the quota of units for each cell (quotas are determined by the researchers and/or decision makers) Instruct the interviewers to fill the quotas assigned to the cells Copyright © Houghton Mifflin Company. All rights reserved.

Parameter & Statistic Parameter Statistic
The actual, or true, population mean value or population proportion for any variable income, product ownership Statistic An estimate of a parameter from sample data Copyright © Houghton Mifflin Company. All rights reserved.

Sampling Error Sampling Error = Parameter Value - Statistic Value
Difference between a statistic value that is generated through a sampling procedure and the parameter value, which can be determined only through a census study Copyright © Houghton Mifflin Company. All rights reserved.

Sampling Distribution
Representation of the sample statistic values obtained from every conceivable sample of a certain size chosen from a population by using a specified sampling procedure along with the relative frequency of occurrence of those statistic values Copyright © Houghton Mifflin Company. All rights reserved.

Sampling Distribution

Table 11.4 Expenditures for Eating Out for a Hypothetical Population
500 10 450 9 400 8 350 7 300 6 250 5 200 4 150 3 100 2 50 1 Annual expenditure for eating out (\$) Family Number Copyright © Houghton Mifflin Company. All rights reserved.

Table 11.5 Partial List of Possible Samples and Sample Means

Exhibit 11.5 Sampling Distribution (Bar Chart) for Simple Random Samples of Two Units

Exhibit 11.6 Sampling Distribution Shown as a Histogram

Central Limit Theorem When the sample size is sufficiently large, the sampling distribution associated with the sampling procedure display the properties of a normal distribution. Copyright © Houghton Mifflin Company. All rights reserved.

Confidence Estimation for Interval Data
n = number of units in the sample X = sample mean value Sx = s / n S = standard deviation Copyright © Houghton Mifflin Company. All rights reserved.

Confidence Estimation for Interval Data (Cont’d)
Given n = 100, x = 1,278 units, and s = 399 units To Construct 95 percent confidence interval s sx = = = units n 100 The 95 percent confidence interval is x ± sx = 1,278 ± (1.96)(39.9) = 1,278 ± = 1,278 ± 78, approximately Copyright © Houghton Mifflin Company. All rights reserved.

Confidence Estimation for Interval Data (Cont’d)
Interpretation From the sample data, we can be 95 percent confident that the average annual sales of men's suits, across all men's clothing stores in the population, are between 1,200 and 1,356 units Copyright © Houghton Mifflin Company. All rights reserved.

Finding Confidence Intervals for Population Proportions
 = true population proportion (i.e., the parameter value) Confidence Intervals for Population proportion: p sp    p sp p = proportion obtained from a single sample (i.e., the statistic value) sp = estimate of the standard error of the sample proportion p = number of sample units having a certain feature total number of sample units (i.e., n) sp =  p (1 - p) n Copyright © Houghton Mifflin Company. All rights reserved.

Finding Confidence Intervals for Population Proportions (Cont’d)
Given n = 100 and p = .64. To construct a 95 percent confidence interval for the population proportion sp = p (1 - p) n (.64)(.36) = .048 100 The 95 percent confidence interval is p ± 1.96 sp = .64 ± (1.96)(.048) = .64 ± = .64 ± .09, approximately. Copyright © Houghton Mifflin Company. All rights reserved.

Finding Confidence Intervals for Population Proportions (Cont’d)
Interpretation This confidence interval can also be expressed in percentage terms: 64% ± 9% In other words, we can be 95 percent confident that between 55 and 73 percent of all grocery stores in the city carry potted plants Copyright © Houghton Mifflin Company. All rights reserved.

Factors Influencing Sample Size

Methods for Determining Sample Size
The desired precision level The desired confidence level An estimate of the degree of variability in the population, expressed in the form of a standard deviation Copyright © Houghton Mifflin Company. All rights reserved.

Sample Size Estimation
H-> Desired precision level q-> Desired confidence level S-> Sample Standard deviation N-> Population mean zq2 s2 N = H2 zqs H = ---- n Copyright © Houghton Mifflin Company. All rights reserved.

Sample Size Estimation (Cont’d)
A marketing manager of a frozen-foods firm wants to estimate within ±\$10 the average annual amount that families in a certain city spend on frozen foods per year and have 99 percent confidence in the estimate He estimates that the standard deviation of annual family expenditures on frozen foods is about \$100 How many families must be chosen for this study? Copyright © Houghton Mifflin Company. All rights reserved.

Sample Size Estimation (Cont’d)
H = \$10, s = \$100, and zq = 2.575 (corresponding to a confidence level of 99 percent) n = (2.575)2(100)2 = 663 families,approximately (10)2 Copyright © Houghton Mifflin Company. All rights reserved.

Determining Sample Size
A sporting goods marketer wants to estimate the proportion of tennis players among high school students in the United States The marketer wants the estimate to be accurate within ±.02 and wants to have 95 percent confidence in the interval estimate A pilot telephone survey of 50 high school students showed that 20 of them played tennis. Estimate the required sample size for the final study from the given data What should the sample size be if the desired precision and confidence levels are to be guaranteed? Copyright © Houghton Mifflin Company. All rights reserved.

Determining Sample Size (Cont’d)
H = .02 and zq = p = 20/50 =0.4 s = (20/50)(1 - 20/50) = (.4)(.6) = .24 z2q s2 (l.96)2(.24 )2 n = = H (.02)2 = 2,305 students, approximately The maximum sample size is .25z2q nmax = = 2,401 students H2 Copyright © Houghton Mifflin Company. All rights reserved.