Sampling Variability Sampling Distributions

Slides:



Advertisements
Similar presentations
Sampling Distributions and Sample Proportions
Advertisements

Topic 3 The Normal Distribution. From Histogram to Density Curve 2 We used histogram in Topic 2 to describe the overall pattern (shape, center, and spread)
Terminology A statistic is a number calculated from a sample of data. For each different sample, the value of the statistic is a uniquely determined number.
McGraw-Hill Ryerson Copyright © 2011 McGraw-Hill Ryerson Limited. Adapted by Peter Au, George Brown College.
Confidence Intervals for Proportions
1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 8 Sampling Variability & Sampling Distributions.
© 2010 Pearson Prentice Hall. All rights reserved Sampling Distributions and the Central Limit Theorem.
Suppose we are interested in the digits in people’s phone numbers. There is some population mean (μ) and standard deviation (σ) Now suppose we take a sample.
Chapter 7: Variation in repeated samples – Sampling distributions
Sampling Distributions
Copyright © 2010 Pearson Education, Inc. Chapter 19 Confidence Intervals for Proportions.
The Basics  A population is the entire group on which we would like to have information.  A sample is a smaller group, selected somehow from.
Variance Fall 2003, Math 115B. Basic Idea Tables of values and graphs of the p.m.f.’s of the finite random variables, X and Y, are given in the sheet.
A P STATISTICS LESSON 9 – 1 ( DAY 1 ) SAMPLING DISTRIBUTIONS.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 7 Sampling Distributions.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 6 Sampling Distributions.
In this chapter we will consider two very specific random variables where the random event that produces them will be selecting a random sample and analyzing.
Form groups of three. Each group needs: 3 Sampling Distributions Worksheets (one per person) 5 six-sided dice.
Copyright © 2010 Pearson Education, Inc. Chapter 6 The Standard Deviation as a Ruler and the Normal Model.
Chapter 8 Sampling Variability and Sampling Distributions.
NOTES The Normal Distribution. In earlier courses, you have explored data in the following ways: By plotting data (histogram, stemplot, bar graph, etc.)
© 2010 Pearson Prentice Hall. All rights reserved Chapter Sampling Distributions 8.
Chapter 6 Lecture 3 Sections: 6.4 – 6.5.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 7: Sampling Distributions Section 7.2 Sample Proportions.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 9: Sampling Distributions Section 9.2 Sample Proportions.
Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal
© 2010 Pearson Prentice Hall. All rights reserved 8-1 Objectives 1.Describe the distribution of the sample mean: samples from normal populations 2.Describe.
Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 7 Sampling Distributions.
Chapter 7: Sample Variability Empirical Distribution of Sample Means.
Chapter 7 Probability and Samples: The Distribution of Sample Means.
1 Chapter 7 Sampling Distributions. 2 Chapter Outline  Selecting A Sample  Point Estimation  Introduction to Sampling Distributions  Sampling Distribution.
1 Chapter 8 Sampling Distributions of a Sample Mean Section 2.
Chapter 7 Sampling Distributions Statistics for Business (Env) 1.
CHAPTER 15: Sampling Distributions
Chapter 8 Sampling Variability and Sampling Distributions.
Chapter 9 Indentify and describe sampling distributions.
The Practice of Statistics Chapter 9: 9.1 Sampling Distributions Copyright © 2008 by W. H. Freeman & Company Daniel S. Yates.
Sampling Distributions Chapter 18. Sampling Distributions A parameter is a measure of the population. This value is typically unknown. (µ, σ, and now.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 7 Sampling Distributions 7.1 What Is A Sampling.
Chapter 6 Lecture 3 Sections: 6.4 – 6.5. Sampling Distributions and Estimators What we want to do is find out the sampling distribution of a statistic.
Chapter 5 Sampling Distributions. The Concept of Sampling Distributions Parameter – numerical descriptive measure of a population. It is usually unknown.
1 Chapter 2: The Normal Distribution 2.1Density Curves and the Normal Distributions 2.2Standard Normal Calculations.
© 2010 Pearson Prentice Hall. All rights reserved Chapter Sampling Distributions 8.
Learning Objectives After this section, you should be able to: The Practice of Statistics, 5 th Edition1 DESCRIBE the shape, center, and spread of the.
Copyright © Cengage Learning. All rights reserved. 5 Joint Probability Distributions and Random Samples.
9.1: Sampling Distributions. Parameter vs. Statistic Parameter: a number that describes the population A parameter is an actual number, but we don’t know.
Section 7.1 Sampling Distributions. Vocabulary Lesson Parameter A number that describes the population. This number is fixed. In reality, we do not know.
Chapter 9 Estimation using a single sample. What is statistics? -is the science which deals with 1.Collection of data 2.Presentation of data 3.Analysis.
Chapter 9 Sampling Distributions 9.1 Sampling Distributions.
WHAT IS A SAMPLING DISTRIBUTION? Textbook Section 7.1.
Section 8.3 The Sampling Distribution of the Sample Proportion.
Sampling Variability and Sampling Distributions
Inference with Proportions I
Binomial and Geometric Random Variables
Sampling Variability & Sampling Distributions
CHAPTER 6 Random Variables
Sampling Distributions
8.1 Sampling Distributions
Sampling Distributions
Section 7.1 Sampling Distributions
The Practice of Statistics
Sampling Distributions of Proportions section 7.2
CHAPTER 7 Sampling Distributions
Sampling Distribution Models
Chapter 7: Sampling Distributions
1/10/ Sample Proportions.
Warmup Which of the distributions is an unbiased estimator?
Sampling Variability Sampling Distributions
Chapter 5: Sampling Distributions
Presentation transcript:

Sampling Variability Sampling Distributions Chapter 8 Sampling Variability Sampling Distributions Created by Kathy Fritz

Suppose that you are interested in learning about the proportion of women in the group of students pictured below and that this group is the entire population of interest. The proportion of women = 19 34 =0.56 You can compute this proportion because the picture provides complete information on gender for the entire population (a census).

that neither of these proportions equal the population proportion. But, suppose that the population information is not available. To learn about the proportion of women in the population, you decide to select a sample from the population by choosing 5 students at random. Notice that the proportions from the two different samples are NOT the same AND that neither of these proportions equal the population proportion. Here is one possible sample. The proportion of women is 3 5 =0.6. In this chapter, you will learn about how the value of the sample proportion varies from sample to sample (sampling variability) AND about the long-run behavior of sample proportions (the sampling distribution). Here is a different sample. The proportion of women is 2 5 =0.4.

Statistics and Sampling Variability

Statistic A number computed from the values in a sample is called a statistic. The observed value of a statistic varies from sample to sample depending on the particular sample selected. This variability is called sampling variability. Recall the notation for population characteristics Statistics, such as the sample mean 𝑥 m the sample median the sample standard deviation s s the sample proportion 𝑝 provide information about population characteristics. p

Consider a small population consisting of 100 Consider a small population consisting of 100 students in an introductory psychology course. Students in the class completed a survey on academic procrastination. Suppose that on the basis of their responses, 40% of the students were identified as severe procrastinators. 𝑝= 40 100 =0.40 Let’s investigate what happens if random samples of size 20 are selected from this population. To do this, write the numbers 1 to 100 on slips of paper, where 1 – 40 represent students who are severe procrastinators. Mix the slips well, then select 20 slips without replacement.

Looking at some additional samples will provide some insight. Consider a small population consisting of 100 students in an introductory psychology course. Students in the class completed a survey on academic procrastination. Suppose that on the basis of their responses, 40% of the students were identified as severe procrastinators. 𝑝= 40 100 =0.40 This value is 0.05 larger than the population proportion of 0.40. Is this difference typical, or is this particular sample proportion unusually far away from p? Looking at some additional samples will provide some insight. One sample of size 20 might be: The nine highlighted numbers correspond to students who were identified as severe procrastinators. 𝑝 = 9 20 =0.45 78 16 31 86 5 67 39 28 97 70 34 65 47 89 26 79 52 4 82 6

The histogram shows the sampling variability in the statistic 𝑝 . It also provides an approximation to the distribution of 𝑝 values that would have been observed if you had considered EVERY different possible sample of size 20 from this population. It is difficult to see, by looking at this table, if a sample proportion of 0.45 is typical or unusual. Let’s look at a histogram of these sample proportions. Severe procrastinators Continued . . . Let’s look at 50 samples of size 5. Sample 𝑝 1 0.45 11 21 0.25 31 0.30 41 2 12 0.50 22 32 0.55 42 3 0.35 13 0.20 23 0.40 33 43 4 14 24 34 44 5 15 25 35 45 6 16 040 26 035 36 0.65 46 7 17 27 37 47 8 18 28 38 48 9 19 29 39 49 10 20 30 40 50

Sampling Distribution The distribution formed by the values of a sample statistic for every possible different sample of a given size is called its sampling distribution.

Sampling Distribution of a Sample Proportion

In the fall of 2008, there were 18,516 students enrolled at California Polytechnic State University, San Luis Obispo. Of these students, 8091 (43.7%) were female. What would you expect to see for the sample proportion of females if you were to take a random sample of size 10 from this population? With success denoting a female student, the proportion of successes in this population is p = 0.437. Using a statistical software package, we will generate 500 samples of size 10 and compute the proportion of females for each sample.

This tells you that a sample of size 10 from this population of students may not provide very accurate information about the proportion of women in the population. In the fall of 2008, there were 18,516 students enrolled at California Polytechnic State University, San Luis Obispo. Of these students, 8091 (43.7%) were female. The relative frequency histogram displays the 500 values of 𝑝 . How would selecting a larger size sample affect the distribution of 𝑝 ? Notice that there is a lot of sample-to-sample variability in the value of 𝑝 , with values from 0.0 to 0.9.

California Polytechnic State University Continued . . . We will generate 500 samples of each of the following sample sizes: n = 10, n = 25, n = 50, n = 100 and compute the proportion of females for each sample. The following histograms display the distributions of the sample proportions for the 500 samples of each sample size. Compare the center, spread, and shape of these histograms.

What do you notice about the shape of these distributions? Are these histograms centered around the population proportion p = 0.437? What do you notice about the standard deviation of these distributions? What do you notice about the shape of these distributions?

General Properties for Sampling Distributions of 𝑝 This means that the 𝑝 values from many different random samples will tend to cluster around the actual value of the population proportion. 𝒑 is the proportion of successes in a random sample of size n from a population where the proportion of success is p. The mean value of 𝑝 is denoted by 𝜇 𝑝 , and the standard deviation of 𝑝 is denoted by 𝜎 𝑝 . Rule 1: 𝝁 𝒑 =𝒑 Rule 2: 𝝈 𝒑 = 𝒑 𝟏−𝒑 𝒏 Rule 3: When n is large and p is not too near 0 or 1, the sampling distribution of 𝑝 is approximately normal. This means that for larger samples, 𝑝 values tend to cluster more tightly around the actual value of the population proportion. Does this mean that the sampling distribution of 𝑝 will always be approximately normal? For what values of n and p is the sampling distribution of 𝑝 approximately normal? This rule is exact if the population is infinite, and is approximately correct if the population is finite and no more than 10% of the population is included in the sample.

When is the sampling distribution of 𝑝 approximately normal? An equivalent way of stating this rule is to say that the sampling distribution of 𝑝 is approximately normal if the sample size is large enough to expect at least 10 successes and at least 10 failures in the sample. The farther the value of p is from 0.5, the larger n has to be in order for 𝑝 to have a sampling distribution that is approximately normal. A conservative rule of thumb is that if both np ≥ 10 and n (1 – p) ≥ 10, then the sampling distribution of 𝑝 is approximately normal.

How Sampling Distributions Support Learning from Data

What role do sampling distributions play in learning about a population characteristic? In an estimation situation, you need to understand sampling variability to assess how close an estimate is likely to be to the actual value of the corresponding population characteristic. Sample data can also be used to evaluate whether or not a claim about a population is believable. There are two reasons why a sample statistic may not equal the value of the claim – Sampling variability 2. There really is a difference between the corresponding population characteristic and the claim value.

How accurate is this estimate likely to be? Each person in a random sample of U.S. adults was asked the following question, “Do you believe that extraterrestrial beings have visited Earth at some time in the past?” Of the 1255 people in the survey, 442 answered yes to this question, resulting in a sample proportion of 𝑝 = 422 1255 =0.35. Let’s look at the sampling distribution of 𝑝. The population proportion who believe that extraterrestrials beings have visited Earth isn’t exactly 0.35. How accurate is this estimate likely to be? Rule 1: 𝜇 𝑝 =𝑝, so we know that the sampling distribution of 𝑝 is centered at p. Rule 2: 𝜎 𝑝 ≈ 0.35 0.65 1255 =0.013, so the 𝑝 values will be tightly clustered around the actual value of the population proportion. Rule 3: The sampling distribution of 𝑝 is approximately normal since there are 442 successes and 813 failures.