Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 7 Probability and Samples: The Distribution of Sample Means.

Similar presentations


Presentation on theme: "Chapter 7 Probability and Samples: The Distribution of Sample Means."— Presentation transcript:

1 Chapter 7 Probability and Samples: The Distribution of Sample Means

2 Samples and Sampling Error The scores we have looked at thus far are z-scores and probabilities where the sample consists of a single score. The scores we have looked at thus far are z-scores and probabilities where the sample consists of a single score. This chapter will extend the concepts of z- scores and probability to cover situations with larger samples. This chapter will extend the concepts of z- scores and probability to cover situations with larger samples. Ex: A z-score for an entire sample Ex: A z-score for an entire sample

3 Z-scores (review) Describes exactly where the score is located in the distribution Describes exactly where the score is located in the distribution Ex: a z-score of +2.00 is extreme Ex: a z-score of +2.00 is extreme

4 Figure 6.4 The normal distribution following a z-score transformation Copyright © 2002 Wadsworth Group. Wadsworth is an imprint of the Wadsworth Group, a division of Thomson Learning Extreme Sample Central, Representative Sample

5 Probability (review) If the score is normal, should be able to determine the probability value for each score. If the score is normal, should be able to determine the probability value for each score. A score with a z-score of +2.00 has a probability of only p =.0028 A score with a z-score of +2.00 has a probability of only p =.0028

6 Figure 6.4 The normal distribution following a z-score transformation Copyright © 2002 Wadsworth Group. Wadsworth is an imprint of the Wadsworth Group, a division of Thomson Learning Extreme Sample Central, Representative Sample

7 Z-Scores So far we have been limited to situations where the sample consists of a single score. So far we have been limited to situations where the sample consists of a single score. Most studies have larger samples Most studies have larger samples We will now extend the concepts of z-scores and probability to cover situations with larger samples. We will now extend the concepts of z-scores and probability to cover situations with larger samples.

8 A z-score near zero indicates a central, representative sample A z-score near zero indicates a central, representative sample A z-score beyond +/- 2.00 indicates an extreme example A z-score beyond +/- 2.00 indicates an extreme example It will be possible to determine exact probabilities for a sample It will be possible to determine exact probabilities for a sample

9 Figure 6.4 The normal distribution following a z-score transformation Copyright © 2002 Wadsworth Group. Wadsworth is an imprint of the Wadsworth Group, a division of Thomson Learning Extreme Sample Central, Representative Sample

10 Difficulties with using samples Samples provide an incomplete picture of the population Samples provide an incomplete picture of the population Any stats computed will not be identical to the corresponding parameters for the entire population Any stats computed will not be identical to the corresponding parameters for the entire population Ex: IQ for a sample of 25 students is different for IQ of all population Ex: IQ for a sample of 25 students is different for IQ of all population The difference is called a sampling error The difference is called a sampling error

11 Sampling Error This difference, or error between the sample stats and the corresponding population parameters, is called sampling error This difference, or error between the sample stats and the corresponding population parameters, is called sampling error A sampling error is the discrepancy, or amount of error between a sample statistic and its corresponding population parameter.

12 Questions How can you tell which sample is giving the best description of the population? How can you tell which sample is giving the best description of the population? Can you predict how a sample will describe its population? Can you predict how a sample will describe its population? What is the probability of selecting a sample that has a certain sample mean? What is the probability of selecting a sample that has a certain sample mean? We can answer these, but we need to set rules that relate samples to populations. We can answer these, but we need to set rules that relate samples to populations.

13 Distribution of Sample Means Many different samples come up with different results. Many different samples come up with different results. A huge set of possible samples forms a relatively simple, orderly, and predictable pattern A huge set of possible samples forms a relatively simple, orderly, and predictable pattern makes it possible to predict the characteristics of a sample with some accuracy. makes it possible to predict the characteristics of a sample with some accuracy.

14 Distribution of Sample Means (cont.) The ability to predict sample characteristics is based on the. The ability to predict sample characteristics is based on the distribution of sample means. The distribution of sample means is the collection of sample means for all the possible random samples of a particular size (n) that can be obtained from a population The distribution of sample means is the collection of sample means for all the possible random samples of a particular size (n) that can be obtained from a population

15 Distribution of Sample Means (cont.) It is necessary to have all the possible values in order to compute probabilities. It is necessary to have all the possible values in order to compute probabilities. If a set has 100 samples, the probability of obtaining any specific sample is 1 out of 100 or p = 1/100. If a set has 100 samples, the probability of obtaining any specific sample is 1 out of 100 or p = 1/100.

16 Before we only discussed scores, now we are discussing statistics (sample means); Before we only discussed scores, now we are discussing statistics (sample means); Because statistics are obtained from samples, a distribution of statistics is referred to as a. Because statistics are obtained from samples, a distribution of statistics is referred to as a sampling distribution.

17 Sampling Distribution A sampling distribution is a distribution of statistics obtained by selecting all the possible samples of a specific size from a population. A sampling distribution is a distribution of statistics obtained by selecting all the possible samples of a specific size from a population.

18 To construct a sample mean: Take a sample Take a sample Get the mean Get the mean Replace Replace Get the sample Get the sample Get the mean Get the mean Replace Replace Do this until you have gotten all possible sample combinations. Do this until you have gotten all possible sample combinations. Look at Ex. 7.1 – 4 scores n=2 16 sample means – look at histogram p. 147. Look at Ex. 7.1 – 4 scores n=2 16 sample means – look at histogram p. 147.

19 Sample Means Note that the sample means tend to pile up around the population mean Note that the sample means tend to pile up around the population mean  5  5 The sample means are clustered around a value of 5 The sample means are clustered around a value of 5

20 Sample Means (cont.) Samples are supposed to be representative of the population Samples are supposed to be representative of the population Therefore, the sample means tend to approximate the population mean. Therefore, the sample means tend to approximate the population mean.

21 Sample Means (cont.) The distribution of sample means is approximately normal in shape. The distribution of sample means is approximately normal in shape. Can use the distribution of sample means to answer probability questions about sample means. Can use the distribution of sample means to answer probability questions about sample means. Ex: if you take a sample of n=2 scores from the original population, what is the probability of obtaining a sample mean greater than 7? Ex: if you take a sample of n=2 scores from the original population, what is the probability of obtaining a sample mean greater than 7? P (X > 7) = ? P (X > 7) = ?

22 Figure 7.1 Frequency distribution for a population of four scores: 2, 4, 6, 8 Copyright © 2002 Wadsworth Group. Wadsworth is an imprint of the Wadsworth Group, a division of Thomson Learning

23 Table 7.1 The possible samples of n = 2 scores from the population in Figure 7.1 Copyright © 2002 Wadsworth Group. Wadsworth is an imprint of the Wadsworth Group, a division of Thomson Learning

24 Ex: if you take a sample of n=2 scores from the original population, what is the probability of obtaining a sample mean greater than 7? Ex: if you take a sample of n=2 scores from the original population, what is the probability of obtaining a sample mean greater than 7? P (X > 7) = ? P (X > 7) = ? Because probability is equivalent to proportion, the probability question can be restated as follows: Because probability is equivalent to proportion, the probability question can be restated as follows:

25 Of all the possible sample means, what proportion has values greater than 7? Of all the possible sample means, what proportion has values greater than 7? In Figure 7.2 – All the possible sample means are pictured, and only 1 out of the 16 means has a value greater than 7. In Figure 7.2 – All the possible sample means are pictured, and only 1 out of the 16 means has a value greater than 7. Answer: 1 out of 16 or p = 1/16 Answer: 1 out of 16 or p = 1/16

26 Figure 7.2 The distribution of sample means for n = 2 Copyright © 2002 Wadsworth Group. Wadsworth is an imprint of the Wadsworth Group, a division of Thomson Learning

27 The Central Limit Theorem It might not be possible to list all the samples and compute all the possible sample means. It might not be possible to list all the samples and compute all the possible sample means. As the size of n increases, the number of possible samples increases too. As the size of n increases, the number of possible samples increases too. Therefore, it is necessary to develop the general characteristics of the distribution of sample means that can be applied in any situation. Therefore, it is necessary to develop the general characteristics of the distribution of sample means that can be applied in any situation. Characteristics are specified in Central Limit Theorem Characteristics are specified in Central Limit Theorem Cornerstone for much of inferential statistics Cornerstone for much of inferential statistics

28 Central Limit Theorem For any population with mean  and standard deviation  the distribution of sample means for sample size n will have a mean of  and a standard deviation of For any population with mean  and standard deviation  the distribution of sample means for sample size n will have a mean of  and a standard deviation of  n and will approach a normal distribution as n approaches infinity.

29 Central Limit Theorem Describes the distribution of sample means for any population, no matter what shape, mean, or standard deviation. Describes the distribution of sample means for any population, no matter what shape, mean, or standard deviation. The distribution of sample means “approaches” a normal distribution very rapidly. The distribution of sample means “approaches” a normal distribution very rapidly. Describes the distribution of sample means by identifying the three basic characteristics that describe any distribution: shape, central tendency, and variability. Describes the distribution of sample means by identifying the three basic characteristics that describe any distribution: shape, central tendency, and variability.

30 Shape of the Distribution of Means Sample means tends to be a normal distribution Sample means tends to be a normal distribution Can be almost perfect shape if: Can be almost perfect shape if: The population from which the samples are selected is a normal distribution The population from which the samples are selected is a normal distribution The number of scores (n) in each sample is relatively large, around 30 or more. The number of scores (n) in each sample is relatively large, around 30 or more.

31 Mean of the Distribution of Means The expected value of X The expected value of X The mean of the distribution of sample means is equal to  (the population mean) and is called the expected value of X. The mean of the distribution of sample means is equal to  (the population mean) and is called the expected value of X.

32 Standard Error of X We have considered the shape and the central tendency of the distribution of sample means. We have considered the shape and the central tendency of the distribution of sample means. To completely describe this distribution, we need one more characteristic To completely describe this distribution, we need one more characteristic Variability Variability

33 Standard Error of X We will be working with the standard deviation for the distribution of sample means. We will be working with the standard deviation for the distribution of sample means. Called the standard error of X Called the standard error of X The standard error defines the standard, or typical, distance from the mean. The standard error defines the standard, or typical, distance from the mean.

34 Remember, a sample is not expected to provide a perfectly accurate reflection of its population. Remember, a sample is not expected to provide a perfectly accurate reflection of its population. There will be some error between the sample and the population There will be some error between the sample and the population

35 Standard Error of X The standard deviation of the distribution of sample means is called the standard error of X. The standard deviation of the distribution of sample means is called the standard error of X. The standard error measures the standard amount of difference between X and  due to chance The standard error measures the standard amount of difference between X and  due to chance

36 Standard Error of X Standard error =  x = standard distance between X and    indicates that we are measuring a standard deviation or a standard distance from the mean  The subscript x indicates that we are measuring the standard deviation for a distribution of sample means.

37 Standard Error Valuable because it specifies precisely how well a sample mean estimates its population mean Valuable because it specifies precisely how well a sample mean estimates its population mean How much error you should expect on the average How much error you should expect on the average Can use the sample mean as an estimate of the population mean Can use the sample mean as an estimate of the population mean

38 Standard Error Magnitude determined by two factors Magnitude determined by two factors Size of the sample Size of the sample The larger the sample size (n), the more probable it is that the sample mean will be close to the population The larger the sample size (n), the more probable it is that the sample mean will be close to the population The standard deviation of the population from which the sample is selected The standard deviation of the population from which the sample is selected standard error =  x =  standard error =  x =   n

39 Standard error When the sample size increases, the standard error decreases When the sample size increases, the standard error decreases As n decreases, the error increases As n decreases, the error increases

40 Probability and the Distribution of Sample Means Primary use of the distribution of sample means is to find the probability associated with any specific sample. Primary use of the distribution of sample means is to find the probability associated with any specific sample. Remember probability is equivalent to proportion. Remember probability is equivalent to proportion. Because the distribution of sample means presents the entire set of all possible X’s, we can use proportion of this distribution to determine probabilities. Because the distribution of sample means presents the entire set of all possible X’s, we can use proportion of this distribution to determine probabilities.

41 Example 7.2 Population of SAT scores Population of SAT scores  = 100  = 100 If you take a random sample of n = 25 students, what is the probability that the sample mean would be greater than X = 540? If you take a random sample of n = 25 students, what is the probability that the sample mean would be greater than X = 540?

42 Restate probability question as a proportion question Restate probability question as a proportion question Out of all the possible sample means, what proportion has values greater than 540? Out of all the possible sample means, what proportion has values greater than 540? all the possible sample means is the distribution of sample means all the possible sample means is the distribution of sample means The problems is to find a specific portion of this distribution The problems is to find a specific portion of this distribution

43 What we know What we know The distribution is normal becausse the population of SAT scores is normal The distribution is normal becausse the population of SAT scores is normal The distribution has a mean of 500 because the population mean is  The distribution has a mean of 500 because the population mean is  The distribution has a standard error of  X = 20 The distribution has a standard error of  X = 20  X =   X =   n 25 5

44 Figure 7.3 A distribution of sample means Copyright © 2002 Wadsworth Group. Wadsworth is an imprint of the Wadsworth Group, a division of Thomson Learning

45 We are interested in sample means greater than 540 – the shaded area We are interested in sample means greater than 540 – the shaded area Next, find the s-score value that defines the exact location of X = 540 Next, find the s-score value that defines the exact location of X = 540 The value of 540 is located above the mean by 40 pts. The value of 540 is located above the mean by 40 pts. This is 2 s.d. (in this case, 2 standard errors) above the mean This is 2 s.d. (in this case, 2 standard errors) above the mean The z-score for X = 540 is z = +2.00 The z-score for X = 540 is z = +2.00

46 Because this distribution of sample means is normal, you can use the unit normal table to find the probability associated with z=+2.00 Because this distribution of sample means is normal, you can use the unit normal table to find the probability associated with z=+2.00 The table indicates that 0.0228 of the distribution is located in the tail of the distribution beyond z = +2.00 The table indicates that 0.0228 of the distribution is located in the tail of the distribution beyond z = +2.00 Conclusion – it is very unlikely, p = 0.0228 (2.28%) to obtain a random sample of n = 25 students with an average SAT score greater than 540 Conclusion – it is very unlikely, p = 0.0228 (2.28%) to obtain a random sample of n = 25 students with an average SAT score greater than 540

47 Z-scores It is possible to use a z-score to describe the position of any specific sample within the distribution of sample means It is possible to use a z-score to describe the position of any specific sample within the distribution of sample means Z-score tells exactly where a specific sample is located in relation to all the other possible samples that could have been obtained. Z-score tells exactly where a specific sample is located in relation to all the other possible samples that could have been obtained.

48 Figure 7.8 Showing standard error in a graph Copyright © 2002 Wadsworth Group. Wadsworth is an imprint of the Wadsworth Group, a division of Thomson Learning


Download ppt "Chapter 7 Probability and Samples: The Distribution of Sample Means."

Similar presentations


Ads by Google