Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 5: Introduction to Statistical Inference

Similar presentations


Presentation on theme: "Chapter 5: Introduction to Statistical Inference"— Presentation transcript:

1 Chapter 5: Introduction to Statistical Inference
Suppose we wish to know whether children who grow up in homes without access to television have higher IQs than children in the general population. Assume that IQ is normally distributed in the general population, with μ = 100 and σ = 15 points. We draw a random sample of (N =) 25 children from homes without television (no more than one child per household) and measure each child’s IQ. The mean IQ for our sample turns out to be Can we conclude that children without TV are indeed smarter? It is possible that our sample mean exceeds 100 due entirely to chance factors involved in drawing our random sample. We will return to this problem after we have developed some tools for statistical inference. Chapter 5 Prepared by Samantha Gaies, M.A.

2 Sampling Distributions
Population distributions are composed of individual scores; however, psychologists commonly perform their studies on groups. Therefore, we need to understand distribu- tions that are composed of statistics from groups (all of which are the same size). Such distributions are called sampling distributions. If we are looking specifically at the mean of each sample, the distribution is called the sampling distribution of the mean (SDM). What will the SDM look like? The Central Limit Theorem tells us that as the size of the samples increases, the SDM becomes closer in shape to the normal distribution (e.g., less skewed), regardless of the shape of the original population distribution. Chapter 5 Prepared by Samantha Gaies, M.A.

3 The Sampling Distribution of the Mean
The mean of the SDM is the same as the mean of the population from which the samples are being randomly drawn. The standard deviation of the SDM is called the Standard Error of the Mean (SEM). It is found from the standard deviation of the population and the sample size, according to this formula: The SEM: Is larger when the standard deviation of the population distribution is larger. Is smaller than the standard deviation of the population distribution. The larger the samples, the smaller the SEM (i.e., as N increases, the SEM decreases. Chapter 5 Prepared by Samantha Gaies, M.A.

4 Prepared by Samantha Gaies, M.A.
The z Score for Sample Means In summary, the sampling distribution of the mean can, in most cases, be assumed to be a normal distribution, with a mean equal to µ (the mean of the population being sampled), and a standard deviation equal to σ divided by the square root of N. To determine whether the mean of a particular sample is unusual, we can use the methods of the previous chapter and calculate a z score for a sample mean. The formula needed for this is just like the z score for individuals, except that the raw score is a sample mean, and the standard deviation is the SEM: Chapter 5 Prepared by Samantha Gaies, M.A.

5 The z Score and p value for the TV/IQ Example
Let us now find the z score for the example in the first slide to determine whether drawing a random sample of 25 children would frequently yield a sample mean as far from 100 as the mean in our example. First, note that the SEM for our example is: Therefore, the z score for our sample mean is: The (one-tailed) p value corresponding to this z score is the area beyond z = 1.3, which is (from Table A) – = 9.68/100 = Chapter 5 Prepared by Samantha Gaies, M.A.

6 Null Hypothesis Testing
Finding the p value for the non-TV sample of children is a major step toward using null hypothesis testing (NHT) to decide whether we can conclude that an entire population of children without TV would be any smarter than the current population of children with nearly universal access to TV. The major steps of NHT can be summarized as follows: Step 1: Assume that the worst-case scenario—i.e., the null hypothesis (H0)—is true. In this example, assume that the mean of the non-TV population is exactly the same as for the TV population—that is, µ0 = µ = 100. Step 2: Set an alpha (α) level such that if p is less than α you will reject (H0) as implausible. If a two-tailed test is decided upon, double your p value before comparing it to α. Step 3: Find the p value with respect to the null hypothesis distribution (NHD). For the one-sample case, the NHD is just the sampling distribution of the mean. Chapter 5 Prepared by Samantha Gaies, M.A.

7 The Logic of Null Hypothesis Testing
It is only for the null hypothesis that we can easily find the distribution of possible results and then determine the probability of obtaining results as extreme as ours when H0 is really true. We hope that our results will look unlikely enough when H0 is true that p will be less than α, and we can therefore reject (H0) and declare our results to be statistically significant. The alternative hypothesis (HA) is simply the complement of H0; if H0 is that µ = 100, then HA is that µ ≠ 100 (anything but 100). NHT involves what is called an indirect proof. By casting doubt on the hypothesis we hope is not true (i.e., H0), we lend some support to the opposite hypothesis, HA, which is more consistent with our research hypothesis. Chapter 5 Prepared by Samantha Gaies, M.A.

8 Statistical Decisions Type I and Type II Errors
If your p value is greater than your alpha level (normally .05), you need not “accept” the null hypothesis, but you cannot reject it. Some researchers say that they have to “retain” H0, whereas others prefer to say that they have “failed to reject” H0, or that they have insufficient evidence for rejecting it. Type I and Type II Errors Type I error – rejecting the null hypothesis when it is in fact true. This is the error rate that we control directly by choosing the level for alpha. Type II error – failing to reject the null hypothesis when it is in fact false. This error rate (beta) is not directly controlled, but it does increase if alpha is made smaller, and all else stays the same. Chapter 5 Prepared by Samantha Gaies, M.A.

9 More about Type I and Type II Errors
A Type I error is like a false alarm, because you are saying that the mean of your sample is not consistent with the mean of the larger population, when in reality your sample mean is unusual due to chance factors alone. Alpha, the Type I error rate, can be defined as the percentage of null (ineffective) experiments that nonetheless attain statistical significance. A Type II error is a “miss,” because you are saying that the mean of your sample is consistent with the mean of the larger population, when in reality it is not, but the difference you found happened not to be large enough to attain statistical significance. You could reduce your Type I error rate by using a smaller alpha level (e.g., .01), but that would increase the rate of Type II errors (all else equal). You could reduce your Type II error rate by using a one-tailed rather than a two-tailed test, but you would have to convince your audience that there is no chance you would have tested the other tail. Chapter 5 Prepared by Samantha Gaies, M.A.

10 Prepared by Samantha Gaies, M.A.
Try this example… National data shows that AP scores are normally distributed, with μ = 40 and σ = 5. For the past semester, N = 9 students were privately tutored for their AP psychology tests. The mean perfor- mance of the 9 students was 44. Is the mean of tutored students different from the mean of all students? State the hypotheses. Select the significance (alpha) level and the number of tails for the test. Calculate the test statistic (z score for groups, in this case). Find the p value corresponding to your test statistic and compare to alpha. Make a statistical decision with respect to the null hypothesis. Chapter 5 Prepared by Samantha Gaies, M.A.

11 Assumptions Underlying the
One-Sample z Test The dependent variable was measured on an interval or ratio scale. The sample was drawn randomly from the population of interest. The dependent variable follows a normal distribution in the population. Because of the Central Limit Theorem, this assumption is not critical when the size of the sample is about 30 or more, unless the population distribution is nearly the opposite of the normal distribution. The standard deviation for the sampled population is the same as that of the comparison population. (This allows the SEM to be found from the known population standard deviation.) Chapter 5 Prepared by Samantha Gaies, M.A.


Download ppt "Chapter 5: Introduction to Statistical Inference"

Similar presentations


Ads by Google