Presentation is loading. Please wait.

Presentation is loading. Please wait.

Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 9: Hypothesis Tests for Means: One Sample.

Similar presentations


Presentation on theme: "Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 9: Hypothesis Tests for Means: One Sample."— Presentation transcript:

1 Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 9: Hypothesis Tests for Means: One Sample

2 Hypothesis Testing with Means To this point, we have discussed the use of hypothesis testing in general. To this point, we have discussed the use of hypothesis testing in general. We have also examined the use of correlation and regression techniques to determine if there is a linear relation between two variables. We have also examined the use of correlation and regression techniques to determine if there is a linear relation between two variables. Over the next few weeks, we will focus on hypothesis tests involving differences among means. Over the next few weeks, we will focus on hypothesis tests involving differences among means.

3 Hypothesis Testing with Means Comparisons among means represents a common analysis when conducting experiments. Comparisons among means represents a common analysis when conducting experiments. Individuals are assigned to groups Individuals are assigned to groups Nominal scales Nominal scales Control and treatment groups Control and treatment groups Goal is to determine if the differences between group means are likely due to sampling error or not. Goal is to determine if the differences between group means are likely due to sampling error or not.

4 Hypothesis Testing with Means We will begin with the case of comparing results of a single sample to that of known parameters. We will begin with the case of comparing results of a single sample to that of known parameters. For example, comparing results on intelligence, language development, etc., to known population norms For example, comparing results on intelligence, language development, etc., to known population norms The null hypothesis will take the form where we postulate a specific value: The null hypothesis will take the form where we postulate a specific value: After this, we will move to the more common tack of comparing two groups to one another. After this, we will move to the more common tack of comparing two groups to one another.

5 An Example We know that in the population, the mean IQ = 100 with a sd = 15. We know that in the population, the mean IQ = 100 with a sd = 15. Let’s say that a new supplement is being marketed as a pill capable of increasing intelligence. Let’s say that a new supplement is being marketed as a pill capable of increasing intelligence. To examine this claim, we could have a sample of 100 individuals take the pill for a month and then give them an IQ test. To examine this claim, we could have a sample of 100 individuals take the pill for a month and then give them an IQ test. Let’s say the score from this sample is 105. Let’s say the score from this sample is 105. Did the pill work? Did the pill work?

6 An Example To answer this question, we will have to examine the sampling distribution of the mean to determine the standard error. To answer this question, we will have to examine the sampling distribution of the mean to determine the standard error. Even if the null is true, sample means would be expected to vary around the population mean of 100. Even if the null is true, sample means would be expected to vary around the population mean of 100. The Central Limit Theorem allows us to obtain information about the nature of the sampling distribution for any variable. The Central Limit Theorem allows us to obtain information about the nature of the sampling distribution for any variable.

7 Central Limit Theorem Given a population mean and variance, the sampling distribution of the mean will have a mean and variance defined as below. The distribution will also approach the normal distribution as N increases. Given a population mean and variance, the sampling distribution of the mean will have a mean and variance defined as below. The distribution will also approach the normal distribution as N increases. The CLT thus allows us to The CLT thus allows us to calculate the mean and sd of the sampling distribution, but also dictates that it will be normal with larger N.

8 Central Limit Theorem If the variable is normally distributed in the population, the sampling distribution will be normal regardless of N. If the variable is normally distributed in the population, the sampling distribution will be normal regardless of N. Even if the population distribution is very non- normal (e.g., rectangular, skewed), the sampling distribution will approach normality as N increases. Even if the population distribution is very non- normal (e.g., rectangular, skewed), the sampling distribution will approach normality as N increases. Why? Why?

9 N=3 N=10

10 Back to the Example Sampling distribution for our example: Sampling distribution for our example: Given that we know the population variance, we can calculate a z-score for our group. Given that we know the population variance, we can calculate a z-score for our group.

11 Testing a Sample Mean When  is Unknown: One-Sample t Test More frequently, it may be the case that we know the norms for a population in terms of means, but do not know the variance. More frequently, it may be the case that we know the norms for a population in terms of means, but do not know the variance. We must estimate the population variance from the sample variance. We must estimate the population variance from the sample variance. In these cases, we cannot use the z-scores to calculate probabilities, as the sampling distribution of sample sd’s is positively skewed. In these cases, we cannot use the z-scores to calculate probabilities, as the sampling distribution of sample sd’s is positively skewed. On average, this implies that we will often underestimate the population variance, which will result in lower estimates of the standard error. On average, this implies that we will often underestimate the population variance, which will result in lower estimates of the standard error. What problem will this cause with hypothesis testing? What problem will this cause with hypothesis testing? Increases in Type I errors Increases in Type I errors

12 The sample sd is an unbiased estimator of the population sd, but any single sample sd is likely to underestimate the population sd. The sample sd is an unbiased estimator of the population sd, but any single sample sd is likely to underestimate the population sd. Standard error calculations using the sample sd will usually produce probability values that are too low (i.e., z scores that are too high). Standard error calculations using the sample sd will usually produce probability values that are too low (i.e., z scores that are too high).

13 Student’s t Distribution To remedy this fact, we use the t distribution to calculate probabilities instead of the normal distribution. To remedy this fact, we use the t distribution to calculate probabilities instead of the normal distribution. The t distribution is a family of sampling distributions whose shape varies as a function of sample size. The t distribution is a family of sampling distributions whose shape varies as a function of sample size. As sample size gets larger, the t distribution approaches the normal. As sample size gets larger, the t distribution approaches the normal. Smaller samples produce “fatter” tails. Smaller samples produce “fatter” tails.

14 Student’s t Distribution t distributions represent deviations as t’s rather than z’s. t distributions represent deviations as t’s rather than z’s. t’s are simply “z-scores” calculated using a standard error based on the sample variance. t’s are simply “z-scores” calculated using a standard error based on the sample variance.

15 Student’s t Distribution Degrees of Freedom refer to the number of information items free to vary. Degrees of Freedom refer to the number of information items free to vary. When calculating the variance in a sample, we use the sample mean. Thus, N-1 items are allowed to vary. When calculating the variance in a sample, we use the sample mean. Thus, N-1 items are allowed to vary. Once we know the sample mean and N-1 items, the final item is contrained. Once we know the sample mean and N-1 items, the final item is contrained. Thus, the shape of t distributions vary with degrees of freedom. Thus, the shape of t distributions vary with degrees of freedom. Applet Applet

16 Back to the Example Let’s go back to the IQ example, except we’ll now assume that we don’t know the population sd, but the sample sd=20. Let’s go back to the IQ example, except we’ll now assume that we don’t know the population sd, but the sample sd=20. Does a score of 105 in our sample suggest increased intelligence? Does a score of 105 in our sample suggest increased intelligence? To answer this question, we find the critical t value for 99 df’s. To answer this question, we find the critical t value for 99 df’s. t is significant, but with less probability than using the normal distribution. t is significant, but with less probability than using the normal distribution. What would happen if we used a smaller sample (N=11) What would happen if we used a smaller sample (N=11) Assuming that the standard error was 11 as well. Assuming that the standard error was 11 as well.

17 Factors that Affect the Magnitude of t Size of the obtained difference between means Size of the obtained difference between means Larger difference = larger t Larger difference = larger t Magnitude of sample variance Magnitude of sample variance Larger variance = smaller t Larger variance = smaller t Sample size Sample size Larger sample size = larger t Larger sample size = larger t

18 Confidence Limits on the Mean A specific estimate for a parameter value is termed a point estimate. A specific estimate for a parameter value is termed a point estimate. We can also set confidence intervals. We can also set confidence intervals. A range of values with an associated probability that the true parameter value is contained within the range. A range of values with an associated probability that the true parameter value is contained within the range.

19 Confidence Limits on the Mean Let’s say we want to know the 95% confidence interval for the population mean IQ as estimated from our sample. Let’s say we want to know the 95% confidence interval for the population mean IQ as estimated from our sample. We know the sample mean and standard error (i.e., sd of sampling distribution for t for 99 df’s). We know the sample mean and standard error (i.e., sd of sampling distribution for t for 99 df’s). Note that CI does not include 100. Note that CI does not include 100. Our sample must represent a different population. Our sample must represent a different population.


Download ppt "Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 9: Hypothesis Tests for Means: One Sample."

Similar presentations


Ads by Google