 # Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 10: Hypothesis Tests for Two Means: Related & Independent Samples.

## Presentation on theme: "Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 10: Hypothesis Tests for Two Means: Related & Independent Samples."— Presentation transcript:

Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 10: Hypothesis Tests for Two Means: Related & Independent Samples

Clarification of Estimating Standard Errors The sample sd is an unbiased estimator of the population sd, but any single sample sd is likely to underestimate the population sd. The sample sd is an unbiased estimator of the population sd, but any single sample sd is likely to underestimate the population sd. Standard error calculations using the sample sd will usually produce probability values that are too low (i.e., z scores that are too high). Standard error calculations using the sample sd will usually produce probability values that are too low (i.e., z scores that are too high). Consequently, we use the t distribution, as opposed to the normal, to adjust for this bias. Consequently, we use the t distribution, as opposed to the normal, to adjust for this bias.

One More Example for When Population Mean is Known One case where this is quite common is testing whether participants’ responses are greater than chance. One case where this is quite common is testing whether participants’ responses are greater than chance. For example, can participants identify subliminally- presented stimuli. For example, can participants identify subliminally- presented stimuli. The comparison mean would be.50 The comparison mean would be.50 Number of trials where responses are scored 0 or 1 for incorrect or correct. Number of trials where responses are scored 0 or 1 for incorrect or correct.

One More Example for When Population Mean is Known We do not however know what the population variance is. We do not however know what the population variance is. We must estimate it using the sample variance. We must estimate it using the sample variance. When we do this, we underestimate it, resulting in lower standard errors and higher z-scores (i.e. Type I errors). When we do this, we underestimate it, resulting in lower standard errors and higher z-scores (i.e. Type I errors). Therefore, we will use the t distribution. Therefore, we will use the t distribution.

One More Example for When Population Mean is Known Let’s assume the sample is 25 people, the mean accuracy =.56, and the sample sd=.09. Let’s assume the sample is 25 people, the mean accuracy =.56, and the sample sd=.09.

Confidence Limits What is the 95% confidence interval for accuracy? What is the 95% confidence interval for accuracy?

Comparing Means from Related Samples A more frequent case found in behavioral research is the comparison of two sets of scores that are related (i.e., not independent). A more frequent case found in behavioral research is the comparison of two sets of scores that are related (i.e., not independent). Pre-test / post-test designs Pre-test / post-test designs Dyads Dyads Dependence implies that knowing a score in one distribution allows you better than chance prediction about the related score in the other distribution. Dependence implies that knowing a score in one distribution allows you better than chance prediction about the related score in the other distribution.

Comparing Means from Related Samples The null hypothesis in all cases is: The null hypothesis in all cases is: This can be recast using difference scores. This can be recast using difference scores. Difference scores are calculated as the difference between the subjects’ performance on two occasions (or the difference between related data points) Difference scores are calculated as the difference between the subjects’ performance on two occasions (or the difference between related data points)

Comparing Means from Related Samples Once we do this, we are again working with a “single” sample with a known prediction for the mean. Once we do this, we are again working with a “single” sample with a known prediction for the mean. Thus, we can use a t test as we did previously, with minor modifications. Thus, we can use a t test as we did previously, with minor modifications. We simply calculate the sd of the distribution of difference scores and then use it to estimate the associated standard error. We simply calculate the sd of the distribution of difference scores and then use it to estimate the associated standard error. Note that df’s again = N-1. Note that df’s again = N-1.

Advantages and Disadvantages of Using Related Samples Greatly reduces variability Greatly reduces variability Variability is only with respect to change in dv Variability is only with respect to change in dv Provides perfect control for extraneous variables Provides perfect control for extraneous variables Control group is perfect Control group is perfect Require fewer participants Require fewer participants Problems of order and carry-over effects Problems of order and carry-over effects Experience at time 1 may alter scores at time 2 irrespective of any manipulations Experience at time 1 may alter scores at time 2 irrespective of any manipulations

Effect Size Can we use p-values to quantify the magnitude of an effect? Can we use p-values to quantify the magnitude of an effect? No, as any given difference between means will be more or less significant as a function of sample size (all else being equal). No, as any given difference between means will be more or less significant as a function of sample size (all else being equal). We need a measure of the magnitude of the differences that is separate from sample size. We need a measure of the magnitude of the differences that is separate from sample size.

Effect Size Cohen’s d is a common effect size measure for comparing two means. Cohen’s d is a common effect size measure for comparing two means. By convention: d=.2 small, d=.5 medium, d=.8 large By convention: d=.2 small, d=.5 medium, d=.8 large Can be interpreted as “non-overlap” of distributions. Can be interpreted as “non-overlap” of distributions.

Comparing Means from Independent Samples This represents one of the most frequent cases encountered in behavioral research. This represents one of the most frequent cases encountered in behavioral research. No specific information about the population mean or variance is know. No specific information about the population mean or variance is know. We randomly sample two groups and provide one with a relevant manipulation. We randomly sample two groups and provide one with a relevant manipulation. We then wish to determine whether any differences in group means is more likely attributable to the manipulation or to sampling error. We then wish to determine whether any differences in group means is more likely attributable to the manipulation or to sampling error.

Comparing Means from Independent Samples In this case, we have two independent distributions, each with its own mean and variance. In this case, we have two independent distributions, each with its own mean and variance. We can easily determine what the difference is between the two means, but we will need a measure of sampling error with which to compare it. We can easily determine what the difference is between the two means, but we will need a measure of sampling error with which to compare it. Unlike previous examples, we will need a standard error for the difference between two means. Unlike previous examples, we will need a standard error for the difference between two means.

Standard Errors for Mean Differences Between Independent Samples The logic is similar to what we have done before. The logic is similar to what we have done before. Assume two distinct population distributions. Then, sample pairs of means from each. Assume two distinct population distributions. Then, sample pairs of means from each. The distribution of the mean differences constitutes the appropriate sampling distribution. The distribution of the mean differences constitutes the appropriate sampling distribution. Its sd is the standard error for the t test. Its sd is the standard error for the t test. The variance sum law dictates that the variance of the sum (or difference) of two independent variables is equal to the sum of their variances. The variance sum law dictates that the variance of the sum (or difference) of two independent variables is equal to the sum of their variances.

The means and sd’s for the distributions and their differences are calculated as at right. We know from the central limit theorem that the resulting sampling distributions will be normal. But, the problem of not knowing what the true population sd is arises. To deal with this problem, we must again use the t as opposed to normal distribution to calculate standard errors.

t Tests for Independent Samples The formula is a generalization of the previous formula. The null is that the mean difference between the samples is zero. The formula is a generalization of the previous formula. The null is that the mean difference between the samples is zero. df’s = (n 1 -1)+(n 2 -1)=n 1 +n 2 -2. df’s = (n 1 -1)+(n 2 -1)=n 1 +n 2 -2.

t Tests for Independent Samples with Unequal n’s In the previous formula, we assumed equal condition n’s. Sometimes, however, the n of one sample exceeds the other, in which case its variance is a better approximation of the population variance. In such cases, we pool the variances using a weighted average. In the previous formula, we assumed equal condition n’s. Sometimes, however, the n of one sample exceeds the other, in which case its variance is a better approximation of the population variance. In such cases, we pool the variances using a weighted average.

Assumptions for t Tests Homogeneity of Variance Homogeneity of Variance The population variances of the two distributions are equal The population variances of the two distributions are equal Implies that the variance of the two samples should be relatively equal Implies that the variance of the two samples should be relatively equal Heterogeneity is usually not a problem unless the variance of one sample is greater than 3 times that of the other. Heterogeneity is usually not a problem unless the variance of one sample is greater than 3 times that of the other. If this occurs, SPSS and other programs will provide an both a normal and adjusted t value. If this occurs, SPSS and other programs will provide an both a normal and adjusted t value. The adjustment lowers the df’s which reduces chances for a type I error. The adjustment lowers the df’s which reduces chances for a type I error.

Assumptions for t Tests Normality of Distributions Normality of Distributions We assume that the sampled data are normally distributed. We assume that the sampled data are normally distributed. They need not be exactly normal, but should be unimodal and symmetric. They need not be exactly normal, but should be unimodal and symmetric. Really only a problem for small samples, as the CLT applies everywhere for large samples. Really only a problem for small samples, as the CLT applies everywhere for large samples.

Effect Size Cohen’s d is also used for independent samples. Cohen’s d is also used for independent samples. The only difference is that we use the pooled sd term. The only difference is that we use the pooled sd term.

Confidence Limits What is the 95% confidence interval for accuracy? What is the 95% confidence interval for accuracy?

Download ppt "Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 10: Hypothesis Tests for Two Means: Related & Independent Samples."

Similar presentations