1 Psych 5500/6500 t Test for Dependent Groups (aka ‘Paired Samples’ Design) Fall, 2008.

Slides:



Advertisements
Similar presentations
Psych 5500/6500 t Test for Two Independent Groups: Power Fall, 2008.
Advertisements

T-tests continued.
The t Test for a Single Group Mean (Part 3): Effect Size
PY 427 Statistics 1Fall 2006 Kin Ching Kong, Ph.D Lecture 9 Chicago School of Professional Psychology.
Behavioural Science II Week 1, Semester 2, 2002
Using Statistics in Research Psych 231: Research Methods in Psychology.
The t-test:. Answers the question: is the difference between the two conditions in my experiment "real" or due to chance? Two versions: (a) “Dependent-means.
Cal State Northridge  320 Ainsworth Sampling Distributions and Hypothesis Testing.
Lecture 8 PY 427 Statistics 1 Fall 2006 Kin Ching Kong, Ph.D
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 10: Hypothesis Tests for Two Means: Related & Independent Samples.
Don’t spam class lists!!!. Farshad has prepared a suggested format for you final project. It will be on the web
Analysis of Differential Expression T-test ANOVA Non-parametric methods Correlation Regression.
1 Psych 5500/6500 The t Test for a Single Group Mean (Part 5): Outliers Fall, 2008.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 11: Power.
1 Psych 5500/6500 The t Test for a Single Group Mean (Part 2): p Values One-Tail Tests Assumptions Fall, 2008.
Major Points Formal Tests of Mean Differences Review of Concepts: Means, Standard Deviations, Standard Errors, Type I errors New Concepts: One and Two.
1 Psych 5500/6500 Statistics and Parameters Fall, 2008.
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
Hypothesis Testing:.
Chapter Eleven Inferential Tests of Significance I: t tests – Analyzing Experiments with Two Groups PowerPoint Presentation created by Dr. Susan R. Burns.
Hypothesis Testing II The Two-Sample Case.
1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.
Statistical Analysis Statistical Analysis
Research Methods in Psychology
1 Psych 5500/6500 Chi-Square (Part Two) Test for Association Fall, 2008.
The Hypothesis of Difference Chapter 10. Sampling Distribution of Differences Use a Sampling Distribution of Differences when we want to examine a hypothesis.
Chapter 11 Hypothesis Tests: Two Related Samples.
1 CSI5388: Functional Elements of Statistics for Machine Learning Part I.
EDUC 200C Friday, October 26, Goals for today Homework Midterm exam Null Hypothesis Sampling distributions Hypothesis testing Mid-quarter evaluations.
Psych 5500/6500 ANOVA: Single-Factor Independent Means Fall, 2008.
One-sample In the previous cases we had one sample and were comparing its mean to a hypothesized population mean However in many situations we will use.
Hypothesis Testing Using the Two-Sample t-Test
1 Psych 5500/6500 t Test for Two Independent Means Fall, 2008.
1 Psych 5500/6500 The t Test for a Single Group Mean (Part 4): Power Fall, 2008.
Testing Hypotheses about Differences among Several Means.
1 Section 9-4 Two Means: Matched Pairs In this section we deal with dependent samples. In other words, there is some relationship between the two samples.
Correlation Analysis. Correlation Analysis: Introduction Management questions frequently revolve around the study of relationships between two or more.
1 Psych 5500/6500 The t Test for a Single Group Mean (Part 1): Two-tail Tests & Confidence Intervals Fall, 2008.
Lecture 16 Section 8.1 Objectives: Testing Statistical Hypotheses − Stating hypotheses statements − Type I and II errors − Conducting a hypothesis test.
Sociology 5811: Lecture 11: T-Tests for Difference in Means Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
1 Psych 5500/6500 Introduction to the F Statistic (Segue to ANOVA) Fall, 2008.
DIRECTIONAL HYPOTHESIS The 1-tailed test: –Instead of dividing alpha by 2, you are looking for unlikely outcomes on only 1 side of the distribution –No.
Chapter 9 Three Tests of Significance Winston Jackson and Norine Verberg Methods: Doing Social Research, 4e.
Smith/Davis (c) 2005 Prentice Hall Chapter Nine Probability, the Normal Curve, and Sampling PowerPoint Presentation created by Dr. Susan R. Burns Morningside.
6.1 Inference for a Single Proportion  Statistical confidence  Confidence intervals  How confidence intervals behave.
Chapter Twelve The Two-Sample t-Test. Copyright © Houghton Mifflin Company. All rights reserved.Chapter is the mean of the first sample is the.
1 Psych 5500/6500 Measures of Variability Fall, 2008.
The Single-Sample t Test Chapter 9. t distributions >Sometimes, we do not have the population standard deviation. (that’s actually really common). >So.
1 Psych 5510/6510 Chapter 14 Repeated Measures ANOVA: Models with Nonindependent Errors Part 1 (Crossed Designs) Spring, 2009.
Stats Lunch: Day 8 Repeated-Measures ANOVA and Analyzing Trends (It’s Hot)
T-test for dependent Samples (ak.a., Paired samples t-test, Correlated Groups Design, Within-Subjects Design, Repeated Measures, ……..) Next week: Read.
Psych 230 Psychological Measurement and Statistics Pedro Wolf November 18, 2009.
Chapter 11 The t-Test for Two Related Samples
Independent Samples T-Test. Outline of Today’s Discussion 1.About T-Tests 2.The One-Sample T-Test 3.Independent Samples T-Tests 4.Two Tails or One? 5.Independent.
1 Psych 5510/6510 Chapter 14 Repeated Measures ANOVA: Models with Nonindependent ERRORs Part 2 (Crossed Designs) Spring, 2009.
Chapter 13 Understanding research results: statistical inference.
HYPOTHESIS TESTING FOR DIFFERENCES BETWEEN MEANS AND BETWEEN PROPORTIONS.
Inferential Statistics Psych 231: Research Methods in Psychology.
Chapter 11: The t Test for Two Related Samples. Repeated-Measures Designs The related-samples hypothesis test allows researchers to evaluate the mean.
Dependent-Samples t-Test
Simulation-Based Approach for Comparing Two Means
I. Statistical Tests: Why do we use them? What do they involve?
Psych 231: Research Methods in Psychology
What are their purposes? What kinds?
Inferential Statistics
Psych 231: Research Methods in Psychology
Psych 231: Research Methods in Psychology
Psych 231: Research Methods in Psychology
Psych 231: Research Methods in Psychology
Chapter Outline Inferences About the Difference Between Two Population Means: s 1 and s 2 Known.
Presentation transcript:

1 Psych 5500/6500 t Test for Dependent Groups (aka ‘Paired Samples’ Design) Fall, 2008

2 Experimental Designs t test for dependent groups is used in the following two experimental designs: 1.Repeated measures (a.k.a. within-subjects) design. 2.Matched pairs design.

3 Repeated Measures (Within-subjects) Design Measure each participant twice, once in ‘Condition A’ and once in ‘Condition B’. The scores in the two groups are no longer independent as they come from the same participants. I like to use ‘Condition A’ and ‘Condition B’ rather than ‘Group 1’ and ‘Group 2’ as the latter terms seem to imply (at least to me) that there are different subjects in each group.

4 Design SubjectCondition ACondition B 1S1S1 S1S1 2S2S2 S2S2 3S3S3 S3S3 etc. Each subject’s two scores are dependent, but are independent of the other subjects’ scores.

5 Matched Pairs The two paired scores don’t have to come from the same person, there are other ways the scores within the pairs could be associated (dependent). For example, measuring marital satisfaction within married couples (a static group design).

6 Design CoupleWifeHusband 1S 1.W S 1.H 2S 2.W S 2.H 3S 3.W S 3.H etc. The scores within each couple are dependent, but each couple’s scores are independent of the other other couples’ scores

7 Example: Repeated Measures or Within-Subjects Design You are interested in whether attending a mixed- race day camp affects children’s racial prejudice. Six children attending the day camp were given a test to measure racial prejudice (higher scores = more prejudice) when they first arrived at camp. The same six children were given the same test seven days later when they left the camp.

8 Data SubjectBeforeAfter Mean=

9 Getting Rid of the Nonindependence Because we have two scores per person the scores are not all independent of each other, which means we can’t do a t test. The solution is simple, we will turn those two scores per person into just one score per person, a score which reflects the difference in each person’s score when they are in Condition A compared to when they are in Condition B.

10 Difference Scores SubjectBefore-After=Difference = = = = = =3 For each subject we now have just one score, their ‘difference’ score. 1) The difference scores measure how much the subject’s score differed in the two conditions. 2) The difference scores are independent of each other, we can now perform the t test for a single group of scores on the difference scores.

11 Difference Scores SubjectBefore-After=D = = = = = =3 To simply, let’s call the difference scores ‘d scores’. The mean of the d scores is a measure of the average difference between the scores in condition A and the scores in condition B.

12 Difference Scores SubjectBefore-After=D = = = = = =3 In our sample, the prejudice scores were on average 2.83 higher before the day camp than they were after the day camp.

13 Difference Scores SubjectD Mean D is a statistic, it reflects what we found in those six kids. Our hypotheses will concern the larger population these six kids represent (2-tailed): H0: μ D =0 Ha: μ D  0

14 Same Thing What we are about to do is exactly the same thing as performing a t test for a single group of scores, we have simply relabeled our variable as ‘D’ (to stand for ‘difference scores’) rather then ‘Y’. This is not really a third t test, it is just another context in which we can use the t test for a single group of scores.

15 Sampling Distribution All the results we could get for mean D assuming H0 were true.

16 df and tc so t c =±2.571

17 est. standard error (Compare to t test for a single group mean).

18

19 t obt You should be able to guess what this formula is.

20 t(5)=3.24, p=.023

21 Difference Scores SubjectBefore-After=Difference = = = = = =3 If we analyze the difference scores to see if the mean of their population differs from zero we get: t(5)=3.248, p=.023, we can conclude that their is a statistically significant difference in the before and after scores (i.e. μ D  0), if we have no serious confounding variables then we conclude that the day camp affected prejudice scores.

22 One-Tail Tests If we are testing a theory which predicts that prejudice should be less after the day camp then that would imply that the mean of the difference scores should be greater than zero (write Ha to express the prediction). H0: μ D  0 Ha: μ D > 0 This is indeed the direction the results fell, so the p value would be p=.023/2=.012 So the results are t(5)=3.248, p=.012

23 One-Tail Tests If we are testing a theory which predicts that prejudice should be greater after the day camp then that would imply that the mean of the difference scores should be less than zero (write Ha to express the prediction). H0: μ D  0 Ha: μ D < 0 This is opposite from the direction the results fell, so the p value would be p=1-.023/2=.988 So the results are t(5)=3.248, p=.988

24 Matched Pairs Design This type of design is analyzed exactly the same way as a repeated measures design, you analyze the difference scores. CoupleWife-Husband=Difference 1-= 2-= 3-= 4-= 5-= 6-=

25 Lowering Variance Since the beginning of the semester I’ve been making the point that lowering the variance of the data is a good thing, it leads to more representative data and thus makes it easier to draw conclusions about the population from which the sample was drawn. Lowering variance increases power. I have been promising we would look at a way of accomplishing that other than simply sampling from a more homogeneous population, here it is...

26 Look again at our original data, if these scores came from an independent groups design (e.g. random half of the kids measured before the day camp and the other half measured after the day camp) we would be in trouble, look at how much the scores vary within each group, the kids really differed in prejudice levels. This variance would kill the power of our experiment. SubjectBeforeAfter Mean=

27 But with a repeated measures design we are just looking at the effect of the independent variable (attending the camp) on each kid (how much they differed before and after rather than at how prejudiced they are). The independent variable had fairly similar effects on the kids (from –1 to 5), and thus the difference scores don’t have nearly as much variability as the prejudice levels of the various kids. SubjectBefore-After=Difference = = = = = =3 Mean

28 SubjectBefore-After=Difference = = = = = =3 Mean Analyzed as t for independent groupsAnalyzed as t for dependent

29 Variability and Designs Which t test you use is based upon how you run the study. In deciding how to run the study: 1.If you think the effect of the independent variable will be rather similar for each subject and that the subjects’ actual scores will vary quite a bit then use a paired sample design (repeated measure or matched pairs design). 2.If you think the effect of the independent variable will vary quite a bit and that the subjects’ actual scores will be rather similar than use an independent groups design (true experiment, quasi-experiment, static group design). A repeated measures design is usually more powerful than an independent groups design.

30 Effect Size The direct measure of effect size in this t test is simply the mean of the difference scores. This value represents the effect of the independent variable on the participants, and it also happens to equal the mean of the first group minus the mean of the second group (making it the same as the measure of effect size in the t test for independent groups).

31 Standardized Effect Size

32 Manual Calculations

33 From SPSS When doing a ‘Paired Samples t Test’ (what SPSS calls what I call ‘t test for correlated groups’) the analysis will provide the following under the title ‘Paired Samples Test’: Mean = Std Deviation= In our use of symbols these would be represented as: Which is enough to compute Hedges’s g, for Cohen’s d we need the standard deviation of the sample, which can be found by:

34

35 Warning.... There is some controversy about the correct calculations for standardized effect size. The shortcuts provided in the earlier lecture on a single group t test (repeated below) don’t work in this context: If we were to use those formulas we would get larger effect sizes:

36 GPower 3.0 In GPower this t test is called the t test for “Means: Difference between two dependent means (matched pairs)”. If you give it mean D and the standard deviation of D (‘S D ’) it will compute Cohen’s d (big deal, as we have seen that is a simple formula). The ‘Total sample size’ is the number of pairs of scores (6 in our example). By the way, the post hoc analysis shows that this example had a power of 0.80! This was due to my having the mean D be rather large compared to the S D.

37 Carry-Over Effect Carry-Over Effect: A confounding variable that may arise due to measuring the same person more than once, thus can only happen in a repeated-measures design. Practice effect: the general term for when a carry-over effect leads to an increase in performance over subsequent measures. Fatigue effect: the general term for when a carry-over effect leads to a decrease in performance over subsequent measures.

38 Options for Controlling Carry- Over Effects 1.If your independent variable is a carry-over effect (e.g. the effect of practice) then you do not need or want to control it. Otherwise If applicable, use different forms of the same test. 3.Minimize the carry-over effect (e.g. increase the time between first measure and second measure). 4.Counterbalance the order of conditions.

39 Counterbalancing the Order of Conditions Half the participants are in Condition A first and in Condition B second. The other half of the participants are in Condition B first and Condition A second. SubjectCondition ACondition B S11 st 2 nd S22 nd 1 st S31 st 2 nd S42 nd 1 st