Presentation on theme: "Repeated Measures t - statistic & Estimation"— Presentation transcript:
1 Repeated Measures t - statistic & Estimation Lecture 9: Chapters 11 &12Repeated Measures t - statistic&Estimation
2 2 sample t - test 2 research strategies Between subjects (independent measures): 2 data sets can come from 2 completely different samplese.g. a sample of depressed patients treated with talk therapy and a separate sample of depressed patients treated with meditation therapyWithin subjects (repeated measures): 2 data sets can come from the same samplee.g. A sample of depressed people before treatment and the same sample of depressed people after treatment.e.g. Dogs are given a choice between two types of dog food. Researchers measure the amount eaten of each type over a 10 min. period.
3 Repeated Measures Stats Within subjects: a single sample of individuals is measured more than once on the same dependent variable. The same subjects are used in all treatment conditionsMatched sample: 2 separate samples of subjects, but each individual in one sample is matched with a subject in the other sample.e.g. match subjects across several variables like age, income, education, sex.In this chapter we are going to focus on repeated measures b/c they are more common than matched subjects, but the same statistical techniques apply.
4 t - statistic for repeated measures Related samples t is based on differences scores rather than raw scores (X values)Difference score = D = X2 - X1e.g. X2 was obtained after treatment and X1 is the baseline conditionM - t =MD - DsMSample stat - pop. parametert =t =sMEstimated standard errorDMD = mean difference scoreD = unknown population difference scoresM = standard error of the difference scoresD
5 Hypothesis Tests for Repeated Measures (1) State the hypothesis (Remember this could also be directional):H0 = D = 0There is no effect, no change, no difference.According to this hypothesis, it is possible that some individuals will show + D scores and some will show - D scores, but this will not be systematic. When averaged they will balance to 0.e.g. your resting HR if tested every week for 8 weeks (under the same conditions, no change in exercise regime)H1 = D = 0Difference scores are consistently + or -e.g. your resting HR is you tested every week for 8 weeks under conditions where we begin to increase cardiovascular exercise everyday
6 ExampleA researcher is investigating the effects of eating oatmeal on cholesterol. A sample of 10 volunteers was obtained (none of whom ate oatmeal on a regular basis). Each volunteer had his/her cholesterol measured. Subjects were then asked to eat 2 cups of oatmeal each day. After 30 days their cholesterol was measured again. Does oatmeal change cholesterol levels?SubjectH0 = D = 0 changes in cholesterol levels due to chanceH1 = D = 0 changes in cholesterol levels due to oatmealbaselinetestdiff.145187157-30130119-11155140-15152-12112115+3120111-9208199167159-8184186+2
7 Hypothesis Tests for Repeated Measures (2) Locate the critical regionAll calculations for the t-statistic here are done with difference scores and there is only 1 D score for each subject.So, df = n-1 (n refers to the number of D scores, not the number of X scores)A researcher is investigating the effects of eating oatmeal on cholesterol. A sample of 10 volunteers was obtained (none of whom ate oatmeal on a regular basis). Each volunteer had his/her cholesterol measured. Subjects were then asked to eat 2 cups of oatmeal each day. After 30 days their cholesterol was measured again. Does oatmeal change cholesterol levels?df = 9, alpha = tailed, critical t-value = +/-2.262
8 Hypothesis Tests for Repeated Measures (3) Collect data and compute the test statistic - use D scores for formulasSubjectt =MD - DsMDD = SS= s2 = sM = 3.05t = / 3.05 = -2.92Dbaselinetestdiff.145187157-30130119-11155140-15152-12112115+3120111-9208199167159-8184186+2
9 Hypothesis Tests for Repeated Measures (4) Make a decisionOur t value is and our t-critical value was 2.262, so we can reject the null hypothesis!Eating oatmeal significantly reduces cholesterol, t(9) = -2.92, p < .05
10 Directional Hypothesis Test for Repeated Measures What would change in our previous example if we had predicted that oatmeal would lower cholesterol levels?State hypothesisLocate critical regionH0 = D 0 cholesterol levels are not reduced after eating oatmealH1 = D < 0 cholesterol levels are reduced after eating oatmealdf = n-1df = 9Alpha = tailedCritical value =
11 Measuring Effect Size d = mean difference / standard deviation d = DM / sFrom our oatmeal example:DM = -2.92s2 = 92.99s = 9.62D = 2.92 / 9.62 = .3Medium effect
12 Measuring Effect Size r2 = t2 / (t2 + df) r2 = -2.922 / (-2.922 + 9) r2 = .49 or 49 % of the varianceLarge effect size using this measure
13 Matched Sample - Same Idea New reading program developed for students…want to make sure that the one of the two samples chosen doesn’t just by chance contain better readers, so we’ll match each each group one-to-one. If student A had a reading achievement score of 75 then we’ll match the second sample with a student B that has a score of 75 as well (pre-treatment)Matched PairControlReading ProgramDD2A615+981B5+10100C1117+63613+749
14 Matched PairControlReading ProgramDD2A615+981B5+10100C1117+63613+749MD = 8SS = 10s2 = 3.33= .91H0 = D =0 (no effect on reading comprehension)H1 = D =0 (no effect on reading comprehension)sMD
15 Matched PairControlReading ProgramDD2A615+981B5+10100C1117+63613+749MD = 8SS = 10s2 = 3.33= .91Set critical region for taileddf = n - 1 = 3 +/t = / .91 = 8.79Reject the null the new reading program works!
16 Relationship between Descriptive and Inferential Statistics Descriptive stats should help you visualize the sample data and, so, have a better understanding of the results.For example mood scores for depressed people before and after meditation therapyMD = 12s = 3.6Without even performing hypothesis testing we can see that it looks as if the the sample is obtained from a different population of scores
17 Advantages to Repeated Measures Good when only a small N is availablee.g. patient populations, rare species, etc.Good for questions that are looking for changes across timeDevelopmental questionsLearning questionsNo worry about individual differencesNo worries about the difference simply being due to individuals (e.g. in a learning study that one group was just smarter than the other group)Decreases sample variance (which remember the higher the variance the more difficult it becomes to see a pattern in our data)
18 DisadvantagesCarryover effects - participant’s response in the second treatment is altered by lingering effects of the first treatmente.g. administering the second drug too soon after the firstProgressive error - participant’s performance changes consistently over timee.g. a participant improves over time simply due to practiceCounterbalancing the order treatment can help with these problemse.g. Subject 1 gets treatment 1 first then treatment 2. Then subject 2 gets treatment 2 first then treatment 1.
19 Estimation - Another Inferential Statistical Procedure Hypothesis testing lets us know if a particular sample came from a particular population. It helps us evaluate the effect of the treatment. The treatment either has an effect or it doesn’t.Eating oatmeal does lower cholesterolEstimation is the process of using sample data to actually estimate the values for population parameters.So, it will determine the value of the population mean after treatment. It will estimate how much effect the treatment had. How big or small the effect is.Estimation and hypothesis testing are both inferential stats that involve using sample data as the basis for drawing conclusions about an unknown population.
20 EstimationThe use of samples to estimate the population is quite common42% of the population supports the president8% of UA students are members of the Green Party34% of Americans are homeownersNOTE: We’ve already done estimation before. We know how to estimate the population variance from the sample variance. Using df.
21 2 - Types of EstimationWe also know how to estimate the SE, so we know what the average distance from our sample mean to our population mean is. So, we can begin to estimate our population mean…Point estimation - a specific value using 1 number to estimate the population mean.If I were to estimate the weight of my dog I’d say 100 lbs.Interval estimation / Confidence intervals - uses a range of values to estimate the population mean. Interval estimates are usually accompanied by the probability of obtaining that range of values. This probability information is called a level of confidence.If I were to estimate the weight of my dog I might say between 90 and 110 lbs.
22 When should we estimate? (1) After a hypothesis test when H0 is rejected.This is the case where we know there is a treatment effect, but we want to know how much. (Like the oatmeal example.)(2) When you know there is an effect, but you want to know how much.For instance, we probably know that tutoring will improve grades. However, tutoring is $25 / hour. We want to know how much our grade will improve to figure out if the cost for us will be justified.(3) We want basic information about a populationFor instance, we want to know the average number of times college students eat pizza each week.We want to know about how many people support the war in Iraq.
23 Hypothesis Testing vs. Estimation Goal = testing null hypothesis(1) Hypothesize about the unknown pop. parameter.(2) Calculate z or t by substituting the hypothesized value into the formula.(3) If get an extreme value for z or t we conclude the hypothesize value was incorrect and reject the null.(4) An extreme value is determined by its location in the distribution. Extreme values are less probable than 5%Goal = estimating the value of the parameter(1) Don’t calculate z or t. Instead estimate what z or t should be if our parameter is reasonable.(2) We usually select a z or t of 0 (or a range around 0), because this is most probable because it a highly probable value.(3) The z or t score is inserted into the formula and we solve for the parameter.(4) Because we chose a reasonable z or t we assume our parameter will be a reasonable estimate.
24 Calculating an Estimate Hi - probability outcomes for z & t. REASONABLEExtreme lo-probability outcomesUnknown parameter = stat. +/- (z or t * SE)(1) We will know all the values on right side except for z or t.(2) We don’t know the z or t score, but we do know what the distribution looks like. We know the mean is always zero.(3) For a point estimate best value for z or t will be 0. For an interval estimate best value will mark off the middle part of the distribution
25 Estimate with a z-score: point estimation 75= ?10Mean grade for Latin classes at UofA were 75. We want to know how much grades improved after some students (n= 15) took tutoring. Their mean grade was 85.Point estimateZ = M - / M OR (because of algebra) = M +/- z(sM)0 = 85 - / 2.5885 = So the sample mean estimates the population mean. Because of the Central Limit Theorem this should make sense…as n increases the sample mean should approximate the population mean.
26 Estimate with a z-score: confidence intervals Mean grade for Latin classes at UofA were 75. We want to know how much grades improved after some students (n= 15) took tutoring. Their mean grade was 85.Confidence intervals - Commonly used levels of confidence start at 60% and go up.Let’s use our tutoring example where we want to be 75% confident that our true population mean lies within our predicted range.(1) Determine the z-scores which bind 75% of the distribution (we should have 25% of the distribution left in the tails of our distribution (so .25 / 2 = .125 and the z-score associated with .125 in the tail column of our unit normal table is +/- 1.16(2) Estimate the population means: Z = M - / M OR (because of algebra) = M +/- z*(sM)1.16 = 85 - / = 85 - / 2.58 = = 87.99So, if we randomly pulled a sample 75% of the sample means would be between and 87.99
27 Interpreting a Confidence Interval Population mean = sample mean +/- some errorWhat do we know from our confidence interval?The sample mean 85 is located somewhere in our distribution of sample means.Although we do not know the exact location of that sample mean we can be 75% sure that it is between a z-score of +/- 1.16If we took additional samples 75% of the time they would be located between z-scores of +/-1.16And as long as the sample mean is located in the middle 75% of the distribution, the interval will contain the population mean
28 Let’s Try OneA farmer is interested in increasing his corn yield. He read about a new fertilizer that is purported to increase yield. Before spending the money to fertilize all his fields he decided to test the fertilizer on 2 fields. The average yield of corn without the fertilizer is 150 bushels with a standard deviation of 25. The average yield on his sample 2 fields was 190 bushels.Make a point estimateMake an interval estimate of the population mean, so that you are 80% sure that the true mean is in your interval
29 Answers Point estimate: Confidence Interval 0 = / 17.67 = 190Confidence Interval* z-scores that bind 80% of the distribution are (20% left in the tails, so 10% in each tail) = +/- 1.291.29 = / 17.67 =-1.29 = / 17.67 =So, the we are 80% sure that the population mean for the amount of corn harvested after treatment with a fertilizer is between Should we buy the fertilizer?
30 Estimation with a single-sample t Convert the t-statistic so that :population mean = sample mean +/- t * standard errorSingle sample t : = X +/- (t * sM)Same rules only now, we are estimating a value or range of values for tEstimate where the sample data are located in the t distributionMost likely value = 0 for pointFor interval the exact range will determine our t values.The sample mean and standard error (both computed from our sample) AND our estimated t value gets plugged into the formula
31 Point estimate with a single sample t A toy manufacturer asks a developmental psychologist to test children’s responses to a new product. Specifically, the manufacturer wants to know how long, on average, the toy keeps the child attention. A sample of 9 children is taken the psychologist’s measure the amount of time they play with the toy. Sample mean = 31 min. and SS of 648.t = X - / sM or = X +/- (t * sM)SS = 648s2 = SS / df, so s2 = 81s = s2, so s = 9sM = s / n , sM = 30 = 31 - / 3 = 31NOTE: We aren’t applying a treatment here, so we aren’t trying to estimate the treatment effect size, only the population mean.
32 Confidence Intervals with a single sample t A toy manufacturer asks a developmental psychologist to test children’s responses to a new product. Specifically, the manufacturer wants to know how long, on average, the toy keeps the child attention. A sample of 9 children is taken the psychologist’s measure the amount of time they play with the toy. Sample mean = 31 min. and SS of 648 min.t = X - / sx or = X +/- (t * sM)The toy manufacturer decides they want a confidence interval of 95%. So, we need to find the t values the would form the boundaries of the 95%, so we need to look for the critical t-value when .05 is our proportion in 2 tails. We also need to account for df, which is 8.Our critical t values are and
33 Confidence Intervals with a single sample t A toy manufacturer asks a developmental psychologist to test children’s responses to a new product. Specifically, the manufacturer wants to know how long, on average, the toy keeps the child attention. A sample of 9 children is taken the psychologist’s measure the amount of time they play with the toy. Sample mean = 31 min. and SS of 648 min.t = X - / sx or = X +/- (t * sM)Our critical t values are andEstimate the range of our population mean2.3 = 31 - / 36.9 = 31 - = 24.2-2.3 = 31 - / 3-6.9 = 31 - = 37.9
34 Let’s Try OneYou want to know how many times on average college students order pizza a month. You take a sample of 25 college students and find that they on average ordered pizza 1.8 times per week with a SS of 326.Do a point estimateCalculate a confidence interval of 90%
35 Independent-Measures 1 - 2 = (M1 - M2) +/- ts(M-M)Let’s compute a point estimate and a confidence interval using 95% confidence intervalSample 1 Sample 2n = 10 n = 5M = 25 M = 33SS = 250 SS = 140(M1 - M2) - (1 - 2)t =sM1 - M2
36 Repeated Measures D = MD +/- tsM sM Let’s compute a point estimate and a confidence interval using 90% confidence intervalMD = 21SS = 1215n = 15Dt =MD - DsM
37 Factors affect CI width To gain more confidence in your estimate you must increase the widthThe larger the level of confidence (%) the larger the t value and the larger the intervalInterval width decrease as n increasesBigger sample gives more info about the population, so we can make a more precise estimateSample size controls the magnitude of standard error