7.11 Using Statistics To Make Inferences 7 Summary Single sample test of variance. Comparison of two variances. Monday, 22 June 20159:52 PM
7.22 Goals To perform and interpret χ 2 and F tests. These tests are not available individually within SPSS, but embedded within more complex procedures. Practical Revert to the data from practical 5 and employ the Mann-Whitney test where you previously employed a t test. Perform a t test on reading ability data. Chi squared
7.33 Recall In lecture 4 we compared the mean of two samples using an appropriate t test. What assumption did we make about the variances of the two samples? It was assumed that the two sample variances were effectively equal. Fffffffffffffffffffff fffffff
7.44 Recall What notation do we employ for a population mean and for a sample mean? Population mean μ Sample mean Typically assessed with a t test CCCzCCCz
7.55 Recall What notation do we employ for a population variance and for a sample variance? Population variance σ 2 Sample variance s 2 Typically assessed with a χ 2 or F test Fffffffffffffffffffff Fffffff
7.66 Examination of the Variance Equality of means does not imply equality of variances. It is often important to control (minimise) the variance.
7.77 Examination of the Variance How do we compare a sample variance against the expected population value?
7.88 Single Sample Variance Test The null hypothesis is that a population standard deviation is equal to a particular value σ. Assuming that the data are normally distributed, a sample of size n is obtained from the population and a standard deviation, s, calculated. The test statistic is
7.99 Conclusion This statistic follows a Chi-squared distribution with ν = n - 1 degrees of freedom and significance level α,
Example From past records students marks have a standard deviation of 10. A group of 20 students are taught by a new method, the standard deviation of their marks is 7.6. Is the group significantly more, or less variable? H 0 is that σ = 10 H 1 is that σ ≠ 10
Conclusion n = 20 s = 7.6 σ = 10 ν = n – 1 = 19
Use of Tables ν p=0.1p=0.05p=0.025p=0.01p=0.005p= ν p=0.9p=0.95p=0.975p=0.99p=0.995p= ν = 19 Here the 5% and 95% values are and 10.12, the value is not significant at the 10% level (two tail test), the null hypothesis is accepted, there is no evidence that the new method affects the variability of the marks. For 5% (2.5% and 97.5%)the corresponding values are and 8.91 and the conclusion is unchanged.
Example The following data is obtained Past experiments suggest that the standard deviation is never more than 2. H 0 is that σ = 2 H 1 is that σ > 2 (a one sided test)
Calculation You might find the following sums useful Σx = and Σx 2 =
Calculation You might find the following sums useful Σx = and Σx 2 = CCCCCCCcCCCCCCCc
Calculation n = 12 s = (direct calculation) σ = 2 ν = n – 1 = 11 ν p=0.1p=0.05p=0.025p=0.01p=0.005p= Since < 21.79, the result is significant at the 5% level, this means you can be 95% confident of your result, the null hypothesis is rejected, and the variability is significantly higher.
Confidence Interval A confidence interval for the variance, with confidence level 1-α is
Confidence Interval For example, if n = 31 and s = 27.63, our degrees of freedom are n-1 = 30, so that if the confidence level is 95% (α = 0.05), we look up
Confidence Interval For example, if n = 31 and s = 27.63, our degrees of freedom are n-1 = 30, so that if the confidence level is 95% (α = 0.05), we look up
Confidence Interval Recall n = 31 and s = 27.63
Confidence Interval So we can be 95% certain that σ lies in the interval [22.08, 36.93] (on taking the square root).
Aside Consider Boys and Girls and the desire to predict gender based on some simple test. Assume that 50% of births are Boys so that Prob(Boy) = Prob(Girl) = ½. A simple, inexpensive, non-invasive gender testing procedure indicates that it is "perfect" for boys, Prob(Test Boy|Boy) = 1, implying Prob(Test Girl|Boy) = 0. Unfortunately, this simple gender testing procedure for girls is a "coin toss," Prob(Test Girl|Girl) = Prob(Test Boy|Girl) = ½. By evaluating Prob(Boy|Test Boy) and Prob(Girl|Test Girl) assess which is the most likely.
Aside Consider Boys and Girls and the desire to predict gender based on some simple test. Assume that 50% of births are Boys so that Prob(Boy) = Prob(Girl) = ½. A simple, inexpensive, non-invasive gender testing procedure indicates that it is "perfect" for boys, Prob(Test Boy|Boy) = 1, implying Prob(Test Girl|Boy) = 0. Unfortunately, this simple gender testing procedure for girls is a "coin toss," Prob(Test Girl|Girl) = Prob(Test Boy|Girl) = ½. By evaluating Prob(Boy|Test Boy) and Prob(Girl|Test Girl) assess which is the most likely.
Aside Consider Boys and Girls and the desire to predict gender based on some simple test. Assume that 50% of births are Boys so that Prob(Boy) = Prob(Girl) = ½. A simple, inexpensive, non-invasive gender testing procedure indicates that it is "perfect" for boys, Prob(Test Boy|Boy) = 1, implying Prob(Test Girl|Boy) = 0. Unfortunately, this simple gender testing procedure for girls is a "coin toss," Prob(Test Girl|Girl) = Prob(Test Boy|Girl) = ½. By evaluating Prob(Boy|Test Boy) and Prob(Girl|Test Girl) assess which is the most likely. What approach is appropriate (simplest)?
Solution CCCCCCCCCCCCCCCCCCCCCCCCCC What is the grand total of the probabilities? C Cc
Prob(girl|test says girl) CCCCCCCCCCCCCcCCCCCCCCCCCCCc Test says girl CCCCCCCCCCCCCCcCCCCCCCCCCCCCCc
Prob(boy|test says boy) CCCCCCCCCCCCCCCCCCCCCCCCCC Test says boy
Solution The tree diagram or application of Bayes theorem yields what seems to be a strange inversion, Prob(Boy|Test Boy) = ⅔ and Prob(Girl|Test Girl) = 1. That is, somehow, "perfection" switched from Boy to Girl. The test itself was perfect in "confirming" that a Boy was a Boy and has a 50% error rate in confirming that a Girl was a Girl. CCCCCCCCCCCcCCCCCCCCCCCc
Alternate Approach What if we tested 100 boys and 100 girls? Complete the following table.
Alternate Approach Test says boy Test says girl Boy100 Girl Complete the table Prob(Test Boy|Boy) = 1 Prob(Test Girl|Girl) = Prob(Test Boy|Girl) = ½.
Alternate Approach Test says boy Test says girl Boy1000 Girl CCCCCCCCCCCCcCCCCCCCCCCCCc
Prob(girl|test says girl) Test says boy Test says girl Boy1000 Girl Prob(Girl|Test says girl) = 50/50 =1 CCCCCCCCCCCCCcCCCCCCCCCCCCCc
Prob(boy|test says boy) Test says boy Test says girl Boy1000 Girl Prob(boy|test says boy)=100/150= ⅔ CCCCCCCCCCCCCcCCCCCCCCCCCCCc
Conclusion The previous result follows. Of course! More of this next week.
Comparison of Two Sample Variances We know that a t test may be used to compare two sample means. We now compare two sample variances, assuming that the data are normally distributed.
Two Sample Variance Test Note that the tables only give upper tail significance levels, so the larger sample variance must be placed in the numerator. From tables Significant if
Two Sample Variance Test The tables only give upper tail significance levels. What if the lower tail is required? So, swap the degrees of freedom and reciprocate.
Two Sample Variance Test To illustrate, swapping the degrees of freedom and reciprocating. ν1ν1 5 ν2ν2 4 α 0.025F crit 9.36 ν1ν1 4 ν2ν2 5 α 0.025F crit 7.39 ν1ν1 5 ν2ν2 4 α 0.975F crit 0.14(reciprocal7.39) Calculator
Two Sample Variance Test
Example Two samples are taken to check for equality of their variances. sample observations with standard deviation 8.4 sample observations with standard deviation 5.2.
Hypothesis H 0 is that Note So n 1 = 16 n 2 = 20 And ν 1 = n 1 – 1 = 15 ν 2 = n 2 – 1 = 19 sample observations with standard deviation 8.4 sample observations with standard deviation 5.2
Tables ν1ν2ν1ν ν1ν2ν1ν
Conclusion At 90% the upper cut off is 2.23 (2.23 < 2.61). The result is significant at the 10% level, this means you can be 90% confident of your result, reject H 0, the variances are probably inconsistent. But further work is probably required.
Example In a clinical test the following scores were obtained for “normal” and “diseased” patients. Normal Diseased Is there a significant difference between the mean test scores for the two groups? A t test was performed previously, see lecture 4 example 1. This assumed “equality” of the variances!
Previously H 0 is that μ 1 = μ 2 H 1 is that μ 1 ≠ μ 2 under a two tail test Because s 1 and s 2 are similar we assumed that σ 1 = σ 2 (chapter 4). Was this justified?
Conclusion H 0 is that ν1ν2ν1ν There would appear to be no significant difference and the original assumption was justified.
Confidence Interval The confidence interval for the ratio of the two variances is Note the change in the degrees of freedom for the two choices of F. In fact one gives the upper tail and one the lower. It is not necessary that.
Confidence Interval It is not necessary that. The two bounds are always and.
α = 0.05 Confidence Interval For example if First value from table
α = 0.05 Confidence Interval For example if Second value from table
α = 0.05 Confidence Interval For example if Switching the roles of the groups gives bounds and , the reciprocal of the values reported above.
What if I have lost my statistical tables? Most tabulated statistical values may be obtained from Excel Excel Statistical Calculator
Next Week Bring your calculators next week
Read Read Howitt and Cramer Read Davis and Smith pages
Solution To The First Assignment The individual solutions to the first assignment should now be available on the module web page. Please access the “SPSS Verification” which employs the syntax window. You will find this particularly useful at Stage III.
Practical 7 This material is available from the module web page. Module Web Page
Practical 7 This material for the practical is available. Instructions for the practical Practical 7 Material for the practical Practical 7
Whoops! Last week, a formatting error led to us inadvertently suggesting that there was a one in 1,019 chance of the world ending before this edition. That should have read, er, one in rather less likely. Sorry. Feel free to remove the crash helmet. Independent 13/09/08
Whoops! Yeah... that's not the quadratic formula.