Presentation on theme: "Simpson’s Paradox Fr Chris - St Francis High School October 5, 2006."— Presentation transcript:
Simpson’s Paradox Fr Chris - St Francis High School October 5, 2006
Why is it Simpson? Simpson, E.H., 1951, “The interpretation of interaction in contingency tables”, Journal of the Royal Statistical Society, Series B, 13:
The Ashes Series The two Waugh brothers, Steve and Mark, decided to have a little wager on who would have the better overall batting average over the two upcoming Ashes Test series, the first in England and the next here in Australia. After the first Ashes series finished Steve said to Mark, ‘You’ve got your work cut out for you, mate. I have scored 500 runs for 10 outs, for an average of 50. You have 270 runs for 6 outs, for an average of 45.’ After the second Ashes series, Steve said, ‘Ok, mate, pay up. In this series I scored 320 runs for 4 outs, an average of 80, while you had 700 runs for 10 outs, which is only an average of 70. I topped you in each of the Series.’ ‘Hold on’, Mark said, ‘The wager was for the better batting average overall, not series by series. As I reckon it, you have scored 820 runs for 14 outs, and I have scored 970 runs for 16 outs. My trusty calculator tells me your average is 58.6, while my average is A clear case, old son, of being pipped at the bails.’ How is this possible, that Steve could have a better average in each of the two Tests but a lower average overall?
Ask Marilyn Parade Magazine, 28 April 1996, p6. A reader poses the following question: A company decided to expand, so it opened a factory generating 455 jobs. For the 70 white collar positions, 200 males and 200 females applied. Of the females who applied, 20% were hired, while only 15% of the males were hired. Of the 400 males applying for the blue collar positions, 75% were hired, while 85% of the females were hired. A federal Equal Employment enforcement official noted that many more males were hired than females, and decided to investigate. Responding to charges of irregularities in hiring, the company president denied any discrimination, pointing out that in both the white collar and blue collar fields, the percentage of female applicants hired was greater than it was for males. But the government official produced his own statistics, which showed that a female applying for a job had a 58% chance of being denied employment while male applicants had only a 45% denial rate. As the current law is written, this constituted a violation....Can you explain how two opposing statistical outcomes are reached from the same raw data?
Explanation What we have, of course, is an example of Simpson's paradox: The direction of association between gender and hiring rate appears to reverse when the data are aggregated across job classes. Marilyn correctly notes that, even though all the figures presented are correct, the two outcomes are not opposing.
Which is better? Say a company tests two treatments for an illness. In trial No. 1, treatment A cures 20% of its cases (40 out of 200) and treatment B cures 15% of its cases (30 out of 200). In trial No. 2, treatment A cures 85% of its cases (85 out of 100) and treatment B cures 75% of its cases (300 out of 400).... So, in two trials, treatment A scored 20% and 85%. Also in two trials, treatment B scored only 15% and 75%. No matter how many people were in those trials, treatment A (at 20% and 85%) is surely better than treatment B (at 15% and 75%), right?
Wrong! Treatment B performed better. It cured 330 (300+30) out of the 600 cases. ( ) in which it was tried--a success rate of 55%...By contrast, treatment A cured 125 (40+85) out of the 300 cases ( ) in which it was tried, a success rate of only about 42%. She notes that this is exactly what happened to the employer. Because so many more men applied for the blue collar positions, even if the employer hired all the women who had applied for blue collar positions, it couldn't satisfy the government regulations.
NSF The National Science Foundation in the US conducted a study of persons who received a degree in science or engineering in 1977 or The study found that at the bachelor’s degree level the average woman with a full-time job earned an average of 77% of the average male salary. But comparing salaries within each field, the average salary for women was in each case at least 92% of the average male salary. The explanation here is what is called a lurking variable - women were concentrated in the life sciences and social sciences which had lower salaries in general.
death & taxes A government taxes people at two rates. All income below $ is taxed at the low rate of 20% and income above $ is taxed at the high rate of 40%. Being a kind and generous government they decide to give the people a tax cut. The low rate is cut to 15% and the high rate is cut to 35%. Imagine the surprise of the government when the overall tax rate turned out to be higher than before! What the government forgot was to take into account inflation (its easy to forget about these things when you are busy governing). Even though the tax rate for each group was lower, the amount of money taxed at the higher rate had increased so much because of inflation that it more than offset the money lost by cutting tax rates. (This is called ‘bracket creep’. Occasionally the public misuses this term, and mistakenly applies the word ‘creep’ to the tax collector instead.)