Presentation on theme: "1 Health Warning! All may not be what it seems! These examples demonstrate both the importance of graphing data before analysing it and the effect of outliers."— Presentation transcript:
1 Health Warning! All may not be what it seems! These examples demonstrate both the importance of graphing data before analysing it and the effect of outliers on statistical properties. F.J. Anscombe, "Graphs in Statistical Analysis," American Statistician, 27 (February 1973), Thursday, 23 April :07 AM
2 Set 1 Describe the data Seems to be distributed normally, and corresponds to what one would expect when considering two variables correlated and following the assumption of normality. AAAAAAAAAAAAAAAAAAAAAA
3 Set 2 Describe the data Is not distributed normally; while an obvious relationship between the two variables can be observed, it is not linear, and the Pearson correlation coefficient is not relevant. AAAAAAAAAAAAAAAAAAAAAAAA
4 Set 3 Describe the data The distribution is linear, but with a different regression line, which is offset by the one outlier which exerts enough influence to alter the regression line and lower the correlation coefficient from 1 to AAAAAAAAAAAAAAAAAAAAAAAA
5 Set 4 Describe the data Shows another example when one outlier is enough to produce a high correlation coefficient, even though the relationship between the two variables is not linear. AAAAAAAAAAAAAAAAAAAAAA
6 All Data Sets
8 Yet all usual measures are identical!!
9 When correlations go bad Thom BaguleyThom Baguley cautions against the careless and routine application of standardisation in psychology. The effect of range restriction where x_1 are the central 100 x values.
10 When correlations go bad Thom BaguleyThom Baguley cautions against the careless and routine application of standardisation in psychology. To standardise a variable (e.g. x to z x ) first subtract its original mean from every value, then divide this value by the original standard deviation (SD). This preserves the distribution of x and y but rescales them so that both have a mean of 0 and an SD of 1. The resulting regression therefore has an intercept of zero. Its slope is r (and must fall somewhere from –1 to +1).
11 When correlations go bad Thom BaguleyThom Baguley cautions against the careless and routine application of standardisation in psychology. Range restriction occurs whenever the range of values in a sample differs from those in the population of interest. The figure shows the effect of selecting the middle 100 x values on the x–y correlation. (Here, x and y are sampled from normally distributed variables with a population correlation of.80). In the full sample of 500 simulated participants the correlation is.65, while the correlation in the restricted sample is only.08.
12 When correlations go bad Thom BaguleyThom Baguley cautions against the careless and routine application of standardisation in psychology. Careless and routine application of standardisation in psychology (without any awareness of the potential pitfalls) is dangerous.
13 Multiple Comparisons In statistics, the multiple comparisons or multiple testing problem occurs when one considers a set of statistical inferences simultaneously. Errors in inference, including confidence intervals that fail to include their corresponding population parameters or hypothesis tests that incorrectly reject the null hypothesis.
14 Multiple Comparisons Several statistical techniques have been developed to prevent this from happening, allowing significance levels for single and multiple comparisons to be directly compared. These techniques generally require a stronger level of evidence to be observed in order for an individual comparison to be deemed "significant", so as to compensate for the number of inferences being made.
15 Multiple Comparisons If the inferences are hypothesis tests, with just one test performed at the 5% level, there is only a 5% chance of incorrectly rejecting the null hypothesis if the null hypothesis is true. However, for 100 tests where all null hypotheses are true, the expected number of incorrect rejections is 5. If the tests are independent, the probability of at least one incorrect rejection is 99.4% (Prob= ). These errors are called false positives.
16 Multiple Comparisons Techniques have been developed to control the false positive error rate associated with performing multiple statistical tests. Similarly, techniques have been developed to adjust confidence intervals so that the probability of at least one of the intervals not covering its target value is controlled.
17 Multiple Comparisons In statistics, the Bonferroni correction is a method used to counteract the problem of multiple comparisons. It is considered the simplest and most conservative method to control the family wise error rate. Bland J.M. and Altman D.G. “Multiple significance tests: The Bonferroni method” British Medical Journal (6973)
18 Multiple Comparisons “Calculating numerous correlations increases the risk of a type I error, i.e., to erroneously conclude the presence of a significant correlation. To avoid this, the level of statistical significance of correlation coefficients should be adjusted.” Curtin, F. and Schulz, P “Multiple correlations and Bonferroni's correction” Biological Psychiatry 44(8)
19 Multiple Comparisons Statistical inference logic is based on rejecting the null hypotheses if the likelihood under the null hypotheses of the observed data is low. The problem of multiplicity arises from the fact that as we increase the number of hypotheses in a test, we also increase the likelihood of witnessing a rare event, and therefore, the chance to reject the null hypotheses when it's true (type I error).
20 Multiple Comparisons Bonferroni correction is the most naive way to address this issue. The correction is based on the idea that if an experimenter is testing n dependent or independent hypotheses on a set of data, then one way of maintaining the family wise error rate is to test each individual hypothesis at a statistical significance level of 1/n times what it would be if only one hypothesis were tested. So, if it is desired that the significance level for the whole family of tests should be (at most) α, then the Bonferroni correction would be to test each of the individual tests at a significance level of α/n. 1-p' = (1-p) n ≈ 1-np So p' ≈ np If p' = α then p ≈ α/n
21 Multiple Comparisons Statistically significant simply means that a given result is unlikely to have occurred by chance assuming the null hypothesis is actually correct (i.e., no difference among groups, no effect of treatment, no relation among variables). Calculator
22 Multiple Comparisons In several situations scientists are interested in addressing multiple statistical tests among samples. The most common application is to carry out all 2 by 2 comparisons between all samples or to perform only those comparisons of interest. How Many Statistical Tests Are Too Many? The Problem Of Conducting Multiple Ecological Inferences Revisited Pedro R. Peres-Neto Marine Ecology Progress Series, Vol. 176: , 1999.
23 Multiple Comparisons This paper presents a simple and widely applicable multiple test procedure of the sequentially rejective type, i.e. hypotheses are rejected one at a time until no further rejections can be done. It is shown that the test has a prescribed level of significance protection against error of the first kind for any combination of true hypotheses. The power properties of the test and a number of possible applications are also discussed. A Simple Sequentially Rejective Multiple Test Procedure Sture Holm Scandinavian Journal of Statistics, Vol. 6: 65-70, 1979.