1 Health Warning! All may not be what it seems! These examples demonstrate both the importance of graphing data before analysing it and the effect of outliers.

Slides:



Advertisements
Similar presentations
Introduction to Hypothesis Testing
Advertisements

Assumptions underlying regression analysis
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Anthony Greene1 Simple Hypothesis Testing Detecting Statistical Differences In The Simplest Case:  and  are both known I The Logic of Hypothesis Testing:
BPS - 5th Ed. Chapter 241 One-Way Analysis of Variance: Comparing Several Means.
Copyright © Allyn & Bacon (2007) Statistical Analysis of Data Graziano and Raulin Research Methods: Chapter 5 This multimedia product and its contents.
Copyright © 2011 by Pearson Education, Inc. All rights reserved Statistics for the Behavioral and Social Sciences: A Brief Course Fifth Edition Arthur.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Objectives (BPS chapter 24)
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Chapter 10 Simple Regression.
4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.
Lecture 9: One Way ANOVA Between Subjects
PSY 1950 Confidence and Power December, Requisite Quote “The picturing of data allows us to be sensitive not only to the multiple hypotheses that.
Chapter 11 Multiple Regression.
Topic 3: Regression.
Business Statistics - QBM117 Interval estimation for the slope and y-intercept Hypothesis tests for regression.
8-2 Basics of Hypothesis Testing
Today Concepts underlying inferential statistics
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Simple Linear Regression Analysis
Relationships Among Variables
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
Testing Hypotheses.
Introduction to Linear Regression and Correlation Analysis
Lecture Slides Elementary Statistics Twelfth Edition
Chapter 8 Introduction to Hypothesis Testing
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Hypothesis Testing.
CENTRE FOR INNOVATION, RESEARCH AND COMPETENCE IN THE LEARNING ECONOMY Session 2: Basic techniques for innovation data analysis. Part I: Statistical inferences.
QNT 531 Advanced Problems in Statistics and Research Methods
Intermediate Applied Statistics STAT 460
Let’s flip a coin. Making Data-Based Decisions We’re going to flip a coin 10 times. What results do you think we will get?
Statistics Primer ORC Staff: Xin Xin (Cindy) Ryan Glaman Brett Kellerstedt 1.
+ Chapter 9 Summary. + Section 9.1 Significance Tests: The Basics After this section, you should be able to… STATE correct hypotheses for a significance.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Research Process Parts of the research study Parts of the research study Aim: purpose of the study Aim: purpose of the study Target population: group whose.
Hypotheses tests for means
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Inference and Inferential Statistics Methods of Educational Research EDU 660.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
Statistical Hypotheses & Hypothesis Testing. Statistical Hypotheses There are two types of statistical hypotheses. Null Hypothesis The null hypothesis,
Copyright © 2010, 2007, 2004 Pearson Education, Inc Section 8-2 Basics of Hypothesis Testing.
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
Scientific Method Probability and Significance Probability Q: What does ‘probability’ mean? A: The likelihood that something will happen Probability.
1 When we free ourselves of desire, we will know serenity and freedom.
Applied Quantitative Analysis and Practices LECTURE#25 By Dr. Osman Sadiq Paracha.
AP Statistics Unit 5 Addie Lunn, Taylor Lyon, Caroline Resetar.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Intro to Psychology Statistics Supplement. Descriptive Statistics: used to describe different aspects of numerical data; used only to describe the sample.
AP Statistics Chapter 11 Notes. Significance Test & Hypothesis Significance test: a formal procedure for comparing observed data with a hypothesis whose.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Unit 5: Hypothesis Testing.
Review: Stages in Research Process Formulate Problem Determine Research Design Determine Data Collection Method Design Data Collection Forms Design Sample.
Hypothesis Testing. Suppose we believe the average systolic blood pressure of healthy adults is normally distributed with mean μ = 120 and variance σ.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Statistical principles: the normal distribution and methods of testing Or, “Explaining the arrangement of things”
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Methods of Presenting and Interpreting Information Class 9.
Lecture Slides Elementary Statistics Twelfth Edition
Inference about the slope parameter and correlation
AP Statistics Chapter 14 Section 1.
Quantitative Methods PSY302 Quiz Chapter 9 Statistical Significance
CHAPTER 17: Tests of Significance: The Basics
Making Data-Based Decisions
I. Statistical Tests: Why do we use them? What do they involve?
Presentation transcript:

1 Health Warning! All may not be what it seems! These examples demonstrate both the importance of graphing data before analysing it and the effect of outliers on statistical properties. F.J. Anscombe, "Graphs in Statistical Analysis," American Statistician, 27 (February 1973), Thursday, 23 April :07 AM

2 Set 1 Describe the data Seems to be distributed normally, and corresponds to what one would expect when considering two variables correlated and following the assumption of normality. AAAAAAAAAAAAAAAAAAAAAA

3 Set 2 Describe the data Is not distributed normally; while an obvious relationship between the two variables can be observed, it is not linear, and the Pearson correlation coefficient is not relevant. AAAAAAAAAAAAAAAAAAAAAAAA

4 Set 3 Describe the data The distribution is linear, but with a different regression line, which is offset by the one outlier which exerts enough influence to alter the regression line and lower the correlation coefficient from 1 to AAAAAAAAAAAAAAAAAAAAAAAA

5 Set 4 Describe the data Shows another example when one outlier is enough to produce a high correlation coefficient, even though the relationship between the two variables is not linear. AAAAAAAAAAAAAAAAAAAAAA

6 All Data Sets

7

8 Yet all usual measures are identical!!

9 When correlations go bad Thom BaguleyThom Baguley cautions against the careless and routine application of standardisation in psychology. The effect of range restriction where x_1 are the central 100 x values.

10 When correlations go bad Thom BaguleyThom Baguley cautions against the careless and routine application of standardisation in psychology. To standardise a variable (e.g. x to z x ) first subtract its original mean from every value, then divide this value by the original standard deviation (SD). This preserves the distribution of x and y but rescales them so that both have a mean of 0 and an SD of 1. The resulting regression therefore has an intercept of zero. Its slope is r (and must fall somewhere from –1 to +1).

11 When correlations go bad Thom BaguleyThom Baguley cautions against the careless and routine application of standardisation in psychology. Range restriction occurs whenever the range of values in a sample differs from those in the population of interest. The figure shows the effect of selecting the middle 100 x values on the x–y correlation. (Here, x and y are sampled from normally distributed variables with a population correlation of.80). In the full sample of 500 simulated participants the correlation is.65, while the correlation in the restricted sample is only.08.

12 When correlations go bad Thom BaguleyThom Baguley cautions against the careless and routine application of standardisation in psychology. Careless and routine application of standardisation in psychology (without any awareness of the potential pitfalls) is dangerous.

13 Multiple Comparisons In statistics, the multiple comparisons or multiple testing problem occurs when one considers a set of statistical inferences simultaneously. Errors in inference, including confidence intervals that fail to include their corresponding population parameters or hypothesis tests that incorrectly reject the null hypothesis.

14 Multiple Comparisons Several statistical techniques have been developed to prevent this from happening, allowing significance levels for single and multiple comparisons to be directly compared. These techniques generally require a stronger level of evidence to be observed in order for an individual comparison to be deemed "significant", so as to compensate for the number of inferences being made.

15 Multiple Comparisons If the inferences are hypothesis tests, with just one test performed at the 5% level, there is only a 5% chance of incorrectly rejecting the null hypothesis if the null hypothesis is true. However, for 100 tests where all null hypotheses are true, the expected number of incorrect rejections is 5. If the tests are independent, the probability of at least one incorrect rejection is 99.4% (Prob= ). These errors are called false positives.

16 Multiple Comparisons Techniques have been developed to control the false positive error rate associated with performing multiple statistical tests. Similarly, techniques have been developed to adjust confidence intervals so that the probability of at least one of the intervals not covering its target value is controlled.

17 Multiple Comparisons In statistics, the Bonferroni correction is a method used to counteract the problem of multiple comparisons. It is considered the simplest and most conservative method to control the family wise error rate. Bland J.M. and Altman D.G. “Multiple significance tests: The Bonferroni method” British Medical Journal (6973)

18 Multiple Comparisons “Calculating numerous correlations increases the risk of a type I error, i.e., to erroneously conclude the presence of a significant correlation. To avoid this, the level of statistical significance of correlation coefficients should be adjusted.” Curtin, F. and Schulz, P “Multiple correlations and Bonferroni's correction” Biological Psychiatry 44(8)

19 Multiple Comparisons Statistical inference logic is based on rejecting the null hypotheses if the likelihood under the null hypotheses of the observed data is low. The problem of multiplicity arises from the fact that as we increase the number of hypotheses in a test, we also increase the likelihood of witnessing a rare event, and therefore, the chance to reject the null hypotheses when it's true (type I error).

20 Multiple Comparisons Bonferroni correction is the most naive way to address this issue. The correction is based on the idea that if an experimenter is testing n dependent or independent hypotheses on a set of data, then one way of maintaining the family wise error rate is to test each individual hypothesis at a statistical significance level of 1/n times what it would be if only one hypothesis were tested. So, if it is desired that the significance level for the whole family of tests should be (at most) α, then the Bonferroni correction would be to test each of the individual tests at a significance level of α/n. 1-p' = (1-p) n ≈ 1-np So p' ≈ np If p' = α then p ≈ α/n

21 Multiple Comparisons Statistically significant simply means that a given result is unlikely to have occurred by chance assuming the null hypothesis is actually correct (i.e., no difference among groups, no effect of treatment, no relation among variables). Calculator

22 Multiple Comparisons In several situations scientists are interested in addressing multiple statistical tests among samples. The most common application is to carry out all 2 by 2 comparisons between all samples or to perform only those comparisons of interest. How Many Statistical Tests Are Too Many? The Problem Of Conducting Multiple Ecological Inferences Revisited Pedro R. Peres-Neto Marine Ecology Progress Series, Vol. 176: , 1999.

23 Multiple Comparisons This paper presents a simple and widely applicable multiple test procedure of the sequentially rejective type, i.e. hypotheses are rejected one at a time until no further rejections can be done. It is shown that the test has a prescribed level of significance protection against error of the first kind for any combination of true hypotheses. The power properties of the test and a number of possible applications are also discussed. A Simple Sequentially Rejective Multiple Test Procedure Sture Holm Scandinavian Journal of Statistics, Vol. 6: 65-70, 1979.