Presentation is loading. Please wait.

Presentation is loading. Please wait.

Common Statistical Mistakes

Similar presentations


Presentation on theme: "Common Statistical Mistakes"— Presentation transcript:

1 Common Statistical Mistakes

2 Mistake #1 Failing to investigate data for data entry or recording errors. Failing to graph data and calculate basic descriptive statistics before analyzing data.

3 Example: Wrong Decision Due to Error

4 Example: Wrong Decision Due to Error
Test of mu = vs mu not = Variable N Mean StDev SE Mean T P With Without Variable N Mean StDev SE Mean % CI With (23.513, ) Without (23.741, )

5 Mistake #2 Using the wrong statistical procedure in analyzing your data. Includes failing to check that necessary assumptions are met.

6 Example: Wrong Decision Due to Wrong Analysis
Student BEFORE AFTER DIFFA-B Pulse Rates Before and After Marching Paired Data Design, so analyze with Paired t-test.

7 Example: Wrong Decision Due to Wrong Analysis
Paired T for AFTER - BEFORE N Mean StDev SE Mean AFTER BEFORE Difference 95% CI for mean difference: (2.99, 19.01) T-Test of mean difference = 0 (vs not = 0): T-Value = 4.37 P-Value = 0.02 Conclude mean pulse rate after is greater than mean pulse rate before.

8 Example: Wrong Decision Due to Wrong Analysis
Two sample T for AFTER vs BEFORE N Mean StDev SE Mean AFTER BEFORE 95% CI for mu AFTER - mu BEFORE: ( -15.3, 37.3) T-Test mu AFTER = mu BEFORE (vs not =): T = P = 0.33 DF = 5 Conclude no difference in mean pulse rates before and after marching.

9 Mistake #3 Failing to design your study so that it has high enough power to call meaningful differences “significantly different.” Includes concluding that the null hypothesis is true. Should be “not enough evidence to say the null is false.”

10 Example: Low Power Success = Yes, I recycle. Gender X N Sample p Male Female Estimate for p(1) - p(2): 95% CI for p(1) - p(2): ( , ) Test for p(1) - p(2) = 0 (vs not = 0): Z = -1.49 P-Value = 0.135 A number of students said that they were surprised that the hypothesis test said “no difference in percentages.”

11 Example: Low Power Power and Sample Size Test for Two Proportions
Testing proportion 1 = proportion 2 (versus not =) Calculating power for: proportion 1 = 0.55 and proportion 2 = 0.70 Alpha = Difference = -0.15 Sample Size Power *Sample size = # in EACH group

12 Mistake #4 Failing to report a confidence interval as well as the P-value. P-value tells you if statistically significant. Confidence interval tells you what the population value might be.

13 Example: A Significant, but Potentially Meaningless Difference
Two sample T for Phone Gender N Mean StDev SE Mean Male Female 95% CI for mu (1) - mu (2): ( -142, -5) T-Test mu (1) = mu (2) (vs not =): T = P = 0.036 DF = 135 P-value tells us significant difference, but confidence interval tells us that the difference in the averages could be as small as 5 minutes.

14 Incidentally…. Outliers

15 Removing Outliers … Two sample T for Phone Gender N Mean StDev SE Mean Male Female 95% CI for mu (1) - mu (2): ( , -35) T-Test mu (1) = mu (2) (vs not =): T = P = DF = 121 The difference in male and female phone usage becomes even more significant. We are 95% confident that the difference in the averages is now more than 35 minutes.

16 Mistake #5 “Fishing” for significant results. That is, performing several hypothesis tests on a data set, and reporting only those results that are significant. If  = P(Type I) = 0.05, and we perform 20 tests on the same data set, we can expect to make 1 Type I error. (0.05 ×20 = 1).

17 Example: Results Obtained from Fishing
Primary driver of $10,000 vehicle and going away for Spring Break are related (P=0.01). Virginity and supporting self through school are related (P = 0.045). Virginity and graduating in four years are related (P = 0.041). Virginity and attending non-football PSU sports events are related (P = 0.016).

18 Mistake #6 Overstating the results of an observational study.
That is, suggesting that one variable “caused” the differences in the other variable. As opposed to correctly saying that the two variables are “associated” or “correlated.” Don’t forget that a significant result may be “spurious.”

19 Example: Misleading Headlines
Virgins don’t support themselves through school. Non-virgins too busy to go to non-football PSU sporting events. Non-virgins also too busy to graduate in four years.

20 Mistake #7 Using a non-random or unrepresentative sample.
Includes extending the results of an unrepresentative sample to the population.

21 Example: Unrepresentative sample
Shere Hite wrote a book in 1987 called “Women in Love” 100,000 questionnaires about love, sex, and relationships sent to women’s groups. Only 4,500 questionnaires returned. Entire book devoted to results of survey. Examples: 91% of divorcees initiated the divorce; 70% of women married 5 years committed adultery.

22 Mistake #8 Failing to use all of the basic principles of experiments, including randomization, blinding, and controlling.


Download ppt "Common Statistical Mistakes"

Similar presentations


Ads by Google