Random error, Confidence intervals and P-values

Slides:



Advertisements
Similar presentations
Statistics Review – Part II Topics: – Hypothesis Testing – Paired Tests – Tests of variability 1.
Advertisements

Decision Errors and Power
Statistical Decision Making
Random error, Confidence intervals and p-values Simon Thornley Simon Thornley.
Find the Joy in Stats ? ! ? Walt Senterfitt, Ph.D., PWA Los Angeles County Department of Public Health and CHAMP.
Basic Elements of Testing Hypothesis Dr. M. H. Rahbar Professor of Biostatistics Department of Epidemiology Director, Data Coordinating Center College.
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Lecture 9: p-value functions and intro to Bayesian thinking Matthew Fox Advanced Epidemiology.
Thomas Songer, PhD with acknowledgment to several slides provided by M Rahbar and Moataza Mahmoud Abdel Wahab Introduction to Research Methods In the Internet.
AM Recitation 2/10/11.
Descriptive statistics Inferential statistics
1/2555 สมศักดิ์ ศิวดำรงพงศ์
+ Chapter 9 Summary. + Section 9.1 Significance Tests: The Basics After this section, you should be able to… STATE correct hypotheses for a significance.
Health and Disease in Populations 2001 Sources of variation (2) Jane Hutton (Paul Burton)
1 Statistical Inference Greg C Elvers. 2 Why Use Statistical Inference Whenever we collect data, we want our results to be true for the entire population.
The Argument for Using Statistics Weighing the Evidence Statistical Inference: An Overview Applying Statistical Inference: An Example Going Beyond Testing.
Instructor Resource Chapter 5 Copyright © Scott B. Patten, Permission granted for classroom use with Epidemiology for Canadian Students: Principles,
10.2 Tests of Significance Use confidence intervals when the goal is to estimate the population parameter If the goal is to.
Hypothesis Testing Hypothesis Testing Topic 11. Hypothesis Testing Another way of looking at statistical inference in which we want to ask a question.
Agresti/Franklin Statistics, 1 of 122 Chapter 8 Statistical inference: Significance Tests About Hypotheses Learn …. To use an inferential method called.
Biostatistics Class 6 Hypothesis Testing: One-Sample Inference 2/29/2000.
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
Issues concerning the interpretation of statistical significance tests.
Ex St 801 Statistical Methods Inference about a Single Population Mean.
Lecture & tutorial material prepared by: Dr. Shaffi Shaikh Tutorial presented by: Dr. Rufaidah Dabbagh Dr. Nurah Al-Amro.
Inferential Statistics Introduction. If both variables are categorical, build tables... Convention: Each value of the independent (causal) variable has.
1 Probability and Statistics Confidence Intervals.
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
Measures of disease frequency Simon Thornley. Measures of Effect and Disease Frequency Aims – To define and describe the uses of common epidemiological.
+ Homework 9.1:1-8, 21 & 22 Reading Guide 9.2 Section 9.1 Significance Tests: The Basics.
Lecture #8 Thursday, September 15, 2016 Textbook: Section 4.4
EPID 503 – Class 12 Cohort Study Design.
CHAPTER 9 Testing a Claim
Statistics 200 Lecture #9 Tuesday, September 20, 2016
Sample size calculation
CHAPTER 9 Testing a Claim
One-Sample Inference for Proportions
What Is a Test of Significance?
Unit 5: Hypothesis Testing
The binomial applied: absolute and relative risks, chi-square
CHAPTER 9 Testing a Claim
Understanding Results
Statistical inference: distribution, hypothesis testing
CHAPTER 9 Testing a Claim
Unit 6 - Comparing Two Populations or Groups
Section 11.2: Carrying Out Significance Tests
CHAPTER 9 Testing a Claim
Review for Exam 2 Some important themes from Chapters 6-9
CHAPTER 9 Testing a Claim
CHAPTER 9 Testing a Claim
Significance Tests: The Basics
Significance Tests: The Basics
Evaluating the Role of Bias
CHAPTER 9 Testing a Claim
CHAPTER 9 Testing a Claim
Statistical significance using p-value
CHAPTER 9 Testing a Claim
CHAPTER 9 Testing a Claim
CHAPTER 9 Testing a Claim
CHAPTER 9 Testing a Claim
The objective of this lecture is to know the role of random error (chance) in factor-outcome relation and the types of systematic errors (Bias)
Interpreting Epidemiologic Results.
CHAPTER 9 Testing a Claim
CHAPTER 9 Testing a Claim
CHAPTER 9 Testing a Claim
CHAPTER 9 Testing a Claim
CHAPTER 9 Testing a Claim
CHAPTER 9 Testing a Claim
CHAPTER 9 Testing a Claim
CHAPTER 9 Testing a Claim
Presentation transcript:

Random error, Confidence intervals and P-values Simon Thornley 1

Overview Simulated, repeated, epidemiological study to understand: Random error confidence intervals p-values Sample size calculations 2

Interpreting study results Estimate OR/RR and 95% C. I. Is there an association between exposure and outcome? Is P <0.05 or 95% CI for measure of association contain null value (1)? Hypothesis likely false No Consider type-2 error; confounding, bias, other studies Yes Exposure is associated with disease Is there another explanation? Bias Confounding Type-1 error (consider strength of association) Information (recall) Shared common cause of exposure and disease? Selection (survivor; loss to follow up, hosp. controls) How does my study compare with others? Could study design be improved? Regression or stratified analysis

What is random error? Only one type of error in epidemiological studies – others? We rarely have whole population, Rely on sample Who is picked? chance?

20 sided dice Rolling dice = outcome e.g. Diarrhoea 10 rolls =exposed subjects 6 to 20 = diarrhoea in exposed 10 rolls unexposed. 16 to 20 = diarrhoea in unexposed

a b c d Total a+c b+d Diarrhoea No diarrhoea Exposed Unexposed Exposed (10 rolls) 1 roll = 1 participant Unexposed (10 rolls) ≥6 (diarrhoea) <6 (no diarrhoea) ≥16 (diarrhoea) <16 a b c d Diarrhoea No diarrhoea Exposed a b Unexposed c d Total a+c b+d

What is Risk/odds ratio? Assume dice fair calculate ‘true’ risk ratio and odds ratio of the outcome?

What is the true odds ratio? Risk in exposed? Risk in unexposed? Odds in exposed? Odds in unexposed? 8

Risk in exposed?

=15/20

Risk in unexposed?

=5/20

Odds among exposed? 1/3 ½ ¾ 4/3 3

Odds among unexposed? 1/3 ½ ¼ 3 4

True odds ratio? 3 4 5 6 8 9

What is odds ratio? Odds in exposed is 15/5 = 3 Odds in unexposed is 5/15 = 1/3 Odds ratio = 3/(1/3) = 9 Risk ratio is three, but bounded by prevalence in unexposed (denominator), which is ¼. Reciprocal of this (4) is upper limit of risk ratio.

Aggregate outcomes Rolling dice = defines outcome e.g. Diarrhoea 10 students exposed; 10 unexposed. Roll dice 20 times each; ≥6 diarrhoea in exposed; ≥16 diarrhoea in unexposed Simulated epidemiological study, n=10 exposed, n=10 unexposed. For exposed, >5 = disease; for unexposed >15 = disease; otherwise no-disease.

Every pair: Two by two table Diarrhoea No diarrhoea Exposed a b Unexposed c d a+c b+d Get students to prepare two by two table to summarise results. 18

Odds ratio odds of exposure in cases odds of exposure in controls = a*d -------- b*c Also odds ratio, if time.

95% confidence interval for odds ratio se(log OR) = √[1/a+1/b+1/c+1/d] Error factor (EF) = exp[1.96 x s.e.(log OR)] 95%CI = OR / EF to OR x EF Calculate ‘z’ statistic =log (OR)/se(log OR) Use tables to convert to p-value This can be difficult, no need to calculate p-value. 20

My example Diarrhoea No diarrhoea Exposed 6 4 Unexposed 2 8 12 Example for students to work through, if having trouble. OR =(6/4)/(2/8)=(6*8) /(2*4)=48/8 = 6 SE (log(OR)) = √[1/6+1/4+1/2+1/8] =1.02 Error factor = exp[1.96 x s.e.(1.02)] =7.39 95% CI = 6/7.39 to 6*7.39 = 0.81 to 44.4 z-statistic = log(OR)/se(log(OR) =1.76 P =0.08 True odds ratio is... 9 Does calculated 95% odds ratio contain true value?

No significant diff. What does this mean? What is the “null” hypothesis? What is the “alternate” hypothesis?

No significant diff. Null hypothesis (no difference) That P(Diarrhoea | Exposed) = 0.5 P(Diarrhoea | Unexposed) = 0.5 OR P(D|E) - P(D|no E) = 0 Alternate hypothesis (yes, a difference) P(D|E)- P(D|no E) = delta (δ)

Which of the following are not affected by the study sample size? 95% confidence interval P-value Bias Random error

Notice that the point estimates for larger study samples are getting closer to the true value, and the 95% CI is narrower. P(type-2 error) is lower if sample size ↑ True odds ratio (9) This are the odds ratios from the student dice rolling exercises. 19/20 95% confidence intervals should include the true effect (OR=9) Doesn’t contain true value (95% of intervals should!) see definition of confidence interval Type-2 errors: cross null value of 1 (3/10 or 30%). All other groups made correct inference from study data (rejected null hypothesis)

Errors in hypothesis testing Test result (P or 95% CI) Null Hypothesis (No difference) Accept Null (P < 0.05 or 95% CI includes 1) Reject Null (P ≥ 0.05 or 95% CI excludes 1) True (exposure doesn’t cause disease) OK Type-1 error False (exposure causes disease) Type-2 error This is the so called “Neyman-Pearson” approach to hypothesis testing. We fix the probability of making a type 1 error (concluding an effect is present, when there is none), and type 2 error (concluding no effect is present, when the exposure truly does influence outcome), and calculate the sample size required, based on this.

v/2 u (~10%) v/2 (~5%) H0: exp doesn’t cause disease HAlt: exp. associated with disease 0.75 Distribution of risk of disease if no effect of exposure (null hypothesis) Distribution of risk of disease if exposure associated with disease (alternative hypothesis) When we perform a study, our results come from a hypothesised distribution. The null in our sample, represents 0.5. Our estimate for the exposed, is most likely to land at 0.75, but as you have seen, this may be different, due to chance. We have a small chance of getting a result that suggests our exposure has no effect on the outcome. The width of the probability curves can be reduced by increasing the numbers in the study. To accomplish this we fix the shaded and solid areas, estimate the likely treatment effect (0.75) and then use a calculation to find what width of curve is produced by a sample size that we recruit. v/2 v/2 0.50 0.75 u (~10%) v/2 (~5%)

Sample size n=sample in each group π0=risk in unexposed (1/4) π1=risk in exposed (3/4) u=1 sided AUC of normal dist. Corresponding to 100%-power (eg. 10%; u=1.28) v=2 sided z level corresponding to % of AUC of normal dist. for two sided significance level required (5%, v=1.96) Ugly formula for sample size. Put in for interest, not examinable, but principles are.

Our example... n = 18.8 per group (38 total) We were unlucky not to find a significant difference between the two samples

Simpler formula... n=16 per group or 32 total Simplification, pi hat = average of null (pi 0) and alternative (pi 1). 16*0.6*0.4/(0.25)^2 = 16

Sample size vs. power The statistical power (1-probability of type 2 error) increases as sample size increases. Most investigators consider 90% power (0.9) standard for most study designs.

5 different effect measures We have 10 repetitions of the study; How many of these 95% confidence intervals would you expect to contain the true value? How many studies draw the incorrect conclusion of no difference between the treatment and no treatment groups using p-values?

Two approaches Yes/No→ ‘p-value’ What is the true difference? →’95% confidence interval’ 33

Confidence Interval What is the true difference? 95% confidence interval is not “95% probability that the true effect estimate lies within the confidence interval” “If we repeated the study over and over, and calculate 95% confidence intervals on each, 19/20 times, these intervals will contain the true effect estimate” Population Confidence interval Sample 34

95% CI “a series of values compatible with the true value”

Confidence Interval If 95% confidence interval does not contain 0 (difference between means) or 1 (ratio effect measures) the null hypothesis can be rejected. Increasing popularity over p-values...

P-value Propose ‘null’ hypothesis: no effect of eating contaminated food on risk of diarrhoea Is observed difference True vs. spurious (sampling error, chance)? Gives Yes or No answer P-value = P(observed results or more extreme | null true) If P-value low (P < 0.05): Chance is unlikely to explain result Reject null hypothesis 37

Problems with P-values NOT- probability that the null hypothesis is true Wrong conditional interpretation. Affected by both sample size and effect size Sample size is by design… Meta-analyses Is the effect clinically significant? Are the null and alternate hypotheses sufficiently different?

Problems with P-values Imagine study; All given two treatments. At follow up which drug do you prefer? Total Preference Treatment: placebo % Preference Two sided p-value 20 15:5 75% 0.04 200 115:86 57.50% 2000 1046:954 52.30% 2 000 000 1001445: 998555 50.07%

Which of the following statements about p-values are true? A) Probability of the observed results (or more extreme) if we assume there is no association B) Probability null hypothesis is true C) Unaffected by the sample size D) Used to adjudicate the presence of confounding.

Imagine…. Redesigned the dice rolling expt. The risks are the same in both cases and controls of getting disease (1 to 10 = disease; 11 to 20 = no disease). What is the chance of a type-1 (false positive) error? Is the type-1 error rate improved by increasing the sample size?

Summary Confidence interval – range of values within which we are reasonably confident that the population measure lies P-value – strength of evidence against the null hypothesis that the true difference in the population is zero.

Type-2 error is best described as: Not seeing an effect that truly is there. Detecting an effect that really isn’t there Bias Increases with increasing sample size

Type-1 error is: Detecting an effect that really isn’t there Not detecting an effect that really is there Bias Not fixed.

The probability of a type-1 error is usually... 20% 5% 1% 10%

a) It is a true effect of the exposure on the outcome. During the analysis of the results of this epidemiological study, you derive a crude odds ratio with a 95% which excludes the null value of 1. Which of the following could not explain the result you have found?   a) It is a true effect of the exposure on the outcome. b) It is a type-2 error c) It is a type-1 error d) It is due to information bias e) It is due to the presence of unadjusted confounding factors.

Which of the following measures of disease frequency is able to be estimated from a cross-sectional study? Incidence rate Cumulative incidence Prevalence Risk ratio