1 EPI 5240: Introduction to Epidemiology Measures used to compare groups October 5, 2009 Dr. N. Birkett, Department of Epidemiology & Community Medicine,

Slides:



Advertisements
Similar presentations
1 Epidemiologic Measures of Association Saeed Akhtar, PhD Associate Professor, Epidemiology Division of Epidemiology and Biostatistics Aga Khan University,
Advertisements

Comparing Two Proportions (p1 vs. p2)
KRUSKAL-WALIS ANOVA BY RANK (Nonparametric test)
Simple Logistic Regression
KINE 4565: The epidemiology of injury prevention Case control and case crossover studies.
Random error, Confidence intervals and p-values Simon Thornley Simon Thornley.
Introduction to Risk Factors & Measures of Effect Meg McCarron, CDC.
EPI 809 / Spring 2008 Final Review EPI 809 / Spring 2008 Ch11 Regression and correlation  Linear regression Model, interpretation. Model, interpretation.
Measures of Disease Association Measuring occurrence of new outcome events can be an aim by itself, but usually we want to look at the relationship between.
Comunicación y Gerencia 22/12/20101Cohort studies.
Measures of association
1 The Odds Ratio (Relative Odds) In a case-control study, we do not know the incidence in the exposed population or the incidence in the nonexposed population.
BIOST 536 Lecture 4 1 Lecture 4 – Logistic regression: estimation and confounding Linear model.
Understanding study designs through examples Manish Chaudhary MPH (BPKIHS)
Lecture 9: p-value functions and intro to Bayesian thinking Matthew Fox Advanced Epidemiology.
Are exposures associated with disease?
Thomas Songer, PhD with acknowledgment to several slides provided by M Rahbar and Moataza Mahmoud Abdel Wahab Introduction to Research Methods In the Internet.
EPIDEMIOLOGY Why is it so damn confusing?. Disease or Outcome Exposure ab cd n.
Analytic Epidemiology
01/20141 EPI 5344: Survival Analysis in Epidemiology Quick Review and Intro to Smoothing Methods March 4, 2014 Dr. N. Birkett, Department of Epidemiology.
Dr. Abdulaziz BinSaeed & Dr. Hayfaa A. Wahabi Department of Family & Community medicine  Case-Control Studies.
Case-Control Studies (retrospective studies) Sue Lindsay, Ph.D., MSW, MPH Division of Epidemiology and Biostatistics Institute for Public Health San Diego.
Measuring Associations Between Exposure and Outcomes.
Evidence-Based Medicine 4 More Knowledge and Skills for Critical Reading Karen E. Schetzina, MD, MPH.
Biostatistics Case Studies 2005 Peter D. Christenson Biostatistician Session 4: Taking Risks and Playing the Odds: OR vs.
Estimation of Various Population Parameters Point Estimation and Confidence Intervals Dr. M. H. Rahbar Professor of Biostatistics Department of Epidemiology.
CHP400: Community Health Program- lI Research Methodology STUDY DESIGNS Observational / Analytical Studies Case Control Studies Present: Disease Past:
Hypothesis Testing Field Epidemiology. Hypothesis Hypothesis testing is conducted in etiologic study designs such as the case-control or cohort as well.
Measures of Association
Risk Concepts and Glossary. Cross-sectional study The observation of a defined population at a single point in time or time interval. Exposure and outcome.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 8 – Comparing Proportions Marshall University Genomics.
01/20151 EPI 5344: Survival Analysis in Epidemiology Maximum Likelihood Estimation: An Introduction March 10, 2015 Dr. N. Birkett, School of Epidemiology,
CAT 3 Harm, Causation Maribeth Chitkara, MD Rachel Boykan, MD.
Statistics for clinicians Biostatistics course by Kevin E. Kip, Ph.D., FAHA Professor and Executive Director, Research Center University of South Florida,
A short introduction to epidemiology Chapter 2b: Conducting a case- control study Neil Pearce Centre for Public Health Research Massey University Wellington,
Approaches to the measurement of excess risk 1. Ratio of RISKS 2. Difference in RISKS: –(risk in Exposed)-(risk in Non-Exposed) Risk in Exposed Risk in.
Causation ? Tim Wiemken, PhD MPH CIC Assistant Professor Division of Infectious Diseases University of Louisville, Kentucky.
SEARO – CSR Training on Outbreak Investigation Cohort and case-control studies Observational studies.
01/20151 EPI 5344: Survival Analysis in Epidemiology Survival curve comparison (non-regression methods) March 3, 2015 Dr. N. Birkett, School of Epidemiology,
Contingency tables Brian Healy, PhD. Types of analysis-independent samples OutcomeExplanatoryAnalysis ContinuousDichotomous t-test, Wilcoxon test ContinuousCategorical.
The binomial applied: absolute and relative risks, chi-square.
MBP1010 – Lecture 8: March 1, Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)
Measures of Association and Impact Michael O’Reilly, MD, MPH FETP Thailand Introductory Course.
Basic concept of clinical study
Case Control Study : Analysis. Odds and Probability.
Case-Control Study Duanping Liao, MD, Ph.D
11/20091 EPI 5240: Introduction to Epidemiology Confounding: concepts and general approaches November 9, 2009 Dr. N. Birkett, Department of Epidemiology.
A short introduction to epidemiology Chapter 9: Data analysis Neil Pearce Centre for Public Health Research Massey University Wellington, New Zealand.
Measuring Associations Between Exposure and Outcomes Chapter 3, Szklo and Nieto.
Case-Control Studies Abdualziz BinSaeed. Case-Control Studies Type of analytic study Unit of observation and analysis: Individual (not group)
CHP400: Community Health Program - lI Research Methodology STUDY DESIGNS Observational / Analytical Studies Cohort Study Present: Disease Past: Exposure.
The Mathematics of Biostatistics Chapter 6 and 7 Copyright Kaplan University 2009.
THE CHI-SQUARE TEST BACKGROUND AND NEED OF THE TEST Data collected in the field of medicine is often qualitative. --- For example, the presence or absence.
Fall 2002Biostat Inference for two-way tables General R x C tables Tests of homogeneity of a factor across groups or independence of two factors.
Case control & cohort studies
2 3 انواع مطالعات توصيفي (Descriptive) تحليلي (Analytic) مداخله اي (Interventional) مشاهده اي ( Observational ) كارآزمايي باليني كارآزمايي اجتماعي كارآزمايي.
Case Control study. An investigation that compares a group of people with a disease to a group of people without the disease. Used to identify and assess.
Measures of disease frequency Simon Thornley. Measures of Effect and Disease Frequency Aims – To define and describe the uses of common epidemiological.
Epidemiologic Measures of Association
The binomial applied: absolute and relative risks, chi-square
Lecture 8 – Comparing Proportions
Measures of Association
Random error, Confidence intervals and P-values
Class session 13 Case-control studies
Prospective Cohort Study (Click or press Enter to follow animation)
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
Logistic Regression we will go through the pdf first to outline some terms refer to earlier ppt on ODDS RATIOS (Stats Club)
Research Techniques Made Simple: Interpreting Measures of Association in Clinical Research Michelle Roberts PhD,1,2 Sepideh Ashrafzadeh,1,2 Maryam Asgari.
Ob/Gyn Journal Club Notes
Presentation transcript:

1 EPI 5240: Introduction to Epidemiology Measures used to compare groups October 5, 2009 Dr. N. Birkett, Department of Epidemiology & Community Medicine, University of Ottawa

2 Session Overview Methods of Comparing groups –Risk/rate ratios –Odd ratios –Difference measures

3 ONE BIG WARNING!!!!! Some books (e.g. the Greenberg one used in the summer course) rotate their 2X2 tables from the normal approach. That is, they have the outcomes as the rows and the exposure as the columns. BE WARNED. This could cause confusion. My tables use the more common approach.

4 Comparing groups (1) Two main outcome measures –Incidence (either risk or rate) –Prevalence How do you determine if an exposure is related to an outcome? –Need to compare the measure in the two groups. Differences Ratios (we’ll start with this one). –Ratio measures have NO units. –All ratio measures have the same interpretation 1.0 = no effect < 1.0  protective effect > 1.0  increased risk –Values over 2.0 are of strong interest

5 Comparing groups:Cohorts (2) YES NO YES 1,000 9,000 10,000 NO 100 9,900 10,000 1,100 18,900 20,000 Disease Exp RISK RATIO Risk in exposed: = 1000/10000 Risk in Non-exposed= 100/10000 If exposure increases risk, you would expect the risk in the exposed to be larger than risk in the unexposed. How much larger can be assessed by the ratio of one to the other: Risk in exp Risk ratio (RR) = Risk in non-exp = (1000/10000)/(100/10000) = 10.0

6 Comparing groups:Cohorts (3) YES NO YES a b a+b NO c d c+d a+c b+d N Disease Exp RISK RATIO Risk in exposed: = a/(a+b) Risk in Non-exposed= c/(c+d) If exposure increases risk, you would expect a/(a+b) to be larger than c/(c+d). How much larger can be assessed by the ratio of one to the other: Risk in exp Risk ratio (RR) = Risk in non-exp = (a/(a+b))/(c/(c+d) a/(a+b) = c/(c+d)

7 Comparing groups:Cohorts (4) YES NO High Low Death Pollutant level Risk in exposed: = 42/122 = Risk in Non-exposed= 43/345 = Exp risk Risk ratio (RR) = Non-exp risk = 0.344/0.125 = 2.76

8 95% CI’s for CIR (1) For a mean value, the 95% CI is given as: where Assumes mean has a normal (Gaussian) distribution

9 Might try using the same approach to obtain ’95% CI’ for CIR using: BUT: CIR is NOT normally distributed –Range from 0 to +∞ –Null value = 1.0 –Implies a non-symmetric distribution 95% CI’s for CIR (2)

10 Plot of ‘CIR’ distribution when H 0 is true

11 Instead, use ‘log(CIR)’ where the log is taken to the ‘natural’ base ‘e –Often written ln(CIR) ln(CIR) is approximately normally distributed –Range from -∞ to +∞ –Null value = % CI is given by: Need to find formula for ‘se(ln(CIR))’ 95% CI’s for CIR (3)

12 Plot of ‘ln(CIR)’ distribution when H 0 is true

13 if exposed and unexposed are independent 95% CI’s for CIR (4) After some math, this gives the following result (next slide)

14 95% CI’s for CIR (5) YES NO YES a b a+b NO c d c+d a+c b+d N Disease Exp

15 95% CI’s for CIR (6) We’re close now. Just take the ‘anti-logs’ (usually called the ‘exp’ function

16 Comparing groups:Cohorts (5) YES NO High Low Death Pollutant level Risk ratio (RR) or CIR = var(ln(CIR)) = = *122 43*345 se(ln(CIR)) = sqrt( ) = Upper 95% CI = 2.76 * exp(+1.96*0.190) = 4.00 Lower 95% CI = 2.76 * exp(-1.96*0.190) = 1.90 Conclusion: CIR is: 2.76 (1.90 to 4.00)

17 Comparing groups: Cohorts (6) Hypothesis testing (H 0 : CIR=1) –Much less common than 95% CI’s –Normal approximation test is generally OK

18 Comparing groups:Cohorts (7) YES NO YES 1,000 9,000 10,000 NO 100 9,900 10,000 1,100 18,900 20,000 Disease Exp RISK DIFFERENCE Risk in exposed: = 1000/10000 Risk in Non-exposed= 100/10000 If exposure increases risk, you would expect the risk in the exposed to be larger than risk in the unexposed. How much larger can be assessed by the difference between the two: Risk difference (RD) = (Risk in Exp) – (Risk in Non-exp) = = = ,000 10,000 10,000

19 Comparing groups:Cohorts (8) YES NO YES a b a+b NO c d c+d a+c b+d N Disease Exp RISK DIFFERENCE Risk in exposed: = a/(a+b) Risk in Non-exposed= c/(c+d) If exposure increases risk, you would expect a/(a+b) to be larger than c/(c+d). How much larger can be assessed by the difference between the two: Risk difference (RD) = (Risk in Exp) – (Risk in Non-exp) a c = a + b c + d

20 Comparing groups:Cohorts (9) YES NO High Low Death Pollutant Level Risk in exposed: = 42/122 = Risk in Non-exposed= 43/345 = Risk difference (RD) = (Risk in Exp) - (Risk in Non-exp) = = 0.219

21 We assume that the incidence follows a binomial distribution Can be considered as approximately normal if incidence isn’t too small). 95% CI’s for Risk Diff (1)

22 95% CI’s for Risk Diff (2) YES NO High Low Death Pollutant level RD = *80 43*302 var(RD) = = se(RD) = sqrt( ) = Upper 95% CI = *0.047 = Lower 95% CI = *0.047 = Conclusion: RD is: (0.127 to 0.310)

23 Comparing groups: Cohorts (10) Which comparative measure do you use? Depends on the circumstances. Risk Ratio  RELATIVE risk measure Risk Difference  ABSOLUTE risk measure Post-menopausal estrogens & endometrial cancer –RR = 2.3 –RD = 2/10,000

24 Comparing groups:Cohorts (11) Disease Person-years YES 1,000 9,500 NO 100 9,950 1,100 19,450 Exp RATE RATIO Rate in exposed: = 1000/9500 Rate in Non-exposed= 100/9950 If exposure increases rate of getting disease, you would expect the rate in exposed to be larger than the rate in unexposed. How much larger can be assessed by the ratio of one to the other: Rate in Exp Rate ratio (RR) = Rate in Non-exp = (1000/9500)/(100/9950) = 10.5

25 Comparing groups: Cohorts (12) DISEASE Person-time YES A Y 1 NO B Y 2 A + B Y 1 + Y 2 Exp RATE RATIO Rate in exposed: = A/Y 1 Rate in Non-exposed= B/Y 2 If exposure increases rate of getting disease, you would expect A/Y 1 to be larger than B/Y 2. How much larger can be assessed by the ratio of one to the other: Rate in Exp Rate ratio (RR) = Rate in Non-exp = (A/Y 1 ))/(B/Y 2 ) A/Y 1 = B/Y 2

26 Comparing groups:Cohorts (13) Rate in exposed: = 42/101 = Rate in Non-exposed= 43/323.5 = Rate in Exp Rate ratio (RR) = Rate in Non-exp = 0.416/0.133 = 3.13 Pollutant level Dead Person-years High Low

27 Use the same approach to obtain ’95% CI’ for IDR as we used for CIR: BUT: IDR is NOT normally distributed –Range from 0 to +∞ –Null value = 1.0 –Implies a non-symmetric distribution 95% CI’s for IDR (1)

28 Instead, use ‘log(IDR)’ where the log is taken to the ‘natural’ base ‘e –Often written ln(IDR) ln(IDR) is approximately normally distributed –Range from -∞ to +∞ –Null value = % CI is given by: Need to find formula for ‘se(ln(IDR))’ 95% CI’s for IDR (2)

29 if exposed and unexposed are independent 95% CI’s for IDR (3) After some math, this gives the following result (next slide)

30 95% CI’s for IDR (4) DISEASE Person-time YES a Y 1 NO c Y 2 a+c Y 1 + Y 2 Exp DOES NOT DEPEND ON PERSON-TIME!!

31 95% CI’s for IDR (5) We’re close now. Just take the ‘anti-logs’ (usually called the ‘exp’ function

32 Comparing groups:Cohorts (14) Pollutant level Dead Person-years High Low Rate ratio (RR) or IDR = var(ln(IDR)) = = se(ln(IDR)) = sqrt(0.047) = Upper 95% CI = 3.13 * exp(+1.96*0.217) = 4.79 Lower 95% CI = 3.13 * exp( -1.96*0.217) = 2.05 Conclusion: IDR is: 3.13 (2.05 to 4.79)

33 Comparing groups: Cohorts (15) Hypothesis testing (H 0 : IDR=1) –Much less common than 95% CI’s –Normal approximation test is generally OK

34 Comparing groups:Cohorts (16) Disease Person-years YES 1,000 9,500 NO 100 9,950 1,100 19,450 Exp RATE DIFFERENCE Rate in exposed: = 1000/9500 Rate in Non-exposed= 100/9950 If exposure increases rate of getting disease, you would expect the rate in exposed to be larger than the rate in unexposed. How much larger can be assessed by the difference between the two: Rate difference = (Rate in Exp) – (Rate in Non-exp) = = cases/PY

35 Comparing groups:Cohorts (17) DISEASE Person-time YES A Y 1 NO B Y 2 A + B Y 1 + Y 2 Exp RATE DIFFERENCE Rate in exposed: = A/Y 1 Rate in Non-exposed= B/Y 2 If exposure increases rate of getting disease, you would expect A/Y 1 to be larger than B/Y 2. How much larger can be assessed by the difference between the two: Rate difference = (Rate in Exp) – (Rate in Non-exp) A B = Y 1 Y 2

36 Comparing groups: Cohorts (18) Rate in exposed: = 42/101 = Rate in Non-exposed= 43/323.5 = Rate difference (RD) = (Rate in Exp) – (Rate in Non-exp) = = cases/person-year Pollutant level Dead Person-years High Low

37 We assume that the incidence follows a Poisson distribution Can be considered as approximately normal if incidence isn’t too small). 95% CI’s for Rate Diff (1)

38 95% CI’s for Rate Diff (2) RD = cases/PY var(RD) = = se(RD) = sqrt( ) = Upper 95% CI = *0.067 = Lower 95% CI = *0.067 = Conclusion: Rate Diff is: (0.152 to 0.415) Cases/PY Pollutant level Dead Person-years High Low

39 Comparing groups: Cohorts (19) Some Issues What does RR (or RD) mean –Can mean risk or rate ratio. Some people think this is pedantic rather than correct –Need to tell which from context. –Sometimes referred to as Relative Risk (generic term). Are risk differences or ratios preferred? –RR’s are much more common –Both have a role to play.

40 CAN NOT COMPUTE A RISK RATIO! Can not estimate incidence from a case-control study. Can not compute risk differences. Why? We choose the subjects based on their outcome status. Usually, that means making the number of cases and controls equal. Hence, the ‘incidence’ in the case-control study is fixed at In real world, it is most likely much lower (1/100,000). Let’s look at an example. Comparing groups:Case-control (1)

41 Comparing groups:Case-control (2) YES NO YES 1,000 9,000 10,000 NO 100 9,900 10,000 1,100 18,900 20,000 Disease Exp RISK RATIO Risk in exposed: = 0.1 Risk in Non-exposed= 0.01 RR = 0.1/0.01 = 10.0 Case Control YES 1, ,524 NO ,100 1,100 2,200 Exp ‘RISK RATIO’ ‘Risk’ in exposed: = ‘Risk’ in Non-exposed= ‘RR’ = 0.656/.148 = 4.44

42 CAN NOT COMPUTE A RISK RATIO! So, what do we do? –Cornfield & Haenzel provided solution in They looked at the ODDS of exposure. The ratio of the odds of exposure in the cases and controls is almost the same as the RR, if the disease is rare. Comparing groups:Case-control (3)

43 Comparing groups:Case-control (4) YES NO YES ,300 NO ,000 1,000 2,000 Disease Exp ODDS RATIO Odds of exposure in cases = 900/100 Odds of exposure in controls = 400/600 If exposure increases rate of getting disease, you would to find more exposed cases than exposed controls. That is, the odds of exposure for case would be high. How much larger can be assessed by the ratio of one to the other: Exp odds in cases Odds ratio (OR) = Exp odds in controls = (900/100)/(400/600) = 13.5

44 Comparing groups:Case-control (5) YES NO YES a b a+b NO c d c+d a+c b+d N Disease Exp ODDS RATIO Odds of exposure in cases= a/c Odds of exposure in controls= b/d If exposure increases rate of getting disease, you would to find more exposed cases than exposed controls. That is, the odds of exposure for case would be high (a/c > b/d). How much larger can be assessed by the ratio of one to the other: Exp odds in cases Odds ratio (OR) = Exp odds in controls = (a/c)/(b/d) ad = bc

45 Yes No High Low Pollutant Level Odds of exp in cases: = 42/43 = Odds of exp in controls:= 18/67 = Odds ratio (OR) = Odds in cases/odds in controls = 0.977/ = (42*67)/(43*18) = 3.64 Comparing groups:Case-control (6) Disease NOTE: Risk ratio = 2.76 Rate ratio = 3.13

46 Use the same approach to obtain ’95% CI’ for OR as we used for CIR/IDR: BUT: OR is NOT normally distributed –Range from 0 to +∞ –Null value = 1.0 –Implies a non-symmetric distribution 95% CI’s for OR (1)

47 Instead, use ‘log(OR)’ where the log is taken to the ‘natural’ base ‘e –Often written ln(OR) ln(OR) is approximately normally distributed –Range from -∞ to +∞ –Null value = % CI is given by: Need to find formula for ‘se(ln(OR))’ 95% CI’s for OR (2)

48 95% CI’s for OR (3) Case Control YES a b NO c d a+c a+d Exp

49 95% CI’s for OR (4) We’re close now. Just take the ‘anti-logs’ (usually called the ‘exp’ function

50 Odds ratio (OR) = var(ln(OR)) = = se(ln(OR)) = sqrt(0.118) = Upper 95% CI = 3.63 * exp(+1.96*0.343) = 7.11 Lower 95% CI = 3.63 * exp( -1.96*0.343) = 1.85 Conclusion: OR is: 3.63 (1.85 to 7.11) Comparing groups:Case-control (6) Yes No High Low Pollutant Level Disease

51 Comparing groups:Case-Control (7) Hypothesis testing (H 0 : OR=1) –Much less common than 95% CI’s –Normal approximation test is generally OK JUST USE THE STANDARD Chi-square TEST!

52 You can compute an OR for a cohort. Why would you do so? –OR’s are the key outcome measure for logistic regression, one of the most common analysis methods used in epidemiology –Unless disease is common, the OR and the RR from the cohort will be very similar. But, where possible, rate ratios are preferred. Comparing groups:Case-control (8)

53 Cohort studies –Relative risk –Relative rate –Risk/rate differences Case-control study –Odds-ratio Summary: comparisons