# Analytic Epidemiology

## Presentation on theme: "Analytic Epidemiology"— Presentation transcript:

Analytic Epidemiology
Unit 4: Analytic Epidemiology

Unit 4 Learning Objectives:
1. Understand hypothesis formulation in epidemiologic studies. 2. Understand and calculate measures of effect (risk difference, risk ratio, rate ratio, odds ratio) used to evaluate epidemiologic hypotheses. 3. Understand statistical parameters used to evaluate epidemiologic hypotheses and results: --- P-values --- Confidence intervals --- Type I and Type II error --- Power

Unit 4 Learning Objectives (cont.):
4. Recognize the primary study designs used to evaluate epidemiologic hypotheses: --- Randomized trial --- Prospective & retrospective cohort studies --- Case-control study --- Case-crossover study --- Cross-sectional study

Assigned Readings: Textbook (Gordis): Chapter 11 Rothman: Random error and the role of statistics. In Epidemiology: an Introduction, Chapter 6, pages

Analytic Epidemiology
Study of the DETERMINANTS of health-related events

Hypothesis Formulation
Scientific Method (not unique to epi) --- Formulate a hypothesis --- Test the hypothesis

Basic Strategy of Analytical Epi
1. Identify variables you are interested in: • Exposure • Outcome 2. Formulate a hypothesis 3. Compare the experience of two groups of subjects with respect to the exposure and outcome

Basic Strategy of Analytical Epi
Note: Assembling the study groups to compare, whether on the basis of exposure or disease status, is one of the most important elements of study design. Ideally, we would like to know what happened to exposed individuals had they not been exposed, but this is “counterfactual” since, by definition, such individuals were exposed.

Hypothesis Formulation
The “Biostatistican’s” way H0: “Null” hypothesis (assumed) H1: “Alternative” hypothesis The “Epidemiologist’s” way Direct risk estimate (e.g. best estimate of risk of disease associated with the exposure).

Hypothesis Formulation
Biostatistican: H0: There is no association between the exposure and disease of interest H1: There is an association between the (beyond what might be expected from random error alone)

Hypothesis Formulation
Epidemiologist: What is the best estimate of the risk of disease in those who are exposed compared to those who are unexposed (i.e. exposed are at XX times higher risk of disease). This moves away from the simple dichotomy of yes or no for an exposure/disease association – to the estimated magnitude of effect irrespective of whether it differs from the null hypothesis.

Hypothesis Formulation
“Association” Statistical dependence between two variables: • Exposure (risk factor, protective factor, predictor variable, treatment) • Outcome (disease, event)

Hypothesis Formulation
“Association” The degree to which the rate of disease in persons with a specific exposure is either higher or lower than the rate of disease among those without that exposure.

Hypothesis Formulation
Ways to Express Hypotheses: 1. Suggest possible events… The incidence of tuberculosis will increase in the next decade.

Hypothesis Formulation
Ways to Express Hypotheses: 2. Suggest relationship between specific exposure and health-related event… A high cholesterol intake is associated with the development (risk) of coronary heart disease.

Hypothesis Formulation
Ways to Express Hypotheses: 3. Suggest cause-effect relationship…. Cigarette smoking is a cause of lung cancer

Hypothesis Formulation
Ways to Express Hypotheses: 4. “One-sided” vs. “Two-sided” One-sided example: Helicobacter pylori infection is associated with increased risk of stomach ulcer Two-sided example: Weight-lifting is associated with risk of lower back injury

Hypothesis Formulation
Guidelines for Developing Hypotheses: State the exposure to be measured as specifically as possible. State the health outcome as Strive to explain the smallest amount of ignorance

Hypothesis Formulation
Example Hypotheses: POOR Eating junk food is associated with the development of cancer. GOOD The human papilloma virus (HPV) subtype 16 is associated with the development of cervical cancer.

“Measures of Effect” Used to evaluate the research hypotheses
Reflects the disease experience of groups of persons with and without the exposure of interest Often referred to as a “point estimate” (best estimate of exposure/disease relationship between the two groups)

“Measures of Effect” • Risk Difference (RD) • Relative Risk (RR)
--- Risk Ratio (RR) --- Rate Ratio (RR) • Odds Ratio (OR)

“Measures of Effect” • Risk Difference (RD)
The absolute difference in the incidence (risk) of disease between the exposed group and the non-exposed (“reference”) group

“Risk Difference” Hypothesis: Asbestos exposure is associated
with mesothelioma Results: Of 100 persons with high asbestos exposure, 14 develop mesothelioma over 10 years Of 200 persons with low/no asbestos exposure, 12 develop mesothelioma over 10 years D+ D- E+ E-

“Risk Difference” Hypothesis: Asbestos exposure is associated
with mesothelioma Results: Of 100 persons with high asbestos exposure, 14 develop mesothelioma over 10 years Of 200 persons with low/no asbestos exposure, develop mesothelioma over 10 years D+ D- E+ 14 100 E- 12 200

“Risk Difference” D+ D- E+ 14 86 100 E- 12 188 200
Hypothesis: Asbestos exposure is associated with mesothelioma Results: Of 100 persons with high asbestos exposure, 14 develop mesothelioma over 10 years Of 200 persons with low/no asbestos exposure, 12 develop mesothelioma over 10 years D+ D- E+ 14 86 100 E- 12 188 200 RD = IE+ – IE- RD = (14 / 100) – (12 / 200) RD = 0.14 – 0.06 = 0.08 The absolute 10-year risk of mesothelioma is 8% higher in persons with asbestos exposure compared to persons with low or no exposure to asbestos.

{“Relative Risk (RR)”}
“Measures of Effect” • Risk Ratio • Rate Ratio Compares the incidence of disease (risk) among the exposed with the incidence of disease (risk) among the non-exposed (“reference”) by means of a ratio. The reference group assumes a value of 1.0 (the “null” value) {“Relative Risk (RR)”}

The ‘null’ value (1.0) CIexposed = 0.0026 RR = 1.0
CInon-exposed = CIexposed = CInon-exposed = 0.49 IRexposed = per 100K IRnon-exposed = per 100K RR = 1.0 RR = 1.0 RR = 1.0

The ‘null’ value (1.0) • If the relative risk estimate is > 1.0,
the exposure appears to be a risk factor for disease. • If the relative risk estimate is < 1.0, the exposure appears to be protective of disease occurrence.

“Risk Ratio” E+ E- D+ D- Hypothesis:
Being subject to physical abuse in childhood is associated with lifetime risk of attempted suicide Results: Of 2,240 children not subject to physical abuse, have attempted suicide. Of 840 children subjected to physical abuse, 10 have attempted suicide. E+ E- D+ D- Note that the row and column headings have been arbitrarily switched from the prior example.

“Risk Ratio” E+ E- D+ 10 16 D- 840 2,240 Hypothesis:
Being subject to physical abuse in childhood is associated with lifetime risk of attempted suicide Results: Of 2,240 children not subject to physical abuse, have attempted suicide. Of 840 children subjected to physical abuse, have attempted suicide. E+ E- D+ 10 16 D- 840 2,240

“Risk Ratio” E+ E- D+ 10 16 D- 830 2,224 840 2,240 Hypothesis:
Being subject to physical abuse in childhood is associated with lifetime risk of attempted suicide Results: Of 2,240 children not subject to physical abuse, 16 have attempted suicide. Of 840 children subjected to physical abuse, 10 have attempted suicide. E+ E- D+ 10 16 D- 830 2,224 840 2,240 RR = IE+ / IE- RR = (10 / 840) / (16 / 2,240) RR = / = 1.68

“Risk Ratio” RR = IE+ / IE- = 1.68
Children with a history of physical abuse are approximately 1.7 times more likely to attempt suicide in their lifetime compared to children without a history of physical abuse. The risk of lifetime attempted suicide is approximately 70% higher in children with a history of physical abuse compared to children without a history of physical abuse.

“Rate Ratio” Hypothesis: Average daily fiber intake is associated with risk of colon cancer Results: Of 112 adults with high fiber intake followed for 840 person yrs, 9 developed colon cancer. Of 130 adults with moderate fiber intake followed for 900 person yrs, 14 developed colon cancer Of 55 adults with low fiber intake followed for 450 person yrs, 12 developed colon cancer.

“Rate Ratio” Expos. D+ D- PY High 9 --- 840 Mod 14 900 Low 12 450
• Assume that high fiber intake is the reference group (value of 1.0) • Compare the incidence rate (IR) of colon cancer: Moderate fiber intake versus high fiber intake Low fiber intake versus high fiber intake

“Rate Ratio” Expos. D+ D- PY High 9 --- 840 Mod 14 900 Low 12 450 D+
IR RR High 9 --- 840 0.0107 1.0 Mod 14 900 0.0156 1.46 Low 12 450 0.0267 2.50

“Rate Ratio” RR = Imoderate / Ihigh = 1.46 RR = Ilow / Ihigh = 2.50
Persons with moderate fiber intake are at 1.46 times higher risk of developing colon cancer than persons with high fiber intake. Persons with low fiber intake are at 2.50 times higher risk of developing colon cancer than persons with high fiber intake.

“Measures of Effect” • Odds Ratio (OR)
Compares the odds of exposure among those with disease to the odds of exposure among those without the disease. Does not compare the incidence of disease between groups.

“Odds Ratio” Hypothesis: Eating chili peppers is associated with development of gastric cancer. Cases: ate chili peppers did not eat chili peppers Controls: ate chili peppers did not eat chili peppers D+ D- E+ E-

“Odds Ratio” D+ D- E+ 12 (a) 88 (b) E- 9 (c) 391 (d) 21 479
Hypothesis: Eating chili peppers is associated with development of gastric cancer. Cases: ate chili peppers did not eat chili peppers Controls: ate chili peppers did not eat chili peppers OR = (a / c) / (b / d) D+ D- E+ 12 (a) 88 (b) E- 9 (c) 391 (d) 21 479 OR = (12 / 9) / (88 / 391) OR = / = 5.92 OR = (ad) / (bc)

“Odds Ratio” OR = 5.92 • The odds of being exposed to chili peppers are 5.92 times higher for gastric cancer cases as compared to controls • (Interpreting OR as RR – if appropriate) The incidence (or risk) of gastric cancer is 5.92 times higher for persons who eat chili peppers as compared with persons who do not eat chili peppers (Is this appropriate?)

Odds Ratio & Risk Ratio Relationship between RR and OR:
The odds ratio will provide a good estimate of the risk ratio when: 1. The outcome (disease) is rare OR 2. The effect size is small or modest

Odds Ratio & Risk Ratio The odds ratio will provide a good estimate of the risk ratio when: The outcome (disease) is rare a / (a +b ) RR = c / (c +d) D+ D- E+ a b E- c d If the disease is rare, then cells (a) and (c) will be small OR = (a / c) / (b / d) a / (a +b ) a / b ad RR = = =-- = OR c / (c +d) c / d bc OR = (ad) / (bc)

Odds Ratio & Risk Ratio The odds ratio will provide a good estimate of the risk ratio when: 2. The effect size is small or modest. D+ D- E+ 40 60 E- 120 180 (40 / 120) 0.333 OR = = = 1.0 (60 / 180) 0.333 40 / ( ) 0.40 RR = = 1.0 120 / ) 0.40

Odds Ratio & Risk Ratio (20 / 10) 2.0
Finally, we expect the risk ratio to be closer to the null value of 1.0 than the odds ratio. Therefore, be especially interpreting the odds ratio as a measure of relative risk when the outcome is not rare and the effect size is large. (20 / 10) 2.0 OR = = = 6.0 (30 / 90) 0.333 D+ D- E+ 20 30 E- 10 90 (20 / 50) 0.40 RR = = = 4.0 (10 / 100) 0.10