Study of the causes of disease Siti Setiati

Slides:



Advertisements
Similar presentations
Case-control study 3: Bias and confounding and analysis Preben Aavitsland.
Advertisements

Bias, Confounding and Fallacies in Epidemiology
Bias Lecture notes Sam Bracebridge.
Andrea M. Landis, PhD, RN UW LEAH
Agency for Healthcare Research and Quality (AHRQ)
How would you explain the smoking paradox. Smokers fair better after an infarction in hospital than non-smokers. This apparently disagrees with the view.
Observational Studies and RCT Libby Brewin. What are the 3 types of observational studies? Cross-sectional studies Case-control Cohort.
Study Designs in Epidemiologic
Cohort Studies.
SLIDE 1 Confounding and Bias Aya Goto Nguyen Quang Vinh.
1 Confounding and Interaction: Part II  Methods to Reduce Confounding –during study design: »Randomization »Restriction »Matching –during study analysis:
Confounding and Interaction: Part II  Methods to reduce confounding –during study design: »Randomization »Restriction »Matching –during study analysis:
Confounding and Interaction: Part II
Case-Control Studies (Retrospective Studies). What is a cohort?
1 Case-Control Study Design Two groups are selected, one of people with the disease (cases), and the other of people with the same general characteristics.
Chance, bias and confounding
Bias Thanks to T. Grein.
Confounding and Interaction: Part II
Cohort Studies.
Dr. Rufaidah Dabbagh Dr. Armen Torchyan MBBS, MPH MD, MPH CMED 304 Family and Community Medicine Department Family and Community Medicine Department.
Bias and errors in epidemiologic studies Manish Chaudhary BPH( IOM) MPH(BPKIHS)
Manish Chaudhary BPH, MPH
Principles of Epidemiology Lecture 9 Dona Schneider, PhD, MPH, FACE
Dr K N Prasad MD., DNB Community Medicine
Epidemiological Study Designs And Measures Of Risks (2) Dr. Khalid El Tohami.
Case Control Study Manish Chaudhary BPH, MPH
Cohort Study.
Multiple Choice Questions for discussion
Dr. Abdulaziz BinSaeed & Dr. Hayfaa A. Wahabi Department of Family & Community medicine  Case-Control Studies.
Lecture 8 Objective 20. Describe the elements of design of observational studies: case reports/series.
Epidemiologic Study Designs Nancy D. Barker, MS. Epidemiologic Study Design The plan of an empirical investigation to assess an E – D relationship. Exposure.
Confounding and Interaction: Part II  Methods to reduce confounding –during study design: »Randomization »Restriction »Matching –during study analysis:
Study Design. Study Designs Descriptive Studies Record events, observations or activities,documentaries No comparison group or intervention Describe.
Spurious Association Sometimes an observed association between a disease and suspected factor may not be real. e.g. A study was conducted between births.
Epidemiology The Basics Only… Adapted with permission from a class presentation developed by Dr. Charles Lynch – University of Iowa, Iowa City.
CHP400: Community Health Program- lI Research Methodology STUDY DESIGNS Observational / Analytical Studies Case Control Studies Present: Disease Past:
Retrospective Cohort Study. Review- Retrospective Cohort Study Retrospective cohort study: Investigator has access to exposure data on a group of people.
Study of the causes of disease Siti Setiati
Lecture 6 Objective 16. Describe the elements of design of observational studies: (current) cohort studies (longitudinal studies). Discuss the advantages.
 Is there a comparison? ◦ Are the groups really comparable?  Are the differences being reported real? ◦ Are they worth reporting? ◦ How much confidence.
Measures of Association
ANALYTICAL STUDIES Prospective Studies COHORT Prepared by: Dr. Sahar Sabbour Community Medicine Department.
Study Designs in Epidemiologic
COMH7202: EPIDEMIOLOGY III – INTERMEDIATE CONCEPTS Confounding & Effect Modification
A short introduction to epidemiology Chapter 2b: Conducting a case- control study Neil Pearce Centre for Public Health Research Massey University Wellington,
Lecture 7 Objective 18. Describe the elements of design of observational studies: case ‑ control studies (retrospective studies). Discuss the advantages.
Cohort design in Epidemiological studies Prof. Ashry Gad Mohamed MBCh B, MPH, DrPH Prof. of Epidemiology Dr Amna R Siddiqui MBBS, MSPH, FCPS, PhD Associate.
Case-control study Chihaya Koriyama August 17 (Lecture 1)
Design and Analysis of Clinical Study 2. Bias and Confounders Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia.
Study Designs for Clinical and Epidemiological Research Carla J. Alvarado, MS, CIC University of Wisconsin-Madison (608)
Analytical epidemiology Disease frequency Study design: cohorts & case control Choice of a reference group Biases Alain Moren, 2006 Impact Causality Effect.
Case Control Study Dr. Ashry Gad Mohamed MB, ChB, MPH, Dr.P.H. Prof. Of Epidemiology.
Case-Control Studies Abdualziz BinSaeed. Case-Control Studies Type of analytic study Unit of observation and analysis: Individual (not group)
Epidemiological Research. Epidemiology A branch of medical science that deals with the incidence, distribution, and control of disease in a population.
COHORT STUDY COHORT A group of people who share a common characteristic or experience within a defined period of time. e.g. age, occupation, exposure.
Case Control Studies Dr Amna Rehana Siddiqui Department of Family and Community Medicine October 17, 2010.
Types of Studies. Aim of epidemiological studies To determine distribution of disease To examine determinants of a disease To judge whether a given exposure.
Odds Ratio& Bias in case-control studies
1 Study Design Imre Janszky Faculty of Medicine, ISM NTNU.
Case control & cohort studies
Copyright ©2011 Brooks/Cole, Cengage Learning Gathering Useful Data for Examining Relationships Observation VS Experiment Chapter 6 1.
(www).
Epidemiological Study Designs And Measures Of Risks (1)
Chapter 9: Case Control Studies Objectives: -List advantages and disadvantages of case-control studies -Identify how selection and information bias can.
Study Designs Group Work
Present: Disease Past: Exposure
CASE-CONTROL STUDIES Ass.Prof. Dr Faris Al-Lami MB,ChB MSc PhD FFPH
ERRORS, CONFOUNDING, and INTERACTION
Epidemiology MPH 531 Analytic Epidemiology Case control studies
Presentation transcript:

Study of the causes of disease Siti Setiati Etiologic research Study of the causes of disease Siti Setiati

Etiologic research The research question: Is there a relation between a determinant (risk factor) and a disease-outcome? Research question for causal relation! In a etiologic research wil the investigator a causal verband between the ‘determinant’ and the ‘ziekte’ study. Er wordt dus gezocht naar determinants van or risks for a certain disease or disorder. Like gesteld gaat the in dit lecture over a niet-experimentele or natural exposure als determinant.

Major Types of Clinical Epidemiologic Research Type of Research Question Descriptive/Causal Aim Diagnostic research Descriptive Predict the probability of presence of target disease from clinical and non-clinical profile Prognostic research Predict the course of disease from clinical an d non-clinical profile Etiologic research Causal Causally explain occurrence of target disease from determinant Intervention research Causal & Descriptive Causally explain the course of disease as influenced by treatment Predict the course of disease given treatment (options) and clinical and non-clinical profile Jakarta March 2007

Etiologic research Characteristics To demonstrate causality (cause-effect) Cause comes before effect Exposure or determinant occurs before the disease-outcome occurs Determinant-outcome relation is not explained by other factors Explanatory research versus descriptive research

Hills’ Criteria Temporal relationship, where the cause precedes the outcome Strong association Dose-response relationship Biological plausibility (has biological and theoretical reason, consistent with existing biological and medical knowledge)

Etiologic research What study design? Experimental Exposure or determinant assigned by investigator versus Observational Exposure or determinant not assigned by investigator This lecture: observational research

Etiologic research What study design? Design of two observational studies to show the relation of cause and effect: Cohort study Case-control study

Cohort study Also called follow-up study Definition Study in which persons, based on their exposure or a determinant and free of the disease outcome at the start of the study, are followed in time to assess the occurrence of the disease outcome.

Cohort study determinant + determinant - time disease + cohort without outcome determinant + disease - disease + determinant - disease - time start study disease-outcome

Framingham Heart Study 1948 – Framingham, MA 5200 persons 30-62 years old Aim: identification of risk factors for cardiovascular diseases Remeasured every 2 years Example of a research question: Is hypertension a risk factor for MI?

Framingham Heart Study cohort without myocardial infarction hypertension + MI - MI + hypertension - MI - time 1948 1998

Cohort study determinant-outcome relation hypertension + a b a/a+b=probability of MI for hypertension + = Incidence+ hypertension - c d c/c+d=probability of MI for hypertension - = Incidence - relative risk = incidence + / incidence -

Cohort study How do you get a cohort?

Cohort study How do you get a cohort? Geographical (Framingham Heart Study) Birth cohort (British 1946 birth cohort) Dynamic cohort (Leidsche Rijn) Occupational cohort (Whitehall study)

Cohort study How do you follow the cohort? How do you find the disease-outcome?

Cohort study How do you follow the cohort? How do you find the disease-outcome? After a certain time interval, send out a questionnaire or invite for interview or medical examination Record disease outcomes via medical files or registrations

Cohort study summary determinant disease-outcome

** - * - Prospective Cohort + + Free of outcome to t1 Exposure Outcome Start here Exposure Outcome + ** Free of outcome - * + - to t1

** - * - Historical Cohort + + Free of outcome to t1 Exposure Outcome Start here Exposure Outcome + ** Free of outcome - * + - to t1

Case-control study Also called patient-control study Definition Study in which patients with the disease-outcome and a control group without the disease-outcome are selected and in which it is determined how many people in both groups have been exposed to the determinant

Case-control study disease + (patients) disease – (controls) time determinant + disease + (patients) determinant - determinant + disease – (controls) determinant - time start study

Creutzfeldt-Jakob’s Disease

Creutzfeldt-Jakob’s Disease Fast, progressive form of dementia In the 90s a new variant of Creutzfeldt-Jakob was discovered in Europe after an epidemic of mad-cow disease Caused by eating beef? What research question? Why case control?

Creutzfeldt-Jakob’s Disease beef + patients with CJD beef - beef + controls from hospital beef - time start study

Case-control study determinant-outcome relation CJD + CJD - beef + a b beef - c d Odds Ratio a/c = odds beef+ in cases b/d = odds beef+ in controls = a x d / b x c

Case-control study How do you find cases/patients? How to selecet a control group?

Case-control study How do you find patients? GP; hospital; cancer registration How to select a control group? GP; hospital; general population Patients and controls have to come from the same ‘source’ population.

Case-control study How do you assess exposure to determinant?

Case-control study How do you assess exposure to determinant? Interview with participant Interview with proxy Medical file

Case-control study summary determinant disease-outcome

Case Control Start here Exposure Outcome + Case - Popu lation + Control -

Survey / Cross Sectional Study Design Direction of inquiry Survey / Cross Sectional TODAY Cohort Case-control Historical cohort

Measures of association: Cohort approach Research question? Is smoking associated with lung cancer? Cohort approach divide the cohort in smokers and non-smokers estimate the incidence density (or CI) in each group prior: ID smokers > ID not smokers

Measures of association: Cohort approach Disease Yes No Yes a - PY1 Determinant No c - PY0 a/py1 RR = c/py0

Measures of association: Cohort approach Smoking and lung cancer Disease Yes No Yes 440 - 22.008 py Determinant No 212 - 21.235 py RR = (440/22.008) / (212/21.235) = 2.0

Measures of association Risk difference (RD) between exposed and non-exposed  reflects public health impact = CIexposed – CI nonexposed or = IDexposed - CI nonexposed Risk difference smoking and lung cancer RD = 20/1000 py - 10/1000 py = 10 / 1000 py

Measures of association: Case Control approach Research question: Does smoking increase the risk of lung cancer ? Patient control study select cases and controls Estimate the frequency of smoking among cases and controls prior: % smokers among cases > % smokers among controls

Measures of association: Case Control approach Disease Yes No Yes a b Determinant No c d RR? Odds ratio = (a/c) / (b/d) = ad / bc

Measures of association: Case Control approach Smoking and lung cancer (controls = 10% random sampling from cohort) Disease Yes No Yes 440 300 740 Determinant No 212 350 562 Odds ratio (440/212) / (300/350) = 2.42 RR = (440/740) / (212/562) = 1.57 (shouldn’t be calculated)

Cohort study Advantages and disadvantages What are the advantages of a cohort study? What are the disadvantages of a cohort study?

Cohort study Advantages Cause is measured before effect Not very sensitive to selection- and information bias Appropriate for rare determinant Can study several outcomes Disadvantages Selective withdrawal / loss to follow-up Expensive and time consuming Not appropriate for rare outcome

Case-control study Advantages and disadvantages What are the advantages of a case-control study? What are the disadvantages of a case-control study?

Case-control study Advantages Efficient and relatively cheap Appropriate for rare outcome Can study several determinants Disadvantages Cause is measured after effect Very sensitive to selection- and infobias Not appropriate to study several outcomes

Exercise 1 Ad question 3 Cohort research Randomized trial Disadvantage: in both cases you need many women

Validity and bias Validity: Bias: absence of systematic errors in design, conduct or data-analysis of the research Bias: degree of disruption of the determinant–outcome relation caused by systematic errors – leads to reduced validity 3 types of bias in etiologic research: selection bias, information bias, confounding

Bias Any systematic process in the conduct of a study that results in the incorrect estimate of a measure of disease occurence or measure of association

Precision (or accuracy) The absence of Random error Depends on Standardisation of measurements Numbers Number of persons Number of (repeated) observations / measurements

Confounding definition Determinant – disease outcome relation is disturbed by the effect of another factor (the confounder) (“mixing of effects”) Can you think of an example? A confounder verstoort the relation between determinant and outcome Dit can alleen als the confounder samenhangt with the determinant èn als the confounder samenhangt with the outcome independent of the determinant die je studies (and not in the causale keten zit) (possible a dia toevoegen with deze driehoeksrelatie schematisch weergegeven?) Voor confounding can men in a research corrigeren (vooraf: restriction; achteraf:stratificeren or corrigeren: confounders have to known zijn and measured zijn).

Confounding Exposure Outcome Third variable To be a confounding factor, two conditions must be met: Exposure Outcome Third variable Be associated with exposure - without being the consequence of exposure Be associated with outcome - independently of exposure (not an intermediary)

Confounding example Children with a higher birth order more often have Down’s syndrome What could be a confounder?

Confounding determinant (birth order) disease outcome (Down sydrome) Confounder (age mother) Confounder is determinant of the disease outcome Confounder is associated with the determinant Confounder is no factor in the causal chain Confounder is: 1. predictive for outcome, los van determinant 2. associated with determinant 3. not in the causale keten

Confounding Think of another example of confounding determinant disease outcome Confounder Think of another example of confounding Confounder is: 1. predictive for outcome, los van determinant 2. associated with determinant 3. not in the causale keten

Confounding Coffee CHD Smoking Smoking is correlated with coffee drinking and a risk factor even for those who do not drink coffee

Confounding Birth Order Down Syndrome Maternal Age Maternal age is correlated with birth order and a risk factor even if birth order is low

Confounding Obesity Mastitis Age In cows, older ones are heavier and older age increases the risk for mastitis. This association may appear as an obesity association

Information bias definition Distortion of the determinant-outcome relation caused by systematic errors in the measurement of the determinant and/or outcome. Who knows an example?

Information bias examples Misclassification of determinant Self reporting more accurate for cases than controls (or the other way around) Misclassification of outcome Disease better diagnosed in people with determinant In what cases can this play a role? Can this also play a role in cohort research?

Selection bias definition Distortion of the determinant-outcome relation caused by systematic errors in the selection of study participants (cases and/or controls) Bias that is caused when individuals have different probabilities of being included in the study according to relevant study characteristics Determinant-outcome relation is different for those that do and do not participate

Selection bias example 1 Oral anticonception and probability of DVT Patients: women with DVT admitted to hospital. Controls: healthy women between 25-45 years old Patients turned out to use oral anticonception more often. Oral anticonception should be the cause of DVT. How could selection bias play a role here?

Selection bias example 1 Medical circuit: 'oral anticonception could lead to DVT’ Women with DVT complaints who use oral anticonception will be more often referred than those that do not use oral anticonception Because of this selective referral all oral anticonception users will have a higher probability to come into the study as a case and the effect of oral anticonception on DVT will be overestimated

Selection bias example 2 Patients from hospital – control group from hospital: In the hospital co-morbidity and unhealthy lifestyles occur more often than in the population Relation between smoking and cancer can be underestimated due to over-representation of controls who smoke

Selection bias example 3 Mortality rates among people who work are often lower than in the general population, because people who work are healthier than people who do not work (“healthy worker effect”).

How to prevent bias? Confounding – cannot be prevented Measure and adjust in data analysis Information bias - prevent during design Disease status blind for determinant status Medical files instead of self-reporting Same way of reporting for cases and controls Selection bias - prevent during design Control selection independent of determinant status Good definition of source population

Controlling for Information Bias - Blinding prevents investigators and interviewers from knowing case/control or exposed/non-exposed status of a given participant - Form of survey mail may impose less “white coat tension” than a phone or face-to-face interview - Questionnaire use multiple questions that ask same information acts as a built in double-check - Accuracy multiple checks in medical records gathering diagnosis data from multiple sources

THANK YOU

Confounding vs Interaction An extraneous or nuisance pathway that an investigator hopes to prevent or rule out Interaction A more detailed description of the relationship between the exposure and disease A richer description of the biologic or behavioral system under study A finding to be reported, not a bias to be eliminated Recall, it was in an attempt to prevent potential confounding of an association between an exposure and a given disease by some third variable by performing stratification that we discovered the concept of interaction. Sometimes there is a lot of confusion about the differences between confounding and interaction and there really need not be. How do confounding and interaction differ? Well, confounding is an extraneous pathway that we want rule out or avoid when looking at the direct association between our exposure in question and the disease under study. Interaction, however, when present, is a more detailed description of the biological or behavioral system under study. It is not extraneous but rather a richer description of the system. When present, it is not a bias we are seeking to eliminate but rather a new finding we should report.

Information / Measurement / Misclassification Bias Reporting bias: Individuals with severe disease tends to have complete records, therefore more complete information about exposures and greater association found Individuals who are aware of being participants of a study behave differently (Hawthorne effect)

Exercise 1 Ad question 1 Determinant: 3rd-generation pill compared to 2nd-generation pill Outcome: first case of deep vein thrombosis Domain: women in the fertile age, who have not yet had deep vein thrombosis

Exercise 1 Ad question 2 Case-control study: etiologic question about rare disorder (side effect)

Exercise 2 Ad question 1 Determinant: consumption soya (products) Outcome: (new cases of) breast cancer Domain: women (at risk for breast cancer)  

Exercise 2 Ad question 2 absolute risk is 427/34 759 = 0,0123 (1,23%) OR: ID = 427 / 488 989 person-years = 87,3 per 100,000 person-years Risk for high tofu consumption is: 52/52 695 = 98,7 per 100,000 person-years

Exercise 2 Ad question 3 Etiologic research question Age is possible confounder: Tofu eaters are older and higher age gives higher probability of breast cancer

Exercise 2 Ad question 4 cohort study In a P-C study patients with breast cancer are compared with a sample from the domain. For both groups information about tofu consumption is collected, e.g. via a questionnaire or interview Problem: information (recall) bias

Effect modification Definition: The association between exposure and disease differ in strata of the population Example: Tetracycline discolours teeth in children, but not in adults Example: Measles vaccine protects in children > 15 months, but not in children < 15 months Rare occurence 75

Exercise 2

Selection bias example 2 Population-based versus hospital-based research In which way could selection bias play a role?

Selection Bias Examples (www)

Selection Bias Examples (www)

Selection Bias Examples (www)

Selection Bias Examples (www)

Selection Bias Examples Selective survival (Neyman's) bias (www)

Selection Bias Examples Case-control study: Controls have less potential for exposure than cases Outcome = brain tumour; exposure = overhead high voltage power lines Cases chosen from province wide cancer registry Controls chosen from rural areas Systematic differences between cases and controls

Case-Control Studies: Potential Bias Schulz & Grimes, 2002 (www) (PDF)

Selection Bias Examples Cohort study: Differential loss to follow-up Especially problematic in cohort studies Subjects in follow-up study of multiple sclerosis may differentially drop out due to disease severity Differential attrition  selection bias

Selection Bias Examples Self-selection bias: - You want to determine the prevalence of HIV infection - You ask for volunteers for testing - You find no HIV - Is it correct to conclude that there is no HIV in this location?

Selection Bias Examples Healthy worker effect: Another form of self-selection bias “self-screening” process – people who are unhealthy “screen” themselves out of active worker population Example: - Course of recovery from low back injuries in 25-45 year olds - Data captured on worker’s compensation records - But prior to identifying subjects for study, self-selection has already taken place

Information / Measurement / Misclassification Bias Method of gathering information is inappropriate and yields systematic errors in measurement of exposures or outcomes If misclassification of exposure (or disease) is unrelated to disease (or exposure) then the misclassification is non-differential If misclassification of exposure (or disease) is related to disease (or exposure) then the misclassification is differential Distorts the true strength of association

Information / Measurement / Misclassification Bias Recall bias: Those exposed have a greater sensitivity for recalling exposure (reduced specificity) - specifically important in case-control studies - when exposure history is obtained retrospectively cases may more closely scrutinize their past history looking for ways to explain their illness - controls, not feeling a burden of disease, may less closely examine their past history Those who develop a cold are more likely to identify the exposure than those who do not – differential misclassification - Case: Yes, I was sneezed on - Control: No, can’t remember any sneezing

Confounding ? Maternal Age Down Syndrome Birth Order Birth order is correlated with maternal age but not a risk factor in younger mothers

Effect of randomisation on outcome of trials in acute pain Bandolier Bias Guide (www)

Confounding If each case is matched with a same-age control, there will be no association (OR for old age = 2.6, P = 0.0001) (www)

No Confounding (www)

Confounding or Effect Modification Birth Weight Leukaemia Sex Can sex be responsible for the birth weight association in leukaemia? - Is it correlated with birth weight? - Is it correlated with leukaemia independently of birth weight? - Is it on the causal pathway? - Can it be associated with leukaemia even if birth weight is low? - Is sex distribution uneven in comparison groups?

Confounding or Effect Modification Birth Weight Leukaemia Sex OR = 1.5 Does birth weight association differ in strength according to sex? Birth Weight Leukaemia BOYS OR = 1.8 GIRLS Birth Weight / / Leukaemia OR = 0.9

Effect modification is similar to interaction in statistics. In an association study, if the strength of the association varies over different categories of a third variable, this is called effect modification. The third variable is changing the effect of the exposure. The effect modifier may be sex, age, an environmental exposure or a genetic effect. Effect modification is similar to interaction in statistics. There is no adjustment for effect modification. Once it is detected, stratified analysis can be used to obtain stratum-specific odds ratios.

Effect modifier Confounding factor Belongs to nature Belongs to study Different effects in different strata Simple Useful Increases knowledge of biological mechanism Allows targeting of public health action Confounding factor Belongs to study Adjusted OR/RR different from crude OR/RR Distortion of effect Creates confusion in data Prevent (design) Control (analysis)

Modification-1 Present when the measure of association between a given determinant and outcome is not constant across a subject characteristics Descriptive modification may easily occur due to differences in prevalence of the disease across populationsor population subgroups The presence or absence of modification has a bearing on the domain and the generalizability of research findings Modifiers point to subdomains, which implies that generalizing results from a study should be different for populations with or without the (particular level of the) modifier

Modification-2 In etiologic research, analysis of modifiers may help the investigator to understand the complexity of multicausality and causally explain why a particular disease may be more common in certain individuals despite an apparent similar exposure to determinant

Statistical Interaction Definition when the magnitude of a measure of association (between exposure and disease) meaningfully differs according to the value of some third variable Synonyms Effect modification Effect-measure modification Heterogeneity of effect Proper terminology e.g. Smoking, caffeine use, and delayed conception Caffeine use modifies the effect of smoking on the risk for delayed conception. There is interaction between caffeine use and smoking in the risk for delayed conception. Caffeine is an effect modifier in the relationship between smoking and delayed conception. The example using smoking, caffeine use, and delayed conception illustrates statistical interaction, which is what we call the situation when a particular measure of association (between an exposure and disease; for example a risk ratio) meaningfully differs according to the level of some third variable. Synonyms for statistical interaction include effect modification, effect-measure modification and hetergeneity of effect. You will hear interaction and effect modification most commonly. What’s the proper usage in this situation? For example, we would say: . . .

RR = 3.0 RR = 3.0 Our text, like many others, uses a graphical approach to depict interaction. Let’s look at the top graph. Risk of disease (in a log scale which remember is a multiplicative scale) is shown on the y axis; exposure status (exposed vs unexposed) is on the x axis. Let’s look at the line with the red symbols first; it is for persons who have a third variable present, say something you are evaluating as a potential confounder. In the presence of a third variable, the risk of disease in the unexposed group is 0.05 and it goes up three fold to 0.15 in the exposed group. When the third variable is absent (the black squares), risk in unexposed is 0.15 which goes up to 0.45 in the exposed group, again a 3 fold increase. In other words, the risk ratio does not change according to the third variable. The lines are parallel; this means that there is not statistical interaction in terms of the risk ratio. In the bottom panel, you can see that the risk ratio of disease does change according to the level of the third variable. With the third variable present, the risk ratio is 3.0. When the third variable is absent, the risk ratio is 11.2. Non-parallel lines equals statistical interaction. Does does this make sense to everyone? RR = 11.2 RR = 3.0

RR = 2.5 RR = 0.72 What’s going on there? In the presence of the third variable, exposed persons appear to be protected relative to unexposed, but in the absence of the third variable, exposed persons are at over two fold increased risk. This is what we see in the smoking, caffeine use, and delayed conception example. The effects in the two levels of the third variable are on the opposite sides of 1.0. This is what we call qualitative interaction; in other words, the interaction is huge!

Interaction is likely everywhere Susceptibility to infectious diseases e.g., exposure: sexual activity disease: HIV infection effect modifier: chemokine receptor phenotype Susceptibility to non-infectious diseases exposure: smoking disease: lung cancer effect modifier: genetic susceptibility to smoke Susceptibility to drugs (efficacy and side effects) effect modifier: genetic susceptibility to drug But in practice to date, difficult to document Genomics may change this If you think about it for a moment, I think you will agree that interaction is likely everywhere. As an example from infectious diseases, if the exposure is sexual activity and the outcome is HIV infection, we know that certain persons are more apt to become infected than others. One such effect modifier that has been discovered is the presence of a particular chemokine receptor phenotype. From non-infectious diseases, we have the example of smoking and lung cancer. Although not well worked out, we can imagine that there are host genetic factors that modify the effect of smoke and make some persons much more susceptible to the harmful effects of smoke. How about the effectiveness of drugs? We all suspect there is substantial heterogeneity in terms of how people respond both in terms of therapeutic efficacy and toxicity and that this likely due to various genetically coded susceptibilities. These are just beginning to be described. However, although we all believe that interaction is likely everywhere around us, it has been - to date- in practice actually relatively difficult to find and document these factors. This is one hope of the genomics revolution, that we will be able to find these different host susceptibility factors.

Additive vs Multiplicative Interaction Assessment of whether interaction is present depends upon the measure of association ratio measure (multiplicative interaction) or difference measure (additive interaction) Hence, the term effect-measure modification Absence of multiplicative interaction typically implies presence of additive interaction Additive interaction present RR = 3.0 RD = 0.3 Multiplicative interaction absent So, when talking about interaction, we have to be precise about whether we are talking about interaction of ratio measures (i.e. multiplicative interaction) or interaction of differences measures ie (additive interaction) or both. That’s why some like to call this effect-measure modification, because whether or not interaction is occuring depends upon the measure of association in question. Let’s go thru a few scenarios. Absence of multiplicative interaction typically implies presence of additive interaction. As you can see here, although there is no interaction for the ratio of risks, there is interaction in the risk differences. When the third variable is present, the risk difference is 0.1, but when the third variable is absent the risk difference is 0.3. What this illustrates is that when we talk about interaction we really have to tie it to the measure of association. RR = 3.0 RD = 0.1

Additive vs Multiplicative Interaction Absence of additive interaction typically implies presence of multiplicative interaction Multiplicative interaction present Additive interaction absent RR = 1.7 RD = 0.1 RR = 3.0 RD = 0.1 Absence of additive interaction (when an effect is present) typically implies presence of multiplicative interaction. Here, the risk difference is 0.1 in both strata of the third variable but the risk ratio differs between strata - multiplicative interaction is present.

Additive vs Multiplicative Interaction Presence of multiplicative interaction may or may not be accompanied by additive interaction RR = 2.0 RD = 0.1 No additive interaction RR = 3.0 RD = 0.1 RR = 3.0 RD = 0.4 Additive interaction present The presence of multiplicative interaction may or may not be accompanied by additive interaction. In the top panel, we see that despite the presence of multiplicative interaction, the risk difference is 0.1 in both strata of the third variable - ie no additive interaction. In the bottom panel, there is again multiplicative interactive, but this time the risk difference in one stratum is 0.1 and 0.4 in another - i.e., additive interaction is present. RR = 2.0 RD = 0.1

Additive vs Multiplicative Interaction Presence of additive interaction may or may not be accompanied by multiplicative interaction RR = 3.0 RD = 0.4 Multiplicative interaction present RR = 2.0 RD = 0.1 RR = 3.0 RD = 0.2 Multiplicative interaction absent Likewise, the presence of additive interaction may or may not be accompanied by multiplicative interaction. In the top panel, we see additive interaction and multiplicative interaction. In the bottom panel, we see additive interaction but no multiplicative interaction. RR = 3.0 RD = 0.1

Additive vs Multiplicative Interaction Presence of qualitative multiplicative interaction is always accompanied by qualitative additive interaction Multiplicative and additive interaction both present One thing that you can count on for sure is that the presence of qualitative multiplicative interaction is always accompanied by qualitative additive interaction. We saw this in our example of smoking caffeine and delayed conception.

Additive vs Multiplicative Scales Additive measures (e.g., risk difference): readily translated into impact of an exposure (or intervention) in terms of number of outcomes prevented e.g. 1/risk difference = no. needed to treat to prevent (or avert) one case of disease or no. of exposed persons one needs to take the exposure away from to avert one case of disease gives “public health impact” of the exposure Multiplicative measures (e.g., risk ratio) favored measure when looking for causal association (etiologic research) So, hopefully these past few slides point out the importance of paying attention to what measure of association you are dealing with. For example, there’s no need to look for additive interaction if indeed the relative risk is the right measure of association for you. Knowing which measure of association to care about gets us back to the material covered by Dennis Osmond earlier in the course. Additive measures (like the risk difference) are most readily translated into the impact an exposure (or intervention) has in terms of actual number of actual cases of disease. For example, as I think you all know, 1/risk difference is the number of exposed persons in whom you would have to eliminate exposure in order to avert one case of disease. Or, when the exposure is a drug, 1/risk difference is the number of persons you would to treat to avert one case of disease. In clinical trials, this is known as the number needed to treat. The background incidence of disease in the unexposed group is an important component of this. This gives you the public health impact of the exposure. Multiplicative measures (like the risk ratio) are the favored measures when looking for causal relationships, in other words, etiologic research. Here you are simply asking whether a given exposure causes a given disease. These relative measures are favored when studying causality because they don’t depend upon the background incidence of a disease in unexposed persons.

Additive vs Multiplicative Scales Causally related but minor public health importance - Risk ratio = 2 Risk difference = 0.0001 - 0.00005 = 0.00005 Need to eliminate exposure in 20,000 persons to avert one case of disease Causally related and major public health importance RR = 2 RD = 0.2 - 0.1 = 0.1 Need to eliminate exposure in 10 persons to avert one case of disease How about some examples? In the upper panel, we are working with a disease that is very rare but nonetheless the exposure in question is associated with a two fold risk of disease. While this may be causally related, the risk difference between exposed and unexposed is very small, just 0.00005. That means you have to eliminate exposure in 20,000 persons just to avert one case of disease. A simple way to infer this is if you took 100000 exposed persons and then took away their exposure, you would end up 5 cases of disease (the background) and 5 cases of disease averted. In other words, take away exp from 100000 and avert 5 cases, translates into take away exposure in 20,000 to avert 1 case. Contrast that with the lower panel. Here, the disease is more common in unexposed but the relative risk of the exposure is still 2, just like above. In this case, the risk difference is 0.1 which translates into only needing to eliminate exposure in 10 persons to avert one case of disease.

Risk ratiofamily history = 2.0 RDfamily history = 0.20 Smoking, Family History and Cancer: Additive vs Multiplicative Interaction Crude Family History Present Stratified Family History Absent Risk ratiofamily history = 2.0 RDfamily history = 0.20 Risk rationo family history = 2.0 RDno family history = 0.05 No multiplicative interaction but presence of additive interaction If etiology is goal, risk ratio’s may be sufficient If goal is to define sub-groups of persons to target: Rather than ignoring, it is worth reporting that only 5 persons with a family history have to be prevented from smoking to avert one case of cancer Let’s look at this hypothetical example of a cohort study. Smoking is the exposure, cancer (say, one type of cancer, or a variety of cancers) is the outcome, and family history of cancer is the third variable in question. We see that there is not multiplicative interaction, the risk ratio’s are the same in both strata. However, there is apparently additive interaction. The risk diffference is 0.2 among those with a family history and 0.05 among those without a family history. If your goal was simply to assess whether smoking was a risk factor, you would probably go with the risk ratio of 2 and not bother to report the additive interaction to your readers, afterall, it is much easier to report just one number instead of two particularly when your study may have many different risk factors. But say you already had a pretty good sense that smoking was a risk factor and now your goal is to see where you can have the most impact in terms of getting persons to stop smoking. So, if your goal is to identify subgroups of persons to target with an intervention (say a smoking cessation intervention), then you have actually found something interesting. The impact of an intervention would differ depending upon the third variable, family history. In the stratum with a family history present, you just need to eliminate smoking in 5 persons to avert one case of cancer. In the family history absent stratum, you need to eliminate smoking in 20 persons to avert one case of cancer. Hence, the most efficient group to intervene upon is those with a family history. Hence, it is well worth to report the presence of interaction based upon family history. This is the mathematical basis of choosing high risk groups when for such interventions.

Controlling confounding In the design Restriction of the study Matching In the analysis Restriction of the analysis Stratification Multivariable methods 112

Confounding Imagine you have repeated a positive finding of birth order association in Down syndrome or association of coffee drinking with CHD in another sample. Would you be able to replicate it? If not why? Imagine you have included only non-smokers in a study and examined association of alcohol with lung cancer. Would you find an association? Imagine you have stratified your dataset for smoking status in the alcohol - lung cancer association study. Would the odds ratios differ in the two strata? Imagine you have tried to adjust your alcohol association for smoking status (in a statistical model). Would you see an association?

Confounding Imagine you have repeated a positive finding of birth order association in Down syndrome or association of coffee drinking with CHD in another sample. Would you be able to replicate it? If not why? You would not necessarily be able to replicate the original finding because it was a spurious association due to confounding. In another sample where all mothers are below 30 yr, there would be no association with birth order. In another sample in which there are few smokers, the coffee association with CHD would not be replicated.

Confounding Imagine you have included only non-smokers in a study and examined association of alcohol with lung cancer. Would you find an association? No, because the first study was confounded. The association with alcohol was actually due to smoking. By restricting the study to non-smokers, we have found the truth. Restriction is one way of preventing confounding at the time of study design.

Confounding Imagine you have stratified your dataset for smoking status in the alcohol - lung cancer association study. Would the odds ratios differ in the two strata? The alcohol association would yield the similar odds ratio in both strata and would be close to unity. In confounding, the stratum-specific odds ratios should be similar and different from the crude odds ratio by at least 15%. Stratification is one way of identifying confounding at the time of analysis. If the stratum-specific odds ratios are different, then this is not confounding but effect modification.

Confounding Imagine you have tried to adjust your alcohol association for smoking status (in a statistical model). Would you see an association? If the smoking is included in the statistical model, the alcohol association would lose its statistical significance. Adjustment by multivariable modelling is another method to identify confounders at the time of data analysis.

Confounding For confounding to occur, the confounders should be differentially represented in the comparison groups. Randomisation is an attempt to evenly distribute potential (unknown) confounders in study groups. It does not guarantee control of confounding. Matching is another way of achieving the same. It ensures equal representation of subjects with known confounders in study groups. It has to be coupled with matched analysis. Restriction for potential confounders in design also prevents confounding but causes loss of statistical power (instead stratified analysis may be tried).

Confounding Randomisation, matching and restriction can be tried at the time of designing a study to reduce the risk of confounding. At the time of analysis: Stratification and multivariable (adjusted) analysis can achieve the same. It is preferable to try something at the time of designing the study.