Introduction to Risk Factors & Measures of Effect Meg McCarron, CDC.

Slides:



Advertisements
Similar presentations
How would you explain the smoking paradox. Smokers fair better after an infarction in hospital than non-smokers. This apparently disagrees with the view.
Advertisements

Comparing Two Proportions (p1 vs. p2)
KRUSKAL-WALIS ANOVA BY RANK (Nonparametric test)
Data Analysis Basics for Analytic Epidemiology Session 3, Part 3.
1 Case-Control Study Design Two groups are selected, one of people with the disease (cases), and the other of people with the same general characteristics.
Find the Joy in Stats ? ! ? Walt Senterfitt, Ph.D., PWA Los Angeles County Department of Public Health and CHAMP.
Chapter Seventeen HYPOTHESIS TESTING
Chapter 11 Survival Analysis Part 2. 2 Survival Analysis and Regression Combine lots of information Combine lots of information Look at several variables.
Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests.
Statistics By Z S Chaudry. Why do I need to know about statistics ? Tested in AKT To understand Journal articles and research papers.
Data Analysis Statistics. Levels of Measurement Nominal – Categorical; no implied rankings among the categories. Also includes written observations and.
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Sample Size and Statistical Power Epidemiology 655 Winter 1999 Jennifer Beebe.
HaDPop Measuring Disease and Exposure in Populations (MD) &
Are exposures associated with disease?
Thomas Songer, PhD with acknowledgment to several slides provided by M Rahbar and Moataza Mahmoud Abdel Wahab Introduction to Research Methods In the Internet.
The Bahrain Branch of the UK Cochrane Centre In Collaboration with Reyada Training & Management Consultancy, Dubai-UAE Cochrane Collaboration and Systematic.
The Chi-Square Test Used when both outcome and exposure variables are binary (dichotomous) or even multichotomous Allows the researcher to calculate a.
Analytic Epidemiology
David Yens, Ph.D. NYCOM PASW-SPSS STATISTICS David P. Yens, Ph.D. New York College of Osteopathic Medicine, NYIT l PRESENTATION.
Medical Statistics (full English class) Ji-Qian Fang School of Public Health Sun Yat-Sen University.
Analysis of Categorical Data
Multiple Choice Questions for discussion
Measuring Associations Between Exposure and Outcomes.
Evidence-Based Medicine 4 More Knowledge and Skills for Critical Reading Karen E. Schetzina, MD, MPH.
Statistics for clinical research An introductory course.
Epidemiology The Basics Only… Adapted with permission from a class presentation developed by Dr. Charles Lynch – University of Iowa, Iowa City.
Evidence-Based Medicine 3 More Knowledge and Skills for Critical Reading Karen E. Schetzina, MD, MPH.
Health and Disease in Populations 2001 Sources of variation (2) Jane Hutton (Paul Burton)
Hypothesis Testing Field Epidemiology. Hypothesis Hypothesis testing is conducted in etiologic study designs such as the case-control or cohort as well.
Introduction To Biological Research. Step-by-step analysis of biological data The statistical analysis of a biological experiment may be broken down into.
Measures of Association
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Literature searching & critical appraisal Chihaya Koriyama August 15, 2011 (Lecture 2)
Causation ? Tim Wiemken, PhD MPH CIC Assistant Professor Division of Infectious Diseases University of Louisville, Kentucky.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Contingency tables Brian Healy, PhD. Types of analysis-independent samples OutcomeExplanatoryAnalysis ContinuousDichotomous t-test, Wilcoxon test ContinuousCategorical.
The binomial applied: absolute and relative risks, chi-square.
MBP1010 – Lecture 8: March 1, Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)
I is for Investigation Outbreak Investigation Methods from Mystery to Mastery.
The exam is of 2 hours & Marks :40 The exam is of two parts ( Part I & Part II) Part I is of 20 questions. Answer any 15 questions Each question is of.
Introduction to Inference: Confidence Intervals and Hypothesis Testing Presentation 8 First Part.
Introduction to Inference: Confidence Intervals and Hypothesis Testing Presentation 4 First Part.
BC Jung A Brief Introduction to Epidemiology - XIII (Critiquing the Research: Statistical Considerations) Betty C. Jung, RN, MPH, CHES.
Organization of statistical research. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and.
Design and Analysis of Clinical Study 7. Analysis of Case-control Study Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia.
More Contingency Tables & Paired Categorical Data Lecture 8.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 10 Comparing Two Groups Section 10.1 Categorical Response: Comparing Two Proportions.
BIOSTATISTICS Lecture 2. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and creating methods.
A short introduction to epidemiology Chapter 6: Precision Neil Pearce Centre for Public Health Research Massey University Wellington, New Zealand.
Chapter 13 Understanding research results: statistical inference.
Introduction to Biostatistics, Harvard Extension School, Fall, 2005 © Scott Evans, Ph.D.1 Contingency Tables.
ANalysis Of VAriance (ANOVA) Used for continuous outcomes with a nominal exposure with three or more categories (groups) Result of test is F statistic.
Introdcution to Epidemiology for Medical Students Université Paris-Descartes Babak Khoshnood INSERM U1153, Equipe EPOPé (Dir. Pierre-Yves Ancel) Obstetric,
Inferential Statistics Assoc. Prof. Dr. Şehnaz Şahinkarakaş.
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
2 3 انواع مطالعات توصيفي (Descriptive) تحليلي (Analytic) مداخله اي (Interventional) مشاهده اي ( Observational ) كارآزمايي باليني كارآزمايي اجتماعي كارآزمايي.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Case Control study. An investigation that compares a group of people with a disease to a group of people without the disease. Used to identify and assess.
Measures of disease frequency Simon Thornley. Measures of Effect and Disease Frequency Aims – To define and describe the uses of common epidemiological.
Methods of Presenting and Interpreting Information Class 9.
EPID 503 – Class 12 Cohort Study Design.
Sample size calculation
Significance testing Introduction to Intervention Epidemiology
The binomial applied: absolute and relative risks, chi-square
Random error, Confidence intervals and P-values
Multiple logistic regression
Interpreting Basic Statistics
Interpreting Epidemiologic Results.
Research Techniques Made Simple: Interpreting Measures of Association in Clinical Research Michelle Roberts PhD,1,2 Sepideh Ashrafzadeh,1,2 Maryam Asgari.
Presentation transcript:

Introduction to Risk Factors & Measures of Effect Meg McCarron, CDC

Introduction to Risk Analysis 2

What is a risk analysis? The analysis of an association between a variable (e.g. underlying condition) and an outcome (e.g. death) Why do risk analysis? The probability of an outcome is often dependent on the interplay between a variety of factors Follow up on suggested associations observed in descriptive analysis (e.g. the elderly appear to die more frequently than healthy young adults; a risk analysis might tell you whether or not that is a true observation) Determine the severity of risk Identify significant risk factors Using this type of analysis we can measure risk ratio (RR), odds ratio (OR) 3

What is a risk factor?  A risk factor is a factor that is associated with increased chance of getting a disease.  In epidemiological terms: A risk factor is a variable (determinant) associated with an increased risk of disease or infection (outcome).  Example: Obesity (determinant/exposure) is associated with increased risk of heart attack (outcome)  When we measure risk factors we assess  Strength  Direction  Shape 4

Risk factors in SARI surveillance Information about a number of potential risk factors and outcomes is often recorded e.g. Outcomes: death, influenza status Risk factors: age, co-morbid conditions Surveillance data can be analyzed to increase the understanding of the association of risk factors with severe outcomes Surveillance data describing exposures allows analysis of associations without expensive in- depth studies 5

Is a risk factor the cause of a disease?  Risk factors are correlational and not necessarily causal  Correlation does not imply causation  The statistical methods used do not consider the direction of effects  For an effect to be causal the exposure must have occurred before the outcome  e.g. young age does not cause measles (Morbillivirus causes measles), but young people are at greater risk because they are less likely to have developed immunity due to previous exposure or vaccination 6

The Correlation-Causation Problem Somalia has many pirates, but low carbon emissions

How are risk factors/disease determinants identified?  Individual-level data  Two key variables  Outcome: e.g. influenza  Exposure: e.g. vaccination  Should consider multiple risk factors  Epidemiological study designs used to identify risk factors  Case-control  Cohort  Surveillance data may approximate a cohort study  Biological plausibility  e.g. age and influenza infection  Exposure (risk factor) must occur prior to outcome (disease)

Types of variables  Continuous  E.g. Age  Categorical variables  Binary  E.g. Gender, vaccination status  Ordinal  E.g. Age group, socioeconomic status (SES)  Nominal/Categorical  E.g. Geographic region  Count  E.g. number of ILI symptoms

How are risk factors/disease determinants identified? 10

How are risk factors/disease determinants identified? (… continue …) 11

Cohort study  Follow people over time  Collect data on their exposures (risks)  Monitor their outcomes  Compare risk of disease among exposed versus unexposed Participant 1 2 D 3 4 D time

Example: cohort study  e.g. Risk of death among SARI admissions  Outcome: death  Risk factors: age, underlying conditions, influenza-positive  Source population: all patients admitted with SARI, followed until death or discharge 13

Case control study 14  Cases: people with disease  Deliberately over- selected  Controls: people without disease  Represent exposure distribution of the source population  Find out their exposure status  Compare risk of exposure among diseased and non- diseased ED 1 Participant D 2 ED 3 4 E 5 6 time

Example: case-control study  Risk of influenza among vaccinated patients  Cases: people with influenza  Controls: people without influenza  Outcome: influenza status  Risk factors: vaccination status, age, underlying comorbidity 15

Statistical significance: is the association due to chance alone?  A statistical test is used to assess if an association may be due to chance alone (random error)  In statistics, a result is called statistically significant if it is unlikely to have occurred by chance alone, according to a pre-determined threshold probability, the significance level (e.g. α: 0.05). 16

Common statistical tests  Categorical data:  Chi-square (  2 ) test,  Fisher’s test  McNemar’s test  Continuous data:  T-test  Wilcoxon rank-sum test  ANOVA  These tests can tell if there’s a difference between groups but do not convey the size or direction of effects

Common measures of association / effect  Measure the size of an association (effect)  Compare some measure of disease in exposed versus unexposed  Absolute difference  Y 1 -Y 2  Risk difference  Relative difference (ratio)  Y 1 /Y 2  Odds ratio  Risk ratio  Incidence rate ratio  Hazard ratio (survival data)  Attributable risk 18

Odds ratios 19  Most common measure of association used in epidemiology  Binary outcome  Odds Ratios (OR): compares the odds of exposure among cases (people with disease) with controls (people without disease)  Odds: ratio of the probability (p) of an event occurring versus it not occurring  Odds = p/(1-p) Calculation of the RR & OR CasesControls Exposedab Unexposedcd OR = (a/c) / (b/d) OR = 1 = no association OR < 1 = negative association (reduces risk) OR > 1 = positive association (increases risk)

Example of OR Calculations Outcome (Influenza patients that died) Outcome (Influenza patients that died) 20 Calculation of the RR & OR DiedAlive Flu+200 (a)150 (b) Flu-50 (c)100 (d) OR = (a/c) / (b/d) = (a*d) / (b*c) OR=(200/50)/(150/100)=2.7 Calculation of the RR & OR DiedAlive Female200 (a)180 (b) Male98 (c)100 (d) OR=(200*100)/(180*98)=1.1

Confidence intervals  OR is a point estimate  Confidence interval (CI) is a measure of uncertainty around your point estimate  CI is based on the standard error (SE)   SE=narrower confidence interval  If CI includes 1, then not statistically significant  wide CI also a problem  Usually use 95%CI CasesControls Exposedab Unexposedcd SE = √1/a + 1/b + 1/c + 1/d 95%CI = e (OR  1.96 * SE)

22 OR=1.1 95%CI=1.01,1.4

Confidence intervals e.g Victorian surveillance data, adults, influenza B Flu+Flu- Vaccinated44 (a)95 (b) Unvaccinated205 (c)260 (d) OR = (44/205) / (95/260) = 0.59 ln(OR) = ln(0.25)= SE = √1/44 + 1/95+ 1/ /260 = %CI = e ( *0.20) = e (0.09) = 0.39 (UL) = e ( *0.20) = e (-2.87) = 0.88 (LL)

Interpreting Results  Size of the CI is an indicator of uncertainty  Wide CI =  uncertainty  Narrow CI =  uncertainty  If CI includes 1, then not statistically significant  The observed effect could just be due to chance  P-values are often used to convey statistical significance  The p-value for a OR is calculated from a chi- squared test  The p-value reference for a 95%CI is 5% or 0.05

P-values  The p-values help us to determine whether the difference between the two groups might be due to random variation  CI and p-values  95%CI=1.0, 2.3 indicates that the two-sided p-value for no association is about  95%CI=0.9, 2.4 suggests p>0.05  95%CI=0.9, 2.4 indicate that the data are compatible with a two-fold higher risk (i.e. upper limit includes 2)  The p-value is a measure of the compatibility of the data and the null hypothesis

Implementation of a statistical test  We start with a research hypothesis  State the relevant null (H0)  No effect (effect is due to chance)  Alternative hypotheses (HA)  An effect exists  Decide which test is appropriate (see earlier list)  Compute the test statistic and the associated p-(probability) value  Compare the computed p-value to a reference p value (usually 0.05) to accept or reject the null hypothesis  If the p-value of the test is lower than the reference value the H0 is rejected  The effect is not likely to be due to chance

Example: Implementation of a statistical test  Influenza prevalence in hospitalized patients:  Non pregnant women: 100/1000 = 10%  Pregnant women: 30/200 = 15%  Question:  Is the influenza prevalence in hospitalized pregnant women different to non- pregnant women?  Hypothesis  H0: p1 = p2 ; p1 - p2 = 0  HA: p1 = p2 ; p1 - p2 = 0  Reject H0 if p (test) is < α: 0.05  Test results:  Z (test statistic):  p value:  0.037<0.05 → Reject H0

Example: factors associated with influenza-positive diagnosis among ILI patients OR p-value 95% CI Lower limitUpper limit Vaccinated Underlying condition Epi week Age group <20ref Adjusted OR=0.54 (95%CI=0.32,0.89) Crude OR=0.59 (95%CI=0.39,0.88) Adjusted OR=0.54 (95%CI=0.32,0.89) Crude OR=0.59 (95%CI=0.39,0.88)

Summary  A risk factor is a variable which increases (or decreases) the risk of an outcome  We can assess the influence of risk factors using individual- level data from case-control and cohort studies  The size of the effect can be measured by effect measures  Most common effect measure is the odds ratio  The uncertainty of the effect can be measured by the confidence interval  Understanding whether an effect is due to random error is indicated by the p-value and tested using a statistical test  Multivariable methods can tell us how much influence one risk factor has compared with others