Presentation on theme: "March 20121 Back to Basics, 2012 POPULATION HEALTH (1): Epidemiology Methods, Critical Appraisal, Biostatistical Methods N. Birkett, MD Epidemiology &"— Presentation transcript:
March 20121 Back to Basics, 2012 POPULATION HEALTH (1): Epidemiology Methods, Critical Appraisal, Biostatistical Methods N. Birkett, MD Epidemiology & Community Medicine Other resources available on Individual & Population Health web siteIndividual & Population Health web site
March 20122 THE PLAN (1) Session 1 (March 23, 9:00-12:00) –Diagnostic tests Sensitivity, specificity, validity, PPV –Critical Appraisal –Intro to Biostatistics –Brief overview of epidemiological research methods
March 20123 THE PLAN (2) Aim to spend about 2-2.5 hours on lectures –Review MCQs in remaining time A 10 minute break about half-way through You can interrupt for questions, etc. if things aren’t clear. –Goal is to help you, not to cover a fixed curriculum.
March 20124 INVESTIGATIONS (1) 78.2 –Determine the reliability and predictive value of common investigations –Applicable to both screening and diagnostic tests.
March 20125 Reliability = reproducibility. Does it produce the same result every time? Related to chance error Averages out in the long run, but in patient care you hope to do a test only once; therefore, you need a reliable test
March 20126 Validity Whether it measures what it purports to measure in long run, viz., presence or absence of disease Normally use criterion validity, comparing test results to a gold standard Link to SIM web on validityvalidity
March 20127 Reliability and Validity: the metaphor of target shooting. Here, reliability is represented by consistency, and validity by aim Reliability Low High Low Validity High
March 20128 Test Properties (1) DiseasedNot diseased Test +ve90595 Test -ve1095105 100 200 True positivesFalse positives False negativesTrue negatives
March 20129 Test Properties (2) DiseasedNot diseased Test +ve90595 Test -ve1095105 100 200 Sensitivity = 0.90Specificity = 0.95
March 201210 2x2 Table for Testing a Test Gold standardDisease PresentAbsent Test Positivea (TP)b (FP) Test Negativec (FN)d (TN) SensitivitySpecificity = a/(a+c) = d/(b+d)
March 201211 Test Properties (6) Sensitivity =Pr(test positive in a personSensitivity with disease) Specificity =Pr(test negative in a person without disease) Range: 0 to 1 –> 0.9:Excellent –0.8-0.9:Not bad –0.7-0.8:So-so –< 0.7:Poor
March 201212 Test Properties (7) Values depend on cutoff point Generally, high sensitivity is associated with low specificity and vice-versa. Not affected by prevalence, if severity is constant Do you want a test to have high sensitivity or high specificity? –Depends on cost of ‘false positive’ and ‘false negative’ cases –PKU – one false negative is a disaster –Ottawa Ankle Rules: insisted on sensitivity of 1.00
March 201213 Test Properties (8) Sens/Spec not directly useful to clinician, who knows only the test result Patients don’t ask: “If I’ve got the disease, how likely is a positive test?” They ask: “My test is positive. Does that mean I have the disease?” → Predictive values.
March 201214 Predictive Values Based on rows, not columns –PPV = a/(a+b); interprets positive test –NPV = d/(c+d); interprets negative test Depend upon prevalence of disease, so must be determined for each clinical setting Immediately useful to clinician: they provide the probability that the patient has the disease
March 201215 Test Properties (9) DiseasedNot diseased Test +ve90595 Test -ve1095105 100 200 PPV = 0.95 NPV = 0.90
March 201216 2x2 Table for Testing a Test Gold standard Disease Present Absent Test +a (TP) b (FP) PPV = a/(a+b) Test -c (FN) d (TN) NPV= d/(c+d) a+c b+dN
March 201217 Prevalence of Disease Is your best guess about the probability that the patient has the disease, before you do the test Also known as Pretest Probability of Disease (a+c)/N in 2x2 table Is closely related to Pre-test odds of disease: (a+c)/(b+d)
March 201218 Test Properties (10) DiseasedNot diseased Test +veaba+b Test -vecdc+d a+cb+da+b+c+d =N Prevalence odds Prevalence proportion
March 201219 Prevalence and Predictive ValuesPredictive Values Predictive values of a test are dependent on the pre-test prevalence of the disease –Tertiary hospitals see more pathology then FP’s Their tests are more often true positives. How to ‘calibrate’ a test for use in a different setting? Relies on the stability of sensitivity & specificity across populations.
March 201220 Methods for Calibrating a Test Four methods can be used: –Apply definitive test to a consecutive series of patients (rarely feasible) –Hypothetical table –Bayes’s Theorem –Nomogram You need to be able to do one of the last 3. By far the easiest is using a hypothetical table.
March 201221 Calibration by hypothetical table Fill cells in following order: “Truth” DiseaseDiseaseTotal PV PresentAbsent Test Pos 4 th 7 th 8 th 10 th Test Neg 5 th 6 th 9 th 11 th Total 2 nd 3 rd 1 st (10,000)
March 201222 Test Properties (11) DiseasedNot diseased Test +ve450 25475 Test -ve 50475525 500 1,000 Tertiary care: research study. Prev=0.5 PPV = 0.89 Sens = 0.90Spec = 0.95
March 201223 Test Properties (12) DiseasedNot diseased Test +ve Test -ve 10,000 Primary care: Prev=0.01 PPV = 0.1538 9,900 90 10 100 495 9,405 585 9,415 Sens = 0.90Spec = 0.95
March 201224 Calibration by Bayes’ Theorem You don’t need to learn Bayes’ theorem Instead, work with the Likelihood Ratio (+ve)Likelihood Ratio –Equivalent process exists for Likelihood Ratio (–ve), but we shall not calculate it here
March 201225 Test Properties (13) DiseasedNot diseased Test +ve 90595 Test - ve 1095105 100 200 Pre-test odds = 1.00 Post-test odds (+ve) = 18.0 Post-test odds (+ve) = LR(+) * Pre-test odds = 18.0 * 1.0 = 18.0, but of course you do not know the LR(+)
March 201226 Calibration by Bayes’s Theorem You can convert sens and spec to likelihood ratios LR(+) = sens/(1-spec) LR(+) is fixed across populations just like sensitivity & specificity. Bigger is better. Posttest odds(+) = pretest odds * LR(+) –Convert to posttest probability if desired…
March 201227 Converting odds to probabilities Pre-test odds = prevalence/(1-prevalence) –if prevalence = 0.20, then pre-test odds =.20/0.80 = 0.25 Post-test probability = post-test odds/(1+post-test odds) –if post-test odds = 0.25, then prob =.25/1.25 = 0.20
March 201228 Calibration by Bayes’s Theorem How does this help? Remember: –Post-test odds(+) = pretest odds * LR(+) To ‘calibrate’ your test for a new population: –Use the LR(+) value from the reference source –Estimate the pre-test odds for your population –Compute the post-test odds –Convert to post-test probability to get PPV
March 201229 Example of Bayes' Theorem (‘new’ prevalence 1%, sens 90%, spec 95%) LR(+) =.90/.05 = 18 (>>1, pretty good) Pretest odds =.01/.99 = 0.0101 Positive Posttest odds =.0101*18 =.1818 PPV =.1818/1.1818 = 0.1538 = 15.38% Compare to the ‘hypothetical table’ method (PPV=15.38%)
March 201230 Calibration with Nomogram Graphical approach avoids some arithmetic Expresses prevalence and predictive values as probabilities (no need to convert to odds) Draw lines from pretest probability (=prevalence) through likelihood ratios; extend to estimate posttest probabilities Only useful if someone gives you the nomogram!
April 201131 Example of Nomogram (pretest probability 1%, LR+ 18, LR– 0.105) Pretest Prob. LR Posttest Prob. 1% 18.105 15% 0.01% March 201231
March 201232 Are sens & spec really constant? Generally, assumed to be constant. BUT….. Sensitivity and specificity usually vary with severity of disease, and may vary with age and sex Therefore, you can use sensitivity and specificity only if they were determined on patients similar to your own Risk of spectrum bias (populations may come from different points along the spectrum of disease)
Cautionary Tale #1: Data Sources March 201233 The Government is extremely fond of amassing great quantities of statistics. These are raised to the nth degree, the cube roots are extracted, and the results are arranged into elaborate and impressive displays. What must be kept ever in mind, however, is that in every case, the figures are first put down by a village watchman, and he puts down anything he damn well pleases! Sir Josiah Stamp, Her Majesty’s Collector of Internal Revenue.
March 201234 78.2: CRITICAL APPRAISAL (1) “Evaluate scientific literature in order to critically assess the benefits and risks of current and proposed methods of investigation, treatment and prevention of illness” UTMCCQE does not present hierarchy of evidence (e.g., as used by Task Force on Preventive Health Services)Task Force on Preventive Health Services
March 201235 Hierarchy of evidence (lowest to highest quality, approximately) Expert opinion Case report/series Ecological (for individual-level exposures) Cross-sectional Case-Control Historical Cohort Prospective Cohort Quasi-experimental Experimental (Randomized) } similar/identical
Cautionary Tale #2: Analysis March 201236 Consider a precise number: the normal body temperature of 98.6°F. Recent investigations involving millions of measurements have shown that this number is wrong: normal body temperature is actually 98.2°F. The fault lies not with the original measurements - they were averaged and sensibly rounded to the nearest degree: 37°C. When this was converted to Fahrenheit, however, the rounding was forgotten and 98.6 was taken as accurate to the nearest tenth of a degree.
March 201237 BIOSTATISTICS Core concepts (1) Sample: –A group of people, animals, etc. which is used to represent a larger ‘target’ population. Best is a random sample Most common is a convenience sample. –Subject to strong risk of bias. Sample size: –the number of units in the sample Much of statistics concerns how samples relate to the population or to each other.
March 201238 BIOSTATISTICS Core concepts (2) Mean: –average value. Measures the ‘centre’ of the data. Will be roughly in the middle. Median: –The middle value: 50% above and 50% below. Used when data is skewed. Variance: –A measure of how spread out the data are. Defined by subtracting the mean from each observation, squaring, adding them all up and dividing by the number of observations. Standard deviation: –square root of the variance.
March 201239 BIOSTATISTICS Core concepts (3) Standard error: –, where n is sample size. –Is the standard deviation of the sample mean, so measures the variability of that mean. Confidence Interval: –A range of numbers which tells us where we believe the correct answer lies. For a 95% confidence interval, we are 95% sure that the true value lies in the interval, somewhere. –Usually computed as: mean ± 2 SE
March 201240 Example of Confidence Interval If sample mean is 80, standard deviation is 20, and sample size is 25 then: –SE = 20/5 = 4. We can be 95% confident that the true mean lies within the range: 80 ± (2*4) = (72, 88). If the sample size were 100, then –SE = 20/10 = 2.0, and 95% confidence interval is 80 ± (2*2) = (76, 84). More precise.
March 201241 Core concepts (4) Random Variation (chance): –every time we measure anything, errors will occur. –In addition, by selecting only a few people to study (a sample), we will get people with values different from the mean, just by chance. –These are random factors which affect the precision (SD) of our data but not the validity. –Statistics and bigger sample sizes can help here.
March 201242 Core concepts (5) Bias: –A systematic factor which causes two groups to differ. A study uses a two section measuring scale for height which was incorrectly assembled (with a 1” gap between the upper and lower section). Over-estimates height by 1” (a bias). –Bigger numbers and statistics don’t help much; you need good design instead.
March 201243 BIOSTATISTICS Inferential Statistics Draws inferences about populations, based on samples from those populations. –Inferences are valid only if samples are representative (to avoid bias). Polls, surveys, etc. use inferential statistics to infer what the population thinks based on talking to a few people. RCTs use them to infer treatment effects, etc. 95% confidence intervals are a very common way to present these results.
March 201244 Population from which sample is drawn Sample Target population Inferences drawn (Confidence interval used to indicate accuracy of extrapolating results to broader population from which sample was drawn) Your practice patients
March 201245 ┼ ┼ Increasing random error Increasing systematic error (bias) Population parameter Results from different samples Effects of bias and random error on study results ┼ ┼ ┼ Bias Random error
March 201246 Hypothesis Testing (1) Used to compare two or more groups. –We first assume that the two groups have the same outcome results. null hypothesis (H 0 ) –Compute some number (a statistic) which, under this null hypothesis (H 0 ), should be ‘0’. –If we find a large value for the statistic, then we can conclude that our assumption (null hypothesis) is unlikely to be true (reject the null hypothesis).
March 201247 Hypothesis Testing (2) Formal methods use this approach by determining the probability that the value you observe could occur –The p-value. Reject H 0 if that value exceeds the critical value expected from chance alone.
March 201248 Hypothesis Testing (3) Common methods used are: –T-test –Z-test –Chi-square test –ANOVA Approach can be extended through the use of regression models –Linear regression Toronto notes are wrong in saying this relates 2 variables. It can relate many independent variables to one dependent variable. –Logistic regression –Cox models
March 201249 Hypothesis Testing (4) Once you select a method for hypothesis testing, interpretation involves: –Type 1 error (alpha) –Type 2 error (beta) –P-value Essentially the alpha value –Power Related to type 2 error (Beta)
March 201250 Hypothesis testing (5) No effectEffect No effectNo errorType 2 error (β) EffectType 1 error (α) No error Actual Situation Results of Stats Analysis
March 201251 Hypothesis Testing (6) P-value: –The probability of making a type 1 error You observe a value for your statistic –Z=1.96 If the null hypothesis were to be true, you can figure out the probability of observing a value of your statistic which is as big or bigger than this –0.05 This is the p-value –If the null hypothesis is true, how likely would I be to observe a value of my statistic that is a big as I did (or bigger). This is not quite the same as saying the chance that the group difference is ‘real’
March 201252 Example of significance test Is there an association between sex and smoking: –35 of 100 men smoke but only 20 of 100 women smoke Calculate the chi-square (the statistic) – = 5.64. –If there is no effect of sex on smoking (the null hypothesis), a chi-square value as large as 5.64 would occur only 1.8% of the time. P=0.018 –Can also compare your statistic to the ‘critical value’ The value of the Chi-square which gives p=0.05 3.84 Since 5.64 > 3.84, we conclude that p<0.05
March 201253 Hypothesis Testing (7) Power: –The chance you will find a difference between groups when there really is a difference (of a given amount). Basically, this is 1-β –Power depends on how big a difference you consider to be important
March 201254 How to improve your power? Increase sample size Improve precision of the measurement tools used (reduces standard deviation) Use better statistical methods Use better designs Reduce bias
Cautionary Tale #3: Anecdotes March 201255 Laboratory and anecdotal clinical evidence suggest that some common non-antineoplastic drugs may affect the course of cancer. The authors present two cases that appear to be consistent with such a possibility: that of a 63-year-old woman in whom a high- grade angiosarcoma of the forehead improved after discontinuation of lithium therapy and then progressed rapidly when treatment with carbamezepine was started, and that of a 74-year-old woman with metastatic adenocarcinoma of the colon which regressed when self- treatment with a non-prescription decongestant preparation containing antihistamine was discontinued. The authors suggest...... ‘that consideration be given to discontinuing all nonessential medications for patients with cancer.’
March 201256 Epidemiology overview Key study designs to examine (SIM web link) (SIM web link) –Case-control –Cohort –Randomized Controlled Trial (RCT) Confounding Relative Risks/odds ratios –All ratio measures have the same interpretation 1.0 = no effect < 1.0 protective effect > 1.0 increased risk –Values over 2.0 are of strong interest
March 201257 The Epidemiological Triad Host Agent Environment
March 201258 Terminology Prevalence: –The probability that a person has the outcome of interest today. Relates to existing cases of disease. Useful for measuring burden of illness. Incidence: –The probability (chance) that someone without the outcome will develop it over a fixed period of time. Relates to new cases of disease. Useful for studying causes of illness.
March 201259 Prevalence On July 1, 2007, 140 graduates from the U. of O. medical school start working as interns. Of this group, 100 had insomnia the night before. Therefore, the prevalence of insomnia is: 100/140 = 0.72 = 72%
March 201260 Incidence Proportion (risk) On July 1, 2007, 140 graduates from the U. of O. medical school start working as interns. Over the next year, 30 develop a stomach ulcer. Therefore, the incidence proportion (risk) of an ulcer in the first year post-graduation is: 30/140 = 0.21 = 214/1,000 over 1 yr
March 201261 Incidence Rate (1) Incidence rate is the ‘speed’ with which people get ill. Everyone dies (eventually). It is better to die later death rate is lower. Compute with person-time denominator: PT = # people * duration of follow-up
March 201262 Incidence rate (2) 140 U. of O. medical students were followed during their residency –50 did 2 years of residency –90 did 4 years of residency –Person-time = 50 * 2 + 90 * 4 = 460 PY’s During follow-up, 30 developed ‘stress’. Incidence rate of stress is:
March 201263 Prevalence & incidence As long as conditions are ‘stable’ and disease is fairly rare, we have this relationship: That is, Prevalence ≈ Incidence rate * average disease duration
March 201264 Cohort study (1) Select non-diseased subjects based on their exposure status Main method used: Select a group of people with the exposure of interest Select a group of people without the exposure Can also simply select a group of people without the disease and study a range of exposures. Follow the group to determine what happens to them. Compare the incidence of the disease in exposed and unexposed people If exposure increases risk, there should be more cases in exposed subjects than unexposed subjects Compute a relative risk. Framingham Study is standard example.
March 201265 Exposed group Unexposed group No disease Disease No disease Disease time Study beginsOutcomes
March 201266 Cohort study (2) YES NO YES a b a+b NO c d c+d a+c b+d N Disease Exp RISK RATIO Risk in exposed: = Risk in Non-exposed= If exposure increases risk, you would expect to be larger than. How much larger can be assessed by the ratio of one to the other:
March 201267 Cohort study (3) YES NO Yes 42 80 122 No 43302 345 85 382 467 Death Exposure Risk in exposed: = 42/122 = 0.344 Risk in Non-exposed= 43/345 = 0.125
March 201268 Cohort study (4) Historical cohort study Recruit subjects sometime in the past Follow-up to the present Usually use administrative records Can continue to follow into the future Example: cancer in Gulf War Vets Identify soldiers deployed to Gulf in 1991 Identify soldiers not deployed to Gulf in 1991 Compare development of cancer from 1991 to 2010
March 201269 Case-control study (1) Select subject based on their final outcome. –Select a group of people with the outcome/disease (cases) –Select a group of people without the outcome (controls) –Ask them about past exposures –Compare the frequency of exposure in the two groups If exposure increases risk, there should be more exposed cases than exposed controls –Compute an Odds Ratio –Under many conditions, OR ≈ RR
March 201270 Disease (cases) No disease (controls) Exposed Unexposed Exposed Unexposed The study begins by selecting subjects based on Review records Review records
March 201271 Case-control study (2) YES NO YES a b a+b NO c d c+d a+c b+d N Disease? Exp? ODDs RATIO Odds of exposure in cases = Odds of exposure in controls = If exposure increases risk, you would to find more exposed cases than exposed controls. That is, the odds of exposure for cases would be higher This can be assessed by the ratio of one to the other:
March 201272 Yes No Yes 42 18 No 43 67 85 85 Exposure Odds of exp in cases: = 42/43 = 0.977 Odds of exp in controls: = 18/67 = 0.269 Case-control study (3) Death
March 201273 Randomized Controlled Trials Basically a cohort study where the researcher decides which exposure (treatment) the subject get. –Recruit a group of people meeting pre-specified eligibility criteria. –Randomly assign some subjects (usually 50% of them) to get the control treatment and the rest to get the experimental treatment. –Follow-up the subjects to determine the risk of the outcome in both groups. –Compute a relative risk or otherwise compare the groups.
March 201274 Randomized Controlled Trials (2) Some key design features –Allocation concealment –Blinding (masking) Patient Treatment team Outcome assessor Statistician –Monitoring committee Two key problems –Contamination Control group gets the new treatment –Co-intervention Some people get treatments other than those under study
Number needed to treat, NNT (to prevent one adverse event) = March 201275 Randomized Controlled Trials: Analysis Outcome is often an adverse event –RR is expected to be <1 Absolute risk reduction Relative risk reduction
March 201276 RCT – Example of Analysis Asthma No TotalIncid attackattack Treatment 15 35 50.30 Control 25 25 50.50 Relative Risk = 0.30/0.50 = 0.60 Absolute Risk Reduction = 0.50-0.30 = 0.20 Relative Risk Reduction = 0.20/0.50 = 40% Number Needed to Treat = 1/0.20 = 5
March 201277 Confounding Interest in the effect of an exposure on an outcome –Does alcohol drinking cause oral cancer? BUT, the effect of alcohol is ‘mixed up’ with the effect of smoking. The effect of this third factor ‘confounds’ the relationship we are interested in. –Produces a biased results. –Can make result more or less strong Confounder is an extraneous factor which is associated with both exposure and outcome, and is not an intermediate step in causal pathway Proper statistical analysis must adjust for the confounder.
March 201278 The Confounding Triangle Exposure Outcome Confounder Causal Association
March 201279 Confounding (example) Does heavy alcohol drinking cause mouth cancer? –Do a case-control study –OR=3.4 (95% CI: 2.1-4.8). BUT –Smoking causes mouth cancer –Heavy drinkers tend to be heavy smokers. –Smoking is not part of causal pathway for alcohol. Therefore, we have confounding. We do a statistical adjustment (logistic regression is most common): –OR=1.3 (95% CI: 0.92-1.83)
March 201280 Standardization An method of adjusting for confounding (usually used for differences in age between two populations) Refers observed events to a standard population, producing hypothetical values Direct: –yields age-standardized rate (ASMR) Indirect: –yields standardized mortality ratio (SMR) You don’t need to know how to do this Nearly always used when presenting population rates and trends.
March 201281 Mortality data Three ways to summarize them Mortality rates (crude, specific, standardized) PYLL: –subtracts age at death from some “acceptable” age of death. –Places more Emphasis on causes that kill at younger ages. Life expectancy: –average age at death if current mortality rates continue. Derived from life table.
March 201282 Summary measures of population health Combine mortality and morbidity statistics, in order to provide a more comprehensive population health indicator –QALY Years lived are weighted according to quality of life, disability, etc. Two types: –Health expectancies point up from zero –Health gaps point down from ideal
March 201283 Impact of different causes of death in Canada 2001: Mortality rates and PYLL Source: Statistics Canada
March 201284 Attributable Risks (1) (SIM web link) (SIM web link) Generally, tries to give an estimate of the amount of a disease which might be prevented –Gives an upper limit on amount of preventable disease. –Meaningful only if association is causal. Tricky area since there are several measures with similar names. Attributable risk. –The amount of disease due to exposure in the exposed subjects. The same as the risk difference. Can also express as attributable fraction. Often, we want at the proportion of risk attributed to the exposure in the general population –depends on how common the exposure is).
March 201285 Attributable risks (2) ExpUnexp Risk Difference or Attributable Risk I exp I unexp RD = AR = I exp - I unexp
March 201286 Attributable risks (2) ExpUnexp Population Attributable Risk I exp I unexp I pop Population