Clinical Epidemiology – the basics

Slides:

Advertisements

Similar presentations

High Resolution studies

Advertisements

Appraisal of an RCT using a critical appraisal checklist

Systematic reviews and Meta-analyses

Epidemiological terminology and measures

Epidemiologic Study Designs Clinical Studies & Objective Medicine

Critical appraisal of research Sarah Lawson

Basic statistics.

How to assess an abstract

How would you explain the smoking paradox. Smokers fair better after an infarction in hospital than non-smokers. This apparently disagrees with the view.

GATE: Graphic Approach To Epidemiology

Copyright ©2005 Brooks/Cole, a division of Thomson Learning, Inc. Statistical Significance for 2 x 2 Tables Chapter 13.

Likelihood ratios Why these are the most thrilling statistics in Emergency Medicine.

Understanding p-values Annie Herbert Medical Statistician Research and Development Support Unit

Lecture 3 Validity of screening and diagnostic tests

How do we delay disease progress once it has started?

PP (Study Design) for 2nd Year

Conditional Probability

Observational Studies and RCT Libby Brewin. What are the 3 types of observational studies? Cross-sectional studies Case-control Cohort.

Study Designs in Epidemiologic

Reporting drugs and treatments Thomas Abraham. What we will learn today The difference between absolute and relative risk reduction A basic way to interpret.

1 Case-Control Study Design Two groups are selected, one of people with the disease (cases), and the other of people with the same general characteristics.

Introduction to Critical Appraisal : Quantitative Research

Interpreting Basic Statistics

Clinical trial The Way We Make Progress Against Disease Prof. Ashry Gad Mohamed Prof. of Epidemiology College of Medicine & KKUH.

Statistics for Health Care

TREATMENT 1 Evaluation of interventions How best assess treatments /other interventions? RCT (randomised controlled trial)

Are the results valid? Was the validity of the included studies appraised?

Statistics in Screening/Diagnosis

 Mean: true average  Median: middle number once ranked  Mode: most repetitive  Range : difference between largest and smallest.

Multiple Choice Questions for discussion

Dr. Abdulaziz BinSaeed & Dr. Hayfaa A. Wahabi Department of Family & Community medicine  Case-Control Studies.

EBD for Dental Staff Seminar 2: Core Critical Appraisal Dominic Hurst evidenced.qm.

Copyright © 2011 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Screening and Prevention of Illnesses and Injuries: Research Methods.

DEB BYNUM, MD AUGUST 2010 Evidence Based Medicine: Review of the basics.

1 Experimental Study Designs Dr. Birgit Greiner Dep. of Epidemiology and Public Health.

Study design P.Olliaro Nov04. Study designs: observational vs. experimental studies What happened?  Case-control study What’s happening?  Cross-sectional.

 Is there a comparison? ◦ Are the groups really comparable?  Are the differences being reported real? ◦ Are they worth reporting? ◦ How much confidence.

EVIDENCE BASED MEDICINE Effectiveness of therapy Ross Lawrenson.

Measures of Association

Study Designs in Epidemiologic

Mafs Gareth What do the R & C in RCT mean?  Randomised – equal chance of receiving treatment  Control – there is a control.

How to Analyze Therapy in the Medical Literature (part 2)

Understanding real research 4. Randomised controlled trials.

EBCP. Random vs Systemic error Random error: errors in measurement that lead to measured values being inconsistent when repeated measures are taken. Ie:

VSM CHAPTER 6: HARM Evidence-Based Medicine How to Practice and Teach EMB.

How to find a paper Looking for a known paper: –Field search: title, author, journal, institution, textwords, year (each has field tags) Find a paper to.

Stats Facts Mark Halloran. Diagnostic Stats Disease present Disease absent TOTALS Test positive aba+b Test negative cdc+d TOTALSa+cb+da+b+c+d.

Statistics for the board September 14 th 2007 Jean-Sebastien Rachoin MD.

Unit 2 – Public Health Epidemiology Chapter 4 – Epidemiology: The Basic Science of Public Health.

Screening and its Useful Tools Thomas Songer, PhD Basic Epidemiology South Asian Cardiovascular Research Methodology Workshop.

Clinical epidemiology

A Simple Method for Evaluating the Clinical Literature “PP-ICONS” approach Based on Robert J. Flaherty - Family Practice Management – 5/2004.

EVALUATING u After retrieving the literature, you have to evaluate or critically appraise the evidence for its validity and applicability to your patient.

BIOSTATISTICS Lecture 2. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and creating methods.

Biostatistics Board Review Parul Chaudhri, DO Family Medicine Faculty Development Fellow, UPMC St Margaret March 5, 2016.

1 Evidence based health SCREENING Dr.Hathaitip Tumviriyakul Diploma Family medicine,Hatyai Hospital Msc. Epidemiology LSHTM,UK.

2 3 انواع مطالعات توصيفي (Descriptive) تحليلي (Analytic) مداخله اي (Interventional) مشاهده اي ( Observational ) كارآزمايي باليني كارآزمايي اجتماعي كارآزمايي.

GP ST2 Group, 28/9/11 Tom Gamble

HelpDesk Answers Synthesizing the Evidence

Study Designs Group Work

Question 1 A new ‘Super test’ claims to have a superb capability to diagnose disease X. Its sensitivity is 99% and specificity is 90%. Which of the following.

Chapter 7 The Hierarchy of Evidence

Public Health Phase 3A Abigail Aitken

Interpreting Basic Statistics

How to assess an abstract

Interpreting Epidemiologic Results.

Clinical Epidemiology EBM, statistics, screening and more

Evidence Based Diagnosis

Basic statistics.

Medical Statistics Exam Technique and Coaching, Part 2 Richard Kay Statistical Consultant RK Statistics Ltd 22/09/2019.

Presentation transcript:

Clinical Epidemiology – the basics

What do the terms relative risk and absolute risk mean What do the terms relative risk and absolute risk mean? What are the advantages and disadvantages of each? A new screening test is described as having a sensitivity of 78%, a specificity of 89%,a positive likelihood ratio of 13 and a negative likelihood ratio of 0.05. Explain these terms.

A 2 minute exercise What do you know about this topic? Could you explain what you know to others? What bits would you like clarifying? (Hint – do you understand RR, OR, CIs etc)

EVIDENCE BASED MEDICINE EBM is an approach to practicing medicine in which the clinician is aware of the evidence in support of his / her clinical practice, and the strength of that evidence.

EVIDENCE BASED HEALTHCARE Evidence based health care promotes the collection, interpretation, and integration of valid, important and applicable patient-reported, clinician-observed and research derived evidence. The best available evidence, moderated by patient circumstances and preferences, is applied to improve the quality of clinical judgements and facilitate effective healthcare.

EFFECTIVE SAFE PATIENT FACTORS COST These work best if drawn on acetates on OHP or flip chart. Q1 to audience. “Imagine see a patient with dysuria, nocturia and blood in the urine. She’s 25 and sexually active so not unreasonably you diagnose a bacterial UTI and you are going to prescribe an antibiotic. What are you looking for in the antibiotic in order to come to the best choice for most patients” If you get the answer “one which is indicated by lab culture” turn it round and say “So you want one which is effectve” and write that on the acetate / flip chart. Then the audience gets the point and you can rapidly get safety and cost. Usually examples of patient factors are thrown up – once a day, previous experience of patient etc. and again you can turn that into a principle. Then you can make the point about the difficulty of decision making, what happens when you choose trimethoprim (reasonably effective and safe, low cost) but the patient says “I don’t want them, they didn’t work last time.” So no such think as 100% compliance with any policy decision in health care. This comes as a surprise to quite a few I find – seems to be an expectation that the expert has come to tell them what to do, when all I aim to do is describe the evidence and then let them decide. Helps to let them know this up front IMHO. This grid comes from a BMJ Paper by (I think) Nick Barbour c 1997/8 and I’ll post the full ref when I dig it out –just can’t locate on eBMJ now. PATIENT FACTORS COST

QUALITY OF CARE SYSTEMATIC APPLICATION OF THERAPY 100% 0% Q2. Imagine you could measure the quality of care – y axis – and against this measure the systematic application of a single intervention – say aspirin after myocardial infarction. So at the far end of the x axis (100%) everybody gets aspirin and the opposite extreme (0%) we have the pick and mix approach to secondary prevention where prescribers stick a pin in the BNF (when they remember). So at 0% quality of care is low – nods from audience (hopefully) – what happens as are gets better organised. Hopefully audience agrees quality of care goes up and you can draw right hand green line. Then ask what happens if you give aspirin to everybody? Hopefully someone says the quality of care goes down coz you give it to allergic people or those who had a GI bleed the day before or something similar, & you can then draw in the right hand red line. Then say “if you get 85-90% compliance with a policy of aspirin post MI – an intervention with a good evidence base (unless you are John Cleland), low cost, acceptable safety and few patients objecting to taking on personal grounds – what sort of curve do you get where the evidence is less good, or the safety profile worse or less certain, the costs against comparators are a closer call or patients factors figure more prominently?” Usually some gets it pretty quickly that the peak for quality shifts to the left and you can draw in the second set of lines. SYSTEMATIC APPLICATION OF THERAPY 100% 0%

LEVELS OF EVIDENCE LARGE WELL DESIGNED RCT META ANALYSIS OF SMALLER RCTs CASE CONTROL AND COHORT STUDIES (CASE REPORTS AND CASE SERIES) CONSENSUS FROM EXPERT PANELS I THINK Q3. Again, starting with a blank acetate or flip chart ask “So where do we get this evidence from?” No group has yet failed to get somewhere the words “randomised trial” and we tease out blind and double blind and often need to explain what that means. Then ask “What happens if you don’t have a big, well designed and conducted RCT, but have lots of smaller RCTs with 30, 50, or 70 people in them instead of several thousands?” Again, so far someone has mentioned metanalysis or systematic review (apparently not understanding the difference) & then can explain the process of SR and MA very briefly. Then ask where the safety data comes from – enabling a discussion of observational data and the potential for bias. And finally the limitations of consensus. Then start the workshop!!

Why don’t we always use an RCT then? Ethics Cost Feasibility Practicality Ethics – can’t usually do RCTs to Qs re potential harm. E.g. getting thousands of medical students and randomising half of them to smoke and the rest not to smoke and seeing how many in each group get lung cancer 30 years later ain’t on. But there are RCTs comparing NSAIDs in which the outcomes were PUBs. OK to do this coz intervention sought to reduce the rate of harmful events. Cost – obvious. Feasibility – E.g. may not be possible to reproduce a one-off exposure to an environmental mishap such as tipping aluminium into a water supply. Practicality – Qs about prevalence can be satisfactorily answered by cross-sectional studies.

Why read journals? Need to make the best possible decisions for patients Need to make the best possible decisions for healthcare Need to feel confident about being “on top of the job” Need to feel knowledgeable themselves and credible with peers Q4. Ask who reads journals (even occasionally) – HANDS UP. Then ask all to brainstorm WHY they read. The answers can then be grouped on a flipchart / OHP in the 4 categories above. And so, if they read journals, they need to be able to (a) understand the terminology (b) critically appraise and © apply the evidence. Hence the workshop.

WHY THE MOVE TO EBM? RANDOMISED CONTROLLED TRIALS PRE-1960 WERE ODDITIES REVIEWS AND META-ANALYSES AVAILABLE AS ACCESSIBLE DIGESTS OF EVIDENCE ACCESS TO EVIDENCE VIA I.T. METHODOLOGICAL ADVANCEMENTS E.G. NUMBERS NEEDED TO TREAT

EBM IS ABOUT ... CLINICAL EXPERIENCE, DIAGNOSTIC SKILLS AND CLINICAL INSTINCT ARE A NECESSARY PART OF A COMPENTENT PHYSICIAN. HOWEVER, CLINICAL PRACTICE BASED SOLELY UPON CLINICAL EXPERIENCE “BECOMES TOMORROW’S BAD JOKE”. “RATIONAL” TREATMENT BASED SOLELY UPON BASIC PATHOLOGICAL PRINCIPLES MAY IN FACT BE INCORRECT, LEADING TO INACCURATE TREATMENT. UNDERSTANDING CERTAIN RULES OF EVIDENCE IS NECESSARY TO CORRECTLY INTERPRET LITERATURE ON CAUSATION, PROGNOSIS, DIAGNOSTIC TESTS AND TREATMENT STRATEGY.

20,000 biomedical journals in print 20,000 biomedical journals in print. So why isn’t all practice based on scientific evidence? Not RELEVANT Upstream to clinical decisions being made, e.g. animal or in vitro studies Study populations and / or settings do not reflect question type, practice population and settings. Not RELIABLE Poor study design Bias and confounding Measurement validity Insufficient power E.g study of ARBs v placebo in diabetic nephropathy in 15 native Australians with biochemical outcome meqasures, whereas we need v ACEi in several hundred people in Slagthorpe, UK with mortality, or at least progression to end stage renal failure in order to decide whether ARBs are better than ACEis

BIAS Selection bias Observer bias Participant bias Withdrawal or drop out bias Recall bias Measurement bias Publication bias Selection bias – select sicker patients to get the active or new Rx and fitter patients to get placebo or older Rx Observer bias – if we know the patient has active treatment can subconsciously record health status as being better Participant bias – e.g. in study looking at Gi bleeds in NSAID v non-NSAID users, the people who are not prescribed NSAIDs buy them OTC. Withdrawal / drop out – if lose people from the study those left at thend may not be representative of those originally included, and their numbers may be very much smaller so affecting the validity and generalisability of event rates. Recall – mothers of kids with leukaemia remember living near high voltage cables. Mothers of kids without leukaeimia won’t remember living near cables coz to them it’s a trivial fact. Measurement bias – e.g. measuring BP in trials with sphygs that are not calibrated Publication bias – positive studies get published much more often than equivocal or negative studies

CONFOUNDING Confounding is a particular form of bias where both the disease or outcome being measured (here its lung cancer) and the “intervention” (here it’s coffee drinking) are associated with the confounding variable (here its smoking). Coffee drinking is positively associated with smoking, and smoking is positively associated with lung cancer. Hence a study could show an association between coffee drinking and lung cancer but it would be confounded (rather than biased).

Power The ability of the study to detect an effect if in truth there is an effect. An RCT may be underpowered if:- The duration is too short (too few events) It includes too few people (too few events) The wrong outcome was used (too few events) Expecting a higher level of statistical proof than is realistic for the condition and the intervention being tested Don’t worry about the maths. We have some slides for complete anoraks if you must. Its all about   and the answer always being 42. Do worry if the RCT doesn’t include a power calculation, and worry especially if the study shows no benefit and there isn’t a power calculation. The study may be negative because it was underpowered.

EBM SKILLS - STATISTICS CHANCE - p = 1 in 20 (0.05). > 1 in 20 (0.051) = not significant < 1 in 20 (0.049) = statistically significant CONFIDENCE INTERVALS what is the range of values between which we could be 95% certain that this result would lie if this intervention was applied to the general population Straightforward, surely. If not see Simple Statistics by Frances Clegg, Cambridge Press.

TYPES OF STUDY - HYPOTHESIS FORMING CASE REPORTS / CASE SERIES CROSS SECTIONAL / PREVALENCE STUDIES measure personal factors & disease states – a snapshot CORRELATIONAL / ECOLOGICAL / GEOGRAPHIC STUDIES. prevalence &/or incidence measurement in one population c/w another pop. Care reports and case series are often derided. But imagine you were a physician in San Francisco in the 1980s and you had in the space of a few weeks 3 young homosexual men admitted under your care all with pneumocystis pneumonia. Wouldn’t that be an important case series to write up?

TYPES OF STUDY - HYPOTHESIS TESTING CASE CONTROL STUDIES

CASE CONTROL EXAMPLE -SMOKING & LUNG CANCER DISEASE Cases Controls EXPOSURE Yes a b EXPOSURE No c d Odds Ratio = ad/bc (1 = no association, > 1 = possible association, < 1 = protective effect) (lung cancer) EXPOSURE Yes 56 230 (smoking) No 7 246 The odds ratio would therefore be 56 x 246 = 13776 = 8.6. 7 x 230 1610 If the maths bother you, just revert to common sense. There are 56 cases of lung cancer who are heavy smokers and 7 cases of lung cancer who don’t smoke. It then obvious that if you get lung cancer you are about 8 times more likely to be a heavy smoker. NB this is an ASSOCIATION between smoking and lung cancer. It’s a big OR so the association is likely to be causal. But it don’t NECESSARILY prove that if you smoke you are 8 times more likely to get lung cancer. (This was a prop of the tobacco industry for years, of course.)

TYPES OF STUDY - HYPOTHESIS TESTING COHORT STUDIES

This is expressed as a divided by c . COHORT STUDIES OUTCOME Yes No Exposed a b Not exposed c d Relative risk "How many times are exposed persons more likely to develop the disease, relative to non-exposed persons?" i.e. the incidence in the exposed divided by the incidence in the non-exposed. This is expressed as a divided by c . a+b c+d

Exposed ( on oral contraceptive ) 41 9998 COHORT STUDY EXAMPLE Deep vein thromboses (DVT) in oral contraceptive users. (Hypothetical results). OUTCOME (DVT) Yes No Exposed ( on oral contraceptive ) 41 9998 Not exposed (not on o.c.) 7 10009 These results would give a relative risk of 6 - significantly large enough numbers to indicate the possibility of a real association between exposure and outcome. However, NB biases. Again, this is almost easier if the maths is ignored. 41 DVTs in those on the COC v only 7 in those not on an COC – so you’re potentially 6 times more likely to have a DVT if you’re on a COC than if you’re not.

RANDOMISED CONTROLLED TRIALS

RANDOMISED CONTROLLED TRIALS OUTCOME Yes No Comparison intervention a b Experimental intervention c d Absolute risk reduction: “What is the size of this effect in the population” Control event rate - experimental event rate a/a+b - c/c+d Relative risk reduction: “ How many fewer patients will get the outcome measured if they get active treatment versus comparison intervention” a /a+b - c/c+d a/a+b

ARR and RRR A quick test In a study lasting 12 months, the death rate on placebo was 10% and the death rate on Marvelicoxib was 5%. What is the ARR? What is the RRR? ARR = CER – EER; that’s 10% - 5% = 5% or 0.05. A good group could go on and do 100/5 or 1/0.05 and get the NNT of 20. So that means 20 people being treated with Marvelicoxib for 12 months to prevent 1 death. The other 19 would live or die despite getting Marvelicoxib. And you can’t identify which is the 1 in 20. A really really good group could be challenged to find a way of increasing the odds of doing more good – the answer being to give Marvelicoxib to people at a higher baseline risk. Give it to people with a 20% chance of dying, and if the relative benefits were the same then only 10% would die, ARR 10%, NNT 10, not 20. RRR = CER – EER / CER so that’s 5% / 10% = 50%. Or for the non-mathematicians it goes something like:- the death rate went down by 5% from a baseline of 10%, so the death rate was halved or “reduced by 50%”.

ARR and RRR in more detail 4S STUDY STABLE ANGINA OR MYOCARDIAL INFARCTION MORE THAN 6 MONTHS PREVIOUSLY SERUM CHOLESTEROL > 6.2mmol/l EXCLUDED PATIENTS WITH ARYHTHMIAS AND HEART FAILURE ALL PATIENTS GIVEN 8 WEEKS OF DIETARY THERAPY IF CHOLESTEROL STILL RAISED (>5.5) RANDOMISED TO RECEIVE SIMVASTATIN (20mg > 40mg) OR PLACEBO OUTCOME DEATH OR MYOCARDIAL INFARCTION (LENGTH OF TREATMENT 5.4 YEARS ) WERE THE OUTCOMES

1/ARR = NUMBER NEEDED TO TREAT. RCT EXAMPLE - 4S STUDY OUTCOME (death) Yes No Comparison intervention (placebo) 256 1967 2223 Experimental intervention (simvastatin) 182 2039 2221 The ARR is (256/2223) - (182/2221) = 0.115 - 0.082 = 0.033. The RRR is 0.033/0.115 = 0.29 or expressed as a percentage 29%. 1/ARR = NUMBER NEEDED TO TREAT. 1/0.033 = 30. i.e. if we treat 30 patients with IHD with simvastatin as per 4S study, in 5.4 years we will have prevented 1 death.

Another way of calculating NNTs OUTCOME (death) Yes No Comparison intervention (placebo) 256 1967 2223 Experimental intervention (simvastatin) 182 2039 2221 Prevalence of event in control group = 256/2223x100=11.5% RRR = 29%

Now that’s magic! The Rev Bayes in Northampton in the 1750s was truly a genius. You can now play about with baseline risk in particular and work out how good statins are if given to people at 5% or 1% risk over 5.4 years. Lights may start to switch on in the group at this point.

Intervention Outcome NNT NNT EXAMPLES Intervention Outcome NNT But these are all in different patient groups with interventions with very different costs so tables of NNTS are illustrative but no answer.

Why are RCTs the “gold standard” Breast cancer mortality in studies of screening with mammography; women aged 50 and over (55 in Malmo study, 45 in UK) Still a useful example despite the controversy over the RCTs’ validity.

Egger M et al. Meta-analysis Spurious precision Egger M et al. Meta-analysis Spurious precision? Meta-analysis of observational studies BMJ 1998;316:140-144 Meta-analysis of association between ß carotene intake and cardiovascular mortality: results from observational studies show considerable benefit, whereas the findings from randomised controlled trials show an increase in the risk of death.

Odds ratios or relative risks? Macfarlane J et al. BMJ 2002; 13: 105-9 Patients who took antibiotics Patients who did not take antibiotics TOTAL Patients who were given a leaflet 49 55 104 Patients not given a leaflet 63 38 101 112 93 205 A real trial.

Relative risk: (49/104) / (63/101) = 0.76. Patients who took antibiotics Patients who did not take antibiotics TOTAL Patients who were given a leaflet 49 55 104 Patients not given a leaflet 63 38 101 112 93 205 Relative risk: (49/104) / (63/101) = 0.76. i.e the relative risk of patients taking an antibiotic if they were given a leaflet is reduced by 24%. Also called risk ratio.

There was a 46% reduction in the ratio of those taking Patients who took antibiotics Patients who did not take antibiotics TOTAL Patients who were given a leaflet 49 55 104 Patients not given a leaflet 63 38 101 112 93 205 All these ways of comparing data are valid and it’s a matter of judgement which one best describes the differences between the outcomes of two groups. Relative risk reductions are appropriately used to compare different interventions in the different trials. ARRs give a more realistic description of the benefits in population terms. And the benefits are often related to the baseline risk in an individual. Sometimes the RR can be impressive, but if there’s a 50% reduction in very rare events, then a lot of people have to be given the intervention for comparatively few to benefit. And of course ALL those given the intervention are exposed to the side effects of the intervention. OR and RRs are often the same for rare events but for commoner events (as in the above example) they may give very different numbers. Odds ratio: (49/55) / (63/38) = 0.54. There was a 46% reduction in the ratio of those taking antibiotics who had a leaflet compared with the ratio of those taking antibiotics who did not have a leaflet.

Absolute risk reduction: (49/104) – (63/101) = 0.15. Patients who took antibiotics Patients who did not take antibiotics TOTAL Patients who were given a leaflet 49 55 104 Patients not given a leaflet 63 38 101 112 93 205 A bit more practice for those who need it. Absolute risk reduction: (49/104) – (63/101) = 0.15. Also known as the risk difference. i.e. the difference in the risk of taking antibiotics depending on whether a leaflet was used or not.

NNT: 1 / 0.15 = 7. i.e. 7 people need to be given a leaflet Patients who took antibiotics Patients who did not take antibiotics TOTAL Patients who were given a leaflet 49 55 104 Patients not given a leaflet 63 38 101 112 93 205 NNT: 1 / 0.15 = 7. i.e. 7 people need to be given a leaflet In order for 1 additional person not to take antibiotics

Jüni P, Rutjes AWS, Dieppe PA Jüni P, Rutjes AWS, Dieppe PA. Are selective COX 2 inhibitors superior to traditional non steroidal anti-inflammatory drugs? BMJ 2002; 324: 1287-1288 Just a classic example of when you measure the results and report them can make a huge difference. When would you have reported the study – at 6 months or at 12 months? (A: the original researchers reported at 6 months but didn’t say they had 12 months data) Why might that be reasonable? (A: drop out rates might have meant fewer people left in the study beyond 6 months so making events rates from that point on less statistically valid).

Screening and Diagnostic Tests

SCREENING - WILSON & JUNGEN (WHO, 1968) IS THE DISORDER COMMON / IMPORTANT ARE THERE TREATMENTS FOR THE DISORDER IS THERE A KNOWN NATURAL HISTORY & “WINDOW OF OPPORTUNITY” WHERE SCREENING CAN DETECT DISEASE EARLY WITH IMPROVED CHANCE OF CURE IS THE TEST ACCEPTABLE TO PATIENTS SENSITIVE AND SPECIFIC GENERALISABLE CHEAP / COST EFFECTIVE APPLY TO GROUP AT HIGH RISK Useful exercise is to ask the group to apply these criteria to a test for prostate cancer.

Tests ain’t what they used to be Joseph Heller Catch 22 1962 “Gus and Wes had succeeded in elevating medicine to an exact science. All men reporting on sick call with temperatures above 102 were rushed to hospital. All those except Yossarian reporting on sick call with temperatures below 102 had their gums and toes painted with gentian violet solution and were given a laxative to throwaway in the bushes. All those reporting on sick call with temperatures of exactly 102 were asked to return in an hour to have their temperatures taken again.”

A B No Disease Disease No Disease Disease C No Disease Disease Percent of population Point out the distribution of results for people without disease, and for those with disease. Then set the cut off point at A (lots of false positives), then at B (lots of false negatives), and then at C (some false positives and negatives but the optimum point for this test in this condition. 10 20 30 VALUE Arbitrary Units Set cut off at A A lot of people who do not have the disease are labeled as having it (false positives) A lot of people who do have the disease are labeled as not having it (false negatives) Set cut off at B

DISEASE Present Absent Positive 50 a b TEST c d Negative 50 The start of an example to work through. Ask the group what is the prevalence here? (A: 50%) a b TEST c d Negative 50

Measure the usefulness of the TEST by.. DISEASE Present Absent 5 Positive 45 a b TEST Just work through this in two vertical columns explaining sensitivity and specificity. c d Negative 5 45 Sensitivity Specificity Sensitivity = a = 45 = 90% high sensy = a + c 50 few false negatives Specificity = d = 45 = 90% high specy = b + d 50 few false positives

Test with a high specificity useful to rule in a diagnosis e.g. before cancer chemotherapy Test with high sensitivity useful to rule out a diagnosis e.g. antenatal for syphilis Sensitivity and specificity are properties of the test and are taken into account when deciding whether to test.

(and this is the hard bit so concentrate NOW) But…… (and this is the hard bit so concentrate NOW) When the test result is available the usefulness of the result depends on:- How good (or bad) the test was at detecting true positives and true negatives The pre-test probability of the person being tested actually having the disease for which they are being tested.

What is the pre-test probability of someone with dyspepsia being H pylori positive? What is the pre-test probability of someone with dyspepsia being H pylori negative? Another example to get the important concept of pre-test probability over.

The Impact of Prevalence on Predictive Value (Bayes Theorem) DISEASE Absent Present Positive predictive value a = 45 = 90% a+b 50 5 Positive 45 a b c d TEST 5 45 Negative predictive value d = 45 = 90% c+d 50 Negative Sens = 45/50 i.e. 90% Spec = 45/50 i.e. 90% Prevalence = 50%

Watch what happens when the prevalence drops to 10%……. DISEASE Sensitivity and specificity stay the same (so the test don’t alter). But suddenly, if you have a positive test it’s a 50:50 chance whether that positive result is a true positive or a false positive. If this is seeming a bit dry – put the group in the position of having to explain a positive screening mammogram to a woman. We know that only 1 in 10 women with a positive mammogram turn out to have breast cancer. So the PPV is 10%, i.e. 90% are false positives. But all 100% will think they have breast cancer. And then after further tests 90% will be told they don’t. Ask the group if they are comfortable with that consultation? Would they want to be screened? Present Absent POSITIVE PPV = 9 = 50% 18 9 9 TEST a b NPV = 81 = 99% 82 NEGATIVE 1 81 c d Sensitivity = 9 = 90% Specificity = 81 = 90% 10 90

This change can be described arithmetically by likelihood ratios. Likelihood ratios express how many more times (or less times) a test result is to be found in diseased people compared with non-diseased people. DISEASE Now this gets more complicated – again don’t worry too much about the maths. Present Absent Positive a b TEST Negative c d LR +ve = a LR -ve = c a + c a + c b d b + d b + d

Likelihood ratios - EXAMPLE DISEASE PRESENT ABSENT 9 9 POSITIVE a b TEST c d 1 81 NEGATIVE LR +ve = 0.9 = 10 LR -ve = 0.1 = 0.12 0.09 0.81

Douglas O, et al. Digestive Diseases and Sciences 1996; 41:740-8 New non-invasive tests for H. Pylori Gastritis. Comparison with tissue-based gold standard. Douglas O, et al. Digestive Diseases and Sciences 1996; 41:740-8 Sens. Spec. LR +ve LR -ve Urea Breath Test 90 96 22 The key thing here is not the maths of LRs. What is important to realise is that ALL diagnostic and screening tests have false positive and false negative rates, and that the usefulness of an individual test result depends on both the ability of the test to discriminate disease from health AND THE PRE-TEST PROBABILITY OF THE INDIVIDUAL HAVING THE DISEASE. If someone has a low chance of having a disease then many positive results are false positive. Conversely, if someone has a very high chance of having a disease then a negative test may well be a false negative. 0.10 74 89 7 0.30 Serum Anti-bodies Here comes another (different) magic nomogram!

PTP - Post-test probability UBT - urea breath test +ve -ve Prevalence PTP PTP UBT 20% 85% 2% 40% 95% 5% Sab 20% 60% 5% 40% 80% 12% PTP - Post-test probability UBT - urea breath test Sab - serum antibody test Work through these in the same way are we did earlier for the 4S study. Altering the baseline risk and how good the test is at discriminating (likelihood ratios) is enlightening.

+ve -ve Prevalence PTP PTP UBT 20% 85% 2% 40% 95% 5% Sab 20% 60% 5% 40% 80% 12% PTP - Post-test probability UBT - urea breath test Sab - serum antibody test

H Pylori infection in a population with a 25% prevalence MeReC Bulletin 2001; 12 (1): 1-4 6 46 94 54 75.5 86 Near-patient serological tests 3 25 97 75 90 91 Laboratory serological tests 1 12 99 88 95.5 97.5 Breath test (14C) 11 89 96 96.5 Breath test (13C) False Negative results (%) False positive results (%) Negative predictive value (%) Positive predictive value (%) Specificity (%) Sensitivity (%) We did this exercise for this table in a recent MeReC bulletin. The false positive results for serology and near patient tests make sobering reading.

SUMMARY

EVIDENCE BASED MEDICINE FORMULATE QUESTION EVALUATE PERFORMANCE EFFICIENTLY TRACK DOWN BEST AVAILABLE EVIDENCE Whilst this is fine for academics, most busy clinicians won’t go and do Medline searches. They need summaries of evidence. IMPLEMENT CHANGES IN CLINICAL PRACTICE CRITICALLY REVIEW THE VALIDITY AND USEFULNESS OF THE EVIDENCE

Evidence Based Medicine “The evidence isn’t there” (whinge, moan) OR “I don’t have the time” (whine, complain) Clinical Evidence Cochrane DTB, MeReC Bulletin PRODIGY Evidence Based Medicine

LIMITATIONS STILL LOTS OF ROOM FOR DEBATE ABOUT THE EVIDENCE BASE EBM = WHAT IS BEST FOR AN INDIVIDUAL PATIENT (patient utility) EVIDENCE BASED PURCHASING = BEST USE OF HEALTH CARE RESOURCES FOR THE LOCAL POPULATION (cost utility). i.e. knowledge of local needs, priorities and constraints WHAT IF THESE CONFLICT? (Anybody want to mention beta interferon and MS?!)

EBM VISION FROM 1996 Modesty forbids, but this was done for a book written in 1996 as a signpost for the future – to try and convince sceptics that EBM was worth persevering with. Starting to look about right in 2003 innit?!