Presentation on theme: "Know Your Limitations Or, Debunking The Easy Way UKCSJ London June 18, 2014 Ivan Oransky VP, Global Editorial Director, MedPage Today"— Presentation transcript:
Know Your Limitations Or, Debunking The Easy Way UKCSJ London June 18, 2014 Ivan Oransky VP, Global Editorial Director, MedPage Today http://medpagetoday.com Co-founder, Retraction Watch http://retractionwatch.com @ivanoransky
“Somehow we feel that the ability to grab eyeballs by putting "crotch length" in a headline dictated editorial judgment on this story. Because it certainly fell short in delivering the goods that a man would need in order to evaluate the potential of this finding.”
Crotch Length and Fertility “Certain limitations warrant mention. As a referral center for male infertility, it was not always possible to blind observers to the men's diagnoses or fatherhood status which theoretically can lead to observer bias. Although, the magnitude of observed differences in AGD between fathers and infertile men (i.e. 40% in mean AGD and 45% in median AGD) suggests that any bias would be unlikely to affect the overall conclusions. Moreover, the current method of AGD measurement in adult men has not been studied, thus its accuracy and reproducibility were difficult to assess other than the performed comparison of measurements between investigators...
Crotch Length and Fertility “Future studies are necessary to compare techniques for measurement as well as other anatomic locations of the AGD measurement. In addition, only men referred to and evaluated in our clinic were eligible for enrollment; therefore, it is possible that our patient population does not represent all infertile men...
Crotch Length and Fertility “It is also important to note that the fertile controls were significantly older than the infertile patients. While age was not associated with AGD after accounting for fatherhood status and no evidence of effect modification by age was found, it possible that AGD could change with age. In addition, while all patients were measured in the same position, some men were measured at the time of surgery under general anesthesia while others were awake. It is conceivable that anesthesia may affect measurements, although stratifying by anesthesia status did not affect the conclusions.”
“Numerous headlines tout that substituting a fist bump for a handshake reduces transmission of infection. But is this just one of many examples of the media sensationalizing the findings of a paper far beyond what it is due?” Guest Blogger: Skeptical Scalpel
“This study was designed as a pilot to explore the potential harmful effects of the handshake within the healthcare system, and as such it has several limiting factors. The study is limited by our small sample size and it could not assess statistical significance. Limited funding curtailed our ability to identify specific strains of bacteria. A larger study is planned to assess the level of significance between the handshake and fist bump as well as to assess virulence of the cultured strains. Furthermore, viral transmission is thought to be commonly transferred by skin contact, but was not assessed in this study.”
“Our study has several potential limitations. First, our study populations primarily consisted of white female nurses, which may limit the generalizability of the findings to other ethnic groups or males. Second, because diet was assessed by FFQs, measurement error of nut intake is inevitable, which may underestimate the true associations. Third, biochemical markers for type 2 diabetes (fasting glucose, insulin, lipids, and HbA1C, etc.) were not available in the full NHS cohorts, and thus could not be adjusted in the models...
Walnuts and Diabetes “Furthermore, habitual nut consumption was associated with several healthy lifestyle practices and may be a marker for an overall healthy lifestyle. Although we carefully controlled for a number of diabetes risk factors, unmeasured and residual confounding is still possible to explain the association and we could not fully exclude the potential influence from the overall diet quality and healthy cuisine effects.”
“Limitations of this study include its retrospective component, potential for recall bias, and cohort selectivity limited to the Motherisk database. Also, the use of different versions of the assessment instrument and broad age range of the children may be limiting factors.”
“In a more detailed assessment of the medical literature, in which two independent reviewers assessed the abstract and discussion sections of 300 medical research papers, published in first and second tier general medical and specialty journals, 73% of all papers were found to acknowledge a median of 3 limitations.”
Press Releases: They’re Not There Academic medical centers issue a mean of 49 press releases/year Among 200 randomly selected releases – 87 (44%) promoted animal or laboratory research, of which 64 (74%) explicitly claimed relevance to human health – Among 95 releases about clinical research, 22 (23%) omitted study size and 32 (34%) failed to quantify results – 113 releases promoted human research 17% promoted randomized trials or meta-analyses 40% reported on uncontrolled interventions, small samples (<30 participants), surrogate primary outcomes, or unpublished data—yet 58% lacked the relevant cautions Woloshin S et al. Press releases by academic medical centers: not so academic? Ann Intern Med 2009;150:613-618
Get the full study and read it – “I think it’s journalistic malpractice to not have the full study in front of you when you’re reporting,” Oransky says.
How to Get Studies www.EurekAlert.org for embargoed material www.EurekAlert.org AHCJ membership includes access to Cochrane Library, Health Affairs, JAMA, and many other journals www.healthjournalism.orgwww.healthjournalism.org ScienceDirect (Elsevier) gives reporters free access to hundreds of journals www.sciencedirect.comwww.sciencedirect.com Open access journals (e.g., Public Library of Science www.plos.org) www.plos.org Ask press officers, or the authors
Ask “Dumb” Questions If you lack experience dealing with scientific material, don’t be afraid to ask for definitions of jargon and scientific terms. This is no time to pretend you understand everything. Oransky says the science and medical industries are full of jargon that mask important details. “You’ll get off the phone and have a notebook full of gibberish and jargon,” he says. “You can’t be afraid of asking a dumb question.”
Ask Smart Questions Was it: – Peer-reviewed? – Published? Where? Not all journals are created equal. “Dr. X said they published in Y rather than a clinical journal because the paper was too long for the word limits in the clinical journals. I'm not sure where a detail like that would go…but he was impressed with my question.”
Ask Smart Questions Was it in humans? – It’s remarkable there are any mice left with cancer, depression, or restless leg syndrome
Ask Smart Questions Size matters Look for the power calculation, and ask if you don’t see one
Ask Smart Questions Was it well-designed? From Covering Medical Research, Schwitzer/AHCJ
Ask Smart Questions “Were those your primary endpoints?” “Looks as though that endpoint reached statistical significance. Is that difference clinically significant?”
Who Could Benefit? How many people have the disease? Keep potential disease-mongering in mind
How Effective is the Treatment? Clinically significant endpoints, or surrogates – does this matter? Preventing complications? How many? Always remember to quantify results, not just “patients improved”
What Are The Side Effects? Every treatment has them Where to look: – Go beyond press releases and abstracts – Look at tables, charts, and results sections
Who Dropped Out? Why did they leave the trial? Intention to treat analysis
How Much Does it Cost? If it’s ready to be the subject of a story, someone has projected the likely cost and market. – At least ask.
Who Has an Interest? Disclose conflicts PharmedOut.org Dollars For Docs series http://projects.propublica.org/docdollars/ http://projects.propublica.org/docdollars/
Are There Alternatives? Did the study compare the new treatment to existing alternatives, or to placebo? What are the advantages and disadvantages (and costs) of those existing alternatives? Consider alternative explanations. Remember coffee and pancreatic cancer?
Don’t Rely Only on Study Authors Find outside sources. Here’s how:
“The best piece I saw was by John Gever, a dependable debunker at MedPage Today, who led by saying the test's "accuracy fell short of what would normally be acceptable for a screening test." “That's a perspective I hadn't seen anywhere else. And Gever reports that the test has a "positive predictive value" of 35%, which means "nearly two-thirds of positive screening results would be false." In other words, out of every 10 people who tested positive, two-thirds of them--say, six or seven--would be told they will develop Alzheimer's disease when that was not the case. “If Gever's math is correct--I can't vouch for it, but he has a long track record--then the test is nowhere close to being useful for screening patients.”
“But they left out an important statistic for judging the usefulness of such a test, as it would be applied in the clinic -- the positive predictive value (PPV) or the accuracy of positive results seen in the target population (in this case, cognitively healthy seniors). “Contrary to what I later learned is popular belief, calculating a PPV is easy, requiring nothing more than fourth grade arithmetic.
Some Math: Alzheimer’s “So let's look at an assay with sensitivity and specificity rates of 90%, to be used in a population of 1,000 people in which 5% will actually convert to cognitive impairment. We know that 50 of these 1,000 will convert and 950 will not. Of those 50 converters, 10% will falsely test negative, and, of the 950 nonconverters, 10% will falsely test positive. “That means the testing program produces a total of 140 positive results, 45 of which are correct and 95 false. The PPV is just the fraction of total positives that are correct -- in this example, 45/140 or 32%.
Some Math: Alzheimer’s “Even if the conversion rate to cognitive impairment in healthy seniors is 10% instead of 5%, the PPV remains low. In 1,000 people, 100 will be true positives and 900 will be true negatives; thus, with the same test sensitivity and specificity, there will be 180 positive results, split evenly between true and false, for a PPV of 50%.