Presentation on theme: "SADC Course in Statistics Preparing & presenting epidemiological information: I (Session 07)"— Presentation transcript:
SADC Course in Statistics Preparing & presenting epidemiological information: I (Session 07)
To put your footer here go to View > Header and Footer 2 Learning Objectives At the end of this session, you will be able to access and browse vast amounts of web- based epidemiological material Explain, and at a basic level discuss, a number of broad terms used to describe the quality of epidemiological and other data recognise issues about inaccuracy in the ascertainment of binary data
To put your footer here go to View > Header and Footer 3 Reference Material: 1 Interested in epidemiology and want to learn more? A free way is to type epidemiology supercourse into a web search engine. From several sites you can access:- a global repository of lectures on public health and prevention targeting educators across the world. Supercourse has a network of over 42500 scientists in 174 countries who are sharing for free a library of over 3232 lectures in 26 languages. The concept of the Supercourse and its lecture style has been described as the Global Health Network University.
To put your footer here go to View > Header and Footer 4 Reference Material: 2 N.B. One of the sites is the South African Medical Research Councils N.B. Site includes 2 best-selling books Statistics at Square One & Epidemiology for the Uninitiated ~ read or download the whole of these books ~ free of charge. N.B. Some lectures wordy/easy to follow, but some are more cryptic or more medically scientific.
To put your footer here go to View > Header and Footer 5 Epidemiological data in general Most data reported as graphs, charts and tables of standard sorts. Subject matter makes these epidemiological, but reporting methods are standard (see journal papers, epi. books, supercourse etc for endless examples). Quality and appropriateness often need to be checked – our concern here.
To put your footer here go to View > Header and Footer 6 The critical approach To give a good summary/presentation of some data, you need to ensure it is well- enough explained that you have answered reasonable questions about your data. Some will be specific: about the study location, reasons why the epidemiological problem is of significance in that place etc. Many will be generic as below: these are Qs you should ask about other peoples study write-ups as well as answering in your own!
To put your footer here go to View > Header and Footer 7 General concept of reliability This is about individual measurements. There are several concepts/measurements. One, where there is no objective measurement, is inter-rater reliability: concerns measures of agreement between trained observers as to the scores they give to a set of subjects e.g. students essays on a set theme. Test/ retest reliability concerns whether different versions of the same attainment test give the students essentially the same grades or ranks.
To put your footer here go to View > Header and Footer 8 Repeatability/reproducibility: 1 Especially in industrial settings, reliability idea is extended: repeatability concerns whether the same observer, using the same methods & instruments would get essentially the same answer measuring the same thing. If not measurement itself is weak. Reproducibility concerns whether different observers (e.g. different labs) using differing methods & instruments get very nearly the same answer. If yes, the measurement is quite robust.
To put your footer here go to View > Header and Footer 9 Repeatability/reproducibility: 2 Note that in more complex statistical studies, variance of measurements is broken down into components of variation attributed to separable sources e.g. method, observer, instrument, laboratory variability OR in a survey, interviewer, community, ethnic group, & other effects. This uses a form of the general technique called analysis of variance – see higher modules.
To put your footer here go to View > Header and Footer 10 Use of reliability For a given scenario, need a relevant plausible check on measurement reliability to be devised, explained and used. No single standard method exists. Often check against a gold standard that is too expensive to use all the time e.g. self-administered depression questionnaire results are checked for a sample of patients vs. gold-standard diagnoses after full-scale examination by trained psychiatrist
To put your footer here go to View > Header and Footer 11 General concept of validity It is fairly obvious what constitutes an adequate measurement of height of a child standing up straight. Instruments and procedures will affect accuracy, but concept is clear. A more abstract idea will be harder e.g. a set of questionnaire measures to assess user satisfaction with local peri-natal care provision. Do all concerned agree set of Qs cover all aspects of what may (dis-)satisfy a user ~ in brief, but comprehensive & balanced way?
To put your footer here go to View > Header and Footer 12 Use of validity Much more a social science concept than reliability. The question, Does the measurement system properly reflect what it should? raises other Qs e.g... According to whom? e.g. a poor communitys ideas of wealth/poverty may not be same elements as expert economists idea. Often need a multi-disciplinary, consultative check for measures of relatively big abstract ideas.
To put your footer here go to View > Header and Footer 13 Accuracy vs. precision Repeated measures of weight on a scale may be very precise (and therefore repeatable) if scale gives almost same measurement each time an object is weighed, but can still be consistently wrong if not calibrated correctly. If measurement precise and gives virtually the correct answer each time, it is accurate. Checking this requires reference to a gold standard to know what is correct.
To put your footer here go to View > Header and Footer 14 Critiquing a set of measurements Human and veterinary epidemiologists seldom measure one thing on a subject. Overall quality of profile of measurements often looked at to ensure readings are mutually consistent and fit with selection criteria e.g. parents perception of childs development and health looked at along with height, weight, age.
To put your footer here go to View > Header and Footer 15 Sampling Often a main variable in an epidemiological study will be zero/one and sample size will need to be very large OR sometimes very detailed study has to be restricted to quite a small sample. Each poses questions about how sample was chosen. Often not well described or justified e.g. small study claims to describe a whole country though in fact conducted in 1 or 2 tiny areas!
To put your footer here go to View > Header and Footer 16 Large samples Often a sample of hundreds or thousands is treated as if it were a simple random sample, though collected in a stratified and/or clustered fashion, maybe even a number of arbitrarily selected convenience samples. Ask what was the sample design, how was this taken into account in the analysis and write-up, what if anything that sample can claim to represent, ways in which it might be biased or untypical.
To put your footer here go to View > Header and Footer 17 Errors in binary data For any data, including binary, there are possible errors in sampling from the wrong frame, sampling biases and so on, but measurement error for observations of one binary variable reduces to false negatives and false positives. The rates at which these arise can only be assessed using "the usual" data collection procedures on samples whose Yes/No status is incontrovertibly known i.e. gold standard. See table below next slide for concepts.
To put your footer here go to View > Header and Footer 18 Example Consider a population of 100,000 which in reality has 5% seropositive for HIV. All members are screened using a second- generation ELISA assay whose sensitivity is 98% [i.e. out of 100 positive individuals the test will on average detect 98] and whose specificity is 99% [i.e. out of 100 negative individuals the test will on average correctly classify 99 as negative].
To put your footer here go to View > Header and Footer 19 THE INFECTION STATUS InfectedNot Infected TEST Positive TRUE POSITIVE (A) FALSE POSITIVE (B) (A+B) ALL TEST POSITIVES RESULT Negative FALSE NEGATIVE (C) TRUE NEGATIVE (D) (C+D) ALL TEST NEGATIVES ALL TRULY INFECTED (A+C) ALL TRULY NOT INFECTED (B+D) TOTAL POPULATION (A+B+C)
To put your footer here go to View > Header and Footer 20 Measures of data quality: 1 A general outcome measure is prevalence = all truly infectedA + B total population of interestA + B + C + D Genuine quality measures include sensitivity = true positives A all truly infectedA + C and specificity = true negatives D all truly non-infectedB + D = = =
To put your footer here go to View > Header and Footer 21 Measures of data quality: 2 Positive Predictive Value (PPV) = true positives A all test positivesA + B Negative Predictive Value (NPV) = true negatives D all test negativesB + D See expected numbers for the example in slide below. No statistical variation in these! = =
To put your footer here go to View > Header and Footer 22 THE INFECTION STATUS InfectedNot Infected ELISA Positive TRUE POSITIVE 4,900 (A) FALSE POSITIVE 95,000- 94,050 (B) ALL ELISA POSITIVE 5,850 (A+B) RESULT Negative FALSE NEGATIVE 100 (C) TRUE NEGATIVE 94,050 (D) ALL ELISA NEGATIVE 94,150 (C+D) ALL TRULY INFECTED 5,000 (A+C) ALL TRULY NOT INFECTED 100,000-5,000 (B+D) TOTAL POPULATION 100,000 (A+B+C+D)
To put your footer here go to View > Header and Footer 23 Example results Prevalence = 5000/100,000 = 5% Sensitivity = 4900/5000 = 98% Specificity = 94060/95000 = 99% ~ all as assumed & built into the table. PPV = A/(A+B) = 4900/5850 = 83.8%* NPV= D/(C+D) = 94,050/94150 = 99.89% *so about 1/6 th of all positives are false positives because of large proportion of uninfected in the overall population.
To put your footer here go to View > Header and Footer 24 Example conclusions These results are from thinking about what might be expected given certain supposed rates in the population. Generally not knowable in reality, but represent need for caution in interpreting real binary data! If these were the real figures we would expect on further examination to find 1/6 positives from initial testing were NOT real cases. Concepts of error/accuracy important as for measurement data!
To put your footer here go to View > Header and Footer 25 Practical work follows to ensure learning objectives are achieved…