Presentation on theme: "Statistical Methods for Alerting Algorithms in Biosurveillance"— Presentation transcript:
1 Statistical Methods for Alerting Algorithms in Biosurveillance Howard S. BurkomThe Johns Hopkins University Applied Physics LaboratoryNational Security Technology DepartmentWashington Statistical Society SeminarFebruary 3, 2006National Center for Health StatisticsHyattsville, MD
2 ESSENCE Biosurveillance Systems ESSENCE: An Electronic Surveillance System for the Early Notification of Community-based EpidemicsMonitoring health care data from ~800 military treatment facilities since Sept. 2001Evaluating data sourcesCivilian physician visitsOTC pharmacy salesPrescription salesNurse hotline/EMS dataAbsentee rate dataDeveloping & implementing alerting algorithmsIn the ESSENCE system, outpatient visit and emergent care data are collected from about 800 MTFs worldwide.We get syndromic counts from these sites on a daily basis, so we’re looking at data streams with a daily sample rate, which in some cases is increasingPut remarks in context of multiple streams of syndromic counts/rates each day
3 Outline of TalkProspective Syndromic Surveillance: introduction, challengesAlgorithm Evaluation ApproachesStatistical Quality Control in Health SurveillanceData Modeling and Process ControlRegression Modeling ApproachGeneralized Exponential SmoothingComparison StudySummary & Research Directions
4 Required Disciplines: Medical/Epi Medical/Epidemiologicalfiltering/classifying clinical records => syndromesinterpretation/response to system outputcoding/chief complaint interpretation
5 Required Disciplines: Informatics Information Technologysurveillance system architecturedata ingestion/cleaninginterface between health monitors and system
6 Required Disciplines: Analytics AnalyticalStatistical hypothesis testsData mining/automated learningAdaptation of methodology to background data behavior
7 Essential Task Interaction in Volatile Data Background Medical/Epidemiologicalfiltering/classifying clinical records => syndromesinterpretation/response to system outputcoding/chief complaint interpretationInformation Technologysurveillance system architecturedata ingestion/cleaninginterface between health monitors and systemAnalyticalStatistical hypothesis testsData mining/automated learningAdaptation of methodology to background data behavior
8 The Multivariate Temporal Surveillance Problem Varying Nature of the Data:Scale, trend, day-of-week, seasonal behavior depending on grouping:Multivariate Nature of Problem:Many locationsMultiple syndromesStratification by age, gender, other covariatesSurveillance Challenges:Defining anomalous behavior(s)Hypothesis tests--both appropriate and timelyAvoiding excessive alerting due to multiple testingCorrelation among data streamsVarying noise backgroundsCommunication with/among users at different levelsData reduction and visualization
9 Data issues affecting monitoring Most suitable for modeling without data-specific informationStatistical propertiesScale and random dispersionPeriodic effectsDay-of-week effects, seasonalityDelayed (often variably) availability in monitoring systemTrends: long/short term: many causes, incl. changes in:Population distribution or demographic compositionData provider participationConsumer health care behaviorCoding or billing practicesProlonged data drop-outs, sometimes with catch-upsOutliers unrelated to infectious disease levelsOften due to problems in data chainInclement weatherMedia reports (example: the “Clinton effect”)
10 Forming the Outcome Variable: Binning by Diagnosis Code
13 Dynamic DetectionDynamic DetectionSimulated Data
14 Example with Detection Statistic Plot ThresholdInjected Cases Presumed Attributable to Outbreak Event
15 Comparing Alerting Algorithms Criteria: SensitivityProbability of detecting an outbreak signalDepends on effect of outbreak in dataSpecificity ( 1 – false alert rate )Probability(no alert | no outbreak )May be difficult to prove no outbreak existsTimelinessOnce the effects of an outbreak appear in the data, how soon is an alert expected?
16 Modeling the Signal as Epicurve of Primary Cases Need “data epicurve”: time series of attributable counts above backgroundPlausible to assume proportional to epidemic curve of infectedSartwell lognormal model gives idealized shape for a given disease typeEpicurve: plot of number of symptomatic cases by dayCanonical idea of a bioterrorist attack is a localized, point-source outbreak, such as the 1979 accident at Sverdlovsk where weaponized anthrax spores were released in aerosol form, an unknown number infected, and about 70 diedMagenta dotted curve shows actual epicurve we constructed from plot in 1992 Meselson paperWe’ve taken data such as this to calculate zeta and sigma for disease-specific lognormal dist.Can then plot the “maximum likelihood epicurve”Modal day is exp(zeta + 2*sigma); in constructing a test signal, we set the modal number of cases to a multiple of the estimated standard deviation of the time series of interest, then divide by the modal probability to get the total number infected, and we add the resulting counts to the authentic dataSartwell, PE. The distribution of incubation periods of infectious disease. Am J Hyg 1950; 51:310:318
17 Signal Modeling: Realizations of Smallpox Epicurve Each symptomatic case a random draw“maximum likelihood” epicurveHowever, we don’t assume the maximum likelihood epicurve in our simulation; we form a stochastic signal using the calculated N, zeta, and sigmaFor each of the N simulated cases, we take a random draw from the lognormal with params, zeta & sigma, just as each individual’s incubation period could be seen as a random draw from distributions of dosage and susceptibilityEach trial is then a set of N such random draws, giving a large set of random signalsThese signals are what we add to the noise background of authentic data
18 Assessing Algorithm Performance Summary processing: measure dependence of sensitivity or timeliness on false alert rate (ROC or AMOC curves or key sample values at practical rates)Sensitivity/Specificity asa function of threshold:Receiver Operating Characteristic(ROC)DetectionProbability(sensitivity)thresholdFalse Alert Rate(1 – specificity)Actual false alert rate of interest depends on:resources of public health dept using the system“prior” likelihood of an outbreak (DHS threat level)However, we do NOT look at area under the curve, but at PD at alert rates of interestTimeliness/Specificity asa function of threshold:Activity Monitor OperatingCharacteristic(AMOC)Timeliness Score (e.g. Mean or Median Time to Alert)thresholdFalse Alert Rate(1 – specificity)
20 Quality Control Charts and Health Surveillance Benneyan JC, Statistical Quality Control Methods in Infection Control and Hospital Epidemiology, Infection and Hospital Epidemiology, Vol. 19, (3)Part I: Introduction and Basic TheoryPart II: Chart use, statistical properties, and research issues1998 Survey article gives 135 referencesMany applications: monitoring surgical wound infections, treatment effectiveness, general nosocomial infection rate, …Monitoring process for “special causes” of variationOrganize data into fixed-size groups of observationsLook for out-of-control conditions by monitoring mean, standard deviation,…General 2-phase procedure:Phase I: Determine mean m, standard deviation s of process from historical “in-control” data; control limits often set to m 3sPhase II: Apply control limits prospectively to monitor process graphically
21 Adaptation of Traditional Process Control to Early Outbreak Detection On adapting statistical quality control to biosurveillance:Woodall , W.H. (2000). “Controversies and Communications in Statistical Process Control”, Journal of Quality Technology 32, pp“Researchers rarely…put their narrow contributions into the context of an overall SPC strategy. There is a role for theory, but theory is not the primary ingredient in most successful applications.”Woodall , W.H. (2006, in press). “The Use of Control Charts in Health Care Monitoring and Public Health Surveillance”“In industrial quality control it has been beneficial to carefully distinguish between the Phase I analysis of historical data and the Phase II monitoring stage”“It is recommended that a clearer distinction be made in health-related SPC between Phase I and Phase II…”Does infectious disease surveillance require an “ongoing Phase I” strategy to maintain robust performance?
22 Statistical Process Control in Advanced Disease Surveillance Key application issues:Background data characteristics change over timeHospital/clinic visits, consumer purchases not governed by physical science, engineeringBut monitoring requires robust performance: algorithms must be adaptiveTarget signal: effect of infectious disease outbreakTransient signal, not a mean shiftMay be sudden or gradual
23 The Challenge of Data Modeling for Daily Health Surveillance Conventional scientific application of regressionDo covariates such as age, gender affect treatment? Does treatment success of differ among sites if we control for covariates?Studies use static data sets with exploratory analysisIn surveillance, we model to predict data levels in the absence of the signal of interestNeed reliable estimates of expected levels to recognize abnormal levelsData sets dynamic—covariate relationships change
24 The Challenge of Data Modeling for Daily Health Surveillance, cont’d Modeling to generate expected data levelsPredictive accuracy matters, not just strength of association or overall goodness-of-fitFor a gradual outbreak, recent data can “train” model to predict abnormal levelsAlerting decisions based on model residualsResidual = observed value – modeled valueConventional approach:assume residuals fit a known distribution (normal, Poisson,…)hypothesis test for membership in that distributionFor surveillance, can also apply control-chart methods to residuals
25 Monitoring Data Series with Systematic Features Problem: How to account for short-term trends, cyclic data features in alerting decisions?ApproachesData ModelingRegression: GLM, ARIMA, others & combinationsSignal ProcessingLMS filters and waveletsExponential Smoothing: generalizes EWMA
26 Example: OTC Purchasing Behavior Influenced by Many Factors Loglinear RegressionExample: Tracking Daily Sales of Flu RemediesLog(Y) = b0 + b1-6d + b7t + b8-9h +b10w + b11p + edaily countof anti-flusalesday of week(6 indicators)lineartrendharmonic(seasonal)weather(temp.)salespromotion(indicator)deviation(Poissondist.)2a. The black curve on this plot shows daily sales of flu remedies from a large urban region.2b. The blue curve gives the fitted values from a single regression model over the entire interval on all data features shown except for sales promotions; will discuss model specifics on next slides.2c. The drop called “unusual weather” was during a heatwave.
27 Recent Surveillance Method Based on Loglinear Regression Modeling emergency department visit patterns for infectious disease complaints: results and application to disease surveillanceJudith C Brillman , Tom Burr , David Forslund , Edward Joyce , Rick Picard and Edith UmlandBMC Medical Informatics and Decision Making 2005, 5:4, pp 1-14Modeling visit counts on day d:Let S(d) = log ( visits(day d) + 1 ), the “started log”S(d) = [Σi ci × Ii(d)] + [c8 + c9 × d] + [c10 × cos(kd) + c11 × sin(kd)],k = 2π /c1-c7 day-of-week effectsc9 long-term trendc10-c11 seasonal harmonic termsTraining period: 3036 days ~ 8.33 yearsTest period: 1 year2a. The black curve on this plot shows daily sales of flu remedies from a large urban region.2b. The blue curve gives the fitted values from a single regression model over the entire interval on all data features shown except for sales promotions; will discuss model specifics on next slides.2c. The drop called “unusual weather” was during a heatwave.
28 Brillman et. al. Figure 12a. The black curve on this plot shows daily sales of flu remedies from a large urban region.2b. The blue curve gives the fitted values from a single regression model over the entire interval on all data features shown except for sales promotions; will discuss model specifics on next slides.2c. The drop called “unusual weather” was during a heatwave.
29 EWMA Monitoring Exponential Weighted Moving Average Average with most weight on recent Xk:Sk = wS k-1 + (1-w)Xk,where 0 < w < 1Test statistic:Sk compared to expectation from sliding baselineBasic idea: monitor(Sk – mk) / skAdded sensitivity for gradual eventsLarger w means less smoothing
30 EWMA Concept & Smoothing Constant Brown, R.G. and Meyer, R.F. (1961), "The Fundamental Theorem of Exponential Smoothing," Operations Research, 9,Exponential smoothing represents “an elementary model of how a person learns”:xk = xk-1 + w (xk - xk-1) where 0 < w < 1For the smoothed value Sk,Sk = wS k-1 + (1-w)Xk ,The variance of Sk is sS = [w / (2 - w)] sXSo a smaller w is preferred because it gives a more stable Sk; values between 0.1 and 0.3 often usedBut Chatfield: changes in global behavior will result in a larger optimal w
31 Generalized Exponential Smoothing Holt-Winters Method: modeling level, trend, and seasonalityAnnex_B_The_Holt-Winters_forecasting_method.pdfForecast Function:where: mj = level at time j,bj = trend at time j,cj = periodic multiplier at time js = periodic intervalk = number of steps aheadand mj, bj, cj are updated by exponential smoothing
32 Holt-Winters Updating Equations Updating Equations, multiplicative method:Level at time t:Slope at time t:Periodic multiplierat time t:And choice of initial values m0, b0, c0,…cs-1 should be calculated from available data
33 Forecasting Local Linearity: Automatic vs Nonautomatic Methods Chatfield, C. (1978), "The Holt-Winters Forecasting Procedure," Applied Statistics, 27,Chatfield, C.and Yar, M. (1988), "Holt-Winters Forecasting: Some Practical Issues, " The Statistician, 37,“Modern thinking favors local linearity rather than global linear regression in time…”“Local linearity is also implicit in ARIMA modelling…”Simple EWMA ~ ARIMA(0,1,1)EWMA + trend ~ ARIMA(0,2,2)Multiplicative Holt-Winters has no ARIMA equivalent“Practical considerations rule out [Box-Jenkins] if there are insufficient observations or …expertise available”“Box-Jenkins… requires the user to identify an appropriate… [ARIMA] model”For “fair” comparison of H-W to B-J, have both automatic or nonautomatic.Assertion: The simplicity of H-W permits easier classification, requiring less historic data.Can an automatic B-J give robust forecasting over a range of input series types?
34 Regression vs Holt-Winters Ongoing study withGalit Shmueli, U. of MD Sean Murphy, JHU/APL30 time series,700 days’ data5 cities3 data types2 syndromesRespiratory: seasonal & day-of-week behaviorGastrointestinal:day-of-week effects
35 Temporal Aggregation for Adaptive Alerting Data stream(s) to monitor in time:Counts to be tested for anomalyNominally 1 dayLonger to reduce noise, test for epicurve shapeWill shorten as data acquisition improvestest intervalbaseline intervalUsed to get some estimateof normal data behaviorMean, varianceRegression coefficientsExpected covariate distrib.-- spatial-- age category-- % of claims/syndromeguardbandAvoidscontaminationof baselinewith outbreaksignal
36 Candidate Methods 1. Global loglinear regression of Brillman et. al. 2. Holt-Winters exponential smoothingfixed sets of smoothing parameters for data:with both day-of-week & seasonal behaviorwith only day-of-week behavior3. Adaptive RegressionLog(Y) = b0 + b1-6d + b7t + b8hol + b9posthol + e56-day baseline, 2-day guardbandb1-6 = day-of-week indicator coefficientb7 = centered ramp coefficientb8 = coefficient for holiday indicatorb9 = coefficient for post-holiday indicator1-day ahead and 7-day-ahead predictions
37 Respiratory Visit Count Data --- Holt-Winters--- Regression--- Adaptive Regr.All series display this autocorrelation;good test for published regression model
38 GI Visit Count Data --- Data --- Holt-Winters --- Regression --- Adaptive Regr.
40 Mean Residual Comparison When mean residuals favor regression, difference is small, and this difference results from largest residualsIf the holiday terms in adaptive regression are removed, H-W means uniformly smaller
45 SummaryData-adaptive methods are required for robust prospective surveillanceAppropriate algorithm selection requires an automated data classification methodology, often with little data historyStatistical expertise is required to manage practical issues to maintain required detection performance as datasets evolve:stationarity (causes rooted in population behavior, evolving informatics, others)late reportingdata dropouts
46 Research DirectionsClassification of time series for automatic forecastingEasier for Holt-Winters than for Box-Jenkins?Determining reliable discriminants:Autocorrelation coefficientsSimple means/mediansGoodness-of-fit measuresHow little startup data history required?Most effective alerting algorithm using residuals, given signal of interestApply control chart to residuals?Need to detect both sudden, gradual signalsDetection performance constraints:Minimum detection sensitivityMaximum background alert rate