Spatio-temporal surveillance

Spatio-temporal surveillance
Moving beyond the simple scan statistics Dr Ross Sparks| Senior Principal Research Scientist 14 August 2014 Digital Productivity and Services

Outline Spatio-temporal surveillance - Moving beyond the simple scan statistics Discuss my aim for spatio-temporal surveillance Simple Scan statistic EWMA Scan plan Forward Selection Scan plan 2-Pass CUSUM Plan Plans based on order statistics Group of high-end order statistics of fixed size Aggregation over a dynamic group size of high end order statistics CUSUM high end order statistics An alternative forward selection scan plan Spatio-temporal surveillance | Ross Sparks

Aim of spatio-temporal surveillance
Aim to detect unusual increases in events in a spatial setting over time. Examples of application are: Increase in people falling ill with influenza in Peru in When and where is it increasing? Increase in traffic accidents in Lima. When and where do these increase? Increases in a certain noxious weed in Peru in 2014? Where is it increasing and when did it start? Decrease in the population of an important indigenous bird or animal in Peru. When did this first become evident and where in Peru does it occur? A decrease in immunisation/vaccine rates in children born in Peru. When and where does this decrease occur? Decrease in taking illicit drugs in Peru. When and where is this decrease occur? Spatio-temporal surveillance | Ross Sparks

Summary of the Simple Scan Plan – Spatial aspect
Count the number events within the circle and compare it to expected values Spatio-temporal surveillance | Ross Sparks

Simple Scan Plan – Temporal aspect
Spatial aspect Time Height of the cylinder is time, e.g, for daily monitoring of disease counts within a circular space going back say 10 days. Count the number events within the cylinder and compare it to its expected value. Spatio-temporal surveillance | Ross Sparks

How does it work? The size of the circles right is important.
Getting the number of circles of different sizes right is important. Spatio-temporal surveillance | Ross Sparks

Modelling the point process
Diggle P, Rowlingson B, Su TL. Point process methodology for on-line spatio-temporal disease surveillance. Environmetrics 2005, 16(5); These authors model the likelihood of an event occurring in space at any point in time thus they treat both space and time as continuous. Although I find this approach attractive I think biological processes are inherently very difficult to model. Their dynamic nature makes it difficult to identify outbreaks in real-time. In addition, the spatial distribution of the disease is unlikely to be smooth and omi-directional (equally likelihood in every direction). I am not a fan of this approach but greats like Noel Cressie and Peter Diggle are the obvious apostles of this approach. Spatio-temporal surveillance | Ross Sparks

The issues with the simple scan statistic
There is a massive multiple testing problem especially when different radius circles are allowed. Overlapping circles are needed to cover the whole region – individual tests are correlated. Not efficient for multiple outbreaks regions that are geographically dispersed. Shapes of outbreaks are not always circular. What size to consider? The temporal memory - a moving average - not as efficient as CUSUM or EWMA. It is very popular - intuitive and easy to understand. Presentation title | Presenter name

Alternative approaches we have tried
Dividing space into a regular rectangular grid. Advantages: Reduces multiple testing & overlapping scan regions are well defined. Allows easier application of well known techniques such as CUSUM and EWMA to accumulate both spatial and temporal memory, respectively. Reduces multiple testing & computational effort. Advantages are: The ability to extend the approach to additional dimensions. Reduce the number of spurious outbreaks, i.e., improve diagnosis of the outbreak nature. Spatio-temporal surveillance | Ross Sparks

Dividing the region into a rectangular grid
Advantages: Reduces search space - non- overlapping scanning. Allows to build temporal memory using multivariate EWMA smoothing. Allows various options for searching for rectangular clustered outbreaks. Disadvantages Outbreak don’t occur in rectangles Grid restricts the search. Spatio-temporal surveillance | Ross Sparks

EWMA Scan plan Features:
The cell counts are temporally smoothed using EWMA. How many size windows to use in the scanning process? 1 size? 2 sizes? 3 sizes? The process of searching is similar to the scan statistics but now its in terms of rectangles rather than circles. Spatio-temporal surveillance | Ross Sparks

EWMA Scan EWMA smooth the cell counts across time, and then scan the rectangular grid for outbreaks in the usual way. Still optimal selection of the size of the scanning window is dependent on the size of the outbreak Spatio-temporal surveillance | Ross Sparks

EWMA Scan plan Only difference between this approach and the Scan plan is using the EWMA to build temporal memory rather than a traditional moving average. 100 by 100 grid Outbreak 11X21 Number window sizes? mr= 11 and mc= 21 m1= 10, m2= 25 m1=10, m2=20, m3=35 m1 = 21 Outbreak 1 2 3 4 ATS delta 100.3 102.1 99.2 0.5 55.2 47.3 40.7 35.0 52.2 48.5 40.8 36.1 53.5 50.0 39.9 34.6 1.0 14.9 14.0 12.8 11.2 14.8 14.1 12.6 11.4 17.2 15.4 12.4 1.5 8.2 7.8 7.5 6.9 8.1 7.3 8.5 8.6 6.8 2.0 6.1 5.9 5.8 5.4 6.0 5.6 5.3 6.4 6.2 2.5 5.0 4.8 4.7 4.5 4.6 4.4 5.1 3.0 4.3 4.0 3.8 3.9 3.4 3.2 3.1 3.3 3.5 2.8 2.7 2.6 2.9 2.4 2.3 2.2 7.0 2.1 8.0 1.9 1.8 Spatio-temporal surveillance | Ross Sparks

EWMA Scan plan Outbreak 5 6 7 8 ATS delta 100.3 102.1 99.2 1 32.5 26.6
mr= 11 and mc= 11 m1= 10, m2= 25 m1=10, m2=20, m3=35 m1 = 21 Outbreak 5 6 7 8 ATS delta 100.3 102.1 99.2 1 32.5 26.6 27.4 21.8 41.5 26.0 34.5 22.1 57.5 40.5 41.1 25.3 2 8.8 8.0 8.3 7.6 9.1 8.9 16.2 12.5 12.4 9.3 3 5.7 5.4 5.5 5.3 6.0 5.8 8.6 7.2 6.1 4 4.4 4.3 4.1 4.6 4.2 4.5 6.3 4.9 3.7 3.6 3.4 3.8 5.1 3.9 3.2 3.1 3.0 2.9 3.3 4.0 2.5 2.4 2.3 2.6 3.5 2.7 10 2.1 2.0 2.2 1.9 12 1.8 14 1.7 1.6 1.5 16 1.4 Spatio-temporal surveillance | Ross Sparks

Reference Sparks, R.S. (2012). Spatially Clustered Outbreak Detection Using the EWMA SCAN Statistics with Multiple Sized Windows. Communications in Statistics—Simulation and Computation®, 41: 1637–1653. Spatio-temporal surveillance | Ross Sparks

Recursive partitioning – defining a forward selection scan plan
Partition the region into two parts recursively – one region most likely to include the outbreak The other region least likely to include the out Stop partitioning as soon as 5 generations of partitioning have been completed. Prune away insignificant partitions – leaving only outbreak regions. Spatio-temporal surveillance | Ross Sparks

Recursive partition – forward selection scan scan
17 | Spatio-temporal surveillance | Ross Sparks

18 | 18 |

19 | 19 | 19 |

Recursive partition – Pruning insignificant regions
For very large lattice structures this approach can significantly reduce the amount of searching. 20 | 20 | 20 | 20 |

Strengths and weaknesses of forward selection scan
Reduces the amount of multiple testing. Simple – outbreaks are described by a set of rules – simplifying diagnosis. Allows for flexibility in size. Allows for EWMA temporal smoothing of the cell counts – thus building the necessary temporal memory to improve sensitivity. Restricts shapes to rectangles. Like the scan statistic not efficient for multiple spatially dispersed outbreak regions. The partitioning suffers the same issues as forward selection and backward elimination in regression selection of variables. Spatio-temporal surveillance | Ross Sparks

References Sparks, R., Okugami, C. and Bolt, S. (2012). Outbreak detection of Spatio- temporal smoothed crashes. Open Journal of Safety Science and Technology. 2: Sparks, RS, Bolt, S. and Okugami, C. (2012). Spatio-Temporal Disease Surveillance. Invited Chapter in the book on Bioterrorism edited by Dr Stephen S. Morse, CDC USA, InTech – Open Access Publisher (see Chapter 8, pages ). Sparks, RS and Okugami, C. (2010). “Recursive partitioning – an efficient approach to early detection of anomalies.” Published in InterStat. January 2010, S. Bolt and R. Sparks, Detecting and diagnosing hotspots for the enhanced management of hospital emergency departments in Queensland, Australia, BMC Medical Informatics and Decision Making 2013, 13:134 (9 December 2013) . DOI: / Spatio-temporal surveillance | Ross Sparks

2 pass CUSUM We tried a plan that applied the CUSUM first to column totals (backwards and forwards) Then to the rows of the column flagged region. Again backward and forwards. The intersecting region is taken as the outbreaks.

2 pass CUSUM approach Step 1 is to find column totals. Apply the CUSUM approach to column total data as if they are univariate counts. Record what is signalled. Reverse the order of the CUSUM and record what is signalled. If common columns are signalled by both forwards and backwards CUSUM Flags these and go to Step 2. Otherwise no signal is provided

2 pass CUSUM – applied to column totals
Assume the threshold is Don’t reset the CUSUM to zero when it exceeds the threshold. Don’t let the CUSUM exceed 4.90. Column Counts Expected Counts Z-scores CUSUM(k=0.5) (Forwards) CUSUM(k=0.5) (Backwards)

Blank out the region not flagged with an outbreak
11 19 21 48 37 16 14 Step 2 is to find row totals of the remaining region. Apply the CUSUM approach to these total data as if they are univariate counts. Record what is signalled. Reverse the order of the CUSUM and record what is signalled. If common rows are signalled by both forwards and backwards CUSUM. If so the common region identifies the outbreak.

2 pass CUSUM Row Counts Expected Z-scores CUSUM(k=0.5) (Forwards) CUSUM(k=0.5) (Backwards)

Blank out the region not flagged as an outbreak
11 19 21 48 37 16 14 Flagged outbreak is defined by the region outlined in red 28 |

Reduces the amount of multiple testing to the bare minimum. Based on the efficient CUSUM. Allows for flexibility in size. Allows for EWMA temporal smoothing of the cell counts – thus building the necessary temporal member to improve sensitivity. Restricts shapes to rectangles. Like the scan statistic not efficient for multiple spatially dispersed outbreak regions. Not that efficient at diagnosis – can miss some of the outbreak region.

References S. Bolt and R. Sparks, Detecting and diagnosing hotspots for the enhanced management of hospital emergency departments in Queensland, Australia, BMC Medical Informatics and Decision Making 2013, 13:134 (9 December 2013) . DOI: / SPARKS, RS, Bolt, S. and Okugami, C. (2012). Spatio-Temporal Disease Surveillance. Invited Chapter in the book on Bioterrorism edited by Dr Stephen S. Morse, CDC USA, InTech – Open Access Publisher (see Chapter 8, pages ). Sparks, R., Okugami, C. and Bolt, S. (2012). Outbreak detection of Spatio- temporal smoothed crashes. Open Journal of Safety Science and Technology. 2: SPARKS, RS and Okugami, C. (2010). “Recursive partitioning – an efficient approach to early detection of anomalies.” Published in InterStat. January 2010,

Reference SPARKS, R. (2011). Detection of Spatially Clustered Outbreaks in Motor Vehicle Crashes: What is the Best Method? Safety Science. 49; SPARKS, RS (2010). “Enhancing road safety through early detection of outbreaks in the frequency of motor vehicle crashes.” Safety Science. 48; SPARKS, R. (2010). Two-pass CUSUM for identifying age cluster outbreaks. Aust. N. Z. J. Stat. 52(3), 2010, 245–260

Multiple outbreak regions –recursive partitioning

Multiple outbreak regions –prunning
33 |

Spatial smoothing helps if events cluster spatially
Temporal smoothing such as EWMA is good at smoothing away noise and hence efficient at revealing trends. The same is true for spatial trends. We apply spatial smoothing of the rows and columns of the matrix of temporally smoothed counts. This smoothing is similar to smoothing with a double exponential distribution. It is an open question on how to design the combination of temporal and spatial smoothing (research is needed in this area).

Plans based on order statistics
For homogeneous counts, the ordered cell counts contain all the necessary information in order of importance for detecting the outbreak. Just deciding on how many of the top order statistics to use and how to use them is the important decision.

Harnessing the power of order statistics?
After temporal smoothing and then spatial smoothing of the counts - the outbreak region(s) should be quite clear. The high end order statistics of the cell count signal to noise ratios should identify the full outbreak regions. Aggregating cells over the top g high end order statistics and comparing them to expected ssems like a good idea.

Order statistics – say the 5 top order statistics
12 17 33 20 17 If we assume that all cells have expected counts equal to 6 and counts are Poisson distributed, then

When is the group of high end order statistics high enough to flag an outbreak?
If is higher than a threshold this flags an outbreak?

Early detection performance relative to the scan statistic
Outbreak mr =21 and mc =11 EWMA SCAN Threshold=1.917 g=400 SCAN PLAN pc= Row i Column j 1 25 65 55 Shift / ATS 100.1 101.2 0.5 21.6 32.3 40.2 56.9 47.0 33.0 1.0 8.2 11.8 13.2 17.0 12.9 11.9 1.5 5.1 6.8 7.5 8.6 2.0 3.8 4.8 5.4 6.3 5.8 5.3 2.5 3.1 4.1 4.9 4.3 3.0 2.6 3.2 3.5 4.5 4.0 5.0 1.7 2.1 2.2 2.7 6.0 1.8 1.9 2.3 7.0 1.3 1.6 8.0 1.2 1.4 100 x 100 grid with the outbreak 21 x 11 starting at position row i to i+20 and column j to j+10. Assume these are daily counts are Poisson on a grid with mean rate equalling 0.01 per cell, e.g., about 100 events per day. Scan searching 21 x 21 grid for outbreaks. EWMA Scan examines the top 400 cell order statistics.

Reasonable designs for plans in 100 by 100 grid
Range in mean distributed as roughly uniform with min and max values Expected counts over the whole region is close to: Recommended choice for the plans for small to large outbreaks 0.0025,0.019 100 200 to 400 0.0025,0.029 150 200 to 500 0.0025,0.039 200 300 to 600 0.0025,0.049 250 400 to 700 0.0025,0.059 300 0.0025,0.069 350 500 to 700 0.0025,0.079 400 500 to 800 0.0025,0.089 450 600 to 900 0.0025,0.099 500 700 to 1000

Group of order statistics approach
Advantages Disadvantages Independent of shape of the outbreak(s). Uses all the information in all outbreaks if the group of high end order statistics is large enough – hence more sensitive when we have good signal to noise ratio information. Easy to implement and computationally efficient. The number high of end order statistics to consider is dependent on the spread of the outbreak (which is unknown in advance). Does not work well for low counts when the signal to noise ratio are unstable, i.e., does not work well when cell expected values are low over a whole local spatial region.

Reference Sparks, RS and Patrick, E. (2014). “Detection of multiple outbreaks using Spatio-temporal EWMA ordered statistics”. Communications in Statistics—Simulation and Computation; 43: 2678–2701. DOI: /

Extending the order statistic concept
The use of order statistics seems like a good idea. However, we need a more efficient way of deciding on the number of order statistics dynamically. Questions: Can we use the data to determine the number of order statistic to use? If the data can inform us on the number of order statistic to use, then can we devise a plan that adapts to a variable number of order statistics, i.e., variable size outbreaks?

Determining the number of order statistics
Even though counts may not be homogeneous, we know that the square root of Poisson counts have approximately homogeneous variance. This is even more so after spatio-temporal EWMA smoothing. This is very convenient because we have the ability to establish the expected distribution of spatio-temporal smoothed counts and therefore we can establish the expected quantiles for this distribution.

An example: 100 by 100 grid, cell means 0.1

QQ-plots may help decide on the number of high end order statistic to consider
Expected quantiles

Number of high-end order statistic to consider
Let the number of high-end order statistics be the number of the largest order statistics that are greater than some fixed offset above the expected quantiles. If we train the offset so that it delivers the optimal group number for certain outbreaks then this would help adapt the number of high end order statistics to the local optimal value. Since quantiles are highly volatile we smooth these number of high order statistic from one time to the next to help reduce this volatility.

Estimating the number of high end quantiles to take
Offset The number of high end order statistics that are greater than their theoretical quantile plus the offset is taken as optimal

The process for m high end order statistics
Let S(m) be the sum of the m highest order statistic when the grid cell means sum to An outbreak is flagged whenever Since now that adjusts with each time t the adaptive plan with changing can be expressed as This is useful when examining the non-homogeneous approach.

Non-homogeneous processes
Let the sum of m non-homogeneous order statistics be given by where The respective mean value is given by Then the process for monitoring non-homogeneous cell means is This gives the same false alarm rate as the homogeneous thresholds used to design this plan.

Reduces the amount of multiple testing. Flexible in terms of the number of outbreaks, their size and shape. Allows for EWMA temporal smoothing of the cell counts – thus building the necessary temporal memory. Ideal for multiple spatially dispersed outbreaks Diagnosis is not as easy as the forward selection scan statistic. More complicated for non- technical users.

Ordered statistic plan
Some results – single outbreak Plan Ordered statistic plan Scan plan offset=0:125 hOSP =0:99513 hscan =0: Outbreak region Rows 1:21 25:45 Columns 1:11 65:75 Delta=0 100.1 90.5 0.5 20.7 26.3 39.6 53.0 48.6 33.7 1.0 8.5 10.0 13.6 17.0 15.7 11.9 1.5 5.4 6.4 8.1 8.9 6.8 2.0 4.1 4.7 5.7 6.3 2.5 3.3 3.6 4.6 5.3 5.1 4.5 3.0 2.8 3.1 3.8 4.2 3.9 4.0 2.2 2.4 2.9 3.5 3.4 6.0 1.6 1.7 8.0 1.3 1.4 1.8

Non-rectangular outbreaks
Scenario 1 Scenario 2 Scenario 3

Ordered statistic plan
Signal outbreak regions that are not rectangular Plan Ordered statistic plan Scan plan offset=0:125 Threshold =1.0026 hscan =0: Scenario 1 2 3 100.1 90.5 0.5 37.9 56.5 39.8 36.6 66.6 43.3 1.0 13.8 34.8 14.9 13.0 46.4 15.8 1.5 8.1 21.6 9.2 7.5 31.3 8.9 2.0 5.9 15.2 6.6 5.7 21.0 6.3 2.5 4.7 11.7 5.1 4.5 5.2 3.0 3.9 9.5 4.3 11.4 4.4 4.0 6.9 3.3 3.1 8.2 3.6 6.0 2.1 2.3 2.2 5.6 8.0 1.7 3.5 1.8

Multiple outbreak scenarios
Scenario 4: Three spatially dispersed diamond shapes. Scenario 5: Combination of Scenario 1 and Scenario 3. Scenario 6: Similar to two smaller version of Scenarios 3 spatially dispersed. Scenario 7: Similar to two smaller version of Scenarios 3 spatially dispersed plus a smaller version of Scenario 1.

Multiple outbreak regions
Plan Ordered statistic plan Scan plan offset=0:125 hOSP =1.0026 hscan =0: Scenario 4 5 6 7 100.1 90.5 0.5 38.6 29.1 39.2 52.9 44.8 30.7 51.8 61.5 1.0 15.2 11.3 13.4 23.8 19.9 10.9 29.7 38.1 1.5 9.4 7.3 8.1 14.4 11.2 6.9 11.7 21.9 2.0 6.7 5.4 5.9 9.7 8.0 5.6 7.5 14.0 2.5 5.3 4.2 4.8 7.7 6.4 4.4 6.0 10.4 3.0 3.5 4.1 6.2 3.9 4.9 4.0 3.4 2.7 2.9 4.6 2.4 1.9 2.1 3.1 2.2 1.6 1.8 3.6

References Sparks R, Hai Y & Tsui K (2014). Outbreak Detection Using A Dynamic Number of Spatio-Temporal Smoothed Ordered Statistics (Submitted to Journal of Quality Technology).

CUSUM of the high end order statistics
First assume the we are working with a spatially homogeneous Poisson cell counts. We start by ranking the cell counts of say a 100 by 100 grid of spatial-temporal EWMA smoothed counts, i.e, where n=10000.

Estimate the median value for the order statistics. Let the ith order statistic have median Now we examine CUSUM of the order statistic from there respective median value for a given reference value (k>0) for i=1,2,…, Note that However for reasonably large values ok reference value most of the smaller order statistics will result in CUSUM values of zero. Most of the information in the outbreaks is housed in the high end order statistics.

We only calculate the CUSUM for the top g=300,400,500,600 or 700 of the order statistics. The optimal value of g depends on the number of cells in the outbreak. The larger the number of cells the larger value of g will be optimal. However large outbreaks are largely unaffected by the choice of g because the CUSUM plan takes care of these. Choice of g=500 is reasonably robust to most changes

CUSUM the high-end order statistics
Rather than examine all order statistics; we examine only g of the largest high end ordered counts, i.e., we examine for i=1,2,…,m(<n) and flag an significant increase in communication for a group of communications in the target people if where is selected so that delivers an acceptably low false alarm rate. |

Average time to signal for 50 by 20 outbreak in 100 by 100 lattice
25:74 55:74 Size of the location shift per cell k m 0.1 0.2 0.3 0.4 0.5 0.6 0.8 0.10 1.2 1.4 1.6 800 57.4 35.9 20.3 13.8 9.8 7.8 5.5 4.3 3.7 3.1 2.7 58.3 35.1 9.7 7.9 3.6 0.0005 61.4 34.9 20.4 13.6 3.5 700 60.8 33.9 20.7 7.7 4.2 58.2 33.8 13.5 9.6 5.6 63.2 34.7 21.1 13.7 4.4 3.0 600 65.4 38.1 20.8 14.1 10.3 8.0 5.7 67.4 37.0 14.3 10.2 67.9 37.1 21.8 14.4 2.8 500 64.9 37.2 21.7 10.4 8.2 63.8 21.6 8.1 62.3 20.9 13.9 450 69.6 39.6 22.9 10.5 8.3 5.8 4.6 3.2 66.6 39.0 22.4 14.6 4.5 64.2 37.5 22.0 14.7 10.7 5.9 400 65.8 23.4 8.4 66.0 39.4 23.2 10.6 65.2 22.8

Average time to signal for 21 by 11 outbreak in 100 by 100 lattice
25:45 65:75 Size of the location shift per cell+++ k m 0.5 1 1.5 2 2.5 3 4 5 6 7 8 800 38.8 14.1 8.0 5.6 4.4 3.7 2.8 2.2 2.0 1.7 1.6 39.0 13.5 7.9 5.7 4.5 1.9 0.0005 38.6 13.4 7.7 5.5 3.6 2.7 700 38.4 2.3 38.0 7.8 4.3 5.4 600 38.3 13.3 4.2 2.6 500 7.6 3.5 37.9 13.2 7.5 1.4 37.1 13.0 7.4 1.8 450 41.0 39.7 39.2 12.9 2.1 400 40.0 39.9 5.3 3.4 38.5

Problems with non-homogeneous means
Assume that cell count has mean count has median order statistics given by Find the rank position of amongst these median values, i.e. find Denote this rank by Rij. Repeat this for all i and j=1 to Rank these from highest to smallest, i.e. let corresponding counts by

These correspond to the 500 most unusually high counts in the lattice. The adaptive CUSUM plan now is defined as follows n=1,2,…,500 The out-of control threshold for this is 1, i.e., an outbreak is flagged whenever This plan will have the same false alarm rate as the homogeneous plans used to derive )

These approach is only approximately true, because the median of smoothed homogeneous counts is not the same as the median of the count with the same smoothed mean. The alternative of finding the median of smoothed non- homogeneous counts is far too onerous, and therefore not feasible. So this computationally approximate result only works if the non- homogeneous means do not differ by much.

CUSUM Ordered statistic plan
Results Plan CUSUM Ordered statistic plan Scan plan g = 500 & ks = hc = hscan =0: Outbreak region Rows 1:21 25:45 Columns 1:11 65:75 Delta=0 100.5 90.5 0.5 24.9 28.4 37.1 53.0 48.6 33.7 1.0 9.6 10.4 13.0 17.0 15.7 11.9 1.5 5.8 6.3 7.4 8.9 8.1 6.8 2.0 4.3 4.4 5.4 6.4 2.5 3.4 3.6 4.2 5.3 5.1 4.5 3.0 2.9 3.5 4.7 3.9 4.0 2.2 2.3 2.6 5.0 1.8 1.9 2.8 6.0 1.6 1.7 2.4

Non-rectangular outbreaks Plan CUSUM Ordered statistic plan Scan plan g = 500 & ks = hc = hscan =0: Scenario 1 2 3 100.5 90.5 0.5 40.4 62.5 39.8 37.6 66.6 43.3 1.0 13.5 34.9 14.4 14.0 44.4 15.8 1.5 7.7 21.1 8.4 8.1 30.3 8.9 2.0 5.4 13.9 5.8 20.7 6.3 2.5 4.3 10.3 4.5 15.0 5.2 3.0 3.5 3.8 3.9 11.4 4.4 4.0 2.7 5.7 2.8 3.1 3.6 5.0 2.2 2.3 2.6 6.5 6.0 1.9

Multiple outbreak regions Plan CUSUM Ordered statistic plan Scan plan g = 500 & ks = hc = hscan =0: Scenario 4 5 6 7 0.5 40.4 30.1 36.8 49.4 39.0 28.6 51.8 61.5 1.0 13.6 10.9 13.9 20.3 17.7 10.5 29.7 38.1 1.5 7.7 6.5 7.9 11.2 10.2 6.6 11.7 21.9 2.0 5.4 4.7 5.6 7.5 7.3 5.2 14.0 2.5 4.3 3.7 4.4 5.8 5.9 4.2 6.0 10.4 3.0 3.5 3.1 3.6 5.1 4.9 8.1 4.0 2.7 2.4 2.9 4.1 6.2 5.0 2.2 1.9 2.8 3.4 3.2 1.7 2.1 7.0 2.6 3.8

Reduces multiple testing. Flexible in terms of the number of outbreaks, their size and shape. Allows for EWMA temporal smoothing of the cell counts. Ideal for multiple spatially dispersed outbreaks. Very efficient for problems with homogeneous cells means. Diagnosis is not as easy as the forward selection scan statistic. More complicated for non- technical users. Non-homogeneous cell means complicates the process. 70 |

Reference Sparks, R (2014). Spatio-temporal Surveillance using CUSUM of order statistics. Accepted by Quality and Reliability Engineering International.

Alternative forward selection scan
Start with the highest order statistic and then grow the region with neighbouring cells to increase the significance of the outbreak. Next slide demonstrates this approach using the earlier example

Alternative forward selection scan

Reference Takahashi K, Kulldorff M, Tango T and Yih K. (2008).A flexibly shaped space-time scan statistic for disease outbreak detection and monitoring. International Journal of Health Geographics 2008, 7:14 doi: / X-7-14

Reviewing the spatio-temporal surveillance approaches considered
Plan Diagnostic issues Detection issues Simple scan Diagnostics weak Inflexible to shape & size Forward selection scan statistic Diagnostics excellent Not efficient for multiple outbreak regions EWMA scan Improved diagnostics but still weak Group of order statistics Flexible to shape & size. Great for multiple outbreak regions Dynamic group of order statistics Diagnostics alright Very flexible to shape & size. Great for multiple outbreak regions CUSUM order statistics Not flexible enough for non-homogeneous cell counts Alternative forward selection scan Not flexible enough for multiple outbreak

Thank you. Questions? CSIRO, DP & S
Dr Ross Sparks Senior Principal Research Scientists t e w CSIRO/Digital PRODUCTIVITY & SERVICES

Spatio-temporal surveillance

Similar presentations

Presentation on theme: "Spatio-temporal surveillance"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Spatio-temporal surveillance

Similar presentations

Presentation on theme: "Spatio-temporal surveillance"— Presentation transcript:

Similar presentations

About project

Feedback