Presentation is loading. Please wait.

Presentation is loading. Please wait.

University of Southern California Department of Mathematics University of Southern California Department of Mathematics Modeling and Detection of Sudden.

Similar presentations


Presentation on theme: "University of Southern California Department of Mathematics University of Southern California Department of Mathematics Modeling and Detection of Sudden."— Presentation transcript:

1 University of Southern California Department of Mathematics University of Southern California Department of Mathematics Modeling and Detection of Sudden Spurts in Activity Profile of Terrorist Groups QMDNS May 1, 2012 Vasanthan Raghavan Joint work with Aram Galstyan and Alexander Tartakovsky 1

2 Broad Objectives  Terrorism is no longer a shocking word to most of us!  Ongoing radicalization of different interest groups  Rise of social media has made tracking terrorist activity a harder task  Objective 1: Development of data-driven probabilistic models of activity of adversarial networks based on interacting hierarchical hidden Markov models  Objective 2: Develop methods for  model learning  inferencing – rapid detection and tracking of changes in adversarial behavioral patterns and objectives based on optimal nonlinear filtering and quickest changepoint detection techniques 2

3 Features of Activity Profile  Collecting terrorism data is a painful exercise!  Open-source databases: ITERATE, RDWTI, GTD, etc.  Temporal ambiguity (time-scale of modeling = Days)  Attributional ambiguity (turf warfare)  Data sparsity (604 attacks over 9 years)  Missing data, mis-labeled data 199820002002 2004 20062008 Source: RAND Database on Worldwide Terrorism Incidents (RDWTI) 3

4 Model for Terrorist Activity  Observation Period:  Observations: No. of attacks =  History of group:  Model for activity profile  Desired Qualities in an Ideal Model:  Describes data sufficiently accurately  Motivated by a small set of hypotheses  Robust to looseness in hypothesis statements  Described by few parameters  Robust to missing or mis-labeled data 4

5 Existing Models  Type 1: Classical time-series techniques  Fit trend, seasonality and stationary components to time-series [Enders & Sandler]  Quadratic or cubic trend = 4 parameters, seasonality = 3, stationary part = 1  Fit lagged value of endogenous variables, and other variables [Barros]  8 or more model parameters  Key Theme: Good-to-acceptable fit for time-series at the cost of large number of parameters in a model with complicated dependencies  Type 2: Group-based trajectory analysis  Identify cases with similar development trends [Nagin]  Cox proportional hazards model + logistic regression methods for model selection [LaFree, Dugan & co-workers]  Key Theme: Contagion theoretic viewpoint  Current activity of group is influenced by past history of group  Attacks are clustered 5

6 Self-exciting Hurdle Model  Type 3: Self-exciting hurdle model  Puts the contagion point-of-view on a theoretical footing  Motivated by similar model development in  Earthquake models – Aftershocks are function of current shock  Inter-gang violence – Action-reaction violence between gangs  Epidemiology – immigrants + offsprings in a cell colony [Porter & White 2012, White, Porter & Mazerolle @ QMDNS 2010, Erik Lewis]  Hurdle probability component: Accounts for few attacks  Self-exciting component: Accounts for clustering of attacks  Key Theme:  Excellent model-fit  Explains clustering of attacks from a theoretical perspective  Self-exciting component can be complicated  more parameters Is a self-exciting model necessary to explain clustering? 6

7 Motivating Hypotheses  Hypothesis 1: Current activity of the group depends on past history only through k dominant states (that remain hidden)  Hypothesis 2: Of these k states, the two most dominant are  Its Capabilities ( ) – Manpower assets, special skills (bomb-making, IED), propaganda warfare skills, logistics skills, coordination with other groups, ability to raise finances, etc.  Its Intentions ( ) – Guiding ideology/philosophy (e.g., Marxist-Leninist- Maoist thought, political Islam), designated enemy group, nature of high profile attacks, nature of propaganda warfare, etc. [Cragin and Daly, “The dynamic terrorist threat: An assessment of group motivations and capabilities in a changing world”] 7

8 Motivating Hypotheses  Hypothesis 3:  Mature group  Intentions are to attack (more or less)  Change in capabilities is primarily responsible for change in attack patterns  A d-state model for Capabilities  d = 2:  Active state (high capability/strong), Inactive state (low capability/weak)  Observation density: Different possibilities (Poisson, shifted Zipf, geometric, etc.) 8 Hurdle/State transitions  Data rarity Self-exciting comp./Diff. rates  Clustering

9 Consequences  Metrics to capture capabilities of group  Time–window:  No. of days of terrorist activity in a day time-window (X n ) – measures resilience of group [Santos]  No. of attacks in a day time-window (Y n ) – measures level of coordination in group [Lindberg]  Consequence 1: where  Consequence 2: Time to next day of terrorist activity is appx. exponential  The days of activity form a Poisson process for  Consequence 3: Y n is compound Poisson whose density can be written in terms of density of M i 9

10 Case-Study 1: FARC  Revolutionary Armed Forces of Colombia (FARC)  Oldest and largest terrorist group in the Americas  Marxist-Leninist ideology, anti-establishmentist  Uses guerilla warfare  Actively involved in cocaine cultivation and trans-shipment to U.S. and W. Europe, kidnapping rings, …  Database used : RAND Database on Worldwide Terrorism Incidents (RDWTI)  Time-period of interest: 1998 – 2007  Why FARC? Ans: Reliable dataset from RDWTI  Dominant in Colombia  Less ambiguity in terms of other groups’ attacks  Anti-establishment group  Strong signature in attack profile  Easy to differentiate FARC from non-FARC attacks in case of ambiguity 10

11 Activity Profile  Why 1998 – 2007? Ans: Two key geo-political events  Spurt 1  1997: Colombia becomes leading cultivator of coca  1999–2000: Plan Colombia with U.S. aid  2001–2002: President Uribe’s election on anti-FARC plank  Spurt 2  2003–2004: Anti-FARC efforts bear fruit  2005 – 2006: President Uribe’s re-election bid and local elections 199820002002 2004 20062008 Elections, Uribe’s win, Plan Colombia, cocaine fields destroyed Local elections, Pres. Uribe’s successful bid Becomes leading cultivator of coca Massive increase in Plan Colombia funding announced 11

12 Models for M i 12

13 Model Verification More attacks, Good first- order fit Heavy tails Large inter- arrival time, Good fit Normal Activity Profile Reduced inter- arrival time, Good first-order fit Spurt in Activity Fewer no. of attacks, Good fit Normal Activity Profile Spurt in Activity 13

14 Case-Study 2: Shining Path  Shining Path (Sendero Luminoso) of Peru  Marxist-Leninist ideology, anti-establishmentist, attacks perceived interventionist forces  Actively involved in cocaine cultivation and trans-shipment to U.S. and W. Europe  Database used : RDWTI  Time-period of interest: 1981 – 1996 (full cycle in Sh. Path’s evolution)  Spurt 1  Early 1980s: Becomes violent  1981–1984: Tepid response from govt., seizes initiative, stabilizes  Downfall 1  1985–1986: Attack on Pres. Elections in ‘85, COIN ops.  slight lull  Spurt 2  1987–1990: Reorganization in early ‘87, stabilizes  1991: Excessively violent, Sh. Path controls center and south of Peru and in outskirts of Lima  Downfall 2  1992: Personality cult around leader, disrespect for indigenous culture, loss of support base, institutional apathy, Guzman captured in Sept. 1992  1993–1996: Death of outfit 14

15 Activity Profile 15

16 Models for M i 16

17 Model Verification Heavy tails Beginning of activity, Good fit Intense activity, Good first-order fit Stabilization, Good first- order fit Decay and death, Good fit 17

18 Quickest Changepoint Detection  Changepoint ( ): Point of transition from normal behavior to abnormal behavior  Statistical model: Density changes from f to g at  Goal: Detection procedure (stopping time) to detect quickly without too many false alarms  Numerous applications: anomaly detection, failure detection, process/quality control, intrusion detection, target detection, finance, etc.  Simplest setting: Observations are i.i.d. both pre- and post-change 18

19 Proposed Solution 1 (EWMA)  Changepoint detection procedures  Page’s Cumulative Sum (CUSUM) test  Shiryaev-Roberts (S-R) test  Disadvantage: Require pre- and post-change distributions  Advantage: Optimality properties can be proved under certain ideal assumptions  Exponential weighted moving average (EWMA) test  Introduced by Roberts in 1959 and studied in statistics literature  Detects drift in mean of a sequence and tracks it (continuously)  How does it work? Smoothens small changes and enhances big changes  Essentially a first-order auto-regressive process  Advantage:  Easy to implement in practice  Works well with no specific assumptions on underlying distributions 19

20 EWMA Test  Time-window =  No. of days of terrorist activity ( X n )  Total no. of attacks in the time-window ( Y n )  Update equations for statistic:  First-order auto-regressive process  are smoothing parameters, designed experimentally, small values work best  Stopping time: chosen to optimize trade-off between FAR & Det. delay 20

21 Detecting Spurts: EWMA  Conclusions:  EWMA statistic detects persistent changes and tracks underlying process  But short moderate changes are not tracked 21

22 Proposed Solution 2 (State Estimation)  Methodology:  Train: Learn HMM parameters (p 0, q 0, Active rate and Inactive rate) – Baum-Welch/EM algorithm  Test: Verify inter-arrival duration density during Active/Inactive periods – Q-Q plots, scatter diagram  Classify States: As Active/Inactive – Viterbi algorithm, non-linear filtering (small hit over VA) 22

23 Detecting Spurts and Downfalls: Shining Path 23

24 Concluding Remarks  A competing model for activity profile  Desired Qualities in an Ideal Model:  Motivated by a small set of hypotheses  Yes, 3 hypotheses. Are these hypotheses justified?  Robust to looseness in hypothesis statements?  Intentions do not matter?!  Described by few parameters  Yes, 4 parameters in the simplest setting  Describes data sufficiently accurately  Yes, for spurt /downfall detection and tracking activity. No, in general.  Refine the model to incorporate heavy tails, non-geometric density, spatio-temporal, etc.  Robust to missing or mis-labeled data  Ongoing work 24

25 Concluding Remarks  Ripley’s K function  Indonesia/Timor-Leste attacks have abnormal tails (36 attacks on one day!) – cant see how a 2-state HMM with geometric obs. density will work for this dataset 25 Is a self-exciting model necessary to explain clustering?


Download ppt "University of Southern California Department of Mathematics University of Southern California Department of Mathematics Modeling and Detection of Sudden."

Similar presentations


Ads by Google