Anomaly Detection in Crowded Scenes Vijay Mahadevan, Weixin Li, Viral Bhalodia & Nuno Vasconcelos Statistical Visual Computing Laboratory University of California, San Diego
Anomaly Detection surveillance cameras are everywhere mainly in crowded, public spaces can we detect anomalous events automatically? e.g. car in wrong direction, jumping the turnstile
Defining what is an anomaly? it is an outlier any deviation from normal behavior most common definition : temporal outliers events not conforming to learned pattern Hard to define what an anomaly is
Another type … spatial outliers context based events that “stand-out” depends on the surroundings events that “stand-out”
Modeling the video need good models of normal video to identify temporal and spatial anomalies current approaches mainly rely on optical flow [Adam et al `08, Kim & Graumann `09] does not retain pixel appearance need to model both appearance and dynamics how to combine the two? mixtures of dynamic textures (MDT) joint stochastic model for appearance and dynamics
Mixture of dynamic textures multiple components each modeling appearance and dynamics [Chan & Vasconcelos`08,`09]
spatiotemporal patches Temporal anomaly detection spatiotemporal patches + Training Negative Log Likelihoods Test Frame Temporal Anomaly Map
Spatial anomaly detection saliency and anomaly are related discriminant center-surround saliency saliency : extent to which center can be differentiated from surround [Gao & Vasconcelos`08] location l marginal Saliency S(l) center, c=1 surround, c=0
spatiotemporal patches MDT for saliency at each location, , we need to learn 2 MDTs for center and surround : computationally expensive approximation : learn one MDT for entire frame get posterior label assignments use this to get 2 MDTs at each location l + spatiotemporal patches Global MDT Learn one MDT for entire frame Need to learn 2 MDTs from the center and surround regions. Marginal is the average of the two. Instead, we can learn just one MDT for the entire frame..
Posterior Label Allocation Approximation… in each local window use the posterior label assignments to compute mixing fraction if is assigned to mixture component and 0 otherwise l Posterior Label Allocation + center, c=1 surround, c=0 Global MDT
Computing the saliency saliency expression requires KL divergence between 2 MDTs cannot compute in closed form use a variational approximation where [Hershey & Olsen `07] KL between 2 DTs [Chan & Vasconcelos `05]
Learning a mixture model + mixture of dynamic textures spatiotemporal patches + Center Surround Saliency Spatial anomaly map Label Allocation
Integrated Anomaly Map overall anomaly map normalized sum of the temporal and spatial maps smoothing done using a spatiotemporal filter reduces noisy predictions finally, anomalous regions are obtained by thresholding
UCSD Anomaly Dataset scenes from pedestrian walkway new dataset! UCSD Anomaly Dataset scenes from pedestrian walkway anomalous events are due to : non pedestrian entities previously unseen motion patterns varying crowd density two different camera views typical anomalies bikers, skaters, small carts, and people cutting across the walkway Dataset Training Test No anomaly Bikes Skaters Carts Walk. Across Others Peds1 35 10 15 8 3 2 4 Peds2 17 5 16 1
Evaluation Procedure anomaly detection anomaly localization does the frame have an anomaly or not? {0,1} groundtruth for each frame anomaly localization where is the anomaly? pixel level groundtruth masks
Experiments compared to social force (SF) [Mehran et al,`09] MPPCA [Kim and Grauman,`09] optical flow monitoring [Adam et al,`08] SF + MPPCA
Results
Results …
More examples : MDT
ROC - Peds1 ROC – Peds2 Anomaly Detection: Error Rate Peds1 31% 40% Dataset Social Force MPPCA Adam et al. SF+MPPCA MDT Peds1 31% 40% 38% 32% 25% Peds2 42% 30% 36%
Anomaly Localization : Accuracy Peds1 21% 18% 15% 28% 45% Social Force MPPCA Adam et al. SF+MPPCA MDT Peds1 21% 18% 15% 28% 45%
Conclusion new anomaly detection approach using mixture of dynamic textures representation for the video enables joint modeling of appearance and dynamics principled way to account for temporal and spatial outliers new dataset for benchmarking http://www.svcl.ucsd.edu/projects/anomaly/dataset.htm
Thank you! visit our poster - F4 http://www.svcl.ucsd.edu/projects/anomaly/
Video results : bike
Video results : another cart
One more
MDT SF+ MPPCA
FAQs Running time/Computation Failure modes not real time yet 45min to train ~20sec to process one frame Failure modes when anomalies are small (of the order of patch size)