Activity Analysis in Video Spring 2005 Computational Intelligence Seminar Series Partial Review of the Paper “Discovery and Segmentation of Activities.

Slides:



Advertisements
Similar presentations
Applications of one-class classification
Advertisements

Robust Speech recognition V. Barreaud LORIA. Mismatch Between Training and Testing n mismatch influences scores n causes of mismatch u Speech Variation.
Caroline Rougier, Jean Meunier, Alain St-Arnaud, and Jacqueline Rousseau IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 5,
Change Detection C. Stauffer and W.E.L. Grimson, “Learning patterns of activity using real time tracking,” IEEE Trans. On PAMI, 22(8): , Aug 2000.
HMM II: Parameter Estimation. Reminder: Hidden Markov Model Markov Chain transition probabilities: p(S i+1 = t|S i = s) = a st Emission probabilities:
Expectation Maximization
Vision Based Control Motion Matt Baker Kevin VanDyke.
Complex Feature Recognition: A Bayesian Approach for Learning to Recognize Objects by Paul A. Viola Presented By: Emrah Ceyhan Divin Proothi Sherwin Shaidee.
Robust Foreground Detection in Video Using Pixel Layers Kedar A. Patwardhan, Guillermoo Sapire, and Vassilios Morellas IEEE TRANSACTION ON PATTERN ANAYLSIS.
Introduction to Hidden Markov Models
Ch 9. Markov Models 고려대학교 자연어처리연구실 한 경 수
Statistical NLP: Lecture 11
Hidden Markov Models Theory By Johan Walters (SR 2003)
Lecture 15 Hidden Markov Models Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer.
EE-148 Expectation Maximization Markus Weber 5/11/99.
HMM-BASED PATTERN DETECTION. Outline  Markov Process  Hidden Markov Models Elements Basic Problems Evaluation Optimization Training Implementation 2-D.
Motion Detection And Analysis Michael Knowles Tuesday 13 th January 2004.
Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,
Pattern Recognition Topic 1: Principle Component Analysis Shapiro chap
Lecture 5: Learning models using EM
Probabilistic video stabilization using Kalman filtering and mosaicking.
Ensemble Tracking Shai Avidan IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE February 2007.
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Hidden Markov Models K 1 … 2. Outline Hidden Markov Models – Formalism The Three Basic Problems of HMMs Solutions Applications of HMMs for Automatic Speech.
Object Detection and Tracking Mike Knowles 11 th January 2005
Effective Gaussian mixture learning for video background subtraction Dar-Shyang Lee, Member, IEEE.
Jacinto C. Nascimento, Member, IEEE, and Jorge S. Marques
Learning and Recognizing Activities in Streams of Video Dinesh Govindaraju.
Cognitive Computer Vision 3R400 Kingsley Sage Room 5C16, Pevensey III
1 Activity and Motion Detection in Videos Longin Jan Latecki and Roland Miezianko, Temple University Dragoljub Pokrajac, Delaware State University Dover,
HMM-BASED PSEUDO-CLEAN SPEECH SYNTHESIS FOR SPLICE ALGORITHM Jun Du, Yu Hu, Li-Rong Dai, Ren-Hua Wang Wen-Yi Chu Department of Computer Science & Information.
Extracting Places and Activities from GPS Traces Using Hierarchical Conditional Random Fields Yong-Joong Kim Dept. of Computer Science Yonsei.
Isolated-Word Speech Recognition Using Hidden Markov Models
Feature and object tracking algorithms for video tracking Student: Oren Shevach Instructor: Arie nakhmani.
Alignment and classification of time series gene expression in clinical studies Tien-ho Lin, Naftali Kaminski and Ziv Bar-Joseph.
Learning and Recognizing Human Dynamics in Video Sequences Christoph Bregler Alvina Goh Reading group: 07/06/06.
A General Framework for Tracking Multiple People from a Moving Camera
Segmental Hidden Markov Models with Random Effects for Waveform Modeling Author: Seyoung Kim & Padhraic Smyth Presentor: Lu Ren.
Object Stereo- Joint Stereo Matching and Object Segmentation Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on Michael Bleyer Vienna.
Cognitive Computer Vision Kingsley Sage and Hilary Buxton Prepared under ECVision Specific Action 8-3
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: ML and Simple Regression Bias of the ML Estimate Variance of the ML Estimate.
Action and Gait Recognition From Recovered 3-D Human Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS— PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST.
1 Modeling Long Distance Dependence in Language: Topic Mixtures Versus Dynamic Cache Models Rukmini.M Iyer, Mari Ostendorf.
Expectation-Maximization (EM) Case Studies
1 CSE 552/652 Hidden Markov Models for Speech Recognition Spring, 2005 Oregon Health & Science University OGI School of Science & Engineering John-Paul.
1 CONTEXT DEPENDENT CLASSIFICATION  Remember: Bayes rule  Here: The class to which a feature vector belongs depends on:  Its own value  The values.
ECE-7000: Nonlinear Dynamical Systems Overfitting and model costs Overfitting  The more free parameters a model has, the better it can be adapted.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Supervised Learning Resources: AG: Conditional Maximum Likelihood DP:
Unsupervised Mining of Statistical Temporal Structures in Video Liu ze yuan May 15,2011.
 Present by 陳群元.  Introduction  Previous work  Predicting motion patterns  Spatio-temporal transition distribution  Discerning pedestrians  Experimental.
Hand Gesture Recognition Using Haar-Like Features and a Stochastic Context-Free Grammar IEEE 高裕凱 陳思安.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Elements of a Discrete Model Evaluation.
1 Hidden Markov Models Hsin-min Wang References: 1.L. R. Rabiner and B. H. Juang, (1993) Fundamentals of Speech Recognition, Chapter.
Discriminative Training and Machine Learning Approaches Machine Learning Lab, Dept. of CSIE, NCKU Chih-Pin Liao.
Hmm, HID HMMs Gerald Dalley MIT AI Lab Activity Perception Group Group Meeting 17 April 2003.
CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov
Classification of melody by composer using hidden Markov models Greg Eustace MUMT 614: Music Information Acquisition, Preservation, and Retrieval.
Hidden Markov Models. A Hidden Markov Model consists of 1.A sequence of states {X t |t  T } = {X 1, X 2,..., X T }, and 2.A sequence of observations.
Portable Camera-Based Assistive Text and Product Label Reading From Hand-Held Objects for Blind Persons.
Visual Recognition Tutorial1 Markov models Hidden Markov models Forward/Backward algorithm Viterbi algorithm Baum-Welch estimation algorithm Hidden.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Bayes Rule Mutual Information Conditional.
Motion Estimation of Moving Foreground Objects Pierre Ponce ee392j Winter March 10, 2004.
A Forest of Sensors: Using adaptive tracking to classify and monitor activities in a site Eric Grimson AI Lab, Massachusetts Institute of Technology
Motion Detection And Analysis
Dynamical Statistical Shape Priors for Level Set Based Tracking
Video-based human motion recognition using 3D mocap data
Hidden Markov Models Part 2: Algorithms
Visual Recognition of American Sign Language Using Hidden Markov Models 문현구 문현구.
A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models Jeff A. Bilmes International.
A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models Jeff A. Bilmes International.
Presentation transcript:

Activity Analysis in Video Spring 2005 Computational Intelligence Seminar Series Partial Review of the Paper “Discovery and Segmentation of Activities in Video” By Matthew Brand (MIRL) Presented by Derek Anderson

Topics 1. TigerPlace Project 2. Monitoring Silhouette Activity 3. Monitoring Object Activity 4. Monitoring both (separate or combined) 5. Hidden Markov Models (Brief Introduction) 6. Evolutionary Computing for Structure Discovery 7. Matthew Brands Approach to Activity Recognition

Context for this Presentation TigerPlace Project TigerPlace Project One component of our system will involve analyzing video (in real-time) and recognizing an important set of “short term” activities One component of our system will involve analyzing video (in real-time) and recognizing an important set of “short term” activities

Sensor and Video Networks We are doing the research for the video sensor network We are doing the research for the video sensor network iPAQ hx4700 series PDA with HP PhotoSmart Digital Cameras iPAQ hx4700 series PDA with HP PhotoSmart Digital Cameras The results from the video network can be combined with other sources of information from the sensor network (gait monitor, bed sensors, …) to reduce false alarm rates and help increase the overall confidence that the activities occurred The results from the video network can be combined with other sources of information from the sensor network (gait monitor, bed sensors, …) to reduce false alarm rates and help increase the overall confidence that the activities occurred Is this going to be handled inside the behavior reasoning component of the system … (fuzzy rules)? Is this going to be handled inside the behavior reasoning component of the system … (fuzzy rules)? Fuzzy Integrals? Fuzzy Integrals? Fuzzy Integral: use each of the sources of information in the sensor and video networks, taking into account how reliable each individually are (possible for different kinds of tasks), and asses our confidence in a particular hypothesis, which is an individual activity? Fuzzy Integral: use each of the sources of information in the sensor and video networks, taking into account how reliable each individually are (possible for different kinds of tasks), and asses our confidence in a particular hypothesis, which is an individual activity?

Important Elderly Activities What kind of activities to recognize? What kind of activities to recognize? Presently, we are deciding on an initial set to study Presently, we are deciding on an initial set to study A few possibilities include A few possibilities include Total body motion Total body motion Falling down (and not being able to get up) Falling down (and not being able to get up) Someone entering and leaving their bed Someone entering and leaving their bed Sitting and getting up from a chair Sitting and getting up from a chair Partial body motion Partial body motion Taking their medicine Taking their medicine Drinking Drinking

Monitoring while Ensuring Privacy What features for the video system? What features for the video system? Common approach: Silhouette’s Common approach: Silhouette’s Silhouette is an image based representation of individual with nearly all personal and distinguishing information removed Silhouette is an image based representation of individual with nearly all personal and distinguishing information removed Features from silhouettes will be used to monitor an individuals activity Features from silhouettes will be used to monitor an individuals activity These silhouettes will be initially extracted through image subtraction against a known and stationary background (cleaned up with binary morphology, reconstruction operator) These silhouettes will be initially extracted through image subtraction against a known and stationary background (cleaned up with binary morphology, reconstruction operator)

What the Silhouette's really look like (still a very ideal setting) Conventional Morphological Opening of Extracted Silhouette (Left) Morphological Reconstruction Operation on Extracted Silhouette (Right)

Silhouette motion over time (identification of activity regions) Consecutive Silhouette Subtraction (left) and after additional Erosion Operation (right)

New Application? Do not necessarily focus on the silhouettes, but rather the objects in the environment (or the co-interaction of the two) Do not necessarily focus on the silhouettes, but rather the objects in the environment (or the co-interaction of the two) Object or interesting landmark identification Object or interesting landmark identification SIFT (Scale Invariant Feature Transform) SIFT (Scale Invariant Feature Transform) Interesting enough texture on everything? Interesting enough texture on everything? Where are the camera’s placed? Where are the camera’s placed? Too complex to apply at first? Too complex to apply at first? Will it run real time (present equation, Bob = NO) Will it run real time (present equation, Bob = NO) Low level simple image processing techniques Low level simple image processing techniques Have to see what the resolution and quality of the images are Have to see what the resolution and quality of the images are Use simpler image processing techniques to recognize particular objects Use simpler image processing techniques to recognize particular objects How to deal with some occlusion (why co-interaction might be helpful) How to deal with some occlusion (why co-interaction might be helpful) Used the to help identify skin regions that helped in dealing with occlusion for objects the individual would interact with (tracked the hands) Used the YUV color space to help identify skin regions that helped in dealing with occlusion for objects the individual would interact with (tracked the hands) NLM Short-Term Fellowship (Summer 2004) NLM Short-Term Fellowship (Summer 2004) At the end of the summer, I used Bob’s SIFT implementation to identify key points from a pill bottle (used the minimum spanning tree and density measure) At the end of the summer, I used Bob’s SIFT implementation to identify key points from a pill bottle (used the minimum spanning tree and density measure) Helped reduce some of the false alarms (in the pill taking activity) Helped reduce some of the false alarms (in the pill taking activity)

Activity Recognition I don’t think that we have decided on the exact approach to use yet? I don’t think that we have decided on the exact approach to use yet? Looks like some form of HMMs might be as good of place as any to start? Looks like some form of HMMs might be as good of place as any to start? Simple Simple DOHMMs, COHMMs, or MDCOHMMs DOHMMs, COHMMs, or MDCOHMMs HHMMs (Hierarchical) HHMMs (Hierarchical) Learning Hierarchical Hidden Markov Models for Video Structure Discovery Learning Hierarchical Hidden Markov Models for Video Structure Discovery Entropic HMMs (Structure discovery) Entropic HMMs (Structure discovery) Discovery and Segmentation of Activities in Video Discovery and Segmentation of Activities in Video

Temporal Pattern Recognition Hidden Markov Models (HMM) are statistical methods (stochastic networks) that model sequential patterns that arise from a set of observation sequences which are believed to have come from the process of interest. Hidden Markov Models (HMM) are statistical methods (stochastic networks) that model sequential patterns that arise from a set of observation sequences which are believed to have come from the process of interest. HMMs are known for their application in areas such as natural speech recognition, word and symbol recognition, etc... HMMs are known for their application in areas such as natural speech recognition, word and symbol recognition, etc... HMMs are a doubly embedded stochastic process with an underlying process that is not observable (hidden), but can only be observed through another set of stochastic processes that produce the sequence of observations. HMMs are a doubly embedded stochastic process with an underlying process that is not observable (hidden), but can only be observed through another set of stochastic processes that produce the sequence of observations. 1 2 K … 1 2 K … 1 2 K … … … … 1 2 K … x1x1 x2x2 x3x3 xKxK 2 1 K 2

Mixture Density Continuous Observation HMM

HMM Problems 1) Given the observation sequence O = O 1 O 2 O 3 …O t, and a model m = (A, B, p), how do we efficiently compute P(O | m)? 2) Given the observation sequence O and a model m, how do we choose a corresponding state sequence Q = q 1 q 2 q 3 …q t which is optimal in some meaningful sense? 3) How do we adjust the model parameters to maximize P(O | m)?

Structure Discovery A serious problem related to the deployment of HMMs involves how to specify or learn the HMM model structure A serious problem related to the deployment of HMMs involves how to specify or learn the HMM model structure Matthew Brand has proposed a method based on entropy to learn an “optimal” model structure Matthew Brand has proposed a method based on entropy to learn an “optimal” model structure We might look at identifying a general way to learn the model structure in a simpler fashion, independent of the HMM type, since this will be used in not just a “lab” setting We might look at identifying a general way to learn the model structure in a simpler fashion, independent of the HMM type, since this will be used in not just a “lab” setting I am presently looking into using Evolutionary Computing (EC) techniques to evolve and learn the HMM structure automatically I am presently looking into using Evolutionary Computing (EC) techniques to evolve and learn the HMM structure automatically The difference would be related to the “compression” aspect and the few number of observations samples Brand claims works The difference would be related to the “compression” aspect and the few number of observations samples Brand claims works

EP Overview Generation t+1 S1 S2 S3 S1 S4 S2 S3 S1 S2 S3 S1 S2 S3 S1 S4 S2 S3 S1 S2 Generation t F(P i ) Generation t S1 S4 S2 S3 S1 S2 S3 F(O i ) Mutation {P 1, P 2, P 3, O 1, O 2, O 3 } Selection HMM

Walk before we start running Initially Initially Test how well the procedure works on a fully connected DOHMM when we only mutate the states (add and remove operators) Test how well the procedure works on a fully connected DOHMM when we only mutate the states (add and remove operators) Test a few different measures of complexity (the different fitness functions) Test a few different measures of complexity (the different fitness functions) Each chromosome in a generation acts like a seed to the next iterations Baum-Welch algorithm Each chromosome in a generation acts like a seed to the next iterations Baum-Welch algorithm Later Later Consider a more complicated MDCOHMM model Consider a more complicated MDCOHMM model Try to derive a series of equations and mutation operators that can take an initial population estimated by the Baum- Welch and evolve what was found (I believe that this would be a completely new technique) Try to derive a series of equations and mutation operators that can take an initial population estimated by the Baum- Welch and evolve what was found (I believe that this would be a completely new technique)

Matthew Brands Approach The principle of maximum likelihood is not valid for small data sets, the training is rarely enough to wash out the sampling artifacts (i.e. noise) The principle of maximum likelihood is not valid for small data sets, the training is rarely enough to wash out the sampling artifacts (i.e. noise) He also leaves out the obvious, related to if we have enough observations to estimate all the different parameters in the network (the degrees of freedom) He also leaves out the obvious, related to if we have enough observations to estimate all the different parameters in the network (the degrees of freedom) We may only have a few number of observations with a few “reflective” sub-observation sequences We may only have a few number of observations with a few “reflective” sub-observation sequences He advocates replacing the Baum-Welch formulae with parameter estimators based that minimize entropy He advocates replacing the Baum-Welch formulae with parameter estimators based that minimize entropy Claim is that this exploits the duality between learning and compression Claim is that this exploits the duality between learning and compression

Entropy Minimization

First Setup Variety of activity, from picking up the phone (a few seconds) to activities such as writing (could take up to hours) Variety of activity, from picking up the phone (a few seconds) to activities such as writing (could take up to hours) Used a “blob” representation consisting of ellipse parameters fitting the single largest connected set of active pixels Used a “blob” representation consisting of ellipse parameters fitting the single largest connected set of active pixels Background subtraction through identifying a statistical model of the background and an adaptive Gaussian color/location model (pixels that have changed and others due to motion) Background subtraction through identifying a statistical model of the background and an adaptive Gaussian color/location model (pixels that have changed and others due to motion) Cleaned up the “blob” through dilation (he makes reference to using a seed from the previous frame) Cleaned up the “blob” through dilation (he makes reference to using a seed from the previous frame) Observation vector uses high level geometric features, calculated from the mean and eigenvectors of a 2D Gaussian fitted to the foreground pixels Observation vector uses high level geometric features, calculated from the mean and eigenvectors of a 2D Gaussian fitted to the foreground pixels 30 minutes of data taken at random 30 minutes of data taken at random removed frames when no one is in the video removed frames when no one is in the video roughly 21 minutes after this roughly 21 minutes after this

Training Only three sequences used for training Only three sequences used for training Varied from 100 to 1,900 frames in length Varied from 100 to 1,900 frames in length # states = {12, 16, 20, 25, and 30} # states = {12, 16, 20, 25, and 30}

Procedure 1: Model Activity

Procedure 2: Monitoring Traffic

Monitoring Simultaneous Processes HMMs traditionally are used to model a single hidden process HMMs traditionally are used to model a single hidden process Brand modified (don’t know if he is the first, he claims this is novel) HMMs to take a varying number of observations per time step Brand modified (don’t know if he is the first, he claims this is novel) HMMs to take a varying number of observations per time step The new image representation is a variable length list of flow vectors between two subsequent images The new image representation is a variable length list of flow vectors between two subsequent images Flow vectors that are smaller than some predefined threshold are disregarded Flow vectors that are smaller than some predefined threshold are disregarded The model learns the typical locations and directions of the moving pixels, and the dynamic changes of these patterns The model learns the typical locations and directions of the moving pixels, and the dynamic changes of these patterns

Internals Brand uses a modified version of a multivariate Gaussian mixture model Brand uses a modified version of a multivariate Gaussian mixture model He deals with multiple observations per time step by treating each frame’s flow-list as an observation sequence for a mixture model at one time step He deals with multiple observations per time step by treating each frame’s flow-list as an observation sequence for a mixture model at one time step

multi-observation-mixture+counter (MOMC) HMM First term is a distribution on the obv count First term is a distribution on the obv count The mixture Gaussians are 4D observing flow vectors in (x,y,dx,dy) space The mixture Gaussians are 4D observing flow vectors in (x,y,dx,dy) space The mixture components model motion in particular directions and locations The mixture components model motion in particular directions and locations The counter variable essentially models the combined surface area of the moving objects The counter variable essentially models the combined surface area of the moving objects

Any Questions?

HMM Links Hidden Markov Models (General Introductions) Hidden Markov Models (General Introductions) krogh96/cabios.html krogh96/cabios.html krogh96/cabios.html krogh96/cabios.html Baum-Welch algorithm and the EM (Simpler math derivation) Baum-Welch algorithm and the EM (Simpler math derivation) (Bilmes) (Bilmes) Entropic Hidden Markov Models (Matthew Brand) Entropic Hidden Markov Models (Matthew Brand) Discovery and Segmentation of Activities in Video (IEEE Transactions on pattern analysis and machine intelligence, Vol 22, No. 8, Aug 2000) Discovery and Segmentation of Activities in Video (IEEE Transactions on pattern analysis and machine intelligence, Vol 22, No. 8, Aug 2000) Fuzzy Hidden Markov Models (Gader and Mohammed) Fuzzy Hidden Markov Models (Gader and Mohammed) Generalized Hidden Markov Models – Part I: Theoretical Frameworks (IEEE Transactions on Fuzzy Systems, Vol 8, No 1, Feb 2000) Generalized Hidden Markov Models – Part I: Theoretical Frameworks (IEEE Transactions on Fuzzy Systems, Vol 8, No 1, Feb 2000)