Learning the space of time warping functions for Activity Recognition Function-Space of an Activity Ashok Veeraraghavan Rama Chellappa Amit K. Roy-Chowdhury.

Slides:



Advertisements
Similar presentations
Coherent Laplacian 3D protrusion segmentation Oxford Brookes Vision Group Queen Mary, University of London, 11/12/2009 Fabio Cuzzolin.
Advertisements

Bilinear models for action and identity recognition Oxford Brookes Vision Group 26/01/2009 Fabio Cuzzolin.
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Word Spotting DTW.
Human Identity Recognition in Aerial Images Omar Oreifej Ramin Mehran Mubarak Shah CVPR 2010, June Computer Vision Lab of UCF.
1. A given pattern p is sought in an image. The pattern may appear at any location in the image. The image may be subject to some tone changes. 2 pattern.
Extended Gaussian Images
One-Shot Learning Gesture Recognition Students:Itay Hubara Amit Nishry Supervisor:Maayan Harel Gal-On.
Patch to the Future: Unsupervised Visual Prediction
Modeling the Shape of People from 3D Range Scans
IIIT Hyderabad Pose Invariant Palmprint Recognition Chhaya Methani and Anoop Namboodiri Centre for Visual Information Technology IIIT, Hyderabad, INDIA.
Generation of Virtual Image from Multiple View Point Image Database Haruki Kawanaka, Nobuaki Sado and Yuji Iwahori Nagoya Institute of Technology, Japan.
Computer Vision Laboratory 1 Unrestricted Recognition of 3-D Objects Using Multi-Level Triplet Invariants Gösta Granlund and Anders Moe Computer Vision.
International Conference on Automatic Face and Gesture Recognition, 2006 A Layered Deformable Model for Gait Analysis Haiping Lu, K.N. Plataniotis and.
Multiple People Detection and Tracking with Occlusion Presenter: Feifei Huo Supervisor: Dr. Emile A. Hendriks Dr. A. H. J. Stijn Oomes Information and.
Image Indexing and Retrieval using Moment Invariants Imran Ahmad School of Computer Science University of Windsor – Canada.
Optimization & Learning for Registration of Moving Dynamic Textures Junzhou Huang 1, Xiaolei Huang 2, Dimitris Metaxas 1 Rutgers University 1, Lehigh University.
Local Descriptors for Spatio-Temporal Recognition
Relevance Feedback Content-Based Image Retrieval Using Query Distribution Estimation Based on Maximum Entropy Principle Irwin King and Zhong Jin Nov
Model: Parts and Structure. History of Idea Fischler & Elschlager 1973 Yuille ‘91 Brunelli & Poggio ‘93 Lades, v.d. Malsburg et al. ‘93 Cootes, Lanitis,
Shape and Dynamics in Human Movement Analysis Ashok Veeraraghavan.
Non-metric affinity propagation for unsupervised image categorization Delbert Dueck and Brendan J. Frey ICCV 2007.
Face Recognition Under Varying Illumination Erald VUÇINI Vienna University of Technology Muhittin GÖKMEN Istanbul Technical University Eduard GRÖLLER Vienna.
Shape and Dynamics in Human Movement Analysis Ashok Veeraraghavan.
A Study of Approaches for Object Recognition
Motion based Correspondence for Distributed 3D tracking of multiple dim objects Ashok Veeraraghavan.
Recognition of Human Gait From Video Rong Zhang, C. Vogler, and D. Metaxas Computational Biomedicine Imaging and Modeling Center Rutgers University.
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Parametric Inference.
Recognizing and Tracking Human Action Josephine Sullivan and Stefan Carlsson.
Gait Recognition Simon Smith Jamie Hutton Thomas Moore David Newman.
Dynamic Time Warping Applications and Derivation
Hand Signals Recognition from Video Using 3D Motion Capture Archive Tai-Peng Tian Stan Sclaroff Computer Science Department B OSTON U NIVERSITY I. Introduction.
Relevance Feedback Content-Based Image Retrieval Using Query Distribution Estimation Based on Maximum Entropy Principle Irwin King and Zhong Jin The Chinese.
Statistical Shape Models Eigenpatches model regions –Assume shape is fixed –What if it isn’t? Faces with expression changes, organs in medical images etc.
Face Recognition and Retrieval in Video Basic concept of Face Recog. & retrieval And their basic methods. C.S.E. Kwon Min Hyuk.
Ensuring Home-based Rehabilitation Exercise by Using Kinect and Fuzzified Dynamic Time Warping Algorithm Qiao Zhang.
Overview Introduction to local features
Recognizing Action at a Distance A.A. Efros, A.C. Berg, G. Mori, J. Malik UC Berkeley.
Similarity measuress Laboratory of Image Analysis for Computer Vision and Multimedia Università di Modena e Reggio Emilia,
Presented by Tienwei Tsai July, 2005
Abstract Developing sign language applications for deaf people is extremely important, since it is difficult to communicate with people that are unfamiliar.
Miguel Reyes 1,2, Gabriel Dominguez 2, Sergio Escalera 1,2 Computer Vision Center (CVC) 1, University of Barcelona (UB) 2
General Tensor Discriminant Analysis and Gabor Features for Gait Recognition by D. Tao, X. Li, and J. Maybank, TPAMI 2007 Presented by Iulian Pruteanu.
Raviteja Vemulapalli University of Maryland, College Park.
Mingyang Zhu, Huaijiang Sun, Zhigang Deng Quaternion Space Sparse Decomposition for Motion Compression and Retrieval SCA 2012.
ICDE, San Jose, CA, 2002 Discovering Similar Multidimensional Trajectories Michail VlachosGeorge KolliosDimitrios Gunopulos UC RiversideBoston UniversityUC.
Fourier Descriptors For Shape Recognition Applied to Tree Leaf Identification By Tyler Karrels.
Action and Gait Recognition From Recovered 3-D Human Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS— PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST.
Extracting features from spatio-temporal volumes (STVs) for activity recognition Dheeraj Singaraju Reading group: 06/29/06.
CVPR2013 Poster Detecting and Naming Actors in Movies using Generative Appearance Models.
CS Statistical Machine learning Lecture 24
P RW GEI: Poisson Random Walk based Gait Recognition Intelligent Systems Research Centre School of Computing and Intelligent Systems,
MURI Annual Review, Vanderbilt, Sep 8 th, 2009 Heterogeneous Sensor Webs for Automated Target Recognition and Tracking in Urban Terrain (W911NF )
Multi-Speaker Modeling with Shared Prior Distributions and Model Structures for Bayesian Speech Synthesis Kei Hashimoto, Yoshihiko Nankaku, and Keiichi.
3D Face Recognition Using Range Images
Bayesian Speech Synthesis Framework Integrating Training and Synthesis Processes Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda Nagoya Institute.
University of South Florida, Tampa1 Gait Recognition and Inverse Biometrics Sudeep Sarkar (Zongyi Liu, Pranab Mohanty) Computer Science and Engineering.
Multi-view Synchronization of Human Actions and Dynamic Scenes Emilie Dexter, Patrick Pérez, Ivan Laptev INRIA Rennes - Bretagne Atlantique
Presenter: Jae Sung Park
Learning Image Statistics for Bayesian Tracking Hedvig Sidenbladh KTH, Sweden Michael Black Brown University, RI, USA
Gait Recognition Gökhan ŞENGÜL.
Dynamical Statistical Shape Priors for Level Set Based Tracking
Simon Smith Jamie Hutton Thomas Moore David Newman
The Functional Space of an Activity Ashok Veeraraghavan , Rama Chellappa, Amit Roy-Chowdhury Avinash Ravichandran.
Raviteja Vemulapalli University of Maryland, College Park.
Outline H. Murase, and S. K. Nayar, “Visual learning and recognition of 3-D objects from appearance,” International Journal of Computer Vision, vol. 14,
The EM Algorithm With Applications To Image Epitome
View-Invariant Representation and Recognition of actions
Presentation transcript:

Learning the space of time warping functions for Activity Recognition Function-Space of an Activity Ashok Veeraraghavan Rama Chellappa Amit K. Roy-Chowdhury

Human Motion Analysis [Johansson 1973] 3D models of humans with joint angles Motion capture [mocap.cs.cmu.edu] Shape Sequences [Veeraraghavan 2004]

Human Action Recognition – Prior Work Non-Imaging based methods : Motion capture, light displays etc Point Light Displays 3D Motion capture GPS Tracking Anthropometry View Invariance Execution Rate Modeling Point Trajectories Local Affine Invariant Models View invariant moments 3D models of human action Modeling View-Transition Gait - [Bobick et. al. 2003] Activity - [Sheikh et. al. 2005] Anthropometry Invariants Dynamic Models Shape Normalization

Why is learning Execution rate important? Sequence 1 Sequence 2 Average Sequence Ghost Hands Ghost Heads Time Warped Average Sequence Structural Inconsistencies Structurally consistent average Sequence Structural Inconsistencies. (2 heads, 4 arms etc.) Wrong match in recognition experiments. Algorithms attempt to explain temporal variation by modeling feature variation.

Feature Space Time Warping Space Model for an Activity Possible Features: Silhouettes, 3D motion capture, shape of outline, body angles, location of joints in 3D/2D etc Space of functions Space of time warping functions W3 Run W1 Jog W2 Sit A a1(t): Jog a2(t): Sit

Modeling time-warping for Activities Model For an Activity Nominal activity trajectory : a(t); t ε (0,1) W : Space of time warping functions. Realizations of the activity Also need to know how to sample candidate functions “f” from W. Properties of functions in A 1.All realizations of the activity starts at time t=0 and ends at time t=1, i.e., f(0)=0 and f(1)=1. 2.The order of action units for each activity remains unaltered for all realizations i.e., 3.We note that A is a convex set., i.e., if f 1 and f 2 are in A, then for α є (0,1) f is also in A.

Activity Specific time-warping space W W is a subset of A the space of time warping functions. f(t) = t is a candidate function in W. This represents no time warping. It is reasonable to assume that W is pointwise convex, i.e., for all f1,f2 Є W and α Є (0,1), f = α f1 + (1-α ) f2 is also in W.  Since the derivative is a linear operator, this means that if the rate of execution of some action can be speeded up by factors α1 and α2 then it can also be speeded up by any factor β in between α1 and α2. This is not just reasonable but in fact desirable.

Activity Specific time-warping space W These properties imply that W can be represented by a warping constraint window, given by two functions u(t) and l(t) where u(t) is the upper bounding function and l(t) is the lower bounding function. Time warping functions f in the activity specific warping space W are such that W

From DTW to Activity Specific Warping Constraints

Symmetric Representation of Activity Model Model parameters {a, W} non-unique. There exists an equivalence class of model parameters. Each equivalence class contains one member whose model parameters are symmetric. Learning and inference made only on symmetric representation of model in order to ensure uniqueness.

Learning Symmetric Representation of Activity Model Learn the nominal activity trajectory a(t) Learn the functional space of time-warps W. Learning algorithm can be based on any chosen time alignment procedure. We have used the Dynamic Time Warping (DTW) for time alignment. EM based learning algorithm Expectation Maximization

Learning the Model

Pre-processing and Feature Extraction 1.Background Subtraction 2.Connected Component analysis 3.Extract Shape Feature. (Kendall’s statistical shape) 4.Shape Feature lives on a spherical shape-space. 5.Use appropriate distance measures like Procrustes distance for local distance computations. [Veeraraghavan et. al.] CVPR 2004, PAMI 2005

Results on USF Data 70 people, upto 10 sequences per person Variabilities: shoe type, surface, view point. Figure: CMS curve for various algorithms on the USF database. Baseline : [Sarkar 2005] DTW Shape, HMM Shape: [Veeraraghavan 2004] HMM Image : [Kale 2004]

Activity Recognition on UMD Database 1.Database of 10 Activities and 10 Sequences per activity % Recognition and increase in discrimination in the similarity matrix compared to traditional DTW.

Organize a large Database Hierarchically (Dendrogram) 1.USF Database. Total 1870 Sequences of 122 individuals. 2.No: of Leaves at every node = 3. 3.No: of Levels of Dendrogram = 3

Conclusions Modeling, Learning and accounting for time warping is important for activity recognition. We proposed a convex activity specific function space for time warping functions and derived learning, recognition and clustering algorithms using this model. Appropriate feature selection will enable view and anthropometry invariance.

Thank You! Contact :

Thank You! Contact :

Human Action Recognition – Prior Work Non-Imaging based methods : Motion capture, light displays etc [Murray 1964] [Johansson1973] [Cunado et. al. 1994] Anthropometry View Invariance Execution Rate [Seitz et. al. 1997] [Yacoob et. al. 1998] [Rao et. al. 2002] [Parameswaran et. al. 2003] [Syeda-Mahmood et. al. 2001] [Bobick et. al. 2003] [Sheikh et. al. 2005] [Bissacco et. al. 2001] [Gritai et. al. 2004][Kale et. al. 2004] [Veeraraghavan et. al. 2004]