Visual Tracking Conventional approach Build a model before tracking starts Use contours, color, or appearance to represent an object Optical flow Incorporate.

Slides:



Advertisements
Similar presentations
Part 2: Unsupervised Learning
Advertisements

Bayesian Belief Propagation
Active Appearance Models
Probabilistic Tracking and Recognition of Non-rigid Hand Motion
CSCE643: Computer Vision Bayesian Tracking & Particle Filtering Jinxiang Chai Some slides from Stephen Roth.
SE263 Video Analytics Course Project Initial Report Presented by M. Aravind Krishnan, SERC, IISc X. Mei and H. Ling, ICCV’09.
Change Detection C. Stauffer and W.E.L. Grimson, “Learning patterns of activity using real time tracking,” IEEE Trans. On PAMI, 22(8): , Aug 2000.
Robot Vision SS 2005 Matthias Rüther 1 ROBOT VISION Lesson 10: Object Tracking and Visual Servoing Matthias Rüther.
Learning to estimate human pose with data driven belief propagation Gang Hua, Ming-Hsuan Yang, Ying Wu CVPR 05.
Low Complexity Keypoint Recognition and Pose Estimation Vincent Lepetit.
Reducing Drift in Parametric Motion Tracking
Visual Tracking CMPUT 615 Nilanjan Ray. What is Visual Tracking Following objects through image sequences or videos Sometimes we need to track a single.
Foreground Modeling The Shape of Things that Came Nathan Jacobs Advisor: Robert Pless Computer Science Washington University in St. Louis.
Robust Object Tracking via Sparsity-based Collaborative Model
Visual Recognition Tutorial
Formation et Analyse d’Images Session 8
Transformed Component Analysis: Joint Estimation of Image Components and Transformations Brendan J. Frey Computer Science, University of Waterloo, Canada.
Robust Moving Object Detection & Categorization using self- improving classifiers Omar Javed, Saad Ali & Mubarak Shah.
EE 290A: Generalized Principal Component Analysis Lecture 6: Iterative Methods for Mixture-Model Segmentation Sastry & Yang © Spring, 2011EE 290A, University.
Motion Tracking. Image Processing and Computer Vision: 82 Introduction Finding how objects have moved in an image sequence Movement in space Movement.
Stanford CS223B Computer Vision, Winter 2007 Lecture 12 Tracking Motion Professors Sebastian Thrun and Jana Košecká CAs: Vaibhav Vaish and David Stavens.
Tracking with Online Appearance Model Bohyung Han
Probabilistic video stabilization using Kalman filtering and mosaicking.
Adaptive Rao-Blackwellized Particle Filter and It’s Evaluation for Tracking in Surveillance Xinyu Xu and Baoxin Li, Senior Member, IEEE.
Particle filters (continued…). Recall Particle filters –Track state sequence x i given the measurements ( y 0, y 1, …., y i ) –Non-linear dynamics –Non-linear.
Object Detection and Tracking Mike Knowles 11 th January 2005
Lecture 4 Unsupervised Learning Clustering & Dimensionality Reduction
Probabilistic Robotics Bayes Filter Implementations Particle filters.
Stanford CS223B Computer Vision, Winter 2007 Lecture 12 Tracking Motion Professors Sebastian Thrun and Jana Košecká CAs: Vaibhav Vaish and David Stavens.
Face Recognition from Face Motion Manifolds using Robust Kernel RAD Ognjen Arandjelović Roberto Cipolla Funded by Toshiba Corp. and Trinity College, Cambridge.
© 2003 by Davi GeigerComputer Vision November 2003 L1.1 Tracking We are given a contour   with coordinates   ={x 1, x 2, …, x N } at the initial frame.
Overview and Mathematics Bjoern Griesbach
Crash Course on Machine Learning
Bayesian Filtering for Robot Localization
Tracking Pedestrians Using Local Spatio- Temporal Motion Patterns in Extremely Crowded Scenes Louis Kratz and Ko Nishino IEEE TRANSACTIONS ON PATTERN ANALYSIS.
Markov Localization & Bayes Filtering
Olga Zoidi, Anastasios Tefas, Member, IEEE Ioannis Pitas, Fellow, IEEE
BraMBLe: The Bayesian Multiple-BLob Tracker By Michael Isard and John MacCormick Presented by Kristin Branson CSE 252C, Fall 2003.
TP15 - Tracking Computer Vision, FCUP, 2013 Miguel Coimbra Slides by Prof. Kristen Grauman.
Object Tracking using Particle Filter
Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park.
Visual Tracking Decomposition Junseok Kwon* and Kyoung Mu lee Computer Vision Lab. Dept. of EECS Seoul National University, Korea Homepage:
Computer vision: models, learning and inference Chapter 19 Temporal models.
1. Introduction Motion Segmentation The Affine Motion Model Contour Extraction & Shape Estimation Recursive Shape Estimation & Motion Estimation Occlusion.
From Bayesian Filtering to Particle Filters Dieter Fox University of Washington Joint work with W. Burgard, F. Dellaert, C. Kwok, S. Thrun.
Computer vision: models, learning and inference Chapter 19 Temporal models.
Jamal Saboune - CRV10 Tutorial Day 1 Bayesian state estimation and application to tracking Jamal Saboune VIVA Lab - SITE - University.
Probabilistic Robotics Bayes Filter Implementations.
Forward-Scan Sonar Tomographic Reconstruction PHD Filter Multiple Target Tracking Bayesian Multiple Target Tracking in Forward Scan Sonar.
Learning the Appearance and Motion of People in Video Hedvig Sidenbladh, KTH Michael Black, Brown University.
Processing Sequential Sensor Data The “John Krumm perspective” Thomas Plötz November 29 th, 2011.
ISOMAP TRACKING WITH PARTICLE FILTER Presented by Nikhil Rane.
Expectation-Maximization (EM) Case Studies
EE4-62 MLCV Lecture Face Recognition – Subspace/Manifold Learning Tae-Kyun Kim 1 EE4-62 MLCV.
An Introduction to Kalman Filtering by Arthur Pece
Paper Reading Dalong Du Nov.27, Papers Leon Gu and Takeo Kanade. A Generative Shape Regularization Model for Robust Face Alignment. ECCV08. Yan.
Boosted Particle Filter: Multitarget Detection and Tracking Fayin Li.
Looking at people and Image-based Localisation Roberto Cipolla Department of Engineering Research team
Sequential Monte-Carlo Method -Introduction, implementation and application Fan, Xin
Tracking with dynamics
Visual Tracking by Cluster Analysis Arthur Pece Department of Computer Science University of Copenhagen
11/25/03 3D Model Acquisition by Tracking 2D Wireframes Presenter: Jing Han Shiau M. Brown, T. Drummond and R. Cipolla Department of Engineering University.
Learning Image Statistics for Bayesian Tracking Hedvig Sidenbladh KTH, Sweden Michael Black Brown University, RI, USA
Robust and Fast Collaborative Tracking with Two Stage Sparse Optimization Authors: Baiyang Liu, Lin Yang, Junzhou Huang, Peter Meer, Leiguang Gong and.
Particle Filtering for Geometric Active Contours
Dynamical Statistical Shape Priors for Level Set Based Tracking
Filtering and State Estimation: Basic Concepts
PRAKASH CHOCKALINGAM, NALIN PRADEEP, AND STAN BIRCHFIELD
Image and Video Processing
Introduction to Object Tracking
Presentation transcript:

Visual Tracking Conventional approach Build a model before tracking starts Use contours, color, or appearance to represent an object Optical flow Incorporate invariance to cope with variation in pose, lighting, view angle … View-based approach Solve complicated optimization problem Problem: Object appearance and environments are always changing

Prior Art Eigentracking [Black et al. 96] View-Based learning method Learn to track a “thing” rather than some “stuff” Need to solve nonlinear optimization problem Active contour [Isard and Blake 96] Use importance sampling Propagate uncertainly over time Edge information is sensitive to lighting change Gradient-Based [Shi and Tomasi 94, Hager and Belhumeur 96] Gradient-Based Construct illumination cone per person to handle lighting change Template-based WSL model [Jepson et al. 2001] Model each pixel as a mixture of Gaussian (MoG) On-line learning of MoG Track “stuff”

Incremental Visual Learning Aim to build a tracker that: Is not view-based Constantly updates the model Runs fast (close to real time) Tracks “thing” (structure information) rather than “stuff” (collection of pixels) Operates on moving camera Learns a representation while tracking Challenge Pose variation Partial occlusion Adaptive to new environment Illumination change Drifts

Main Idea Adaptive visual tracker: Particle filter algorithm draw samples from distributions does not need to solve nonlinear optimization problems Subspace-based tracking learn to track the “thing” use it to determine the most likely sample With incremental update does not need to build the model prior to tracking handle variation in lighting, pose and expression Performs well with large variation in Pose Lighting (cast shadows) Rotation Expression change Joint work with David Ross (Toronto) and Jongwoo Lim (UIUC/UCSD), Ruei-Sung Lin (UIUC)

Two Sampling Algorithms A simple sampling method with incremental subspace update [ECCV04] (joint work with David Ross and Jongwoo Lim) A sequential inference sampling method with a novel subspace update algorithm [NIPS05a,NIPS05b] (joint with David Ross, Jongwoo Lim and Ruei-Sun Lin)

Graphical Model for Inference Given the current location L t and current observation F t, predict the target location L t+1 in the next frame p(L t |F t, L t-1 )  p(F t |L t ) p(L t |L t-1 ) p(L t |L t-1 )  dynamic model Use Brownian motion to model the dynamics p(F t |L t )  observation model Use eigenbasis with approximation

Dynamic Model: p(L t |L t-1 ) Representation of L t : Position (x t,y t ), rotation (r t ), and scaling (s t ) L t =(x t,y t,r t,s t ) Or affine transform with 6 parameters Simple dynamics model: Each parameter is independently Gauissian distributed LtLt L t+1

Observation Model: p(F t |L t ) Use probabilistic principal component analysis (PPCA) to model our image observation process Given a location L t, assume the observed frame was generated from the eigenbasis The probability of observing a datum z given the eigenbasis B and mean , p(z|B)=N (z ; , BB T +  I) where  I is additive Gaussian noise In the limit   0, N (z ; , BB T +  I) is proportional to negative exponential of the square distance between z and the linear subspace B, i.e., p(z|B)  ||(z-  )- BB T (z-  )|| p(F t |L t )  ||(F t -  )- BB T (F t -  )|| B z 

Inference Fully Bayesian inference needs to compute p(L t |F t, F t-1, F t-2, …, L 0 ) Need approximation since it is infeasible to compute in a closed form Approximate with p(L t |F t, l * t-1 ) where l * t-1 is the best prediction at time t-1

Sampling Comes to the Rescue Draw a number of sample locations from our prior p(L t | l * t-1 ) For each sample l s, we compute the posterior p s = p( l s |F t, l * t-1 ) p s is the likelihood of l s under our PPCA distribution, times the probability with which l s is sampled Maximum a posteriori estimate l * t = arg max p( l s | F t, l * t-1 )

Remarks Specifically do not assume the probability of observation remains fixed over time To allow for incremental update of our object model Given an initial eigenbasis B t-1, and a new observation w t-1, we compute a new eigenbasis B t B t is then used in p(F t |L t )

Incremental Subspace Update To account for appearance change due to pose, illumination, shape variation Learn a representation while tracking Based on the R-SVD algorithm [Golub and Van Loan 96] and the sequential Karhunen-Loeve algorithm [Levy and Lindebaum 00] First assume zero (or fixed) mean Develop an update algorithm with respect to running mean

R-SVD Algorithm Let and new data Decompose into projection of onto and its complement, Let where SVD of can be written as Compute SVD of Then SVD of where

Schematically Decompose a vector into components within and orthogonal to a subspace Compute a smaller SVD

Put All Together 1. (Optional) Construct an initial eigenbasis if necessary (e.g., for initial detection) 2. Choose initial location L 0 3. Search for possible locations: p(L t |L t-1 ) 4. Predict the most likely location: p(L t |F t, L t-1 ) 5. Update eigenbasis using R-SVD algorithm 6. Go to step 3

Experiments 30 frame per second with 320  240 pixel resolution Draw at most 500 samples 50 eigenvectors Update every 5 frames Runs 8 frames per second using Matlab Results Large pose variation Large pose variation Large lighting variation Previously unseen object

Most Recent Work Subspace update with correct mean Sequential inference model Tracking without building a subspace a priori Learn a representation while tracking Better observation model Handling occlusion

Sequential Inference Model Let X t be the hidden state variable describing the motion parameters. Given a set of observed images I t = {I 1,.., I t } and use Baye’s theorem, p(X t | I t )  p(I t | X t ) ∫p(X t |X t-1 )p(X t-1 | I t-1 )dX t-1 Need to compute Dynamic model: p(X t |X t-1 ) Observation model: p(I t | X t ) Use a Gaussian distribution for dynamic model p(X t |X t-1 )=N(X t ; X t-1,  )

Observation Model Use probabilistic PCA to model the observation with distance to subspace: p dt (I t | X t )=N (I t ; , UU T +  I) distance within subspace: p dw (I t | X t )=N (I t ; , U  -2 U T ) It can be shown that p(I t | X t )= p dt (I t | X t ) p dw (I t | X t )= N (I t ; , UU T +  I) N (I t ; , U  -2 U T ) U z  dt dw

Incremental Update of Eigenbasis The R-SVD or SKL method assumes a fixed sample mean, i.e., uncentered PCA We derive a method to incremental update the eigenbasis with correct sample mean See [Joiffle 96] for arguments on centered and uncentered PCA

R-SVD with Updated Mean Proposition: Let I p ={I 1, …, I n }, I q ={I n+1, …, I n+m }, and I r ={I 1, …, I n, I n+1, …, I n+m }. Denote the means and the scatter matrices of I p, I q, I r as  p,  q,  r, and S p, S q, S r respectively, then S r = S p + S q + nm/(n+m)(  p -  q )(  p -  q ) T Let and use this proposition, we get scatter matrix with correct mean Thus, modify the R-SVD algorithm with and the rest is the same

Occlusion Handling An iterative method to compute a weight mask Estimate the probability a pixel is being occluded Given an observation I t, and initially assume there is no occlusion with W (0) where.* is a element-wise multiplication

Experiments Videos are recorded with 15 or 30 frames per second with 320 x 240 gray scale images Use 6 affine motion parameters 16 eigenvectors Normalize images to 32 x 32 pixels Update every 5 frames Run at 4 frames per second Results: Dudek sequence (partial occlusion) Dudek sequence Sylvester Jr (30 frame per second with cluttered background) Sylvester Jr Large lighting variation with moving camera (15 frame per second) Large lighting variation with moving camera Heavily shadowed condition with moving camera (15 frame per second) Heavily shadowed condition with moving camera

Does the Increment Update Work Well? Compare the results 121 incremental updates (every 5 frame) using our method Use all 605 images with conventional PCA tracking results reconstruction using our method reconstruction using all images residue: 5.65 x per pixel residue: 5.73 x per pixel

Future Work Verification incoming samples Construct global manifold Handling drifts Learning the dynamics Recover from failure Infer 3D structure from video stream Analyze lighting variation

Concluding Remarks Adaptive tracker Works with moving camera Handle variation in pose, illumination, and shape Handle occlusions Learn a representation while tracking Reasonable fast