Presentation is loading. Please wait.

Presentation is loading. Please wait.

Pixels and Particles Sequential Monte Carlo for Image Analysis Patrick Pérez Microsoft Research, Cambridge, UK Workshop on Particle.

Similar presentations


Presentation on theme: "Pixels and Particles Sequential Monte Carlo for Image Analysis Patrick Pérez Microsoft Research, Cambridge, UK Workshop on Particle."— Presentation transcript:

1 Pixels and Particles Sequential Monte Carlo for Image Analysis Patrick Pérez Microsoft Research, Cambridge, UK Workshop on Particle Filter, Paris, 2-3 Dec. 2002

2 Outline  Forewords  Visual Tracking  Specifities  Why it is difficult  Why particle filters are appealing  Various forms  Six years of visual tracking with particles  Tracking at Oxford  Blossom  Two applied examples at Microsoft Research  Color-based tracking  Interactive contour extraction  Thoughts: promises, pitfalls, open problems, and alternatives

3 Why SMC with Images?  Probabilistic generative models: powerful for a large range of image analysis and computer vision tasks  Good at capturing (high-dimensional) priors  Good at solving inverse problems under uncertainty  Visual Tracking: « following » entities in successive video frames  Perceptual interfaces  Visual communication  Intelligent cars  Robotics  Surveillance, biometrics  Medical imaging  Motion capture for sport, medicine, games, movies  Video editing, analysis, compression  Other applications: extracting contours in still images

4 Many Faces of Visual Tracking Tracking…  Object of a given nature. Cars. People. Faces.  Object of a given nature, with a specific attribute. Moving cars. Walking people. Talking heads. Face of a given person.  A picked object, whatever its nature. Moving entities.

5 Tasks and Problems  Hierarchy of tasks  Tracking an entity, based on frame-to-model consistency  Detection (for initialization and re-initialization)  Recognition (identity, activity)  Multiple sources of trouble  Dimension loss  Noise  Variability of image-based appearance  Occlusions, partial to total  Clutter  Motion amplitude  Real time constraints

6 Why Particle Filter?  Sometimes hybrid and/or high dimensional state-spaces  Always complex measurement models  Huge number of measurements at each instant  The state tells which subset of data to look at  Hence highly non-linear, and multi-modal under clutter  Total and partial occlusions make data association even worse

7  State captures various aspects of tracked objects  3D pose/shape [cont.]  2D pose/shape [cont. or disc.]  Point-wise appearance [cont. or disc.]  Color [cont. or disc.]  Identity [disc.]  Activity [disc.]  Dimension ranges from 2 to 50  Dynamics: often AR-p on continuous, HMM on discrete  Link to data:  Part of state defines a portion of image plane:  Part of state might define an appearance:  Likelihood will explain data in and/or around Hidden Process

8  State: control points  Context: entities of unspecified nature, e.g., moving objects Deformable Curves [Freedman’s active contours ]

9  State: Affinity + few deformations modes on average shape  Context: for objects of a given type whose shape is learnt off-line and linearly parameterized (PCA) Eigen Shapes [Taylor and Cootes’s Active Shapes]

10  State: Affinity + few deformations modes on average template  Context: for objects of a given type whose appearance is learnt off-line and linearly parameterized (PCA) Eigen Appearances [Taylor and Cootes’s Active Appearances]

11  State: Affinity applied to small set of examplars  Context: for objects of a given type, whose shape and/or appearance are learnt off-line as a “flip book” Examplars [Gravila’s Chamfer System]

12  State: 3D pose of a set of parameterized parts (possibly articulated)  Context: pose tracking of objects of known type (manufactured objects, human body) whose geometry is known, assumed, or learnt 3D Models [Sminchisescu’s body model][Sidenbladh’s body tracker]

13 Measurements  Raw images (monocular, binocular, etc.)  Intensity  Color Support: pixel grid, possibly sub-sampled  Filtered image  Smoothed image  Frame difference  Gradients  Walevets coefficients, steerable filters Support: pixel grid, possibly sub-sampled  Low-level features (output of detectors)  Edges  Corners  Moving edges Support: sparse, possibly dependent on

14  Measurements: maxima of projected luminance gradient along normals ( such events on normal) Outline Likelihood

15  Measurements: outputs of a filter bank on a grid of points  Background distribution: learned at each grid point on empty scene  Foreground distribution: learned off-line for objects of interest  Scene likelihood Fg/Bkg Grid Likelihood

16  Measurements: outputs of a filter bank on a grid of points  Background distribution: learned at each grid point on empty scene  Foreground distribution: learned off-line for objects of interest  Scene likelihood Fg/Bkg Grid Likelihood

17  Measurements: outputs of a filter bank on a grid of points  Background distribution: learned at each grid point on empty scene  Foreground distribution: learned off-line for objects of interest  Scene likelihood Fg/Bkg Grid Likelihood

18  Measurements: outputs of a filter bank on a grid of points  Background distribution: learned at each grid point on empty scene  Foreground distribution: learned off-line for objects of interest  Scene likelihood Fg/Bkg Grid Likelihood

19  Measurements: outputs of a filter bank on a grid of points  Background distribution: learned at each grid point on empty scene  Foreground distribution: learned off-line for objects of interest  Scene likelihood Fg/Bkg Grid Likelihood

20 Appearance Likelihood  Reference appearance  Hypothetized appearance (affine wrap)  Likelihood  Point-wise  Shuffled

21 Appearance Likelihood  Reference appearance  Hypothetized appearance (affine wrap)  Likelihood  Point-wise  Shuffled

22 Oxford Heritage Mostly contour-based tracking. All papers there.there  [Isard’96] CONDENSATION  [Isard’98] Contour/skin color SIS with color-based proposal density  [Isard’98] Smoothing  [Isard’98] Switching AR-processes  [McCormick’99] Exclusion principle/partitioned sampling for MOT  [Deutscher’99] 3D articulated tracking with singularities  [Rittsher’99] Partial importance sampling for human motion classif  [McCormick’00] Partioned sampling for articulated motion  [Deutscher’00] Annealed particle filter for 3D human tracking

23 2001 Blossom ICCV’01  [Philomin’01] Quasi-random sampling [Philomin’01]  [Toyama’01] Likelihood for contour/appearance examplars [Toyama’01]  [Choo’01] Hybrid Monte Carlo for 3D human tracking [Choo’01]  [Isard’01] 3D multi-people tracking with bckg substraction [Isard’01]  [Vermaak’01] SIS for audio-visual speaker localization [Vermaak’01]  [Sullivan’01] Deterministic search guidance [Sullivan’01]  [Pérez’01] Interactive contour extraction with particles [Pérez’01]  [Sidenbladh’01] 3D human tracking with 2D motion data [Sidenbladh’01] CVPR’01  [Rui’01] Unscented particle filter for contour-based face tracking [Rui’01]  [Sminchisescu’01] Cov. Scaled Sampling for 3D Body Tracking [Sminchisescu’01] Misc.  [Spengler’01] Multi-cue democratic integration [Spengler’01]

24 2002 Blossom ECCV’02  [Sidenbladh’02] Example-based state process for 3D human tracking [Sidenbladh’02]  [Sullivan’02] View based tracking/recogn. of human actions [Sullivan’02]  [Vermaak’02] Adaptive multi-cue tracking [Vermaak’02]  [Pérez’02] color histogram-based tracking of multiple objects [Pérez’02]  [Sminchisescu’02] Hyperdynamics Importance Sampling [Sminchisescu’02] Misc.  [Nummiaro’02] color histogram-based tracking [Nummiaro’02]  [Spengler’02] Multi-cue multi-nature (car/human) tracking on bckg [Spengler’02]  [Tweed’02] Tracking many objects with subordinated PF [Tweed’02]  [Nummiaro’02b] Adatptive color histogram-based tracking [Nummiaro’02b]

25 Color-based Tracking [Joint work with C. Hue, J. Vermaak, M. Gangnet. ECCV’02]  Colour-only tracking appealing when:  No prior knowledge of entities to be tracked  Dramatic changes of appearance through the sequence  Principle: compare colour content of candidate regions against a reference colour histogram  Two deterministic predecessors: [ Bradski’98][Comaniciu’00][ Bradski’98][Comaniciu’00]

26 Model Ingredients State vector (position, scale) Associated image region N-bin colour histogram Reference histogram Likelihood based on Bhattacharyya distance:

27 Results Clutter [ deterministic vs. MonteCarlo ] large motion, blur, shape changes, partial occlusion complete occlusion

28 Multipart Colour Model  Idea: capture roughly spatial colour layout  Multipart model  Region is partitioned as with associated reference histograms  Assuming conditional independence of sub-regions where the histogram is collected in region of

29 Multiple Objects  State with object associated to ref. hist.  Independent dynamics  Data likelihood: marginalizing out depth ordering with computed on

30 Background Modelling  When still camera: background subtraction  Reference background image  Likelihood

31 Skin Detection  Learn skin colour histogram off-line  Label pixel on/off-skin with thresholded likelihood  Start new object around skin-labelled pixel cluster of sufficient size and away from existing hypothesized objects  Filter-out false alarms with motion information if still camera

32 Results Automatic detection (skin-based) and background subtraction

33 Interactive contour extraction [Joint work with A. Blake and M. Gangnet. ICCV’01]  Applications  Interactive cutout for image editing  Road extraction in aerial/satellite images  Blood vessels extraction in endoscopic images  The SMC approach  Contour as trajectory of a hidden dynamic process  Difficult tracking: gaps, spurious contours, branching  Unconventional tracking: no natural time, no sequential data

34  State model: bi-dimensional 2 nd order AR process  : chain of pixels traversed by polyline with vertices  Measurement model: on and off the curve  Combined in the posterior: probability of a path knowing the data Ingredients

35  Measurments: norm of intensity gradient  Likelihoods  over the whole image: consistent exponential behaviour  on plausible contours interactively extracted: complex mixture over the whole range. We chose a uniform distribution Data Model

36  Fixed step-size: e.g.,  Smooth: Gaussian direction changes with a few abrupt changes Dynamics

37 Proposal Density Smooth component of the dynamics, except if a corner is present (as assessed by Harris corner detector, and labelled otherwise) Proposal without cornersProposal with corners

38  User interaction  Starting point and direction  Rough positioning of « dams » to block to strong spurious contours  Restarting, especially at corners  Demo… JetStream : Interactive Cutout

39  Joint extraction of two “parallel” contours  width part of the unknowns  dynamics on it:  likelihood ratios on “Ribbon” Extraction

40 Road Extraction JetStreamRibbon JetStream Varying width (Aerial photographs: courtesy of the GeoInformation Group)

41 Pros and Cons of SMC  Advantages of SMC  Easy to implement and expand  Robust to clutter and brief occlusions  A wealth of theoretical tools  Problems  Jitter of the final estimate  Computational loads  Only brief capture of multimodality

42 Thoughts  Research directions?  Long-term multimodality  Multiple objects  Data fusion  On-line model adaptation  Proper likelihoods  Data-driven proposal function  Final controversial view  Often, dynamics simply maintains temporal coherence  Good engine does not fix weak model  A descriminant and robust data model for task at hand remains the challenge pattern recognition  Alternatives to PF: Variational approximation, EM?


Download ppt "Pixels and Particles Sequential Monte Carlo for Image Analysis Patrick Pérez Microsoft Research, Cambridge, UK Workshop on Particle."

Similar presentations


Ads by Google