+ Integrated Systems Biology Lab, Department of Systems Science, Graduate School of Informatics, Kyoto University Sep. 2 nd, 2013 – IBISML 2013 Enhancing.

Slides:

Advertisements

Similar presentations

Applications of one-class classification

Advertisements

Change Detection C. Stauffer and W.E.L. Grimson, “Learning patterns of activity using real time tracking,” IEEE Trans. On PAMI, 22(8): , Aug 2000.

Human Identity Recognition in Aerial Images Omar Oreifej Ramin Mehran Mubarak Shah CVPR 2010, June Computer Vision Lab of UCF.

Image Analysis Phases Image pre-processing –Noise suppression, linear and non-linear filters, deconvolution, etc. Image segmentation –Detection of objects.

Real-Time Hand Gesture Recognition with Kinect for Playing Racing Video Games 2014 International Joint Conference on Neural Networks (IJCNN) July 6-11,

Vision Based Control Motion Matt Baker Kevin VanDyke.

Qualifying Exam: Contour Grouping Vida Movahedi Supervisor: James Elder Supervisory Committee: Minas Spetsakis, Jeff Edmonds York University Summer 2009.

Robust Object Tracking via Sparsity-based Collaborative Model

Multiple People Detection and Tracking with Occlusion Presenter: Feifei Huo Supervisor: Dr. Emile A. Hendriks Dr. A. H. J. Stijn Oomes Information and.

A KLT-Based Approach for Occlusion Handling in Human Tracking Chenyuan Zhang, Jiu Xu, Axel Beaugendre and Satoshi Goto 2012 Picture Coding Symposium.

Robust Multi-Pedestrian Tracking in Thermal-Visible Surveillance Videos Alex Leykin and Riad Hammoud.

Formation et Analyse d’Images Session 8

Object Recognition with Invariant Features n Definition: Identify objects or scenes and determine their pose and model parameters n Applications l Industrial.

Modeling Pixel Process with Scale Invariant Local Patterns for Background Subtraction in Complex Scenes (CVPR’10) Shengcai Liao, Guoying Zhao, Vili Kellokumpu,

Tracking Objects with Dynamics Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/21/15 some slides from Amin Sadeghi, Lana Lazebnik,

1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.

Efficient Moving Object Segmentation Algorithm Using Background Registration Technique Shao-Yi Chien, Shyh-Yih Ma, and Liang-Gee Chen, Fellow, IEEE Hsin-Hua.

Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

A Study of Approaches for Object Recognition

Ensemble Tracking Shai Avidan IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE February 2007.

Distinguishing Photographic Images and Photorealistic Computer Graphics Using Visual Vocabulary on Local Image Edges Rong Zhang,Rand-Ding Wang, and Tian-Tsong.

1 Integration of Background Modeling and Object Tracking Yu-Ting Chen, Chu-Song Chen, Yi-Ping Hung IEEE ICME, 2006.

Tracking Video Objects in Cluttered Background

Multi-camera Video Surveillance: Detection, Occlusion Handling, Tracking and Event Recognition Oytun Akman.

Jacinto C. Nascimento, Member, IEEE, and Jorge S. Marques

1 Video Surveillance systems for Traffic Monitoring Simeon Indupalli.

Jason Li Jeremy Fowers Ground Target Following for Unmanned Aerial Vehicles.

Distinctive Image Features from Scale-Invariant Keypoints By David G. Lowe, University of British Columbia Presented by: Tim Havinga, Joël van Neerbos.

Object detection, tracking and event recognition: the ETISEO experience Andrea Cavallaro Multimedia and Vision Lab Queen Mary, University of London

Final Exam Review CS485/685 Computer Vision Prof. Bebis.

Prakash Chockalingam Clemson University Non-Rigid Multi-Modal Object Tracking Using Gaussian Mixture Models Committee Members Dr Stan Birchfield (chair)

Olga Zoidi, Anastasios Tefas, Member, IEEE Ioannis Pitas, Fellow, IEEE

BraMBLe: The Bayesian Multiple-BLob Tracker By Michael Isard and John MacCormick Presented by Kristin Branson CSE 252C, Fall 2003.

COMPUTER VISION: SOME CLASSICAL PROBLEMS ADWAY MITRA MACHINE LEARNING LABORATORY COMPUTER SCIENCE AND AUTOMATION INDIAN INSTITUTE OF SCIENCE June 24, 2013.

1. Introduction Motion Segmentation The Affine Motion Model Contour Extraction & Shape Estimation Recursive Shape Estimation & Motion Estimation Occlusion.

A General Framework for Tracking Multiple People from a Moving Camera

Kourosh MESHGI Shin-ichi MAEDA Shigeyuki OBA Shin ISHII 18 MAR 2014 Integrated System Biology Lab (Ishii Lab) Graduate School of Informatics Kyoto University.

3D SLAM for Omni-directional Camera

KOUROSH MESHGI PROGRESS REPORT TOPIC To: Ishii Lab Members, Dr. Shin-ichi Maeda, Dr. Shigeuki Oba, And Prof. Shin Ishii 9 MAY 2014.

Loris Bazzani*, Marco Cristani*†, Vittorio Murino*† Speaker: Diego Tosato* *Computer Science Department, University of Verona, Italy †Istituto Italiano.

Visual SLAM Visual SLAM SPL Seminar (Fri) Young Ki Baik Computer Vision Lab.

Vehicle Segmentation and Tracking From a Low-Angle Off-Axis Camera Neeraj K. Kanhere Committee members Dr. Stanley Birchfield Dr. Robert Schalkoff Dr.

Stable Multi-Target Tracking in Real-Time Surveillance Video

1 Research Question  Can a vision-based mobile robot  with limited computation and memory,  and rapidly varying camera positions,  operate autonomously.

Action and Gait Recognition From Recovered 3-D Human Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS— PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST.

Robust Object Tracking by Hierarchical Association of Detection Responses Present by fakewen.

Efficient Visual Object Tracking with Online Nearest Neighbor Classifier Many slides adapt from Steve Gu.

Expectation-Maximization (EM) Case Studies

Chapter 5 Multi-Cue 3D Model- Based Object Tracking Geoffrey Taylor Lindsay Kleeman Intelligent Robotics Research Centre (IRRC) Department of Electrical.

CVPR2013 Poster Detecting and Naming Actors in Movies using Generative Appearance Models.

Boosted Particle Filter: Multitarget Detection and Tracking Fayin Li.

Looking at people and Image-based Localisation Roberto Cipolla Department of Engineering Research team

By Naveen kumar Badam. Contents INTRODUCTION ARCHITECTURE OF THE PROPOSED MODEL MODULES INVOLVED IN THE MODEL FUTURE WORKS CONCLUSION.

Course14 Dynamic Vision. Biological vision can cope with changing world Moving and changing objects Change illumination Change View-point.

Using Adaptive Tracking To Classify And Monitor Activities In A Site W.E.L. Grimson, C. Stauffer, R. Romano, L. Lee.

Presented by: Idan Aharoni

Visual Tracking by Cluster Analysis Arthur Pece Department of Computer Science University of Copenhagen

Stochastic Grammars: Overview Representation: Stochastic grammar Representation: Stochastic grammar Terminals: object interactions Terminals: object interactions.

Discriminative Training and Machine Learning Approaches Machine Learning Lab, Dept. of CSIE, NCKU Chih-Pin Liao.

Portable Camera-Based Assistive Text and Product Label Reading From Hand-Held Objects for Blind Persons.

Kalman Filter and Data Streaming Presented By :- Ankur Jain Department of Computer Science 7/21/03.

Motion Estimation of Moving Foreground Objects Pierre Ponce ee392j Winter March 10, 2004.

Contents Team introduction Project Introduction Applicability

A Forest of Sensors: Using adaptive tracking to classify and monitor activities in a site Eric Grimson AI Lab, Massachusetts Institute of Technology

Tracking Objects with Dynamics

Dynamical Statistical Shape Priors for Level Set Based Tracking

Vehicle Segmentation and Tracking in the Presence of Occlusions

RGB-D Image for Scene Recognition by Jiaqi Guo

Introduction to Object Tracking

Presentation transcript:

+ Integrated Systems Biology Lab, Department of Systems Science, Graduate School of Informatics, Kyoto University Sep. 2 nd, 2013 – IBISML 2013 Enhancing Probabilistic Appearance-Based Object Tracking with Depth Information: Object Tracking under Occlusion Kourosh MESHGI Yu-zhe LI Shigeyuki OBA Shin-ichi MAEDA and Prof. Shin ISHII

+ Outline Introduction Literature Review Proposed Framework Gridding Confidence Measure Occlusion Flag Experiments Conclusions

+ INTRODUCTION & LITERATURE REVIEW 3

+ Object Tracking: Applications H uman- Computer Interfaces Human Behavior Analysis Video Communication/ Compression Virtual/ Augmented Reality Surveillance 4

+ Object Tracking: Strategies * Objects are segmented out from image in each frame which is used for tracking * Blob Detection (not efficient) *Generates hypotheses and aims to verify them using the image * Model-based * Template Matching * Particle Filter Bottom-Up Top- Down 5

+ Object Tracking Discriminative Generative: Keep the status of each object by a PDF Particle filtering Monte Carlo-based methods Bayesian networks with HMM Real-time computation Use compact appearance models e.g. color histograms or color distribution Trades the number of evaluated solutions Granularity of each solution 6

+ Particle Filters Idea: Applying a recursive Bayesian filter based on sample set Applicable to nonlinear and non-Gaussian systems Computer Vision: Condensation algorithm  developed initially to track objects in cluttered environments Multiple hypotheses Short-Term Occlusions Long-Term Occlusions 7

+ Object Tracking: Occlusion Generative models do not address occlusion explicitly  maintain a large set of hypotheses Discriminative models  direct occlusion detection robust against partial and temporary occlusions  long-lasting occlusions hinder their tracking heavily Occlusions Update model for target  Type of Occlusion is Important  Keep memory vs. Keep focus on the target Dynamic Occlusion: Pixels of other object close to cameraScene Occlusion: Still objects are closer to camera than the target objectApparent occlusion: Result of shape change, silhouette motion, shadows, or self-occlusions 8

+ Object Tracking: Depth Integration Usage of depth information only Use depth information for better foreground segmentation Statistically estimate 3D object positions 9

+ Object Tracking: Challenges Appearance Changes Illumination changes Shadows Affine transformations Non-rigid body deformations Occlusion Sensor Parameters & Compatibility Field of view Position Resolution Signal-to-noise ratio Channels data fusion Segmentation Inherent Problems Partial segmentation Split and merge 10

+ PROPOSED FRAMEWORK 11

+ Overview Goals of Tracker Handles object tracking even under persistent occlusions Highly adaptive to object scale and trajectory Perform a color and depth information fusion Particle Filter Rectangular bounding boxes as hypotheses of target presence Described by a color histogram and median of depth Compared to each bounding box in terms of the Bhattacharyya distance Regular grids for bounding box Confidence measure for each cell Occlusion flag attached to each particle GOAL PARTICLE FILTERS 12

+ Design: Representation Center of mass No sufficient information about shape, size and distance Silhouette (or blobs) Computational complexity Bounding boxes Simplified version of Silhouettes Enhanced with Gridding (new) 13

+ Design: Preprocessing Foreground-Background Separation: Temporal median bkg Normalizing Depth using Sensor Characteristics Relation between raw depth values and metric depth Sensitivity to IR-absorbing material, especially in long distances Clipping out-of-range values Up-sampling to match image size 14

+ Design: Notation Bounding Box Occlusion Flag Image Appearance (RGB) Depth Map Ratio of Foreground Pixels Histogram of Colors Median of Depth Goal Template Grid cell i

+ Observation Model Each Particle is represented by A Bounding Box An Occlusion Flag Decomposed to Grid Cells To capture Local Information Cells are assumed Independent Template has similar grid 16

+ Non-Occlusion Case Information obtained from two channels Color Channel (RGB)  Appearance Information Depth Channel  Depth Map Channels are assumed Independent Channels are not always reliable The problems usually arise from appearance data Requires a confidence measure  Ratio of pixels containing information to all pixels would help 17

+ Design: Feature Extraction I Histogram of Colors Bag-of-words Model Invariant to rigid-body motions, non-rigid- body deformations, partial occlusion, down sampling and perspective projection Each scene  Certain volume of color space  Exposed by clustering colors RGB space + K-means clustering with fixed number of bins 18

+ Design: Feature Extraction II Median of Depth Depth information  Noisy data Histogram of Depth. Needs fine-tuned bins  separate foreground and background pixels in clutter Subjects are usually spans in a short range of depth values Resulting histograms are too spiky in depth planes that essentially those bins can represent the histogram Higher level of informative features based on depth value exists e.g. Histogram of Oriented Depths (HOD) Surplus computational complexity Special consideration (sensitivity to noise, etc.) Median of depth! 19

+ Design: Feature Extraction III Confidence Depends on amount of pixels containing information in cell Ratio of foreground pixels to all pixels of each box  Invariant to box size Moves towards Zero: Box does not contain many foreground pixels  HoC not be statistically significant Moves towards One: Doesn’t mean that the cell is completely trustable Whole bounding box size could be very small Misplaced Beta distribution over ratio Fit on training data Two shape parameters: a and b 20

+ Design: Similarity Measure Histogram of Colors Bhattacharyya Distance KL-Divergance Euclidean Distance Median of Depth  Bhattacharya Distance Log-sum-exp trick 21

+ Occlusion Case Occlusion Flag as a part of Representation Occlusion Time Likelihood No big difference between bounding boxes Uniform Distribution 22

+ Particle Filter Update Occlusion State Transition a 2×2 matrix Decides whether newly sampled particle should stay in previous state or change it by a stochastic manner Along with particle filter resampling and uniform distribution of occlusion case, can handle occlusion stochastically Particle Resampling Based on particle probability  Occlusion case vs. Non-Occlusion case Bounding box position and size are assumed to have no effect on occlusion flag for simplicity. 23

+ Target Model Initialization Manually Automatically with known color hist. Object detection algorithm Expectation of Target Statistical expectation of all particles Target Update Smooth transition between video frames By discarding image outlier Forgetting process Same for updating depth Occlusion Handling Updating a model under partial or full occluded model  losing proper template gradually for next slides 24

+ Visualizations

+ Algorithm Initialization Color Clustering Target Initialization Preprocessing Tracking Loop Input Frame Update Background Calculate Bounding Box Features Calculate Similarity Measures Estimate Next Target Resample Particles Update Target Template 26

+ EXPERIMENTS 27

+ Criteria Specially designed metrics to evaluate bounding boxes for multiple objects tracking Proposed for a classification of events, activities and relationships (CLEAR) project Multiple object tracker precision (MOTP): ability of the tracker to estimate precise object positions, independent of its skill at recognizing object configurations, keeping consistent trajectories, etc Multiple object tracker accuracy (MOTA): including number of misses, of false positives, and of mismatches, respectively, for time t. Scale adaptation Lower values of SA indicates better adaptation of algorithm to scale. 28

+ Experiments Toy dataset Acquired with Microsoft Kinect Image resolution of 640×480 and depth image resolution of 320×240 Annotated with target bounding box coordinates and occlusion status as ground truth Scenario One Walking Mostly in parallel with camera z-plane + parts towards the camera  Test the tracking accuracy and scale adoptability Appearance of the subject changed drastically in several frames Rapid changes in direction of movement and velocity Depth information of those frames remains intact  Test the robustness of algorithm Scenario Two Same video A rectangular space of the data is occluded manually. 29

+ Result I Tracker MOTPMOTASA RGB %112.8 RGB-D %99.9 RGB-D Grid 2× %48.5 RGB-D Grid 2×2+ Occlusion Flag %

+ Result II Tracker MOTPMOTASA RGB %98.8 RGB-D %91.9 RGB-D Grid 2× %59.2 RGB-D Grid 2×2+ Occlusion Flag %

+ FINALLY… 32

+ Conclusion Hybrid space of color and depth Resolves loss of track for abrupt appearance changes Increases the robustness of the method Utilized the depth information effectively to judge who occludes the others. Gridding bounding box: Better representation of local statistics of foreground image and occlusion Improves scale adaptation of the algorithm Preventing the size of bounding box to bewilder around the optimal value Occlusion flag: Distinguishing the occluded and un-occluded cases explicitly Suppressed the template update Extended the search space under the occlusion Confidence measure: Evaluates ratio of fore- and background pixels in a box Giving flexibility to the bounding box size Search in Scale space as well Splitting & Merging 33

+ Future Work Preprocessing: Shadow Removal Preprocessing: Crawling Background Design: Better color clustering method to handle crawling background e.g. Growing K-means Design: No independence between grids Design: More elaborated State Transition Matrix Experiment: Using Public datasets (ViSor, PET 2006, etc.) Experiment: Using Real World Occlusion Scenario 34

+ Thank You! 35

+ Object Tracking: Input Data Category: Type and configuration of cameras 2D: Monocular Cameras Rely on appearance models Models have one-to-one correspondence to objects in the image Suffer from occlusions, fail to handle all object interactions 3D: Stereo Cameras / Multiple Cameras More robust to occlusions Prone to major tracking issues 2.5D: Depth-Augmented Images Microsoft Kinect 36

+ Object Tracking: Single Object Plenty of literature, Wealth of tools Rough Categorization Tracking separated targets  Popular competition Multiple object tracking  Challenging task  Dynamic change of object attributes Color DistributionShapeVisibility Model-BasedAppearance-BasedFeature-Based 37

+ Why Bounding Boxes Bounding box for Tracking Encapsulates a rectangle of pixels RGB pixels in color Normalized depth Top-left corner and width and height (x,y,w,h) The size of the bounding box can change during tracking freely Accommodates scale changes Handle perspective projection effects Doesn’t model velocity components explicitly Trajectory Independence Handle sudden change in the direction Large variations in speed 38