Olga Zoidi, Anastasios Tefas, Member, IEEE Ioannis Pitas, Fellow, IEEE

Slides:

Advertisements

Similar presentations

Distinctive Image Features from Scale-Invariant Keypoints

Advertisements

People Counting and Human Detection in a Challenging Situation Ya-Li Hou and Grantham K. H. Pang IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART.

DDDAS: Stochastic Multicue Tracking of Objects with Many Degrees of Freedom PIs: D. Metaxas, A. Elgammal and V. Pavlovic Dept of CS, Rutgers University.

Introduction To Tracking

Amir Hosein Omidvarnia Spring 2007 Principles of 3D Face Recognition.

Tracking Multiple Occluding People by Localizing on Multiple Scene Planes Professor ：王聖智教授 Student ：周節.

Forward-Backward Correlation for Template-Based Tracking Xiao Wang ECE Dept. Clemson University.

Object Inter-Camera Tracking with non- overlapping views: A new dynamic approach Trevor Montcalm Bubaker Boufama.

Robust Object Tracking via Sparsity-based Collaborative Model

Adviser ： Ming-Yuan Shieh Student ID ： M Student ： Chung-Chieh Lien VIDEO OBJECT SEGMENTATION AND ITS SALIENT MOTION DETECTION USING ADAPTIVE BACKGROUND.

A KLT-Based Approach for Occlusion Handling in Human Tracking Chenyuan Zhang, Jiu Xu, Axel Beaugendre and Satoshi Goto 2012 Picture Coding Symposium.

1 Formation et Analyse d’Images Session 6 Daniela Hall 18 November 2004.

1 Formation et Analyse d’Images Session 12 Daniela Hall 16 January 2006.

IEEE TCSVT 2011 Wonjun Kim Chanho Jung Changick Kim

Lecture 5 Template matching

Recognition of Traffic Lights in Live Video Streams on Mobile Devices

Modeling Pixel Process with Scale Invariant Local Patterns for Background Subtraction in Complex Scenes (CVPR’10) Shengcai Liao, Guoying Zhao, Vili Kellokumpu,

Tracking Objects with Dynamics Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/21/15 some slides from Amin Sadeghi, Lana Lazebnik,

A Study of Approaches for Object Recognition

Adaptive Rao-Blackwellized Particle Filter and It’s Evaluation for Tracking in Surveillance Xinyu Xu and Baoxin Li, Senior Member, IEEE.

Ensemble Tracking Shai Avidan IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE February 2007.

Recognizing and Tracking Human Action Josephine Sullivan and Stefan Carlsson.

Tracking Video Objects in Cluttered Background

Illumination Normalization with Time-Dependent Intrinsic Images for Video Surveillance Yasuyuki Matsushita, Member, IEEE, Ko Nishino, Member, IEEE, Katsushi.

Presented by Pat Chan Pik Wah 28/04/2005 Qualifying Examination

1 Invariant Local Feature for Object Recognition Presented by Wyman 2/05/2006.

Estimating the Driving State of Oncoming Vehicles From a Moving Platform Using Stereo Vision IEEE Intelligent Transportation Systems 2009 M.S. Student,

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 11, NOVEMBER 2011 Qian Zhang, King Ngi Ngan Department of Electronic Engineering, the Chinese university.

Scale-Invariant Feature Transform (SIFT) Jinxiang Chai.

REALTIME OBJECT-OF-INTEREST TRACKING BY LEARNING COMPOSITE PATCH-BASED TEMPLATES Yuanlu Xu, Hongfei Zhou, Qing Wang*, Liang Lin Sun Yat-sen University,

Jason Li Jeremy Fowers Ground Target Following for Unmanned Aerial Vehicles.

1 Formation et Analyse d’Images Session 7 Daniela Hall 7 November 2005.

Computer vision.

Tracking Pedestrians Using Local Spatio- Temporal Motion Patterns in Extremely Crowded Scenes Louis Kratz and Ko Nishino IEEE TRANSACTIONS ON PATTERN ANALYSIS.

Prakash Chockalingam Clemson University Non-Rigid Multi-Modal Object Tracking Using Gaussian Mixture Models Committee Members Dr Stan Birchfield (chair)

TP15 - Tracking Computer Vision, FCUP, 2013 Miguel Coimbra Slides by Prof. Kristen Grauman.

Shape-Based Human Detection and Segmentation via Hierarchical Part- Template Matching Zhe Lin, Member, IEEE Larry S. Davis, Fellow, IEEE IEEE TRANSACTIONS.

1 Mean shift and feature selection ECE 738 course project Zhaozheng Yin Spring 2005 Note: Figures and ideas are copyrighted by original authors.

Babol university of technology Presentation: Alireza Asvadi

1. Introduction Motion Segmentation The Affine Motion Model Contour Extraction & Shape Estimation Recursive Shape Estimation & Motion Estimation Occlusion.

資訊碩一蔡勇儀  Introduction  Method  Background generation and updating  Detection of moving object  Shape control points.

Speaker : Meng-Shun Su Adviser : Chih-Hung Lin Ten-Chuan Hsiao Ten-Chuan Hsiao Date : 2010/01/26 ©2010 STUT. CSIE. Multimedia and Information Security.

A General Framework for Tracking Multiple People from a Moving Camera

Video-Vigilance and Biometrics

 Tsung-Sheng Fu, Hua-Tsung Chen, Chien-Li Chou, Wen-Jiin Tsai, and Suh-Yin Lee Visual Communications and Image Processing (VCIP), 2011 IEEE, 6-9 Nov.

Marco Pedersoli, Jordi Gonzàlez, Xu Hu, and Xavier Roca

CSCE 643 Computer Vision: Structure from Motion

1 ROBUST VISUAL TRACKING A Brief Summary Gagan Mirchandani School of Engineering, University of Vermont 1 1 And Ben Schilling, Clark Vandam, Kevin Haupt.

Visual SLAM Visual SLAM SPL Seminar (Fri) Young Ki Baik Computer Vision Lab.

December 9, 2014Computer Vision Lecture 23: Motion Analysis 1 Now we will talk about… Motion Analysis.

Efficient Visual Object Tracking with Online Nearest Neighbor Classifier Many slides adapt from Steve Gu.

Expectation-Maximization (EM) Case Studies

Chapter 5 Multi-Cue 3D Model- Based Object Tracking Geoffrey Taylor Lindsay Kleeman Intelligent Robotics Research Centre (IRRC) Department of Electrical.

Jiu XU, Axel BEAUGENDRE and Satoshi GOTO Computer Sciences and Convergence Information Technology (ICCIT), th International Conference on 1 Real-time.

Supervisor: Nakhmani Arie Semester: Winter 2007 Target Recognition Harmatz Isca.

Local features: detection and description

776 Computer Vision Jan-Michael Frahm Spring 2012.

Image features and properties. Image content representation The simplest representation of an image pattern is to list image pixels, one after the other.

Max-Confidence Boosting With Uncertainty for Visual tracking WEN GUO, LIANGLIANG CAO, TONY X. HAN, SHUICHENG YAN AND CHANGSHENG XU IEEE TRANSACTIONS ON.

776 Computer Vision Jan-Michael Frahm Spring 2012.

Ehsan Nateghinia Hadi Moradi (University of Tehran, Tehran, Iran) Video-Based Multiple Vehicle Tracking at Intersections.

Video object segmentation and its salient motion detection using adaptive background generation Kim, T.K.; Im, J.H.; Paik, J.K.; Electronics Letters

A. M. R. R. Bandara & L. Ranathunga

Lecture 07 13/12/2011 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.

Paper – Stephen Se, David Lowe, Jim Little

Tracking Objects with Dynamics

Vehicle Segmentation and Tracking in the Presence of Occlusions

PRAKASH CHOCKALINGAM, NALIN PRADEEP, AND STAN BIRCHFIELD

Brief Review of Recognition + Context

Introduction to Object Tracking

Presentation transcript:

Olga Zoidi, Anastasios Tefas, Member, IEEE Ioannis Pitas, Fellow, IEEE Visual Object Tracking Based on Local Steering Kernels and Color Histograms IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY VOL. 23, NO.5, MAY 2013 Olga Zoidi, Anastasios Tefas, Member, IEEE Ioannis Pitas, Fellow, IEEE

Overview Introduction Proposed method Experiment Result Conclusion

Overview Introduction Proposed method Experiment Result Conclusion Visual tracking Object representation Object position prediction Proposed method Experiment Result Conclusion

Visual Tracking Visual tracking is difficult to accomplish as some reason Illumination conditions Object may be nonrigid or articulated Occluded Rapid and complicated movements

Overview Introduction Proposed method Experiment Result Conclusion Visual tracking Object representation Object position prediction Proposed method Experiment Result Conclusion

Object Representation Model-based Appearance-based Contour-based Feature-based Hybrid

Object Representation : Model-Based Exploit a priori information about the object shape and create model [7]. Deal with the problem of object tracking under illumination variations, viewing angle, and partial occlusion. Heavy cost. [7]D.Roller, etc, “Model-based object tracking in monocular image sequences of road traffic scenes” Int. J. Comput. Vision, vol. 10, pp. 257-281, Mar.1993.

Object Representation : Appearance-Based Use the visual information for the object projection on the image plane, i.e., color, texture, and shape. Deal with simple object transformation. Sensitive to illumination changes.

Object Representation : Contour-Based By employing shape matching or contour-evolution techniques [9]. Contour can be represented by active models, such as snakes or B-splines [10]. Deal with rigid and nonrigid objects. Incorporate with occlusion detection and estimation techniques. [9] A. Yilmaz, X.Li, and M. Shah, “Contour-based object tracking with occlusion handling in video acquired using mobile cameras”, IEEE Trans. Pattern Anal. Mach. Intell., vol. 26, no. 11, pp. 1531-1536, Nov. 2004 [10] Y.Wang and O. Lee, “Active mesh – a feature seeking and tracking image sequence representation scheme”, IEEE Trans, Image Process, vol.3, no. 5, pp. 610-624, Sep. 1994

Object Representation : Feature-Based By tracking a set of feature points and these features are then grouped. Problem is the correct distinction between the target object and background features.

Overview Introduction Proposed method Experiment Result Conclusion Visual tracking Object representation Object position prediction Proposed method Experiment Result Conclusion

Object Position Prediction The position of the object in the following frame is usually predicted using a linear Kalman filter. [32] [32] G. Welch and G. Bishop, “An introduction to the Kalman filter,” Univ. North Carolina, Chapel Hill, NC, Tech. Rep. TR95041, 2000.

Overview Introduction Proposed method Experiment Result Conclusion

Proposed Method Some tips of object tracking algorithm Using color histogram (CH) to handle severe change in the object view. Using decomposing target into fragments, which are tracked separately, to handle partial occlusion. Using local steering kernel (LSK) object texture descriptor to represent the region of interest (ROI). Proposed tracking approach is an appearance based method using both CHs and LSK descriptor.

Proposed Method First, search image regions in video frame that have high color similarity to the object CH, and get candidate regions. Next, LSK descriptors of both the target object and candidate search regions are extracted. Discard the image regions with small CH similarity to the object CH, the new position of the object is selected as the image region, whose LSK representation has the maximum similarity to the one of the target object. As tracking evolves, target object appearance changes and being a stack containing different instances. Stack is updated with the representation of the most recent detected object.

LSK Object Tracing Framework Steps Initialization of the object ROI in the first video frame. Initialization can be done manually. Use CH information for color similarity search in the current search region. CH information can lead to background subtraction and reduction of the number of the candidate. Representation of both the object and search region with LSK features. Decision on the object ROI in the new frame, based on the similarities between candidate and: a) ROI in the previous frame, and b) top object instance in the stack. Update the object instance in the stack. Prediction of the object position in the following video frame and initialization of an object search region.

Overview Introduction Proposed method Experiment Result Conclusion Color Similarity Object Texture Description Object Localization and Model Update Search Region Extraction in The Next Frame Experiment Result Conclusion

Color Similarity After object position prediction and search region selection, the search region of size R1*R2 is divided into candidate ROI which size is Q1*Q2. Parameter d determines a uniform sampling of the candidate object ROIs every d pixels in the search region. At frame t, the Bt% of the search region patches with the minimal histogram similarity to the object histogram are considered to belong to the background. Cosine similarity : c h 1 , h 2 = cos θ < h 1 , h 2 > h 1 ∗ h 2 Normalized form S = c 2 (1− c 2 )

Color Similarity Three color channels’ S of all patches comprise a matrix MCH. The distribution of MCH takes values and sets a threshold in deciding whether the patch is a valid candidate ROI. Finally, the binary matrix BCH, whose entry is set to 1 if entry of MCH is ≧threshold and 0, otherwise. BCH will be used in tracking in Object Location section. Setting 𝑀 , 𝑀 𝑚𝑎𝑥 , and 𝑀 𝑚𝑖𝑛 as the mean, maximal, and minimal values of 𝑀 𝐶𝐻 entries, respectively.

Overview Introduction Proposed method Experiment Result Conclusion Color Similarity Object Texture Description Object Localization and Model Update Search Region Extraction in The Next Frame Experiment Result Conclusion

Object Texture Description Introduction of LSK descriptors LSKs are descriptors of the image salient features. They were proven to be robust in small scale and orientation changes and deformations. Result in successful tracking of slowly deformable objects. LSKs descriptors are a nonlinear combination of weighted spatial distances between a pixel p of an image of size N1*N2 and its surrounding M*M pixels (pi). (M is equal to 3 pixels in this paper) The distance K is measured using a weighted Euclidean distance, which uses as weights the covariance matrix Ci of the image gradients.

Object Texture Description K i (p)= det⁡( C i ) 2𝜋 exp − p i −p T C i p i −p 2 In order to get Ci matrix in Ki(p), get gradient vectors gi and formed matrix GiM^2*2. Where G i = g x1 g y1 g x2 g y2 … … g x M 2 g y M 2 . And Ci can be calculated via the singular value decomposition (SVD) of Gi. 𝐺 𝑖 = 𝑈 𝑖 𝑆 𝑖 𝑉 𝑖 𝑇 = 𝑈 𝑖 𝑆 1𝑖 0 0 𝑆 2𝑖 𝑣 1𝑖 𝑇 𝑣 2𝑖 𝑇 , and 𝐶 𝑖 = 𝛾 𝛼 1 2 𝑣 1𝑖 𝑣 1𝑖 𝑇 + 𝛼 2 2 𝑣 2𝑖 𝑣 2𝑖 𝑇 𝛼 1 = 𝑠 1𝑖 +1 𝑠 2𝑖 +1 , 𝛼 2 = 1 𝛼 1 ,𝛾= 𝑠 1𝑖 𝑠 2𝑖 + 10 −7 𝑀 2 0.008

Object Texture Description For each neighboring pixel 𝑝 𝑖 , 𝑖=1,…, 𝑀 2 ,extract K(p) and normalize into 𝐾 𝑁 𝑝 = 𝐾(𝑝) 𝐾(𝑝) 1 , where ‧ 1 is the L1- norm. Above concepts are applied. First converted ROI and search region from RGB to La*b* color space and the LSKs are computed for each channel separately through steps above. The final representation of ROI is obtained by applying PCA [26]. Finally, the search region is divided into patches and the LSK similarity matrix, which will be used in next section, is estimated (like color similarity) by applying the cosine similarity measure. [26] H. Seo and P. Milanfar, “Training-free, generic object detection using locally adaptive regression kernels,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 32, no. 9, pp. 1688–1704, Sep. 2010.

Overview Introduction Proposed method Experiment Result Conclusion Color Similarity Object Texture Description Object Localization and Model Update Search Region Extraction in The Next Frame Experiment Result Conclusion

Object Localization and Model Update Object localization in the search region is performed by taking into account CH and LSK similarity of patch to the 1ROI in the previous frame and the 2object instance in the stack. First, divide the search region into overlapping patches of size equal to the detected object. And for each patch, we extract CH and LSK features. Then, for each patch, we construct three cosine similarity matrices LSK similarity. Between this patch and the detected object in the previous frame. LSK similarity. Between this patch and the last updated object instance. CH similarity. Between this patch and the last updated object instance.

Object Localization and Model Update The new ROI is decided with the final decision matrix, which is computed by M= 1−λ M LSK1 +λ M LSK2 ∗ B CH . (* denotes the element-wise matrix multiplication and λ usually takes the value 0.5) The new candidate object position is at the patch with the maximal value maxi,j(Mij). We compare Mij with previous frame’s maximal value Mij’. If the values drops under a threshold, it indicates a possible change in the object appearance. Other 4 decision matrix of rotation and scaling are calculated. The final decision for the new object is the one which corresponds to the maximal value of five decision matrices. The newly localized object is stored in a stack, which size is constant.

Overview Introduction Proposed method Experiment Result Conclusion Color Similarity Object Texture Description Object Localization and Model Update Search Region Extraction in The Next Frame Experiment Result Conclusion

Search Region Extraction in The Next Frame The position of the object in the following frame is predicted using a linear Kalman filter. The object motion state 𝑥 𝑡 = 𝑥,𝑦,𝑑𝑥,𝑑𝑦 𝑇 , and the new state is given by 𝑥 𝑡 =𝐴 𝑥 𝑡−1 + 𝑤 𝑡−1 . 𝑤 𝑡−1 denotes the process noise, with probability distribution. After get 𝑥 𝑡 , we can then get 𝑧 𝑡 =𝐻 𝑥 𝑡 + 𝑣 𝑡 to compute the equation : Where 𝑃 𝑡 is a covariance matrix with stochastic model. And this model is adjusted through equations below. Among the equation above, 𝑥 𝑡 is the predicted position of a search region’s center.

Overview Introduction Proposed method Experiment Result Conclusion

Experiment Result Quantitative evaluation comparison is performed through the frame detection accuracy (FDA) measure. FDA calculates the overlap area between the ground truth object 𝐺 𝑖 and the detected object D at a given frame t. 𝐹𝐷𝐴 𝑡 = 1 𝑁 𝑡 𝑖=1 𝑁 𝑡 𝐺 𝑖 (𝑡)∩ 𝐷 𝑖 (𝑡) 𝐺 𝑖 (𝑡)∪ 𝐷 𝑖 (𝑡) 𝐴𝑇𝐴= 1 𝑁 𝑖=1 𝑁 𝐹𝐷𝐴(𝑡) 𝑂𝑇𝐴= 1 𝑁 𝑇 𝑖=1 𝑛 𝑁 𝑖 𝐴𝑇 𝐴 𝑖

Experiment Result The performance of proposed tracker is compared with two other trackers, PF tracker and FT tracker. Test case:

Experiment Result

Experiment Result Test Case1 : test for variation of object scale

Experiment Result Test Case2 : test for variation of object scale

Experiment Result Test Case3 : test for variation of object rotation

Experiment Result Test Case4 : test for variation of partial occlusion

Experiment Result Test Case5 : test for variation of partial occlusion; the man walks behind the woman

Experiment Result Test Case6 : test for strong change in illumination

Experiment Result Test Case7 : test for human activity (orientation of glass)

Experiment Result Test Case8 : test for human activity (hands are articulated objects)

Experiment Result Test Case9 : test for human activity

Experiment Result Test Case10 : test for face tracking

Experiment Result

Overview Introduction Proposed method Experiment Result Conclusion

Conclusion The tracker extracted a representation of the target object based on LSK and CH at frame 𝑡−1 and tried to find its location in the frame 𝑡. Proposed method is effective in object tracking under severe changes in appearance, affine transformations, and partial occlusion. The method cannot handle the case of full occlusion. (The tracker continues tracking another object in the background) Kalman filter cannot follow sudden changes in the object direction or speed. (Although a larger search region may solve the issue, but it would result in rapid decrease of speed)