Presentation is loading. Please wait.

Presentation is loading. Please wait.

Real-Time Object Localization and Tracking from Image Sequences

Similar presentations


Presentation on theme: "Real-Time Object Localization and Tracking from Image Sequences"— Presentation transcript:

1 Real-Time Object Localization and Tracking from Image Sequences
Yuanwei Wu, Yao Sui, Arjan Gupta and Guanghui Wang Friday, Sep. 9, 2016

2 Background Amazon: On December 1, 2013, the CEO of Amazon Jeff Bezos revealed their plans about future delivery system using small unmanned aerial vehicles (UAVs) technology. Real-time and Autonomous sense and avoidance navigation system Vision-based methods: Robust to electromagnetic interference Compact and low power consumption Source: Amazon, “Amazon prime air,” Robust to electromagnetic interference compared to conventional sensor-based methods

3 Salient Object Detection and Tracking
The task of salient object detection is to compute a saliency map and segment an accurate boundary of that object. This amazon UAV sequence with challenging situations, namely scale variation, out-of-view and re-appearance. For detection, MB+ fails to provide a high quality saliency map.

4 Salient Object Detection and Tracking
The goal of visual tracking is to estimate the boundary and trajectory of the object in every frame of an image sequence. For tracking, the existing trackers cannot handle out-of-view and re-appearance challenges. Our method provides high quality of saliency map in detection, and accurate scale and position of the target in tracking.

5 Previous Works Real-time Automatic Initialization
Detection and tracking [Andriluka et al. CVPR’08] Saliency-based tracking [Mahadevan et al. CVPR’09] [Andriluka et al. CVPR’08]: combine a detector with a tracker, however, it needs large amount of off-line training for pedestrians. [Mahadevan et al. CVPR’09]: utilize the center surround contrast cue to calculate the saliency map and discriminate object from background, however it builds the motion saliency maps using optical flow makes it computational intensive.

6 Previous Works Real-time Automatic Initialization
Detection and tracking [Andriluka et al. CVPR’08] Saliency-based tracking [Mahadevan et al. CVPR’09] State-of-the-art Tracking-by-detection CT [Zhang et al. ECCV’12] STC [Zhang et al. ECCV’14] CN[Danelljan et al. TPAMI’14] SAMF[Li et al. ECCVW’14] DSST[Danelljan et al. BMVC’14] CCT[Zhu et al. BMVC’15] KCF[Henriques et al. TPAMI’15] Real time trackers: CT: sparsity-based compressive tracking, no scale adaptive bounding box Correlation filter based trackers: STC, CN, SAMF, DSST, CCT, KCF

7 Previous Works Real-time Automatic Initialization
Detection and tracking [Andriluka et al. CVPR’08] Saliency-based tracking [Mahadevan et al. CVPR’09] State-of-the-art Tracking-by-detection CT [Zhang et al. ECCV’12] STC [Zhang et al. ECCV’14] CN[Danelljan et al. TPAMI’14] SAMF[Li et al. ECCVW’14] DSST[Danelljan et al. BMVC’14] CCT[Zhu et al. BMVC’15] KCF[Henriques et al. TPAMI’15] Detection-then-tracking Proposed Proposed approach: real-time and automatic initialization by integration of kalman filter and salient object detection

8 Contributions The proposed algorithm integrates saliency map into a dynamic model and adopts the target-specific saliency map as the observation for tracking; Developed a tracker with automatic initialization for real-world applications; The proposed technique achieves state-of-the-art performance from extensive real experiments. Contributions:

9 System Overview Fast Object Localization and Tracking (FOLT)
In this approach, the bounding box of the object is initialized from the saliency map of the entire image. A dynamic motion is established to predict the object position and size at the next frame. After initialization, the proposed approach runs recursively on prediction, observation and correction phases.

10 System Overview: prediction
Coarse solution: linear Gaussian motion model [24] motion state: , where (x, y) denotes the center coordinates, (u, v) denotes the velocities, and (w, h) denotes the width and height of the minimum bounding box. Project the state ahead: (1), where is additive white Gaussian noise. Project the error covariance ahead: In most tracking scenarios, a linear Gaussian motion model has been demonstrated to be an eective representation for the motion behavior of a salient object in natural image sequences [24, 36]. Under the constraint of natural motion, this predicted bounding box provides the tracking algorithm a coarse solution, which is not far away from the ground truth[24]. [24] Yin, S., Na, J.H., Choi, J.Y., Oh, S.: "hierarchical kalman-particle flter with adaptation to motion changes for object tracking“, CVIU 115(6) (2011) pp

11 System Overview: observation
Refine solution Observation state: A search region is automatically attained by expanding the predicted bounding box with a fixed percentage. The location and size of the object is refined by computing the saliency within the search region, and thresholding the saliency map. The observation zt is the output of the fast salient object detector.

12 System Overview: correction
Update the motion model Compute the Kalman gain Update estimate with measurement (2) Update the error covariance (a) Prediction (a) Observation (c) Correction Next, the refined bounding box, as a new observation, is fed to the Kalman filter to update the motion model in the correction phase.

13 Salient Object Detection
Measuring Boundary Connectivity by Distance Transform Compute the distance for each pixel w.r.t. the image boundary Seed Set Shortest Path The idea of MBD: How to measure image boundary connectivity by distance transform: Set image boundary pixels as seed set (show in red) For each pixel (show in greed), find the shortest path (show in grey) to the seed set, according to the given path cost function The cost of the shortest path is the distance between green and red. Source: Zhang et al. ICCV’15

14 MBD vs Geodesic Distance
In what follows, we consider a single-channel image. MBD [Strand et al. CVIU’13] Geodesic + = Introducing MBD We consider a single-channel real-valued image The path cost function of MBD: \pi is the path, ie a sequence of adjacent pixels I(\pi(i)) represents the pixel value for the i-th pixel on the path The path cost function measure the distance between the highest and the lowest point along the path. The path cost function of Geodesic Distance Source: Zhang et al. ICCV’15

15 MBD is robust to small pixel value fluctuation
MBD vs Geodesic Distance MBD is robust to small pixel value fluctuation MBD MBD is more robust to small pixel fluctuation: MBD With higher sampling frequency (higher resolution), pixel values a long the path can have more fluctuations. The path cost function of MBD is robust to this, because it only depends on the lowest and the highest points of the path Geodesic However, the path cost function is sensitive to small pixel value fluctuation, as they will accumulate along the path. When applied on raw image pixels, the effectiveness of the geodesic distance can be greatly affected by this phenomenon. = Geodesic + = Source: Zhang et al. ICCV’15

16 Raster scanning/Inverse-raster scanning
For tracking, the existing trackers cannot handle out-of-view and re-appearance challenges. Our method provides high quality of saliency map in detection, and accurate scale and position of the target in tracking.

17 Algorithm y x For each visited pixel x:
Check each of the 4-connected neighbors x y Algorithm: For each visited pixel (green), check each of its 4-connected neighbor during forward and backward pass Source: Zhang et al. ICCV’15

18 Algorithm For each visited pixel x:
Check each of the 4-connected neighbors Minimize the path cost Update: D(x), cost of current assigned path U(x), highest value on assigned path L(x), lowest value on assigned path D(y) U(y) L(y) D(x) U(x) L(x) During the passes, we keep track of the following relevant information for each pixel: The path cost of the currently assigned path: D(x) The highest value on the assigned path: U(x) The lowest value on the assigned path: L(x) Then we can minimize the path cost by checking the neighbors using the displayed formulation After the minimization, we upate D(x), L(x) and U(x) Source: Zhang et al. ICCV’15

19 Algorithm Combined For tracking, the existing trackers cannot handle out-of-view and re-appearance challenges. Our method provides high quality of saliency map in detection, and accurate scale and position of the target in tracking.

20 Experiments Parameters: Transition state matrix Measure matrix
For tracking, the existing trackers cannot handle out-of-view and re-appearance challenges. Our method provides high quality of saliency map in detection, and accurate scale and position of the target in tracking.

21 Experiments Parameters: Process noise covariance matrix
Measure noise covariance matrix For tracking, the existing trackers cannot handle out-of-view and re-appearance challenges. Our method provides high quality of saliency map in detection, and accurate scale and position of the target in tracking. Error covariance pre-state

22 Qualitative Evaluation
For tracking, the existing trackers cannot handle out-of-view and re-appearance challenges. Our method provides high quality of saliency map in detection, and accurate scale and position of the target in tracking.

23 Precision and Success Plots
One pass evaluation: Temporal robustness evaluation: TRE For tracking, the existing trackers cannot handle out-of-view and re-appearance challenges. Our method provides high quality of saliency map in detection, and accurate scale and position of the target in tracking.

24 Precision and Success Rates on 15 Sequences

25 Precision and Success Rates on 15 Sequences

26 Qualitative Evaluation
Illumination variation In-plane and out-of-plane rotations Scale variation

27 Limitations

28 Conclusions In this paper, we have proposed an effective and efficient
approach for real-time visual object localization and tracking Our method integrates a fast salient object detector within Kalman filtering framework. Compared to the state-of-the-art trackers, our approach can not only initialize automatically, it also achieves the fastest speed and better performance than competing trackers. For tracking, the existing trackers cannot handle out-of-view and re-appearance challenges. Our method provides high quality of saliency map in detection, and accurate scale and position of the target in tracking.

29 The end!


Download ppt "Real-Time Object Localization and Tracking from Image Sequences"

Similar presentations


Ads by Google