Presentation is loading. Please wait.

Presentation is loading. Please wait.

CVPR2019 Jiahe Li 2019.06.03 SiamRPN introduces the region proposal network after the Siamese network and performs joint classification and regression.

Similar presentations


Presentation on theme: "CVPR2019 Jiahe Li 2019.06.03 SiamRPN introduces the region proposal network after the Siamese network and performs joint classification and regression."— Presentation transcript:

1 CVPR2019 Jiahe Li SiamRPN introduces the region proposal network after the Siamese network and performs joint classification and regression for tracking. DaSiamRPN further introduces a distractor-aware module and improves the discrimination power of the model. [SiamRPN] B. Li, J. Yan, W. Wu, Z. Zhu, and X. Hu. High performance visual tracking with siamese region proposal network. CVPR, 2018 [DaSiamRPN] Z. Zhu, Q. Wang, B. Li, W. Wu, J. Yan, and W. Hu. Distractor-aware siamese networks for visual object tracking. ECCV, 2018

2 The intrinsic restriction
Matching function Restriction: Strict translation invariance is the translation shift sub window operator Padding in deep networks will destroy the strict translation invariance

3 Strict translation invariance
Previous Siamese based networks Strict translation invariance only exists in no padding network AlexNet Deeper networks Padding is inevitable to make the network going deeper, which destroys the strict translation invariance restriction ResNet or MobileNet The hypothesis The violation of this restriction will lead to a spatial bias A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko,W.Wang, T. Weyand, M. Andreetto, and H. Adam. MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv, 2017

4 A spatial bias Testing the hypothesis
Simulation experiments on a network with padding Shift – the max range of translation generated by a uniform distribution in data augmentation Targets are placed in the center with different shift ranges (0, 16 and 32) in three separate training experiments.

5 A suitable shift

6 Layer-wise aggregation
As shown in Fig. 3, the outputs of conv3, conv4, conv5 are fed into three Siamese RPN module individually. Modifying the conv4 and conv5 block to have unit spatial stride, Features from earlier layers will mainly focus on low level information such as color, shape, are essential for localization, while lacking of semantic information; Features from latter layers have rich semantic information that can be beneficial during some challenge scenarios like motion blur, huge deformation.

7 Depth-wise cross correlation
Two feature maps with the same number of channels do the correlation operation channel by channel.

8 SiamRPN++

9 Loss Function

10 Dataset & Evaluation Metrics
Single Object Tracking: Datasets OTB2015: 98 videos, frames per video VOT2018: 60 videos UAV123: 123 videos, 915 frames per video LaSOT: 1400 videos, 280 videos in the testing set TrackingNet: 511 videos in the testing set Evaluation metrics: IOU Expected Average Overlap (EAO) AUC: the area under curve of each success plot The area under curve of each success plot , which is the average of the success rates corresponding to the sampled overlap threshold

11 Ablation experiments They report performance by Area Under Curve (AUC) of success plot on OTB2015 with respect to the top1 accuracy on ImageNet.

12 OTB2015 OPE: one-pass evaluation
TRE: temporal robustness evaluation (different start frame) SRE: spatial robustness evaluation (four center shifts and four corner shifts)

13 VOT2018

14 UAV123

15 LaSOT

16 TrackingNet M. M¨uller, A. Bibi, S. Giancola, S. Al-Subaihi, and B. Ghanem. TrackingNet: A large-scale dataset and benchmark for object tracking in the wild. ECCV, 2018

17 Conclusion + Evaluation with efficient experiments
- No experiment is conducted to demonstrate that the strict translation invariance only exists in no padding networks.


Download ppt "CVPR2019 Jiahe Li 2019.06.03 SiamRPN introduces the region proposal network after the Siamese network and performs joint classification and regression."

Similar presentations


Ads by Google