Download presentation
Presentation is loading. Please wait.
Published byMaryann McKinney Modified over 6 years ago
1
HCP: A Flexible CNN Framework for Multi-Label Image Classification
Source : IEEE Transactions on Pattern Analysis & Machine Intelligence, 2016, 38(9): Authors : Wei Y, Wei X, Lin M, et al. Speaker : Jiefan Tan Date : 2017/6/29 國立台灣大學資訊工程學系
2
Outline Introduction Proposed method Experiment Conclusions
Hypotheses Extraction Training HCP Experiment Conclusions
3
Introduction- single-label image classification
Hand-crafted features classifier Image classification +10% CNN
4
Introduction- single-label VS. Multi-label images
Roughly aligned mis-aligned occluded
5
Proposed method- infrastructure of the proposed HCP
6
Proposed method- Hypotheses Extraction
Source image Bounding boxes-BING Hypotheses-Bb Hypotheses-HS(resize) Filter out hypotheses: 1.Small area(<900 pixel) 2.High height/width (width/height)ratios(>4) High object detection recall rate Small number of hypotheses High computational efficiency
7
Proposed method- Training HCP
pre-train (parameter initialization) single-label image image-fine-tuning(initialize final fully-connected) multi-label image hypotheses-fine-tuning hypotheses cross-hypothesis max-pooling 𝑣 (𝑗) =max( 𝑣 1 𝑗 , 𝑣 2 𝑗 ,…, 𝑣 𝑚 𝑗 ) 𝑣 𝑖 𝑖=1,…,𝑚 is the vector of output j=1,…c is the 𝑗 𝑡ℎ component of 𝑣 𝑖 m is the number of images c is the number of categories
8
Proposed method- Training HCP
cross-hypothesis max-pooling squared loss 𝐽= 1 𝑁 𝑖=1 𝑁 𝑘=1 𝑐 ( 𝑝 𝑖𝑘 − 𝑝 𝑖𝑘 ) 2 𝑝 𝑖 is the ground-truth probability vector of the 𝑖 𝑡ℎ image 𝑝 𝑖 is the predictive probability vector N is the number of images c is the number of categories
9
Proposed method- HCP
10
Experiment Datasets Shared CNN VOC 2007 (trainval/test = 5011/4952)
Alex Net VGG Net
11
Experiment Results 1.mAP-mean of Average Precision 2.Number-hypotheses
12
Experiment Results
13
Conclusions No ground-truth bounding box information is required.
Robust to noisy and/or redundant hypotheses. Can be well pre-trained by a large single-label image dataset. The HCP outputs are intrinsically multi-label prediction results.
14
Thank you!
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.