Programme 2pm Introduction –Andrew Zisserman, Chris Williams 2.10pm Overview of the challenge and results –Mark Everingham (Oxford) 2.40pm Session 1: The.

Programme 2pm Introduction –Andrew Zisserman, Chris Williams 2.10pm Overview of the challenge and results –Mark Everingham (Oxford) 2.40pm Session 1: The Classification Task –Frederic Jurie presenting work by Jianguo Zhang (INRIA) 20 mins Frederic Jurie (INRIA) 20 mins –Thomas Deselaers (Aachen) 20 mins –Jason Farquhar (Southampton) 20 mins 4-4.30pm Coffee break 4.30pm Session 2: The Detection Task –Stefan Duffner/Christophe Garcia (France Telecom) 30 mins –Mario Fritz (Darmstadt) 30 mins 5.30pm Discussion –Lessons learnt, and future challenges

The PASCAL Visual Object Classes Challenge Mark Everingham Luc Van Gool Chris Williams Andrew Zisserman

Challenge Four object classes –Motorbikes –Bicycles –People –Cars Classification –Predict object present/absent Detection –Predict bounding boxes of objects

Competitions Train on any (non-test) data –How well do state-of-the-art methods perform on these problems? –Which methods perform best? Train on supplied data –Which methods perform best given specified training data?

Data sets train, val, test1 –Sampled from the same distribution of images –Images taken from PASCAL image databases –“Easier” challenge test2 –Freshly collected for the challenge (mostly Google Images) –“Harder” challenge

Training and first test set ClassImagesObjects Motorbikes214217 Bicycles114123 People84152 Cars272320 Total684 ClassImagesObjects Motorbikes216220 Bicycles114123 People84149 Cars275341 Total689 train+valtest1

Example images

Second test set ClassImagesObjects Motorbikes202227 Bicycles279399 People5261038 Cars275381 Total1282 test2

Example images

Annotation for training Object class present/absent Sub-class labels (partial) –Car side, Car rear, etc. Bounding boxes Segmentation masks (partial)

Issues in ground truth What objects should be considered detectable? –Subjective judgement by size in image, level of occlusion, detection without ‘inference’ Disagreements will cause noise in evaluation i.e. incorrectly- judged false positives “Errors” in training data –Un-annotated objects Requires machine learning algorithms robust to noise on class labels –Inaccurate bounding boxes Hard to specify for some instances e.g. bicycles Detection threshold was set “liberally”

Results: Classification

Participants test1test2 ParticipantMotorbikesBicyclesPeopleCarsMotorbikesBicyclesPeopleCars Aachen  Darmstadt  Edinburgh  FranceTelecom  HUT  INRIA: dalal  INRIA: dorko  INRIA: jurie  INRIA: zhang  METU  MPITuebingen  Southampton 

Methods Interest points (LoG/Harris) + patches/SIFT –Histogram of clustered descriptors SVM: INRIA: Dalal, INRIA: Zhang Log-linear model: Aachen Logistic regression: Edinburgh Other: METU –No clustering step SVM with other kernels: MPITuebingen, Southampton –Additional features Color: METU, moments: Southampton

Methods Image segmentation and region features: HUT –MPEG-7 color, shape, etc. –Self organizing map Classification by detection: Darmstadt –Generalized Hough transform/SVM verification

Evaluation Receiver Operating Characteristic (ROC) –Equal Error Rate (EER) –Area Under Curve (AUC) EER AUC

Competition 1: train+val/test1 1.1: Motorbikes Max EER: 0.977 (INRIA: Jurie)

Competition 1: train+val/test1 1.2: Bicycles Max EER: 0.930 (INRIA: Jurie, INRIA: Zhang)

Competition 1: train+val/test1 1.3: People Max EER: 0.917 (INRIA: Jurie, INRIA: Zhang)

Competition 1: train+val/test1 1.4: Cars Max EER: 0.961 (INRIA: Jurie)

Competition 2: train+val/test2 2.1: Motorbikes Max EER: 0.798 (INRIA: Zhang)

Competition 2: train+val/test2 2.2: Bicycles Max EER: 0.728 (INRIA: Zhang)

Competition 2: train+val/test2 2.3: People Max EER: 0.719 (INRIA: Zhang)

Competition 2: train+val/test2 2.4: Cars Max EER: 0.720 (INRIA: Zhang)

Classes and test1 vs. test2 Mean EER of ‘best’ results across classes –test1 : 0.946, test2 : 0.741

Conclusions? Interest points + SIFT + clustering (histogram) + SVM did ‘best’ –Log-linear model (Aachen) a close second –Results with SVM (INRIA) significantly better than with logistic regression (Edinburgh) Method using detection (Darmstadt) did not do so well –Cannot exploit context (= unintended bias?) of image –Used subset of training data and is able to localize

Competitions 3 & 4 Classification Any (non-test) training data to be used No entries submitted

Results: Detection

Participants test1test2 ParticipantMotorbikesBicyclesPeopleCarsMotorbikesBicyclesPeopleCars Aachen  Darmstadt  Edinburgh  FranceTelecom  HUT  INRIA: dalal  INRIA: dorko  INRIA: jurie  INRIA: zhang  METU  MPITuebingen  Southampton 

Methods Generalized Hough Transform –Interest points, clustered patches/descriptors, GHT Darmstadt: (SVM verification stage), side views with segmentation mask used for training INRIA: Dorko: SIFT features, semi-supervised clustering, single detection per image “Sliding window” classifiers –Exhaustive search over translation and scale FranceTelecom: Convolutional neural network INRIA: Dalal: SVM with SIFT-based input representation

Methods Baselines: Edinburgh –Detection confidence class prior probability Whole-image classifier (SIFT + logistic regression) –Bounding box Entire image Scale-normalized mean bounding box from training data Bounding box of all interest points Bounding box of interest points weighted by ‘class purity’

Evaluation Correct detection: 50% overlap in bounding boxes –Multiple detections considered as (one true + ) false positives Precision/Recall –Average Precision (AP) as defined by TREC Mean precision interpolated at recall = 0,0.1,…,0.9,1 Measured Interpolated

Competition 5: train+val/test1 5.1: Motorbikes Max AP: 0.886 (Darmstadt)

Competition 5: train+val/test1 5.2: Bicycles Max AP: 0.119 (Edinburgh)

Competition 5: train+val/test1 5.3: People Max AP: 0.013 (INRIA: Dalal)

Competition 5: train+val/test1 5.4: Cars Max AP: 0.613 (INRIA: Dalal)

Competition 6: train+val/test2 6.1: Motorbikes Max AP: 0.341 (Darmstadt)

Competition 6: train+val/test2 6.2: Bicycles Max AP: 0.113 (Edinburgh)

Competition 6: train+val/test2 6.3: People Max AP: 0.021 (INRIA: Dalal)

Competition 6: train+val/test2 6.4: Cars Max AP: 0.304 (INRIA: Dalal)

Classes and test1 vs. test2 Mean AP of ‘best’ results across classes –test1 : 0.408, test2 : 0.195

Conclusions? GHT (Darmstadt) method did ‘best’ on classes entered –SVM verification stage effective –Limited to lower recall (by use of only side views) SVM (INRIA: Dalal) comparable for cars, better on test2 –Smaller objects?, higher recall Performance on bicycles, people was ‘poor’ –“Non-solid” objects, articulation?

Competition 7: any train/ test1 One entry: 7.3: people (INRIA: Dalal) AP: 0.416 Use of own training data improved results dramatically (AP: 0.013)

Competition 8: any train/ test2 One entry: 8.3: people (INRIA: Dalal) AP: 0.438 Use of own training data improved results dramatically (AP: 0.021)

Conclusions Classification –Variety of methods and variations on SIFT+SVM –Encouraging performance on all object classes Detection –Variety of methods and variations on GHT –Encouraging performance on cars, motorbikes People and bicycles more challenging Use of own training data –Only one entry (people detection), much better results than using provided training data –State-of-the-art performance for pre-built classification/detection remains to be assessed

Programme 2pm Introduction –Andrew Zisserman, Chris Williams 2.10pm Overview of the challenge and results –Mark Everingham (Oxford) 2.40pm Session 1: The.

Similar presentations

Presentation on theme: "Programme 2pm Introduction –Andrew Zisserman, Chris Williams 2.10pm Overview of the challenge and results –Mark Everingham (Oxford) 2.40pm Session 1: The."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Programme 2pm Introduction –Andrew Zisserman, Chris Williams 2.10pm Overview of the challenge and results –Mark Everingham (Oxford) 2.40pm Session 1: The.

Similar presentations

Presentation on theme: "Programme 2pm Introduction –Andrew Zisserman, Chris Williams 2.10pm Overview of the challenge and results –Mark Everingham (Oxford) 2.40pm Session 1: The."— Presentation transcript:

Similar presentations

About project

Feedback