WORD-PREDICTION AS A TOOL TO EVALUATE LOW-LEVEL VISION PROCESSES Prasad Gabbur, Kobus Barnard University of Arizona.

WORD-PREDICTION AS A TOOL TO EVALUATE LOW-LEVEL VISION PROCESSES Prasad Gabbur, Kobus Barnard University of Arizona

Overview  Word-prediction using translation model for object recognition  Feature evaluation  Segmentation evaluation  Modifications to Normalized Cuts segmentation algorithm  Evaluation of color constancy algorithms  Effects of illumination color change on object recognition  Strategies to deal with illumination color change

 Low-level computer vision algorithms  Segmentation, edge detection, feature extraction, etc.  Building blocks of computer vision systems  Is there a generic task to evaluate these algorithms quantitatively?  Word-prediction using translation model for object recognition  Sufficiently general  Quantitative evaluation is possible Motivation

Translation model for object recognition Translate from visual to semantic description

Approach Model joint probability distribution of visual representations and associated words using a large, annotated image collection. Corel database

Image pre-processing sun sky waves sea visual features Segmentation* * Thanks to N-cuts team [Shi, Tal, Malik] for their segmentation algorithm [f 1 f 2 f 3 …. f N ] Joint distribution

word blob joint visual/textual concepts * Learn P ( w | l ), P ( b | l ), and P ( l ) from data using EM Node l Frequency tableGaussian over features * Barnard et al JMLR 2003

Annotating images Segment image Compute P(w|b) for each region Sum over regions... b1b1 b2b2 P(w|b 1 ) P(w|b 2 ) + P(w|image)

CAT TIGER GRASS FOREST Predicted Words Actual Keywords CAT HORSE GRASS WATER Measuring performance Record percent correct Use annotation performance as a proxy for recognition Large region-labeled databases are not available Large annotated databases are available

75% Training 160 CD’s 80 CD’s Novel 25% Test Experimental protocol sampling scheme Each CD contains 100 images on one specific topic like “aircraft” Average results over 10 different samplings Corel database

Semantic evaluation of vision processes  Feature sets Combinations of visual features  Segmentation methods  Mean-Shift [Comaniciu, Meer]  Normalized Cuts [Shi, Tal, Malik]  Color constancy algorithms  Train with illumination change  Color constancy processing – Gray-world, Scale-by-max

Feature evaluation Features Size Location Shape Second moment Compactness Convexity Outer boundary descriptor Color (RGB, L*a*b, rgS) Average color Standard deviation Texture Responses to a bank of filters Even and Odd symmetric Rotationally symmetric (DOG) Context (Average surrounding color)

Feature evaluation Base = Size + Location + Second moment + Compactness Annotation Performance (bigger is better)

Segmentation evaluation Mean Shift (Comaniciu, Meer) Normalized Cuts (N-Cuts) (Shi, Tal, Malik)

Segmentation evaluation Performance depends on number of regions used for annotation Mean Shift is better than N-Cuts for # regions < 6 Annotation Performance (bigger is better) # regions

Normalized Cuts Graph partitioning technique Bi-partitions an edge-weighted graph in an optimal sense Normalized cut (Ncut) is the optimizing criterion ij w ij Edge weight => Similarity between i and j AB Minimize Ncut(A,B) Nodes Image segmentation Each pixel is a node Edge weight is similarity between pixels Similarity based on color, texture and contour cues

Normalized Cuts Original algorithm pixel region Initial seg Final seg Produces splits in homogeneous regions, e.g., “sky” – Local connectivity between pixels PresegSeg

Meta-segmentation region PresegIteration 1Iteration n region Modifications to Normalized Cuts OriginalModified k l k l

Modifications to Normalized Cuts OriginalModifiedOriginalModified

Original vs. Modified For # regions < 6, modified out-performs original For # regions > 6, original is better Annotation Performance (bigger is better) # regions

Incorporating high-level information into segmentation algorithms Low-level segmenters split up objects (eg. Black and white halves of a penguin) Using word-prediction gives a way to incorporate high-level semantic information into segmentation algorithms Propose a merge between regions that have similar posterior distributions over words

Illumination change Makes recognition difficult Illumination color change Illuminant 1Illuminant 2 Strategies to deal with illumination change: Train for illumination change Color constancy pre-processing and normalization http://www.cs.sfu.ca/~colour/data * *

Training Train for illumination change Variation of color under expected illumination changes [Matas et al 1994, Matas 1996, Matas et al 2000]

Algorithm Unknown illuminantCanonical (reference) illuminant (Map image as if it were taken under reference illuminant). Test Input Recognition system Training database Canonical (reference) illuminant Color constancy pre-processing [Funt et al 1998]

Algorithm Unknown illuminantCanonical (reference) illuminant (Map image as if it were taken under reference illuminant). Test Input Recognition system Normalized training database Canonical (reference) illuminant Training database Algorithm Color normalization [Funt and Finlayson 1995, Finlayson et al 1998] Unknown illuminant

Simulating illumination change 11 illuminants (0 is canonical) 012 3 4 5 67 8 9 10

Train with illumination variation Experiment B Training: No illumination change Testing: Illumination change Experiment C Training: Illumination change Testing: Illumination change Annotation Performance (bigger is better) Experiment A Training: No illumination change Testing: No illumination change

Color constancy pre-processing Gray-world Training Test Algorithm Mean color = constant CanonicalUnknown Canonical

Color constancy pre-processing Scale-by-max Training Test Algorithm Max color = constant CanonicalUnknown Canonical

Color constancy pre-processing Experiment B Training: No illumination change Testing: Illumination change Others Training: No illumination change Testing: Illumination change + Color constancy algorithm Annotation Performance (bigger is better) Experiment A Training: No illumination change Testing: No illumination change

Color normalization Gray-worldScale-by-max TrainingTestTrainingTest Algorithm Mean color = constantMax color = constant Canonical Unknown Canonical Unknown

Color normalization Experiment B Training: No illumination change Testing: Illumination change Others Training: No illumination change + Color constancy algorithm Testing: Illumination change + Color constancy algorithm Annotation Performance (bigger is better) Experiment A Training: No illumination change Testing: No illumination change

Conclusions  Translation (visual to semantic) model for object recognition  Identify and evaluate low-level vision processes for recognition  Feature evaluation  Color and texture are the most important in that order  Shape needs better segmentation methods  Segmentation evaluation  Performance depends on # regions for annotation  Mean Shift and modified NCuts do better than original NCuts for # regions < 6  Color constancy evaluation  Training with illumination helps  Color constancy processing helps (scale-by-max better than gray-world)

Thank you!

WORD-PREDICTION AS A TOOL TO EVALUATE LOW-LEVEL VISION PROCESSES Prasad Gabbur, Kobus Barnard University of Arizona.

Similar presentations

Presentation on theme: "WORD-PREDICTION AS A TOOL TO EVALUATE LOW-LEVEL VISION PROCESSES Prasad Gabbur, Kobus Barnard University of Arizona."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

WORD-PREDICTION AS A TOOL TO EVALUATE LOW-LEVEL VISION PROCESSES Prasad Gabbur, Kobus Barnard University of Arizona.

Similar presentations

Presentation on theme: "WORD-PREDICTION AS A TOOL TO EVALUATE LOW-LEVEL VISION PROCESSES Prasad Gabbur, Kobus Barnard University of Arizona."— Presentation transcript:

Similar presentations

About project

Feedback