Presentation is loading. Please wait.

Presentation is loading. Please wait.

WORD-PREDICTION AS A TOOL TO EVALUATE LOW-LEVEL VISION PROCESSES Prasad Gabbur, Kobus Barnard University of Arizona.

Similar presentations


Presentation on theme: "WORD-PREDICTION AS A TOOL TO EVALUATE LOW-LEVEL VISION PROCESSES Prasad Gabbur, Kobus Barnard University of Arizona."— Presentation transcript:

1 WORD-PREDICTION AS A TOOL TO EVALUATE LOW-LEVEL VISION PROCESSES Prasad Gabbur, Kobus Barnard University of Arizona

2 Overview  Word-prediction using translation model for object recognition  Feature evaluation  Segmentation evaluation  Modifications to Normalized Cuts segmentation algorithm  Evaluation of color constancy algorithms  Effects of illumination color change on object recognition  Strategies to deal with illumination color change

3  Low-level computer vision algorithms  Segmentation, edge detection, feature extraction, etc.  Building blocks of computer vision systems  Is there a generic task to evaluate these algorithms quantitatively?  Word-prediction using translation model for object recognition  Sufficiently general  Quantitative evaluation is possible Motivation

4 Translation model for object recognition Translate from visual to semantic description

5 Approach Model joint probability distribution of visual representations and associated words using a large, annotated image collection. Corel database

6 Image pre-processing sun sky waves sea visual features Segmentation* * Thanks to N-cuts team [Shi, Tal, Malik] for their segmentation algorithm [f 1 f 2 f 3 …. f N ] Joint distribution

7 word blob joint visual/textual concepts * Learn P ( w | l ), P ( b | l ), and P ( l ) from data using EM Node l Frequency tableGaussian over features * Barnard et al JMLR 2003

8 Annotating images Segment image Compute P(w|b) for each region Sum over regions... b1b1 b2b2 P(w|b 1 ) P(w|b 2 ) + P(w|image)

9 CAT TIGER GRASS FOREST Predicted Words Actual Keywords CAT HORSE GRASS WATER Measuring performance Record percent correct Use annotation performance as a proxy for recognition Large region-labeled databases are not available Large annotated databases are available

10 75% Training 160 CD’s 80 CD’s Novel 25% Test Experimental protocol sampling scheme Each CD contains 100 images on one specific topic like “aircraft” Average results over 10 different samplings Corel database

11 Semantic evaluation of vision processes  Feature sets Combinations of visual features  Segmentation methods  Mean-Shift [Comaniciu, Meer]  Normalized Cuts [Shi, Tal, Malik]  Color constancy algorithms  Train with illumination change  Color constancy processing – Gray-world, Scale-by-max

12 Feature evaluation Features Size Location Shape Second moment Compactness Convexity Outer boundary descriptor Color (RGB, L*a*b, rgS) Average color Standard deviation Texture Responses to a bank of filters Even and Odd symmetric Rotationally symmetric (DOG) Context (Average surrounding color)

13 Feature evaluation Base = Size + Location + Second moment + Compactness Annotation Performance (bigger is better)

14 Segmentation evaluation Mean Shift (Comaniciu, Meer) Normalized Cuts (N-Cuts) (Shi, Tal, Malik)

15 Segmentation evaluation Performance depends on number of regions used for annotation Mean Shift is better than N-Cuts for # regions < 6 Annotation Performance (bigger is better) # regions

16 Normalized Cuts Graph partitioning technique Bi-partitions an edge-weighted graph in an optimal sense Normalized cut (Ncut) is the optimizing criterion ij w ij Edge weight => Similarity between i and j AB Minimize Ncut(A,B) Nodes Image segmentation Each pixel is a node Edge weight is similarity between pixels Similarity based on color, texture and contour cues

17 Normalized Cuts Original algorithm pixel region Initial seg Final seg Produces splits in homogeneous regions, e.g., “sky” – Local connectivity between pixels PresegSeg

18 Meta-segmentation region PresegIteration 1Iteration n region Modifications to Normalized Cuts OriginalModified k l k l

19 Modifications to Normalized Cuts OriginalModifiedOriginalModified

20 Original vs. Modified For # regions < 6, modified out-performs original For # regions > 6, original is better Annotation Performance (bigger is better) # regions

21 Incorporating high-level information into segmentation algorithms Low-level segmenters split up objects (eg. Black and white halves of a penguin) Using word-prediction gives a way to incorporate high-level semantic information into segmentation algorithms Propose a merge between regions that have similar posterior distributions over words

22 Illumination change Makes recognition difficult Illumination color change Illuminant 1Illuminant 2 Strategies to deal with illumination change: Train for illumination change Color constancy pre-processing and normalization http://www.cs.sfu.ca/~colour/data * *

23 Training Train for illumination change Variation of color under expected illumination changes [Matas et al 1994, Matas 1996, Matas et al 2000]

24 Algorithm Unknown illuminantCanonical (reference) illuminant (Map image as if it were taken under reference illuminant). Test Input Recognition system Training database Canonical (reference) illuminant Color constancy pre-processing [Funt et al 1998]

25 Algorithm Unknown illuminantCanonical (reference) illuminant (Map image as if it were taken under reference illuminant). Test Input Recognition system Normalized training database Canonical (reference) illuminant Training database Algorithm Color normalization [Funt and Finlayson 1995, Finlayson et al 1998] Unknown illuminant

26 Simulating illumination change 11 illuminants (0 is canonical) 012 3 4 5 67 8 9 10

27 Train with illumination variation Experiment B Training: No illumination change Testing: Illumination change Experiment C Training: Illumination change Testing: Illumination change Annotation Performance (bigger is better) Experiment A Training: No illumination change Testing: No illumination change

28 Color constancy pre-processing Gray-world Training Test Algorithm Mean color = constant CanonicalUnknown Canonical

29 Color constancy pre-processing Scale-by-max Training Test Algorithm Max color = constant CanonicalUnknown Canonical

30 Color constancy pre-processing Experiment B Training: No illumination change Testing: Illumination change Others Training: No illumination change Testing: Illumination change + Color constancy algorithm Annotation Performance (bigger is better) Experiment A Training: No illumination change Testing: No illumination change

31 Color normalization Gray-worldScale-by-max TrainingTestTrainingTest Algorithm Mean color = constantMax color = constant Canonical Unknown Canonical Unknown

32 Color normalization Experiment B Training: No illumination change Testing: Illumination change Others Training: No illumination change + Color constancy algorithm Testing: Illumination change + Color constancy algorithm Annotation Performance (bigger is better) Experiment A Training: No illumination change Testing: No illumination change

33 Conclusions  Translation (visual to semantic) model for object recognition  Identify and evaluate low-level vision processes for recognition  Feature evaluation  Color and texture are the most important in that order  Shape needs better segmentation methods  Segmentation evaluation  Performance depends on # regions for annotation  Mean Shift and modified NCuts do better than original NCuts for # regions < 6  Color constancy evaluation  Training with illumination helps  Color constancy processing helps (scale-by-max better than gray-world)

34 Thank you!


Download ppt "WORD-PREDICTION AS A TOOL TO EVALUATE LOW-LEVEL VISION PROCESSES Prasad Gabbur, Kobus Barnard University of Arizona."

Similar presentations


Ads by Google