LARGE-SCALE NONPARAMETRIC IMAGE PARSING Joseph Tighe and Svetlana Lazebnik University of North Carolina at Chapel Hill CVPR 2011Workshop on Large-Scale.

LARGE-SCALE NONPARAMETRIC IMAGE PARSING Joseph Tighe and Svetlana Lazebnik University of North Carolina at Chapel Hill CVPR 2011Workshop on Large-Scale Learning for Vision road building car sky

Small-scale image parsing Tens of classes, hundreds of images He et al. (2004), Hoiem et al. (2005), Shotton et al. (2006, 2008, 2009), Verbeek and Triggs (2007), Rabinovich et al. (2007), Galleguillos et al. (2008), Gould et al. (2009), etc. Figure from Shotton et al. (2009)

Large-scale image parsing Hundreds of classes, tens of thousands of images Non-uniform class frequencies

Evolving training set http://labelme.csail.mit.edu/ Large-scale image parsing Hundreds of classes, tens of thousands of images Non-uniform class frequencies

 What’s considered important for small-scale image parsing?  Combination of local cues  Multiple segmentations, multiple scales  Context  Graphical model inference (CRFs, etc.)  How much of this is feasible for large-scale, dynamic datasets? Challenges

Our first attempt: A nonparametric approach  Lazy learning: do (almost) nothing at training time  At test time:  Find a retrieval set of similar images for each query image  Transfer labels from the retrieval set by matching segmentation regions (superpixels)  Related work: SIFT Flow (Liu et al. 2008, 2009)

Step 1: Scene-level matching Gist (Oliva & Torralba, 2001) Spatial Pyramid (Lazebnik et al., 2006) Color Histogram Retrieval set: Source of possible labels Source of region-level matches

Step 2: Region-level matching Superpixels (Felzenszwalb & Huttenlocher, 2004) Superpixel features

Step 2: Region-level matching Snow Road Tree Building Sky Pixel Area (size)

Road Sidewalk Step 2: Region-level matching Absolute mask (location)

Step 2: Region-level matching Road SkySnow Sidewalk Texture

Step 2: Region-level matching Building Sidewalk Road Color histogram

Region-level likelihoods  Nonparametric estimate of class-conditional densities for each class c and feature type k:  Per-feature likelihoods combined via Naïve Bayes: kth feature type of ith region Features of class c within some radius of r i Total features of class c in the dataset

Region-level likelihoods BuildingCarCrosswalk SkyWindowRoad

Step 3: Global image labeling  Compute a global image labeling by optimizing a Markov random field (MRF) energy function: Likelihood score for region r i and label c i Co-occurrence penalty Vector of region labels Regions Neighboring regions Smoothing penalty riri rjrj Efficient approximate minimization using  -expansion (Boykov et al., 2002)

Step 3: Global image labeling  Compute a global image labeling by optimizing a Markov random field (MRF) energy function: Likelihood score for region r i and label c i Co-occurrence penalty Vector of region labels Regions Neighboring regions Smoothing penalty

Step 3: Global image labeling  Compute a global image labeling by optimizing a Markov random field (MRF) energy function: Maximum likelihood labeling Edge penaltiesFinal labelingFinal edge penalties road building car window sky road building car sky Likelihood score for region r i and label c i Co-occurrence penalty Vector of region labels Regions Neighboring regions Smoothing penalty

Step 3: Global image labeling  Compute a global image labeling by optimizing a Markov random field (MRF) energy function: sky tree sand road sea road sky sand sea Original image Maximum likelihood labeling Edge penalties MRF labeling Likelihood score for region r i and label c i Co-occurrence penalty Vector of region labels Regions Neighboring regions Smoothing penalty

Joint geometric/semantic labeling  Semantic labels: road, grass, building, car, etc.  Geometric labels: sky, vertical, horizontal  Gould et al. (ICCV 2009) sky tree car road sky horizontal vertical Original imageSemantic labelingGeometric labeling

Joint geometric/semantic labeling  Objective function for joint labeling: Geometric/semantic consistency penalty Semantic labels Geometric labels Cost of semantic labeling Cost of geometric labeling sky tree car road sky horizontal vertical Original imageSemantic labelingGeometric labeling

Example of joint labeling

Understanding scenes on many levels To appear at ICCV 2011

Datasets Training imagesTest imagesLabels SIFT Flow (Liu et al., 2009)2,48820033 Barcelona (Russell et al., 2007)14,871279170 LabelMe+SUN50,424300232

Overall performance SIFT FlowBarcelonaLabelMe + SUN SemanticGeom.SemanticGeom.SemanticGeom. Base73.2 (29.1)89.862.5 (8.0)89.946.8 (10.7)81.5 MRF76.3 (28.8)89.966.6 (7.6)90.250.0 (9.1)81.0 MRF + Joint76.9 (29.4)90.866.9 (7.6)90.750.2 (10.5)82.2 LabelMe + SUN IndoorLabelMe + SUN Outdoor SemanticGeom.SemanticGeom. Base22.4 (9.5)76.153.8 (11.0)83.1 MRF27.5 (6.5)76.456.4 (8.6)82.3 MRF + Joint27.8 (9.0)78.256.6 (10.8)84.1 *SIFT Flow: 74.75

Per-class classification rates

Results on SIFT Flow dataset

55.392.2 93.6 Results on LM+SUN dataset ImageGround truth Initial semanticFinal semantic Final geometric

58.993.057.3 Results on LM+SUN dataset ImageGround truth Initial semanticFinal semantic Final geometric

11.6 0.0 60.3 93.0 ImageGround truth Initial semanticFinal semantic Final geometric Results on LM+SUN dataset

65.6 75.887.7 ImageGround truth Initial semanticFinal semantic Final geometric Results on LM+SUN dataset

Running times SIFT Flow Barcelona dataset

Conclusions  Lessons learned  Can go pretty far with very little learning  Good local features, global (scene) context more important than neighborhood context  What’s missing  A rich representation for scene understanding  The long tail  Scalable, dynamic learning road building car sky

LARGE-SCALE NONPARAMETRIC IMAGE PARSING Joseph Tighe and Svetlana Lazebnik University of North Carolina at Chapel Hill CVPR 2011Workshop on Large-Scale.

Similar presentations

Presentation on theme: "LARGE-SCALE NONPARAMETRIC IMAGE PARSING Joseph Tighe and Svetlana Lazebnik University of North Carolina at Chapel Hill CVPR 2011Workshop on Large-Scale."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

LARGE-SCALE NONPARAMETRIC IMAGE PARSING Joseph Tighe and Svetlana Lazebnik University of North Carolina at Chapel Hill CVPR 2011Workshop on Large-Scale.

Similar presentations

Presentation on theme: "LARGE-SCALE NONPARAMETRIC IMAGE PARSING Joseph Tighe and Svetlana Lazebnik University of North Carolina at Chapel Hill CVPR 2011Workshop on Large-Scale."— Presentation transcript:

Similar presentations

About project

Feedback