Segmentation- Based Stereo Michael Bleyer LVA Stereo Vision.

Slides:

Advertisements

Similar presentations

A Robust Super Resolution Method for Images of 3D Scenes Pablo L. Sala Department of Computer Science University of Toronto.

Advertisements

Efficient High-Resolution Stereo Matching using Local Plane Sweeps Sudipta N. Sinha, Daniel Scharstein, Richard CVPR 2014 Yongho Shin.

Investigation Into Optical Flow Problem in the Presence of Spatially-varying Motion Blur Mohammad Hossein Daraei June 2014 University.

SOFT SCISSORS: AN INTERACTIVE TOOL FOR REALTIME HIGH QUALITY MATTING International Conference on Computer Graphics and Interactive Techniques ACM SIGGRAPH.

Announcements Project 2 due today Project 3 out today –demo session at the end of class.

Public Library, Stereoscopic Looking Room, Chicago, by Phillips, 1923.

I Images as graphs Fully-connected graph – node for every pixel – link between every pair of pixels, p,q – similarity w ij for each link j w ij c Source:

Stereo Matching Segment-based Belief Propagation Iolanthe II racing in Waitemata Harbour.

Does Color Really Help in Dense Stereo Matching?

GrabCut Interactive Image (and Stereo) Segmentation Carsten Rother Vladimir Kolmogorov Andrew Blake Antonio Criminisi Geoffrey Cross [based on Siggraph.

Graph-Based Image Segmentation

Stephen J. Guy 1. Photomontage Photomontage GrabCut – Interactive Foreground Extraction 1.

Human-Computer Interaction Human-Computer Interaction Segmentation Hanyang University Jong-Il Park.

Epipolar lines epipolar lines Baseline O O’ epipolar plane.

Computer Vision Optical Flow

Optimal solution of binary problems Much material taken from :  Olga Veksler, University of Western Ontario

Boundary matting for view synthesis Samuel W. Hasinoff Sing Bing Kang Richard Szeliski Computer Vision and Image Understanding 103 (2006) 22–32.

Last Time Pinhole camera model, projection

Computer Vision : CISC 4/689 Adaptation from: Prof. James M. Rehg, G.Tech.

Stereo & Iterative Graph-Cuts Alex Rav-Acha Vision Course Hebrew University.

Advanced Topics in Computer Vision Spring 2006 Video Segmentation Tal Kramer, Shai Bagon Video Segmentation April 30 th, 2006.

Multiview stereo. Volumetric stereo Scene Volume V Input Images (Calibrated) Goal: Determine occupancy, “color” of points in V.

High-Quality Video View Interpolation

Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

The plan for today Camera matrix

Interactive Matting Christoph Rhemann Supervised by: Margrit Gelautz and Carsten Rother.

Stereo Computation using Iterative Graph-Cuts

What Energy Functions Can be Minimized Using Graph Cuts? Shai Bagon Advanced Topics in Computer Vision June 2010.

An Iterative Optimization Approach for Unified Image Segmentation and Matting Hello everyone, my name is Jue Wang, I’m glad to be here to present our paper.

CSE473/573 – Stereo Correspondence

Announcements PS3 Due Thursday PS4 Available today, due 4/17. Quiz 2 4/24.

Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’04) /04 $20.00 c 2004 IEEE 1 Li Hong.

Reconstructing Relief Surfaces George Vogiatzis, Philip Torr, Steven Seitz and Roberto Cipolla BMVC 2004.

Stereo Matching Information Permeability For Stereo Matching – Cevahir Cigla and A.Aydın Alatan – Signal Processing: Image Communication, 2013 Radiometric.

Michael Bleyer LVA Stereo Vision

Michael Bleyer LVA Stereo Vision

Surface Stereo with Soft Segmentation Michael Bleyer 1, Carsten Rother 2, Pushmeet Kohli 2 1 Vienna University of Technology, Austria 2 Microsoft Research.

Graph Cut Algorithms for Binocular Stereo with Occlusions

Y. Moses 11 Combining Photometric and Geometric Constraints Yael Moses IDC, Herzliya Joint work with Ilan Shimshoni and Michael Lindenbaum, the Technion.

Object Stereo- Joint Stereo Matching and Object Segmentation Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on Michael Bleyer Vienna.

WIEN Stereo-based Image and Video Analysis for Multimedia Applications Application: “Computer-generated Stereoscopic Paintings“ M. Gelautz, E. Stavrakis,

Stereo Vision Reading: Chapter 11 Stereo matching computes depth from two or more images Subproblems: –Calibrating camera positions. –Finding all corresponding.

Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.

#MOTION ESTIMATION AND OCCLUSION DETECTION #BLURRED VIDEO WITH LAYERS

CS 4487/6587 Algorithms for Image Analysis

Feature-Based Stereo Matching Using Graph Cuts Gorkem Saygili, Laurens van der Maaten, Emile A. Hendriks ASCI Conference 2011.

Computer Vision, Robert Pless

A Region Based Stereo Matching Algorithm Using Cooperative Optimization Zeng-Fu Wang, Zhi-Gang Zheng University of Science and Technology of China Computer.

Michael Bleyer LVA Stereo Vision

1 Artificial Intelligence: Vision Stages of analysis Low level vision Surfaces and distance Object Matching.

Lecture 19: Solving the Correspondence Problem with Graph Cuts CAP 5415 Fall 2006.

Optical Flow. Distribution of apparent velocities of movement of brightness pattern in an image.

Motion Estimation Today’s Readings Trucco & Verri, 8.3 – 8.4 (skip 8.3.3, read only top half of p. 199) Newton's method Wikpedia page

Jeong Kanghun CRV (Computer & Robot Vision) Lab..

Journal of Visual Communication and Image Representation

CS654: Digital Image Analysis Lecture 28: Advanced topics in Image Segmentation Image courtesy: IEEE, IJCV.

Advanced Computer Vision Chapter 11 Stereo Correspondence Presented by: 蘇唯誠指導教授 : 傅楸善博士.

Motion Estimation Today’s Readings Trucco & Verri, 8.3 – 8.4 (skip 8.3.3, read only top half of p. 199) Newton's method Wikpedia page

Photoconsistency constraint C2 q C1 p l = 2 l = 3 Depth labels If this 3D point is visible in both cameras, pixels p and q should have similar intensities.

Computational Vision CSCI 363, Fall 2012 Lecture 17 Stereopsis II

Correspondence and Stereopsis. Introduction Disparity – Informally: difference between two pictures – Allows us to gain a strong sense of depth Stereopsis.

A Plane-Based Approach to Mondrian Stereo Matching

Michael Bleyer LVA Stereo Vision

Semi-Global Matching with self-adjusting penalties

Semi-Global Stereo Matching with Surface Orientation Priors

Merle Norman Cosmetics, Los Angeles

Geometry 3: Stereo Reconstruction

Multiway Cut for Stereo and Motion with Slanted Surfaces

Lecture 31: Graph-Based Image Segmentation

Occlusion and smoothness probabilities in 3D cluttered scenes

Presentation transcript:

Segmentation- Based Stereo Michael Bleyer LVA Stereo Vision

What happened last time?  Once again, we have looked at our energy function:  We have investigated the matching cost function m(): Standard measures:  Absolute/squared intensity differences  Sampling insensitive measures Radiometric insensitive measures:  Mutual information  ZNCC  Census The role of color Segmentation-based aggregation methods 2

What is Going to Happen Today?  Occlusion handling in global stereo  Segmentation-based matching  The matting problem In stereo matching 3

Michael Bleyer LVA Stereo Vision Occlusion Handling in Global Stereo

There is Something Wrong with our Data Term  Recall the data term:  We compute the pixel dissimilarity m() for each pixel of the left image.  As we know, not every pixel has a correspondence, i.e. there are occluded pixels.  It does not make sense to compute the pixel dissimilarity for occluded pixels. 5

We Should Modify the Data Term  In a more correct formulation, we incorporate occlusion information: where  O(p) is a function that returns 1 if p is occluded and 0, otherwise.  P occ is a constant penalty for occluded pixels (occlusion penalty)  Idea: We measure the pixel dissimilarity, if the pixel is not occluded. We impose the occlusion penalty, if the pixel is occluded.  Why do we need the occlusion penalty? If we would not have it, declaring all pixels as occluded would represent a trivial energy optimum. (Data costs would be equal to 0.) 6

We Should Modify the Data Term  In a more correct formulation, we incorporate occlusion information: where  O(p) is a function that returns 1 if p is occluded and 0, otherwise.  P occ is a constant penalty for occluded pixels (occlusion penalty)  Idea: We measure the pixel dissimilarity, if the pixel is not occluded. We impose the occlusion penalty, if the pixel is occluded.  Why do we need the occlusion penalty? If we would not have it, declaring all pixels as occluded would represent a trivial energy optimum. (Data costs would be equal to 0.) 7 How can we define the occlusion function O()?

Occlusion Function  Let us assume we have two surfaces in the left image.  We know their disparity values. 8 Disparity X-Coordinates Left Image

Occlusion Function  We can use the disparity values to transform the left image into the geometry of the right image. (We say that we warp the left image.)  The x-coordinate in the right view x’ p is computed by x’ p = x p – d p. 9 Disparity X-Coordinates Disparity X-Coordinates Left ImageRight Image Warp

Occlusion Function  We can use the disparity values to transform the left image into the geometry of the right image. (We say that we warp the left image.)  The x-coordinate in the right view x’ p is computed by x’ p = x p – d p. 10 Disparity X-Coordinates Disparity X-Coordinates Left ImageRight Image Warp Small disparity => Small shift

Occlusion Function  We can use the disparity values to transform the left image into the geometry of the right image. (We say that we warp the left image.)  The x-coordinate in the right view x’ p is computed by x’ p = x p – d p. 11 Disparity X-Coordinates Disparity X-Coordinates Left ImageRight Image Warp Large disparity => Large shift

Occlusion Function  There are pixels that project to the same x-coordinate in the right view (see p and q).  Only one of these pixels can be visible (uniqueness constraint). 12 Disparity X-Coordinates Disparity X-Coordinates Left ImageRight Image Warp q p q p

Occlusion Function  There are pixels that project to the same x-coordinate in the right view (see p and q).  Only one of these pixels can be visible (uniqueness constraint). 13 Disparity X-Coordinates Disparity X-Coordinates Left ImageRight Image Warp q p q p Which of the two pixels is visible – p or q?

Occlusion Function  There are pixels that project to the same x-coordinate in the right view (see p and q).  Only one of these pixels can be visible (uniqueness constraint). 14 Disparity X-Coordinates Disparity X-Coordinates Left ImageRight Image Warp p p q p q has a higher disparity => q is closer to the camera => q has to be visible p is occluded by q

Occlusion Function  There are pixels that project to the same x-coordinate in the right view (see p and q).  Only one of these pixels can be visible (uniqueness constraint). 15 Disparity X-Coordinates Disparity X-Coordinates Left ImageRight Image Warp p q p q Visibility Constraint: A pixel p is occluded if there exists a pixel q so that p and q have the same matching point in the other view and q has a higher disparity than p.

The Occlusion-Aware Data Term  We have already defined our data term:  The function O(p) is defined using the visibility constraint: 16 1 if 0 otherwise. and

The Occlusion-Aware Data Term  We have already defined our data term:  The function O(p) is defined using the visibility constraint: 17 1 if 0 otherwise. and Pixels have the same matching point q has a higher disparity than p

How Can We Optimize That?  I just give a rough sketch for using graph-cuts  Works for α-expansions and fusion moves  I follow the construction of [Woodford,CVPR08]

How Can We Optimize That?  The trick is to add an occlusion node for each node representing a pixel q OqOq Occlusion node for q: Has two states visible/occluded Node representing pixel q: Has two states active/inactive (active means that the pixel takes a specific disparity.)

How Can We Optimize That?  Data costs are implemented as pairwise interactions: If q is active and O q is visible, we impose the pixel dissimilarity as costs. If q is active and O q is occluded, we impose the occlusion penalty as costs. 0 costs, if q is inactive. (I am simplifying here.) q OqOq Occlusion Penalty Pixel Dissimilarity

How Can We Optimize That?  We have another pixel p.  If p is active it will map to the same pixel in the right image as q.  The disparity of p is smaller than that of q.  => We have to prohibit that the occlusion node of p is visible if q is active (visibility constraint).  How can we do that? q OqOq p OpOp

How Can We Optimize That?  We have another pixel p.  If p becomes “active” it will map to the same pixel in the right image as q.  The disparity of p is smaller than that of q.  => We have to prohibit that the occlusion node of p is in the visible state if q is active (visibility constraint).  How can we do that? q OqOq p OpOp ∞ We define a pairwise term. The term gives infinite costs if q is active and O p is visible. => This case will never occur as the result of energy minimization.

Result  I show the result of Surface Stereo [Bleyer,CVPR10] used in conjunction with the presented occlusion-aware data term.  I will speak about the energy function of Surface Stereo next time. Red pixels are occlusions

Result  Our occlusion term works well, but it is not perfect.  It detects occlusions on slanted surfaces where there should not be occlusions.

Uniqueness Constraint Violated by Slanted Surfaces  A slanted surface is differently sampled in left and right image.  In the example on the right, the slanted surfaces is represented by 3 pixels in the left image and by 6 pixel in the right image.  For slanted surfaces, a pixel can have more than one correspondences in the other view. => uniqueness assumption violated  We will see how we can tackle this problem with Surface Stereo next time. Image taken from [Ogale,CVPR04]

Michael Bleyer LVA Stereo Vision Segmentation- Based Stereo

 Has become very popular over the last couple of years  Most likely because it gives high-quality results  This is especially true on the Middlebury set Top-positions are clearly dominated by segmentation-based approaches 27

Key Assumptions  We assume that 1. Disparity inside a segment can be modeled by a single 3D plane 2. Disparity discontinuities coincide with segment borders  We apply a strong over-segmentation to make it more likely that our assumptions are fulfilled. 28 Tsukuba left imageResult of color segmentation (Segment borders are shown) Disparity discontinuities in the ground truth solution

Key Assumptions  We assume that 1. Disparity inside a segment can be modeled by a single 3D plane 2. Disparity discontinuities coincide with segment borders  We apply a strong over-segmentation to make it more likely that our assumptions are fulfilled. 29 Tsukuba left imageResult of color segmentation (Segment borders are shown) Disparity discontinuities in the ground truth solution We do not longer use pixels as matching primitive, but segments. Our goal is to assign each segment to a “good“ disparity plane.

How Do Segmentation-Based Methods Work?  Two-step procedure: Initialization:  Assign each segment to an initial disparity plane Optimization:  Optimize the assignment of segments to planes to improve the initial solution  Segmentation-based methods basically differ in the way how they implement these two steps.  I will explain the steps using the algorithm of [Bleyer,ICIP04]. 30

Initialization Step (1)  Two preprocessing steps: Apply color segmentation on the left image Compute an initial disparity match via a window-based method (block matching) 31 Tsukuba left imageColor segmentation (Pixels of the same segment are given identical colors) Initial disparity map (obtained by block matching)

Initialization Step (2)  Plane fitting: Fit a plane to each segment using the initial disparity map  Is accomplished via least squared error fitting  A plane is defined by 3 parameters a, b and c.  Knowing the plane, one can compute the disparity of pixel by d x,y = ax + by + c. 32 Color segmentation (Pixels of the same segment are given identical colors) Plane fitting result

Initialization Step (2)  Plane fitting: Fit a plane to each segment using the initial disparity map  Is accomplished via least squared error fitting  A plane is defined by 3 parameters a, b and c.  Knowing the plane, one can compute the disparity of pixel by d x,y = ax + by + c. 33 Color segmentation (Pixels of the same segment are given identical colors) Plane fitting result We now try to refine the initial plane fitting result in the optimization step.

Optimization Step  We use energy minimization: Step 1:  Design an energy function that measures the goodness of an assignment of segments to planes. Step 2:  Minimize the energy to obtain the final solution. 34

Idea Behind The Energy Function  We use the disparity map to warp the left image into the geometry of the right view.  If the disparity map was correct, the warped view should be very similar to the real right image. 35 Reference image Disparity mapWarped view += Real right view min

36 X-Coordinates [pixels] Disparity [pixels] Left view S1S1 S2S2 S3S3 Visibility Reasoning and Occlusion Detection

37 X-Coordinates [pixels] Disparity [pixels] Left view S1S1 S2S2 S3S3 X-Coordinates [pixels] S1S1 S2S2 S3S3 Warped view Disparity [pixels] Warping

38 X-Coordinates [pixels] Disparity [pixels] Left view S1S1 S2S2 S3S3 X-Coordinates [pixels] S1S1 S2S2 S3S3 Warped view Disparity [pixels] Warping If two pixels of the left view map to the same pixel in the right view, the one of higher disparity is visible Visibility Reasoning and Occlusion Detection

39 X-Coordinates [pixels] Disparity [pixels] Left view S1S1 S2S2 S3S3 X-Coordinates [pixels] S1S1 S2S2 S3S3 Warped view Disparity [pixels] Warping If there is no pixel of the left view that maps to a specific pixel of the right view, we have detected an occlusion. Visibility Reasoning and Occlusion Detection

40 Overall Energy Function  Measures the pixel dissimilarity between warped and real right views for visible pixels.  Assigns a fixed penalty for each detected occluded pixel.  Assigns a penalty for neighboring segments that are assigned to different disparity planes (smoothness)

41 Overall Energy Function  Measures the pixel dissimilarity between warped and real right views (for visible pixels).  Assigns a fixed penalty for each detected occluded pixel.  Assigns a penalty for neighboring segments that are assigned to different disparity planes (smoothness) How can we optimize that?

42 Energy Optimization  Start from the plane fitting result of the initialization step.  Optimization Algorithm (Iterated Conditional Modes [ICM]): Repeat a few times:  For each segment s: –For each segment t being a spatial neighbor of s: » Test if assigning s to the plane of t reduces the energy. » If so, assign s to t’s plane. Plane testing

43 Results  Ranked second in the Middlebury benchmark at the time of submission (2004) Computed disparity map Absolute disparity errors

44 Disadvantages of Segmentation–Based Methods  If segments overlap a depth discontinuity, there will definitely be a disparity error. (Segmentation is a hard constraint.)  A planar model is oftentimes not sufficient to model the disparity inside the segment correctly (e.g. rounded objects).  Leads to a difficult optimization problem The set of all 3D planes is of infinite size (label set of infinite size) Cannot apply α-expansions or BP (at least not in a direct way) Map reference frame (color segmentation generates segments that overlap disparity discontinuities) Ground truth Result of [Bleyer,ICIP04]

Michael Bleyer LVA Stereo Vision The Matting Problem in Stereo

The Matting Problem  Let us do a strong zoom-in on the Tsukuba Image.  At depth-discontinuities, there occur pixels whose color is the mixture of fore- and background colors  These pixels are called mixed pixels. Mixed pixels

Single Image Matting Methods  Do a foreground/background segmentation  Bright pixels represent foreground – dark pixels represent background  This is not just a binary segmentation!  The grey value expresses the percentage to which a mixed pixel belongs to the foreground. (This is the so-called alpha-value.) Input image Alpha Matte

Single Image Matting Methods  Do a foreground/background segmentation  Bright pixels represent foreground – dark pixels represent background  This is not just a binary segmentation!  The grey value expresses the percentage to which a mixed pixel belongs to the foreground. (This is the so-called alpha-value.) Zoomed-in View Alpha Matte

C = α F + (1 - α ) B How Can We Compute the Alpha-Matte?  We have to solve the compositing equation: = + ●● ●●  More precisely, given the color image C we have to compute: The alpha-value α The foreground color F The background color B  These are 3 unknowns in one equation => severely under-constraint problem.  Hence matting methods typically require user input (scribbles)

Why Do We Need it?  For Photomontage!  We give an image as well as scribbles as an input to the matting algorithm Red scribbles mark the foreground Blue scribbles mark the background  The matting algorithm computes α and F.  Using α and F we can paste the foreground object against a new background. Input image Novel Background

Why Bother About This When Doing Stereo?  I will now go through the presentation slides for the paper [Bleyer,CVPR09].  You can find them here:

Summary  Occlusion handling in global stereo  Segmentation-based methods  The matting problem in stereo

References  [Bleyer,ICIP04] M. Bleyer, M. Gelautz, A Layered Stereo Matching Algorithm Using Global Visibility Constraints, ICIP  [Bleyer,CVPR09] M. Bleyer, M. Gelautz, C. Rother, C. Rhemann, A Stereo Approach that Handles the Matting Problem Via Image Warping, CVPR  [Bleyer,CVPR10] M. Bleyer, C. Rother, P. Kohli, Surface Stereo with Soft Segmentation. CVPR  [Ogale,CVPR04] A. Ogale, Y. Aloimonos, Stereo correspondence with slanted surfaces : critical implications of horizontal slant, CVPR  [Woodford,CVPR08] O. Woodford, P. Torr, I. Reid, A. Fitzgibbon, Global stereo reconstruction under second order smoothness priors, CVPR 2008.