Michael Bleyer LVA Stereo Vision

Name: Michael Bleyer LVA Stereo Vision
Uploaded: 2017-11-26T13:11:14+00:00
Duration: PTM22S3
Channel: Jennifer Elliott
Description: Michael Bleyer LVA Stereo Vision

Michael Bleyer LVA Stereo Vision
Surface Stereo Michael Bleyer LVA Stereo Vision

What happened last time?
Occlusion treatment in global methods: Visibility constraint Sketch for optimization via graph-cuts Segmentation-based stereo matching: Segmentation assumption Example method [Bleyer,ICIP04] Pros/Cons Stereo and the matting problem

What is Going to Happen Today?
I will speak about the surface stereo paper [Bleyer,CVPR10] We are going to watch TV (in 3D) 

Michael Bleyer LVA Stereo Vision
Surface Stereo [Bleyer,CVPR10] Michael Bleyer LVA Stereo Vision

Contributions In the following, I will present several problems of segmentation-based methods. For each problem, I will also present a strategy of [Bleyer,CVPR10] to overcome it.

Segmentation-Based Methods
Recall the segmentation assumption: All pixels within a segment lie on the same disparity plane Disparity discontinuities coincide with segment borders Methods implementing this assumption deliver high- quality results  But have a number of weaknesses  Tsukuba left image Result of color segmentation (Segment borders are shown) Disparity discontinuities in the ground truth solution

Violations of Segmentation Assumptions
Each segment is assigned to a single disparity plane => Validity of the segmentation assumption is enforced strictly (We say that segmentation assumption is implemented as a hard constraint.) => If a segment overlaps a disparity discontinuity, there will definitely be a disparity error. Map reference frame (color segmentation generates segments that overlap disparity discontinuities) Ground truth Result of [Bleyer,ICIP04]

Violations of Segmentation Assumptions
Each segment is assigned to a single disparity plane => Validity of the segmentation assumption is enforced strictly (We say that segmentation assumption is implemented as a hard constraint.) => If a segment overlaps a disparity discontinuity, there will definitely be a disparity error. How can we make it better? Do not assign whole segments to planes, but assign each pixel to a plane. If all pixels within a segment are assigned to the same plane, give 0 costs and a penalty, otherwise. We allow deviations from the segmentation assumption (but penalize them) = Soft Constraint Map reference frame (color segmentation generates segments that overlap disparity discontinuities) Ground truth Result of [Bleyer,ICIP04]

Planar World Assumption
In segmentation-based stereo, it is commonly assumed that surface shapes can be modeled via planes. If there are rounded objects, this assumption does not hold true. Most segmentation-based methods would fail on the images bellow. Umbrella test set of [Lin,PAMI04] Disparity map with contour lines overlaid

Planar World Assumption
In segmentation-based stereo, it is commonly assumed that surface shapes can be modeled via planes. If there are rounded objects, this assumption does not hold true. Most segmentation-based methods would fail on the images bellow. How can we make it better? We can use more general surfaces than planes. In particular, we assign pixels to planes and B-spline surfaces. Umbrella test set of [Lin,PAMI04] Disparity map with contour lines overlaid

MDL Prior Segmentation-based methods tend to produce solutions containing an unnecessarily large number of surfaces. For example: In the real-world, the fence and the background in the image bellow are one surface each. However, in the computed solution the background is modeled by a large number of slightly different surfaces. Crop of Cones set Result of a segmentation-based method -Different colors represent different surfaces

How can we make it better?
MDL Prior Segmentation-based methods tend to produce solutions containing an unnecessarily large number of surfaces. For example: In the real-world, the fence and the background in the image bellow are one surface each. However, in the computed solution the background is modeled by a large number of slightly different surfaces. How can we make it better? We should tell the algorithm to prefer a simple explanation of the scene (consisting of a small number of surfaces) over a more complex one (consisting of a large number of surfaces) = Minimum Description Length (MDL) Principle We modify our energy so that the number of surfaces occurring in the solution is penalized (low number cheaper than high number). Crop of Cones set Result of a segmentation-based method -Different colors represent different surfaces

Curvature Term Recall our discussion of the second order smoothness term of [Woodford,CVPR08]: Second-order terms are important, because they do not prefer fronto-parallel planes over slanted ones. [Woodford,CVPR08] approximates curvature by where p, q and r are horizontal (or vertical) neighbors. This is just an approximation of the surface’s real second order derivative. Leads to a quite complex graph for optimization (triple cliques): aux p q r

How can we make it better? We are operating on surfaces
Curvature Term How can we make it better? We are operating on surfaces Hence, we can analytically compute the second order derivative directly from the surface model in an exact way (no approximation) In surface stereo, the curvature term is a unary term. Graph for optimization: Recall our discussion of the second order smoothness term of [Woodford,CVPR08]: Second-order terms are important, because they do not prefer fronto-parallel planes over slanted ones. [Woodford,CVPR08] approximates curvature by where p,q and r are horizontal (or vertical) neighbors. This is just an approximation of the surface’s real second order derivative. Leads to a quite complex graph for optimization: aux q p q p

Improved Occlusion Handling
Recall from last session: The uniqueness constraint does not hold true for slanted surfaces. Due to sampling problems, you detect occlusions where there should not be occlusions Red pixels are occlusions

Improved Occlusion Handling
Recall from last session: The uniqueness constraint does not hold true for slanted surfaces. Due to sampling problems, you detect occlusions where there should not be occlusions How can we make it better? We can state that pixels of the same surface must not occlude each other Red pixels are occlusions

Energy Optimization in Segmentation-Based Stereo
In the initialization step, segmentation-based methods compute a set of planes. In the optimization step, these planes are propagated among neighboring segments. Problem 1: If a correct plane is missing in the initial plane fitting result, you will definitely get a disparity error. Problem 2: The number of surfaces contained in the initial plane fitting result is very high. Energy minimization becomes infeasible when using α-expansions or belief propagation.

Energy Optimization in Segmentation-Based Stereo
In the initialization step, segmentation-based methods compute a set of planes. In the optimization step, these planes are propagated among neighboring segments. Problem 1: If a correct plane is missing in the initial plane fitting result, you will definitely get a disparity error. Problem 2: The number of surfaces contained in the initial plane fitting result is very high. Energy minimization becomes infeasible when using α-expansions or belief propagation. How can we make it better? Fusion Moves!!! Problem 1 (Planes are missing): You can fuse a large number of different initial plane fitting results. This large number makes it less likely that you miss a plane (or B-spline) Problem 2 (Large label set): The fusion move algorithm does not care about the number of labels

Disparity-Based Versus Surface-Based Representation
There are two options to represent depth: You either assign your pixels to discrete disparity values That is what the vast majority of stereo matching algorithms do (e.g., all methods that we heard about in the first 7 sessions). You assign your pixels to 3D surfaces This is what segmentation-based methods do on the segment level. On the pixel level, I am just aware of two papers ([Lin,PAMI04] and [Birchfield,ICCV99])

Disparity-Based Versus Surface-Based Representation
Although the disparity-based representation is far more popular, we believe that is has considerable disadvantages over the surface-based representation: In the surface-based representation, you get sub-pixel disparity estimates for free. More importantly, it is extremely difficult to implement a soft segmentation term or MDL prior: You would have to explicitly model all disparity labelings that form a surface (and there are plenty of them). Curvature is more difficult to be modeled in the disparity representation (triple cliques). Slanted surfaces are difficult to be handled correctly with an asymmetric occlusion model in a disparity representation.

Energy Model Our goal is to assign each pixel p of the reference image to a single surface fp. A surface is either a plane or a B-spline. Assigning p to a surface fp implicitly defines p’s disparity: If fp is a plane, we can compute d(p,fp) = fp[a]·px + fp[b]·py + fp[c]. If fp is a B-spline, we can compute disparity via the spline’s blending function. We measure the quality of a mapping F that assigns each pixel of the left view to a surface via an energy function E(F):

I will now present each of these 5 terms
Energy Model Our goal is to assign each pixel p of the reference image to a single surface fp. A surface is either a plane or a B-spline. Assigning p to a surface fp implicitly defines p’s disparity: If fp is a plane, we can compute dp = fp[a]·px + fp[b]·py + fp[c]. If fp is a B-spline, we can compute disparity via the spline’s blending function. We measure the quality of a mapping F that assigns each pixel of the left view to a surface via an energy function E(F): I will now present each of these 5 terms

Data Term 1 if and and 0 otherwise.
I have presented the occlusion-aware data term in the previous session: The term computes The pixel dissimilarity if p is visible. The occlusion penalty if p is occluded. We use the following occlusion function: 1 if and and 0 otherwise.

Visibility Constrain (as discussed in the previous session)
Data Term I have presented the occlusion-aware data term in the previous session: The term computes The pixel dissimilarity if p is visible. The occlusion penalty if p is occluded. We use the following occlusion function: 1 if and and 0 otherwise. Visibility Constrain (as discussed in the previous session) Pixels of the same surface cannot occlude each other (slanted surfaces)

Smoothness Term Penalizes neighboring pixels that are assigned to different surfaces where Psmooth is the smoothness penalty T[] is the indicator function Returns 1 if p and q have different surfaces Returns 0, otherwise

Soft Segmentation Term
We are given a segmentation of the left image as an input. For each pixel p, we center a squared window Wp on p. We look up p’s segment Sp. We build a subsegment Lp by intersection:

We motivate the segmentation assumption to be fulfilled in each subsegment: where Pseg is a penalty for violations of the segmentation assumption Less formal: We give 0 penalty if all pixels within the subsegment are assigned to the same surface and Pseg, otherwise. if Pseg otherwise

A surface discontinuity intersects the segment, which leads to the presence of two different surfaces in the segment. (Left Image) The segmentation assumption is violated for a large number of subsegments (colored red). Each of them imposes a penalty. (Right Image) Smaller deviation from the segment’s shape is penalized less, because the segmentation assumption is only violated for a small number of subsegments.

MDL Term Imposes a penalty on the occurrence of a surface: where
Ω is the set of all surfaces. Pmdl is a penalty T[] is the indicator function Returns 1 if there is at least one pixel in F that is assigned to surface f. Returns 0, otherwise. A solution F that contains a small number of surfaces is cheaper than one that contains a large number = MDL principle.

Curvature Term Penalizes surfaces of high curvature where
Pcurv is a penalty for curvature computes the second order derivative of surface fp at pixel p.

Energy Optimization We use the fusion move algorithm:
We start with an arbitrary assignment of pixels to surfaces Loop Generate a proposal solution Fuse the current solution with this proposal The fusion result is guaranteed to have equal or lower energy than the current solution Current solution := fusion result

Energy Optimization There are 2 things that one needs to discuss:
We use the fusion move algorithm: We start with an arbitrary assignment of pixels to surfaces Loop Generate a proposal solution Fuse the current solution with this proposal The fusion result is guaranteed to have equal or lower energy than the current solution Current solution := fusion result There are 2 things that one needs to discuss: How to generate proposals How to build the graph that computes a “good” fusion (I will skip this)

Proposal Generation Method also used in [Woodford,CVPR08]
Preprocessing steps: Compute different color segmentations by varying the segmentation parameters. Compute different initial disparity maps by changing the smoothness settings of a fast method. How to compute a proposal: Select one of the segmentations and one initial disparity map Apply plane or spline fitting for each segment using the initial disparity map Return result of model fitting Different Mean Shift Segmentations Different Initial Disparity Maps

Proposal Generation Method also used in [Woodford,CVPR08] Preprocessing steps: Compute different color segmentations by varying the segmentation parameters. Compute different initial disparity maps by changing the smoothness settings of a fast method. How to compute a proposal: Select one of the segmentations and one initial disparity map Apply plane or spline fitting for each segment using the initial disparity map Return result of model fitting There are 5 other ways for generating a proposal described in the paper: - Fronto-Parallel Proposals Single Surface Proposals Refit Proposals K-Means Proposals Surface Dilation Proposals Different Mean Shift Segmentations Different Initial Disparity Maps

Results

Results - Contribution of Individual Terms
When we use all terms of our energy function, we obtain the 6th rank in the Middlebury table. We now measure the contribution of the individual terms of our energy: Not using the soft segmentation term considerably worsens matching performance We lose 29 ranks on Middlebury Turning off MDL or curvature terms leads to moderate increase of error rates We lose 7 or 6 ranks respectively Using standard occlusion handling (with errors at slanted surfaces) does not really worsen the results The ranking stays the same

Advantage of Soft Segmentation Term
Map test set – Ground truth disparities and left image Result of using segmentation as a hard constraint – Disparity and error maps Result of Surface Stereo – Disparity and error maps

Result without MDL prior
Advantage of MDL Term Crop of Cones set Result without MDL prior Result with MDL prior The background is reconstructed by a very small number of surfaces when using the MDL prior.

Advantage of Curvature Term
Venus result without curvature term – left image with contour lines and error maps A background plane is erroneously modeled by a high curvature B-spline surface Venus result with curvature term The background is correctly modeled as a plane (of 0 curvature)

Advantage of Improved Occlusion Handling
Result when using standard occlusion handling (visibility constraint) – Red pixels are occlusions Result when using improved occlusion handling We can avoid wrong occluded pixels at slanted surfaces.

Summary Surface stereo: Soft segmentation MDL prior
Simple way for incorporating curvature information Improved occlusion handling

References [Bleyer,ICIP04] M. Bleyer, M. Gelautz, A Layered Stereo Matching Algorithm Using Global Visibility Constraints, ICIP 2004. [Bleyer,CVPR10] M. Bleyer, C. Rother, P. Kohli, Surface Stereo with Soft Segmentation. CVPR 2010. [Lin,PAMI04] M. Lin, C. Tomasi, Surfaces with Occlusions from Layered Stereo, PAMI, 2004. [Woodford,CVPR08] O. Woodford, P. Torr, I. Reid, A. Fitzgibbon, Global stereo reconstruction under second order smoothness priors, CVPR 2008.

Michael Bleyer LVA Stereo Vision

Similar presentations

Presentation on theme: "Michael Bleyer LVA Stereo Vision"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Michael Bleyer LVA Stereo Vision

Similar presentations

Presentation on theme: "Michael Bleyer LVA Stereo Vision"— Presentation transcript:

Similar presentations

About project

Feedback