Automatic Photo Pop-up Derek Hoiem Alexei A.Efros Martial Hebert Carnegie Mellon University.

Slides:



Advertisements
Similar presentations
The blue and green colors are actually the same.
Advertisements

Generative Models Thus far we have essentially considered techniques that perform classification indirectly by modeling the training data, optimizing.
A Robust Super Resolution Method for Images of 3D Scenes Pablo L. Sala Department of Computer Science University of Toronto.
Context-based object-class recognition and retrieval by generalized correlograms by J. Amores, N. Sebe and P. Radeva Discussion led by Qi An Duke University.
Learning on the Test Data: Leveraging “Unseen” Features Ben Taskar Ming FaiWong Daphne Koller.
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Interactively Co-segmentating Topically Related Images with Intelligent Scribble Guidance Dhruv Batra, Carnegie Mellon University Adarsh Kowdle, Cornell.
Carolina Galleguillos, Brian McFee, Serge Belongie, Gert Lanckriet Computer Science and Engineering Department Electrical and Computer Engineering Department.
Chapter 4: Linear Models for Classification
Fast and Extensible Building Modeling from Airborne LiDAR Data Qian-Yi Zhou Ulrich Neumann University of Southern California.
Contextual Classification with Functional Max-Margin Markov Networks Dan MunozDrew Bagnell Nicolas VandapelMartial Hebert.
Multiple Criteria for Evaluating Land Cover Classification Algorithms Summary of a paper by R.S. DeFries and Jonathan Cheung-Wai Chan April, 2000 Remote.
Robust Higher Order Potentials For Enforcing Label Consistency
Surface Reconstruction from 3D Volume Data. Problem Definition Construct polyhedral surfaces from regularly-sampled 3D digital volumes.
A Brief Introduction to Adaboost
Automatic Photo Popup Derek Hoiem Alexei A. Efros Martial Hebert Carnegie Mellon University.
Automatic Photo Pop-up Derek Hoiem Alexei A. Efros Martial Hebert.
3D Scene Models Object recognition and scene understanding Krista Ehinger.
Computer Vision James Hays, Brown
CSE 185 Introduction to Computer Vision Pattern Recognition.
A Bayesian Approach For 3D Reconstruction From a Single Image
EADS DS / SDC LTIS Page 1 7 th CNES/DLR Workshop on Information Extraction and Scene Understanding for Meter Resolution Image – 29/03/07 - Oberpfaffenhofen.
MRFs and Segmentation with Graph Cuts Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 02/24/10.
Recognition using Regions (Demo) Sudheendra V. Outline Generating multiple segmentations –Normalized cuts [Ren & Malik (2003)] Uniform regions –Watershed.
Recovering Surface Layout from a Single Image D. Hoiem, A.A. Efros, M. Hebert Robotics Institute, CMU Presenter: Derek Hoiem CS 598, Spring 2009 Jan 29,
Intelligent Vision Systems ENT 496 Object Shape Identification and Representation Hema C.R. Lecture 7.
LOGO Ensemble Learning Lecturer: Dr. Bo Yuan
A Method for Registration of 3D Surfaces ICP Algorithm
I 3D: Interactive Planar Reconstruction of Objects and Scenes Adarsh KowdleYao-Jen Chang Tsuhan Chen School of Electrical and Computer Engineering Cornell.
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
Supervised Learning of Edges and Object Boundaries Piotr Dollár Zhuowen Tu Serge Belongie.
Automatic Image Annotation by Using Concept-Sensitive Salient Objects for Image Content Representation Jianping Fan, Yuli Gao, Hangzai Luo, Guangyou Xu.
Course 9 Texture. Definition: Texture is repeating patterns of local variations in image intensity, which is too fine to be distinguished. Texture evokes.
Today Ensemble Methods. Recap of the course. Classifier Fusion
Ch 4. Linear Models for Classification (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized and revised by Hee-Woong Lim.
Single-view 3D Reconstruction Computational Photography Derek Hoiem, University of Illinois 10/12/10 Some slides from Alyosha Efros, Steve Seitz.
Neural Networks - Lecture 81 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering.
CS654: Digital Image Analysis
Computer Vision Lecture #10 Hossam Abdelmunim 1 & Aly A. Farag 2 1 Computer & Systems Engineering Department, Ain Shams University, Cairo, Egypt 2 Electerical.
Paired Sampling in Density-Sensitive Active Learning Pinar Donmez joint work with Jaime G. Carbonell Language Technologies Institute School of Computer.
Levels of Image Data Representation 4.2. Traditional Image Data Structures 4.3. Hierarchical Data Structures Chapter 4 – Data structures for.
Image-Based Segmentation of Indoor Corridor Floors for a Mobile Robot Yinxiao Li and Stanley T. Birchfield The Holcombe Department of Electrical and Computer.
Boosted Particle Filter: Multitarget Detection and Tracking Fayin Li.
(c) 2000, 2001 SNU CSE Biointelligence Lab Finding Region Another method for processing image  to find “regions” Finding regions  Finding outlines.
Random Forests Ujjwol Subedi. Introduction What is Random Tree? ◦ Is a tree constructed randomly from a set of possible trees having K random features.
Contextual models for object detection using boosted random fields by Antonio Torralba, Kevin P. Murphy and William T. Freeman.
Object Recognition by Integrating Multiple Image Segmentations Caroline Pantofaru, Cordelia Schmid, Martial Hebert ECCV 2008 E.
Coherent Scene Understanding with 3D Geometric Reasoning Jiyan Pan 12/3/2012.
DISCRIMINATIVELY TRAINED DENSE SURFACE NORMAL ESTIMATION ANDREW SHARP.
Instructor: Mircea Nicolescu Lecture 5 CS 485 / 685 Computer Vision.
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
Linear Discriminant Functions Chapter 5 (Duda et al.) CS479/679 Pattern Recognition Dr. George Bebis.
Course : T Computer Vision
Learning a Region-based Scene Segmentation Model
Summary of “Efficient Deep Learning for Stereo Matching”
Nonparametric Semantic Segmentation
Machine Learning Basics
Computer Vision Lecture 12: Image Segmentation II
Brief Review of Recognition + Context
Pattern Recognition and Image Analysis
Noah Snavely.
Video Compass Jana Kosecka and Wei Zhang George Mason University
Two-view geometry.
Multi-view geometry.
Introduction to Sensor Interpretation
EM Algorithm and its Applications
Introduction to Sensor Interpretation
Calibration and homographies
Derek Hoiem CS 598, Spring 2009 Jan 27, 2009
“Traditional” image segmentation
Presentation transcript:

Automatic Photo Pop-up Derek Hoiem Alexei A.Efros Martial Hebert Carnegie Mellon University

Abstract This paper presents a fully automatic method for creating a 3D model from a single photograph. Our algorithm labels regions of the input image into coarse categories: ground, sky,and vertical. Because of the inherent ambiguity of the problem and the statistical nature of the approach,the algorithm is not expected to work on every image.

Overview Image to Superpixels (Superpixel:nearly uniform regions) Superpixels to Multiple Constellations Multiple Constellations to Superpixel Labels Superpixel Labels to 3D Model!!

Features for Geometric Classes

Color:it is valuable in identifying the material of a surface. Texture:provides additional information about the material of a surface. Location:provide strong cues for distinguishing between ground,vertical structures,and sky. 3D Geometry:help determine the 3D orientation of surface.

Features for Geometric Classes Num:the number of variables in each set Used:how many variables from each set are actually used in the classifier

Features for Geometric Classes

Horizon Position We estimate the horizon position from the intersections of nearly parallel lines by finding the position that minimizes the L1orL2 distance from all of the intersection points in the image. This often provides a resonable estimate,since these scenes contain many parallel to the ground plane.

Labeling the Image

Obtaining Superpixels Forming Constellation Geometric Classification

Labeling the Image Obtaining Superpixels Forming Constellation Geometric Classification

Labeling the Image Obtaining Superpixels Superpixels correspond to small,nearly-uniform regions in the image. Our implementation uses the over-segmentation technique of [Felzenszwalb and Huttenlocher 2004]. The use of superpixels improves the computational efficiency of our algorithm.

Labeling the Image Obtaining Superpixels Forming Constellation Geometric Classification

Labeling the Image Forming Constellations We group superpixels that are likely to share a common geometric label into Constellation. To form constellations, we initialize by assigning one randomly selected superpixel to each of Nc constellations. We then iteratively assign each remaining superpixel to the constellation most likely to share its label, maximizing the average pairwise log-likelihoods with other superpixels in the constellation:

Labeling the Image Forming Constellations Nc:the number of the constellation n k :the number of superpixels in constellstion Ck Y:label of superpixel Z:feature

Labeling the Image Obtaining Superpixels Forming Constellation Geometric Classification

Labeling the Image Geometric Classification For each constellation,we estimate : 1.Label likelihood: the confidence in each geometric label. 2.Homogeneity likelihood: whether all superpixels in the constellation have the same label.

Labeling the Image Geometric Classification Next,we estimate the likelihood of a superpixel label by marginalizing over the constellation likelihood: Si:the i th superpixel Yi:the label of Si

Training

Training Data The likelihood function used to group superpixels and label constellations are learned from training images. Each training image is over- segmented into superpixels,and each superpixel is given a ground truth label according to its geometric class.

Training

Superpixel Same-Label Likelihoods To learn the likelihood that two superpixels have the same label,we sample 2500 same- label and different-label pairs of superpixels from our training data. We estimate the pairwise likelihood function are estimated using the logistic regression version of Adaboost[Collins 2002] with weak learners based on eight-node decision trees[Friedman 2000].

Training Z1,Z2:the features from a pair of superpixels Y1,y2:the labels of the superpixels nf:the number of features :This likelihood function is obtained using kernel density estimation [Duda 2000] over the m th weighted distribution.

Training Constellation Label and Homogeneity Likelihoods To learn the label likelihood and Homogeneity likelihood,we form multiple sets of constellations for the superpixels in our training images using the learned pairwise function. Each constellation is then labeled as ground, vertical, sky,or mixed,according to the ground truth. Each decision tree weak learner selects the best features to use and estimates the confidence in each label based on those features.

Training Constellation Label and Homogeneity Likelihoods The boosted decision tree estimator outputs a confidence for each of ground, vertical, sky, and mixed,which are normalized to sum to 1. The product of the label and homogeneity likelihoods for a particular geometric label is then given by the normalized confidence in ground, vertical, sky,and mixed.

Creating the 3D Model

Cutting and Folding Our model=ground plane+planar object We need to partition the vertical regions into a set of objects and determine where each object meets the ground. Preprocess: S et any superpixels that are labeled as ground or sky and completely surrounded by non-ground or non-sky pixels to the most common label of the neighboring superpixels.

Creating the 3D Model Cutting and Folding 1.We divide the vertically labeled pixels into disconnected or loosely connected regions using the connected components algorithm. 2.For each region,we fit a set of line segments to the region s boundary with the labeled ground using the Hough transform[Duda 1972] 3.Next,within each region,we form the disjoint line segments into a set polylines.

Creating the 3D Model Cutting and Folding We treat each polyline as a separate object, modeled with a set of connected planes that are perpendicular to the ground plane.

Creating the 3D Model Camera Parameters To obtain true 3D world coordinates,we would need to know the two camera parameters: 1.intrinsic 2.extrinsic vfx/05spring/lectures/scribe/07scribe.pdf

Failure Works

Failure

Four example of failures 1.Labeling error 2.Polyline fitting error 3.Modeling assumptions 4.Occlusion in the image 5.Poor estimation of the horizon position

Result

Conclusion

Future work 1.Use segmentation technique[Li 2004/Lazzy snapping] 2.Estimate the orientation of vertical region from the image data,allowing a more robust polyline fit. 3.An extension of the system to the indoor scene.

Thanks for your listening!!