3D Shape Representation Tianqiang 04/01/2014. Image/video understanding Content creation Why do we need 3D shapes?

3D Shape Representation Tianqiang 04/01/2014

Image/video understanding Content creation Why do we need 3D shapes?

Image/video understanding [Chen et al. 2010] Why do we need 3D shapes?

Image/video understanding [Xiang et al. 2014 (ECCV)] [Chen et al. 2010] Why do we need 3D shapes?

Image/video understanding Need a 3D shape prior! Why do we need 3D shapes?

Image/video understanding Content creation [Yumer et al. 2010][Kalogerakis et al. 2012] Why do we need 3D shapes?

Content creation Why do we need 3D shapes? Need a 3D shape prior, again!

Papers to be covered in this lecture Recovering 3D from single images 3D Reconstruction from a Single Image [Rother et al. 08] Shared Shape Spaces [Prisacariu et al. 11] A Hypothesize and Bound Algorithm for Simultaneous Object Classification, Pose Estimation and 3D Reconstruction from a Single 2D Image [Rother et al. 11] Beyond PASCAL: A Benchmark for 3D Object Detection in the Wild [Xiang et al. 14 (WACV)] Inferring 3D Shapes and Deformations from Single Views [Chen et al. 10]

Papers to be covered in this lecture Tracking Simultaneous Monocular 2D Segmentation, 3D Pose Recovery and 3D Reconstruction [Prisacariu et al. 12] Monocular Multiview Object Tracking with 3D Aspect Parts [Xiang et al. 14 (ECCV)]

Papers to be covered in this lecture Shape synthesis A Probabilistic Model for Component-based Shape Synthesis [Kalogerakis et al. 12] Co-constrained Handles for Deformation in Shape Collections [Yumer et al. 14]

Recovering 3D from single images Input: a single image

Recovering 3D from single images Output: 3D shape, and pose or viewpoint

Outline A traditional generative model Shared shape space A benchmark for 3D object detection

A Traditional Model : 3D shape (output) : Viewpoint (output) : Image (input)

: 3D shape (output) : Viewpoint (output) : Image (input) Projection likelihood Shape prior Viewpoint prior A Traditional Model

Shape prior Viewpoint prior A Traditional Model [Rother et al. 2008] [Xiang et al. 2014] [Chen et al. 2014] Projection likelihood

Rother et al. 2008 - Overview Voxel representation for 3D shape Viewpoint is known or enumerated Using an optics law to do projection

Rother et al. 2008 - Overview

Voxel occupancies Rother et al. 2008 – 3D Shape Prior (2D is just for illustration)

Rother et al. 2008 – Projection likelihood (Assuming the viewpoint is known)

Rother et al. 2008 – Projection likelihood Two issues: Effect is unrelated to travel distance in each voxel Dependent on the resolution of grids

Rother et al. 2008 – Projection likelihood Beer-lambert law:

Rother et al. 2008 – Optimization Possible solution: belief propagation But what if the graph is loopy?

Rother et al. 2008 – Optimization Possible solution: belief propagation But what if the graph is loopy? Down-sample the image!

Rother et al. 2008 – Experiments Learning shape prior: separately for each subclass eg. Right leg in front, left leg up, … Separate Together Degree = 0 60 120 180

Rother et al. 2008 – Experiments

Chen et al. 2010 - Overview Modeling intrinsic shape (phenotype) and pose variations Mesh representation for 3D shape. Only leveraging silhouettes in images

Chen et al. 2010 - Overview Pose Shape Viewpoint Projection Silhouette Latent variables Intrinsic shape

Pose Shape Viewpoint Projection Silhouette Intrinsic shape Chen et al. 2010 - Overview Latent variables

Gaussian Process Latent Variable Models Chen et al. 2010 – 3D Shape Prior

Gaussian Process Latent Variable Model Pose

Gaussian Process Latent Variable Model Intrinsic shape (phenotype)

Shape transfer

: projected silhouettes : samples on 3D mesh : camera parameters : input silhouettes Chen et al. 2010 – Projection

ProjectingMatching Chen et al. 2010 – Projection

Projecting: : projection : translation vector Chen et al. 2010 – Projection

: a similarity function : normalizer Matching: Chen et al. 2010 – Projection

Chen et al. 2010 – Optimization Approximation Adaptive-scale line search Multiple initialization

Chen et al. 2010 – Experiments

Shared Shape Space Silhouette 3D shape Latent variable

Shared Shape Space Multimodality

Silhouette 3D shape Latent variable Cannot model multimodality Shared Shape Space

Silhouette 3D shape Latent variables Can model multimodality Similarity distribution Shared Shape Space

A little more detail: Hierarchy of latent manifolds Representation of 2D and 3D shapes

Hierarchy of Manifolds 10D 4D 2D

2D/3D Discrete Cosine Transforms Advantages: Fewer dimensions than directly representing the images.

2D/3D Discrete Cosine Transforms Image: 640X480 = 307200 dims DCT: 35X35 = 1225 dims

Shared Shape Space - Inference 10D 4D 2D InputOutput

Shared Shape Space - Inference Step 1: find a latent space point in which produces Y that best segments the image

Shared Shape Space - Inference Step 2: obtain a similarity function

Shared Shape Space - Inference Step 3: pick the peaks in, which are latent points in space

Shared Shape Space - Inference Step 4: take these points as initializations, and search for latent points in that best segments the image.

Shared Shape Space - Inference Step 5: obtain latent points in the upper layer and continue.

Shared Shape Space - Inference Step 5: obtain latent points in the upper layer and continue. Please refer to the paper to see how to find the latent points for best segmenting the image in each step! (Also use GPLVM)

Shared Shape Space - Results Tracking

Shared Shape Space - Results Shape from single images

Shared Shape Space - Results Gaze tracking

A Benchmark: Beyond PASCAL Benchmark for what?

A Benchmark: Beyond PASCAL Advantage over previous benchmarks Background clutter Occluded/truncated objects Large number of classes and intra-class variation Large number of viewpoints

Advantages Background clutter Occluded/truncated objects Large number of classes and intra-class variation Large number of viewpoints

Background clutter Occluded/truncated objects Large number of classes and intra-class variation Large number of viewpoints Advantages

Background clutter Occluded/truncated objects Large number of classes and intra-class variation Large number of viewpoints

Advantages Background clutter Occluded/truncated objects Large number of classes and intra-class variation Large number of viewpoints

Statistics of azimuth distribution A Benchmark: Beyond PASCAL

Statistics of occlusion distribution

A Benchmark: Beyond PASCAL Statistics of occlusion distribution

A Benchmark: Beyond PASCAL

Papers to be covered in this lecture Tracking Simultaneous Monocular 2D Segmentation, 3D Pose Recovery and 3D Reconstruction [Prisacariu et al. 12] Monocular Multiview Object Tracking with 3D Aspect Parts [Xiang et al. 14 (ECCV)]

Tracking Input: a sequence of frames Output: segmentation, pose, 3D reconstruction per frame

Outline PaperShape representation Motion priorShape consistency Prisacariu et al. 12 DCT (same as Shared Shape Space) NoneConstrained to be the same 3D shape Xiang et al. 14Assembly of 3D aspect parts Gaussian priors

Tracking with 3D Aspect Parts Frames3D aspect parts

Tracking with 3D Aspect Parts : positions of aspect parts : viewpoints : image To maximize the following posterior distribution:

: position of an aspect part Likelihood term Assumption of independency:

Likelihood term Using classifiers in both category and instance level:

Likelihood term Descriptor of a rectified aspect part Extracting Haar-like features for instance-level templates HOG

Tracking with 3D Aspect Parts : positions of aspect parts : viewpoints : image To maximize the following posterior distribution:

Motion prior term

Assumption of independency: Motion prior term

Pairwise term: Motion prior term

MCMC sampling Inference

Results - Tracking

Results – Viewpoint and 3D recovery

Probabilistic Model for Shape Synthesis (from Kalogerakis’ SIGGRAPH slides) Input Probabilistic model Output Learning stage Synthesis stage

Learning stage Input Probabilistic model Learning stage

R P(R)

R P(R) [ P( N l | R) ] Π l ∈ L NlNl L

R SlSl NlNl L P(R) [ P( N l | R) P ( S l | R ) ] Π l ∈ L

ClCl R SlSl L P(R) [ P( N l | R) P ( S l | R ) P( D l | S l ) P( C l | S l ) ] Π l ∈ L DlDl NlNl

ClCl R SlSl L P(R) [ P( N l | R) P ( S l | R ) P( D l | S l ) P( C l | S l ) ] Π l ∈ L DlDl NlNl Height Width

ClCl R SlSl L P(R) [ P( N l | R) P ( S l | R ) P( D l | S l ) P( C l | S l ) ] Π l ∈ L DlDl NlNl Latent object style Latent component style

ClCl R SlSl L DlDl NlNl Learn from training data: latent styles lateral edges parameters of CPDs

Learning stage Given observed data O, find structure G that maximizes:

Learning stage Given observed data O, find structure G that maximizes: Assuming uniform prior over structures, maximize marginal likelihood:

Learning stage Given observed data O, find structure G that maximizes: Assuming uniform prior over structures, maximize marginal likelihood: (Cheeseman-Stutz approximation)

Synthesis Stage (from Kalogerakis’ SIGGRAPH slides) Probabilistic model Output Synthesis stage

Synthesizing a set of components Enumerate high-probability instantiations of the model

Source shapes Unoptimized new shape Optimized new shape Optimizing component placement

Results Source shapes (colored parts are selected for the new shape) New shape

R N1N1 N2N2 C1C1 C2C2 S1S1 S2S2 D1D1 D2D2 Results of alternative models: no latent variables

Results of alternative models: no part correlations R N1N1 N2N2 C1C1 C2C2 S1S1 S2S2 D1D1 D2D2

Co-Constrained Handles for Shape Deformation Input Constraints Output Learning stage Synthesis stage

Pipeline Initial Segmentation Initial Abstraction Clustering Co-constrained Abstraction Handles Constraints

Clustering Seed surface clustering Global Clustering

Clustering Seed surface clustering – Aligning semantically similar/identical segments Global Clustering – Creating larger clusters that join the surfaces from different segments – Based on geometric features of the surfaces, including PCA basis, normal and curvature signature of the surface.

Clustering Door handle Wheels Seed clustering Different clusters

Clustering Door handle Wheels Global clustering Same cluster

Co-constrained abstraction Planer, spherical, or cylindrical? Initial abstraction Co- abstraction

Identify constraints of DOFs Identify as constraints if variance is below certain threshold Translation Rotation Scaling

Correlation among deformation handles Estimate the conditional feature probability

Results

Summary 3D shape from image or video Using silhouettes, colors or features (e.g. HOG) in image Using voxel occupancies, aspect parts, level set, or meshes to represent 3D shapes Either generative model, shared shape space, or discriminative model

Summary Shape synthesis Generative model based on semantic parts, which models styles also. Modeling correlation between geometric features.

3D Shape Representation Tianqiang 04/01/2014. Image/video understanding Content creation Why do we need 3D shapes?

Similar presentations

Presentation on theme: "3D Shape Representation Tianqiang 04/01/2014. Image/video understanding Content creation Why do we need 3D shapes?"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

3D Shape Representation Tianqiang 04/01/2014. Image/video understanding Content creation Why do we need 3D shapes?

Similar presentations

Presentation on theme: "3D Shape Representation Tianqiang 04/01/2014. Image/video understanding Content creation Why do we need 3D shapes?"— Presentation transcript:

Similar presentations

About project

Feedback