Richard G. Baraniuk Chinmay Hegde Sriram Nagaraj Manifold Learning in the Wild A New Manifold Modeling and Learning Framework for Image Ensembles Aswin.

Slides:

Advertisements

Similar presentations

Distinctive Image Features from Scale-Invariant Keypoints

Advertisements

Bayesian Belief Propagation

Distinctive Image Features from Scale-Invariant Keypoints David Lowe.

Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

SHAPE THEORY USING GEOMETRY OF QUOTIENT SPACES: STORY STORY SHAPE THEORY USING GEOMETRY OF QUOTIENT SPACES: STORY STORY ANUJ SRIVASTAVA Dept of Statistics.

Medical Image Registration Kumar Rajamani. Registration Spatial transform that maps points from one image to corresponding points in another image.

Richard G. Baraniuk Chinmay Hegde Sriram Nagaraj Go With The Flow A New Manifold Modeling and Learning Framework for Image Ensembles Aswin C. Sankaranarayanan.

A CTION R ECOGNITION FROM V IDEO U SING F EATURE C OVARIANCE M ATRICES Kai Guo, Prakash Ishwar, Senior Member, IEEE, and Janusz Konrad, Fellow, IEEE.

CSE 473/573 Computer Vision and Image Processing (CVIP)

Multiple Shape Correspondence by Dynamic Programming Yusuf Sahillioğlu 1 and Yücel Yemez 2 Pacific Graphics 2014 Computer Eng. Depts, 1, 2, Turkey.

Richard G. Baraniuk Chinmay Hegde Manifold Learning in the Wild A New Manifold Modeling and Learning Framework for Image Ensembles Aswin C. Sankaranarayanan.

Stereo Vision Reading: Chapter 11

1 Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University.

Presented by: Mingyuan Zhou Duke University, ECE April 3, 2009

Robust Object Tracking via Sparsity-based Collaborative Model

“Random Projections on Smooth Manifolds” -A short summary

Compressed Sensing for Networked Information Processing Reza Malek-Madani, 311/ Computational Analysis Don Wagner, 311/ Resource Optimization Tristan Nguyen,

A Study of Approaches for Object Recognition

Segmentation Divide the image into segments. Each segment:

1 Numerical geometry of non-rigid shapes Spectral Methods Tutorial. Spectral Methods Tutorial 6 © Maks Ovsjanikov tosca.cs.technion.ac.il/book Numerical.

Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Object Recognition Using Distinctive Image Feature From Scale-Invariant Key point D. Lowe, IJCV 2004 Presenting – Anat Kaspi.

Scale Invariant Feature Transform (SIFT)

Atul Singh Junior Undergraduate CSE, IIT Kanpur.  Dimension reduction is a technique which is used to represent a high dimensional data in a more compact.

NonLinear Dimensionality Reduction or Unfolding Manifolds Tennenbaum|Silva|Langford [Isomap] Roweis|Saul [Locally Linear Embedding] Presented by Vikas.

Scale-Invariant Feature Transform (SIFT) Jinxiang Chai.

Random Projections of Signal Manifolds Michael Wakin and Richard Baraniuk Random Projections for Manifold Learning Chinmay Hegde, Michael Wakin and Richard.

Nonlinear Dimensionality Reduction by Locally Linear Embedding Sam T. Roweis and Lawrence K. Saul Reference: "Nonlinear dimensionality reduction by locally.

Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional.

CSC 589 Lecture 22 Image Alignment and least square methods Bei Xiao American University April 13.

Distinctive Image Features from Scale-Invariant Keypoints By David G. Lowe, University of British Columbia Presented by: Tim Havinga, Joël van Neerbos.

Manifold learning: Locally Linear Embedding Jieping Ye Department of Computer Science and Engineering Arizona State University

Final Exam Review CS485/685 Computer Vision Prof. Bebis.

Go With The Flow A New Manifold Modeling and Learning Framework for Image Ensembles Richard G. Baraniuk Rice University.

Internet-scale Imagery for Graphics and Vision James Hays cs195g Computational Photography Brown University, Spring 2010.

Richard Baraniuk Chinmay Hegde Marco Duarte Mark Davenport Rice University Michael Wakin University of Michigan Compressive Learning and Inference.

Mutual Information-based Stereo Matching Combined with SIFT Descriptor in Log-chromaticity Color Space Yong Seok Heo, Kyoung Mu Lee, and Sang Uk Lee.

Cs: compressed sensing

The Brightness Constraint

Learning a Kernel Matrix for Nonlinear Dimensionality Reduction By K. Weinberger, F. Sha, and L. Saul Presented by Michael Barnathan.

Generalized Hough Transform

Computer Vision Lab. SNU Young Ki Baik Nonlinear Dimensionality Reduction Approach (ISOMAP, LLE)

CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai.

GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer.

Expectation-Maximization (EM) Case Studies

Raquel A. Romano 1 Scientific Computing Seminar May 12, 2004 Projective Geometry for Computer Vision Projective Geometry for Computer Vision Raquel A.

Manifold learning: MDS and Isomap

Jan Kamenický.  Many features ⇒ many dimensions  Dimensionality reduction ◦ Feature extraction (useful representation) ◦ Classification ◦ Visualization.

H. Lexie Yang1, Dr. Melba M. Crawford2

Features, Feature descriptors, Matching Jana Kosecka George Mason University.

Vision-based SLAM Enhanced by Particle Swarm Optimization on the Euclidean Group Vision seminar : Dec Young Ki BAIK Computer Vision Lab.

Multiscale Geometric Signal Processing in High Dimensions

776 Computer Vision Jan-Michael Frahm Spring 2012.

Manifold Learning JAMES MCQUEEN – UW DEPARTMENT OF STATISTICS.

776 Computer Vision Jan-Michael Frahm Spring 2012.

Spectral Methods for Dimensionality

SIFT Scale-Invariant Feature Transform David Lowe

Lecture 07 13/12/2011 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.

Unsupervised Riemannian Clustering of Probability Density Functions

Paper Presentation: Shape and Matching

Feature description and matching

ISOMAP TRACKING WITH PARTICLE FILTERING

Outline Nonlinear Dimension Reduction Brief introduction Isomap LLE

Object recognition Prof. Graeme Bailey

Announcements more panorama slots available now

KFC: Keypoints, Features and Correspondences

Announcements more panorama slots available now

NonLinear Dimensionality Reduction or Unfolding Manifolds

Presentation transcript:

Richard G. Baraniuk Chinmay Hegde Sriram Nagaraj Manifold Learning in the Wild A New Manifold Modeling and Learning Framework for Image Ensembles Aswin C. Sankaranarayanan Rice University

Sensor Data Deluge

Concise Models Efficient processing / compression requires concise representation Our interest in this talk: Collections of images

Concise Models Our interest in this talk: Collections of image parameterized by  \in  –translations of an object : x-offset and y-offset –rotations of a 3D object pitch, roll, yaw –wedgelets : orientation and offset

Concise Models Our interest in this talk: Collections of image parameterized by  \in  –translations of an object : x-offset and y-offset –rotations of a 3D object pitch, roll, yaw –wedgelets : orientation and offset Image articulation manifold

Image Articulation Manifold N-pixel images: K-dimensional articulation space Then is a K-dimensional manifold in the ambient space Very concise model articulation parameter space

Smooth IAMs N-pixel images: Local isometry image distance parameter space distance Linear tangent spaces are close approximation locally Low dimensional articulation space articulation parameter space

Smooth IAMs articulation parameter space N-pixel images: Local isometry image distance parameter space distance Linear tangent spaces are close approximation locally Low dimensional articulation space

Smooth IAMs articulation parameter space N-pixel images: Local isometry image distance parameter space distance Linear tangent spaces are close approximation locally Low dimensional articulation space

Ex: Manifold Learning LLE ISOMAP LE HE Diff. Geo … K=1 rotation

Ex: Manifold Learning K=2 rotation and scale

Theory/Practice Disconnect Smoothness Practical image manifolds are not smooth! If images have sharp edges, then manifold is everywhere non-differentiable [Donoho and Grimes] Tangent approximations ?

Theory/Practice Disconnect Smoothness Practical image manifolds are not smooth! If images have sharp edges, then manifold is everywhere non-differentiable [Donoho and Grimes] Tangent approximations ?

Failure of Tangent Plane Approx. Ex: cross-fading when synthesizing / interpolating images that should lie on manifold Input Image Geodesic Linear path

Ex:translation manifold all blue images are equidistant from the red image Local isometry –satisfied only when sampling is dense Theory/Practice Disconnect Isometry

Theory/Practice Disconnect Nuisance articulations Unsupervised data, invariably, has additional undesired articulations –Illumination –Background clutter, occlusions, … Image ensemble is no longer low-dimensional

Image representations Conventional representation for an image –A vector of pixels –Inadequate! pixel image

Image representations Conventional representation for an image –A vector of pixels –Inadequate! Remainder of the talk –TWO novel image representations that alleviate the theoretical/practical challenges in manifold learning on image ensembles

Transport operators for image manifolds

The concept of Transport operators Beyond point cloud model for image manifolds Example

Example: Translation 2D Translation manifold Set of all transport operators = Beyond a point cloud model –Action of the articulation is more accurate and meaningful

Optical Flow Generalizing this idea: Pixel correspondances Idea:OF between two images is a natural and accurate transport operator (Figures from Ce Liu’s optical flow page) OF from I 1 to I 2 I 1 and I 2

Optical Flow Transport IAM Articulations Consider a reference image and a K-dimensional articulation Collect optical flows from to all images reachable by a K-dimensional articulation

Optical Flow Transport IAM Articulations Consider a reference image and a K-dimensional articulation Collect optical flows from to all images reachable by a K-dimensional articulation

Optical Flow Transport IAM OFM at Articulations Consider a reference image and a K-dimensional articulation Collect optical flows from to all images reachable by a K-dimensional articulation Collection of OFs is a smooth, K- dimensional manifold (even if IAM is not smooth) for large class of articulations

OFM is Smooth (Rotation) Articulation θ in [ ⁰ ] Intensity I(θ) Op. flow v(θ) Pixel intensity at 3 points Flow (nearly linear)

Main results Local model at each Each point on the OFM defines a transport operator –Each transport operator maps to one of its neighbors For a large class of articulations, OFMs are smooth and locally isometric –Traditional manifold processing techniques work on OFMs IAM OFM at Articulations

Linking it all together IAM OFM at Articulations Nonlinear dim. reduction The non-differentiability does not dissappear --- it is embedded in the mapping from OFM to the IAM. However, this is a known map

The Story So Far… IAM OFM at Articulations Tangent space at Articulations IAM

Input Image Geodesic Linear path IAM OFM

Data 196 images of two bears moving linearly and independently IAM OFM Task Find low-dimensional embedding OFM Manifold Learning

IAM OFM Data 196 images of a cup moving on a plane Task 1 Find low-dimensional embedding Task 2 Parameter estimation for new images (tracing an “R”) OFM ML + Parameter Estimation

Point on the manifold such that the sum of geodesic distances to every other point is minimized Important concept in nonlinear data modeling, compression, shape analysis [Srivastava et al] Karcher Mean 10 images from an IAM ground truth KM OFM KM linear KM

Sparse keypoint-based image representation

Image representations Conventional representation for an image –A vector of pixels –Inadequate! pixel image

Image representations Replace vector of pixels with an abstract bag of features –Ex: SIFT (Scale Invariant Feature Transform) selects keypoint locations in an image and computes keypoint descriptors for each keypoint

Image representations Replace vector of pixels with an abstract bag of features –Ex: SIFT (Scale Invariant Feature Transform) selects keypoint locations in an image and computes keypoint descriptors for each keypoint –Feature descriptors are local; it is very easy to make them robust to nuisance imaging parameters

Loss of Geometrical Info Bag of features representations hide potentially useful image geometry Goal: make salient image geometrical info more explicit for exploitation Image space Keypoint space

Key idea Keypoint space can be endowed with a rich low-dimensional structure in many situations Mechanism: define kernels, between keypoint locations, keypoint descriptors

Keypoint Kernel Keypoint space can be endowed with a rich low-dimensional structure in many situations Mechanism: define kernels, between keypoint locations, keypoint descriptors Joint keypoint kernel between two images is given by

Keypoint Geometry Theorem: Under the metric induced by the kernel certain ensembles of articulating images form smooth, isometric manifolds In contrast: conventional approach to image fusion via image articulation manifolds (IAMs) fraught with non-differentiability (due to sharp image edges) –not smooth –not isometric

Application: Manifold Learning 2D Translation

Application: Manifold Learning 2D Translation IAM KAM

Manifold Learning in the Wild Rice University’s Duncan Hall Lobby –158 images –360° panorama using handheld camera –Varying brightness, clutter

Duncan Hall Lobby Ground truth using state of the art structure-from-motion software Manifold Learning in the Wild Ground truthIAMKAM

Internet scale imagery Notre-dame cathedral –738 images –Collected from Flickr –Large variations in illumination (night/day/saturations), clutter (people, decorations), camera parameters (focal length, fov, …) –Non-uniform sampling of the space

Organization k-nearest neighbors

Organization “geodesics’ 3D rotation “Walk-closer” “zoom-out”

Summary Need for novel image representations –Transport operators  Enables differentiability and meaningful transport –Sparse features  Robustness to outliers, nuisance articulations, etc.  Learning in the wild: unsupervised imagery True power of manifold signal processing lies in fast algorithms that mainly use neighbor relationships –What are compelling applications where such methods can achieve state of the art performance ?

Summary IAMs a useful concise model for many image processing problems involving image collections and multiple sensors/viewpoints But practical IAMs are non-differentiable –IAM-based algorithms have not lived up to their promise Optical flow manifolds (OFMs) –smooth even when IAM is not –OFM ~ nonlinear tangent space –support accurate image synthesis, learning, charting, … Barely discussed here: OF enables the safe extension of differential geometry concepts –Log/Exp maps, Karcher mean, parallel transport, …

Summary: Rice U Bag of features representations pervasive in image information fusion applications (ex: SIFT) Progress to date: –Bags of features can contain rich geometrical structure –Keypoint distance very efficient to compute (contrast with combinatorial complexity of feature correspondence algorithm) –Keypoint distance significantly outperforms standard image Euclidean distance in learning and fusion applications –Punch line: KAM preferred over IAM Current/future work: –More exhaustive experiments with occlusion and clutter –Studying geometrical structure of feature representations of other kinds of non-image data (ex: documents)

Open Questions Our treatment is specific to image manifolds under brightness constancy What are the natural transport operators for other data manifolds? dsp.rice.edu

Related Work Analytic transport operators –transport operator has group structure [Xiao and Rao 07][Culpepper and Olshausen 09] [Miller and Younes 01] [Tuzel et al 08] –non-linear analytics [Dollar et al 06] –spatio-temporal manifolds [Li and Chellappa 10] –shape manifolds [Klassen et al 04] Analytic approach limited to a small class of standard image transformations (ex: affine transformations, Lie groups) In contrast, OFM approach works reliably with real-world image samples (point clouds) and broader class of transformations

Limitations Brightness constancy –Optical flow is no longer meaningful Occlusion –Undefined pixel flow in theory, arbitrary flow estimates in practice –Heuristics to deal with it Changing backgrounds etc. –Transport operator assumption too strict –Sparse correspondences ?

Occlusion Detect occlusion using forward-backward flow reasoning Remove occluded pixel computations Heuristic --- formal occlusion handling is hard Occluded

Open Questions Theorem: random measurements stably embed a K-dim manifold whp [B, Wakin, FOCM ’08] Q: Is there an analogous result for OFMs?

Tools for manifold processing Geodesics, exponential maps, log-maps, Riemannian metrics, Karcher means, … Smooth diff manifold Algebraic manifoldsData manifolds LLE, kNN graphs Point cloud model

Concise Models Efficient processing / compression requires concise representation Sparsity of an individual image pixels large wavelet coefficients (blue = 0)

History of Optical Flow Dark ages (<1985) –special cases solved –LBC an under-determined set of linear equations Horn and Schunk (1985) –Regularization term: smoothness prior on the flow Brox et al (2005) –shows that linearization of brightness constancy (BC) is a bad assumption –develops optimization framework to handle BC directly Brox et al (2010), Black et al (2010), Liu et al (2010) –practical systems with reliable code

ISOMAP embedding error for OFM and IAM 2D rotations Reference image Manifold Learning

Embedding of OFM 2D rotations Reference image

OFM Implementation details Reference Image

Pairwise distances and embedding

Flow Embedding

OFM Synthesis

Goal: build a generative model for an entire IAM/OFM based on a small number of base images eLCheapo TM algorithm: –choose a reference image randomly –find all images that can be generated from this image by OF –compute Karcher (geodesic) mean of these images –compute OF from Karcher mean image to other images –repeat on the remaining images until no images remain Exact representation when no occlusions Manifold Charting

Goal: build a generative model for an entire IAM/OFM based on a small number of base images Ex:cube rotating about axis. All cube images can be representing using 4 reference images + OFMs Many applications –selection of target templates for classification –“next-view” selection for adaptive sensing applications

Features (including SIFT) ubiquitous in fusion and processing apps (15k+ cites for 2 SIFT papers) SIFT Features building 3D models part-based object recognition organizing internet-scale databases image stitching Figures courtesy Rob Fergus (NYU), Phototourism website, Antonio Torralba (MIT), and Wei Lu

Many Possible Kernels Euclidean kernel Gaussian kernel Polynomial kernel Pyramid match kernel [Grauman et al. ’07] Many others

Keypoint Kernel Joint keypoint kernel between two images is given by Using Euclidean/Gaussian (E/G) combination yields

From Kernel to Metric Theorem: The E/G keypoint kernel is a Mercer kernel –enables algorithms such as SVM Theorem: The E/G keypoint kernel induces a metric on the space of images –alternative to conventional L 2 distance between images –keypoint metric robust to nuisance imaging parameters, occlusion, clutter, etc.

Keypoint Geometry NOTE:Metric requires no permutation-based matching of keypoints (combinatorial complexity in general) Theorem: The E/G keypoint kernel induces a metric on the space of images

Keypoint Geometry Theorem: Under the metric induced by the kernel certain ensembles of articulating images form smooth, isometric manifolds Keypoint representation compact, efficient, and … Robust to illumination variations, non-stationary backgrounds, clutter, occlusions

Manifold Learning in the Wild Viewing angle – 179 images IAM KAM

Manifold Learning in the Wild Rice University’s Brochstein Pavilion –400 outdoor images of a building –occlusions, movement in foreground, varying background

Manifold Learning in the Wild Brochstein Pavilion –400 outdoor images of a building –occlusions, movement in foreground, background IAMKAM