Multiple-View Geometry for Image-Based Modeling (Course 42)

Slides:



Advertisements
Similar presentations
Scene Reconstruction from Two Projections Eight Points Algorithm Speaker: Junwen WU Course:CSE291 Learning and Vision Seminar Date: 11/13/2001.
Advertisements

Lecture 11: Two-view geometry
Two-View Geometry CS Sastry and Yang
Camera Calibration. Issues: what are intrinsic parameters of the camera? what is the camera matrix? (intrinsic+extrinsic) General strategy: view calibration.
Camera calibration and epipolar geometry
Geometry of Images Pinhole camera, projection A taste of projective geometry Two view geometry:  Homography  Epipolar geometry, the essential matrix.
Epipolar geometry. (i)Correspondence geometry: Given an image point x in the first view, how does this constrain the position of the corresponding point.
Uncalibrated Geometry & Stratification Sastry and Yang
Multiple-view Reconstruction from Points and Lines
Uncalibrated Epipolar - Calibration
Multiple View Geometry Marc Pollefeys University of North Carolina at Chapel Hill Modified by Philippos Mordohai.
The Pinhole Camera Model
Projected image of a cube. Classical Calibration.
CS223b, Jana Kosecka Rigid Body Motion and Image Formation.
May 2004Stereo1 Introduction to Computer Vision CS / ECE 181B Tuesday, May 11, 2004  Multiple view geometry and stereo  Handout #6 available (check with.
Lec 21: Fundamental Matrix
Camera parameters Extrinisic parameters define location and orientation of camera reference frame with respect to world frame Intrinsic parameters define.
Automatic Camera Calibration
Computer vision: models, learning and inference
Lecture 11 Stereo Reconstruction I Lecture 11 Stereo Reconstruction I Mata kuliah: T Computer Vision Tahun: 2010.
Geometry and Algebra of Multiple Views
Brief Introduction to Geometry and Vision
1 Preview At least two views are required to access the depth of a scene point and in turn to reconstruct scene structure Multiple views can be obtained.
Course 12 Calibration. 1.Introduction In theoretic discussions, we have assumed: Camera is located at the origin of coordinate system of scene.
University of Palestine Faculty of Applied Engineering and Urban Planning Software Engineering Department Introduction to computer vision Chapter 2: Image.
Lecture 04 22/11/2011 Shai Avidan הבהרה : החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.
Metrology 1.Perspective distortion. 2.Depth is lost.
Multiview Geometry and Stereopsis. Inputs: two images of a scene (taken from 2 viewpoints). Output: Depth map. Inputs: multiple images of a scene. Output:
Vision Review: Image Formation Course web page: September 10, 2002.
Affine Structure from Motion
MASKS © 2004 Invitation to 3D vision. MASKS © 2004 Invitation to 3D vision Lecture 1 Overview and Introduction.
EECS 274 Computer Vision Affine Structure from Motion.
1 Chapter 2: Geometric Camera Models Objective: Formulate the geometrical relationships between image and scene measurements Scene: a 3-D function, g(x,y,z)
Computer vision: models, learning and inference M Ahad Multiple Cameras
3D Reconstruction Using Image Sequence
MASKS © 2004 Invitation to 3D vision Uncalibrated Camera Chapter 6 Reconstruction from Two Uncalibrated Views Modified by L A Rønningen Oct 2008.
Camera Model Calibration
Reconstruction from Two Calibrated Views Two-View Geometry
Geometry Reconstruction March 22, Fundamental Matrix An important problem: Determine the epipolar geometry. That is, the correspondence between.
Uncalibrated reconstruction Calibration with a rig Uncalibrated epipolar geometry Ambiguities in image formation Stratified reconstruction Autocalibration.
MASKS © 2004 Invitation to 3D vision. MASKS © 2004 Invitation to 3D vision Lecture 1 Overview and Introduction.
Camera Calibration Course web page: vision.cis.udel.edu/cv March 24, 2003  Lecture 17.
Lec 26: Fundamental Matrix CS4670 / 5670: Computer Vision Kavita Bala.
Computer vision: geometric models Md. Atiqur Rahman Ahad Based on: Computer vision: models, learning and inference. ©2011 Simon J.D. Prince.
Multi-view geometry. Multi-view geometry problems Structure: Given projections of the same 3D point in two or more images, compute the 3D coordinates.
CS682, Jana Kosecka Rigid Body Motion and Image Formation Jana Kosecka
Nazar Khan PUCIT Lecture 19
55:148 Digital Image Processing Chapter 11 3D Vision, Geometry
Multiple View Geometry Unified
René Vidal and Xiaodong Fan Center for Imaging Science
Rank Conditions in Multiple View Geometry
Segmentation of Dynamic Scenes
Image and Geometry Yi Ma
A Unified Algebraic Approach to 2D and 3D Motion Segmentation
3D Graphics Rendering PPT By Ricardo Veguilla.
Two-view geometry Computer Vision Spring 2018, Lecture 10
Structure from motion Input: Output: (Tomasi and Kanade)
Lecture G: Multiple-View Reconstruction from Scene Knowledge
Geometric Camera Models
Lecturer: Dr. A.H. Abdul Hafez
Multiple View Geometry for Robotics
Uncalibrated Geometry & Stratification
Multiple-view Reconstruction from Points and Lines
George Mason University
Reconstruction.
Two-view geometry.
Two-view geometry.
Multi-view geometry.
The Pinhole Camera Model
Structure from motion Input: Output: (Tomasi and Kanade)
Presentation transcript:

Multiple-View Geometry for Image-Based Modeling (Course 42) Lecturers: Yi Ma (UIUC) Stefano Soatto (UCLA) Jana Kosecka (GMU) Rene Vidal (UC Berkeley) Yizhou Yu (UIUC)

COURSE LECTURE OUTLINE A. Introduction (Ma) B. Preliminaries: geometry & image formation (Ma) C. Image primitives & correspondence (Soatto) D. Two calibrated views (Kosecka) E. Uncalibrated geometry and stratification (Soatto) F. Multiple-view geometry & algebra (Vidal, Ma) G. Reconstruction from scene knowledge (Ma) H. Step-by-step building of 3D model (Kosecka, Soatto) I. Image-based texture mapping and rendering (Yu)

Introduction (Lecture A) Multiple-View Geometry for Image-Based Modeling Introduction (Lecture A) Yi Ma Perception & Decision Laboratory Decision & Control Group, CSL Image Formation & Processing Group, Beckman Electrical & Computer Engineering Dept., UIUC http://decision.csl.uiuc.edu/~yima

IMAGES AND GEOMETRY – A Little History of Perspective Imaging Pinhole (perspective) imaging, in most ancient civilizations. Euclid, perspective projection, 4th century B.C., Alexandria Pompeii frescos, 1st century A.D. Image courtesy of C. Taylor

IMAGES AND GEOMETRY – A Little History of Perspective Imaging Fillippo Brunelleschi, first Renaissance artist painted with correct perspective,1413 “Della Pictura”, Leone Battista Alberti, 1435 Leonardo Da Vinci, stereopsis, shading, color, 1500s “The scholar of Athens”, Raphael, 1518 Image courtesy of C. Taylor

IMAGES AND GEOMETRY – The Fundamental Problem Input: Corresponding “features” in multiple images. Output: Camera calibration, pose, scene structure, surface photometry. Jana’s apartment

IMAGES AND GEOMETRY – History of “Modern” Geometric Vision Chasles, formulated the two-view seven-point problem,1855 Hesse, solved the above problem, 1863 Kruppa, solved the two-view five-point problem, 1913 Longuet-Higgins, the two-view eight-point algorithm, 1981 Liu and Huang, the three-view trilinear constraints, 1986 Huang and Faugeras, SVD based eight-point algorithm, 1989 Tomasi and Kanade, (orthographic) factorization method, 1992 Ma, Huang, Kosecka, Vidal, multiple-view rank condition, 2000

APPLICATIONS – 3-D Modeling and Rendering

APPLICATIONS – 3-D Modeling and Rendering Image courtesy of Paul Debevec

APPLICATIONS – Image Morphing, Mosaicing, Alignment Images of CSL, UIUC

First-down line and virtual advertising APPLICATIONS – Real-Time Sports Coverage First-down line and virtual advertising Image courtesy of Princeton Video Image, Inc.

APPLICATIONS – Real-Time Virtual Object Insertion UCLA Vision Lab

APPLICATIONS – Autonomous Highway Vehicles Image courtesy of E.D. Dickmanns

APPLICATIONS – Unmanned Aerial Vehicles (UAVs) Rate: 10Hz Accuracy: 5cm, 4o Berkeley Aerial Robot (BEAR) Project

Preliminaries: Imaging Geometry & Image Formation (Lecture B) Multiple-View Geometry for Image-Based Modeling Preliminaries: Imaging Geometry & Image Formation (Lecture B) Yi Ma Perception & Decision Laboratory Decision & Control Group, CSL Image Formation & Processing Group, Beckman Electrical & Computer Engineering Dept., UIUC http://decision.csl.uiuc.edu/~yima

Preliminaries: Imaging Geometry and Image Formation INTRODUCTION 3D EUCLIDEAN SPACE & RIGID-BODY MOTION Coordinates and coordinate frames Rigid-body motion and homogeneous coordinates GEOMETRIC MODELS OF IMAGE FORMATION Lens & Lambertian surfaces Pinhole camera model CAMERA INTRINSIC PARAMETERS & RADIAL DISTORTION From space to pixel coordinates Notation: image, preimage, and coimage Radial distortion and correction SUMMARY OF NOTATION

Coordinates of a point in space: 3D EUCLIDEAN SPACE - Cartesian Coordinate Frame Coordinates of a point in space: Standard base vectors: We will always use column vectors, except for one case which I will mention later on. Given a three dimensional vector u, we use u-hat to represent a 3x3 skew symmetric matrix associated to it. In the literature, people also use u-product for the same thing. Using this notation, u-hat multiplying a vector v is then equal to their cross product. In particular, u crosses with u itself gets zero.

A “free” vector is defined by a pair of points : 3D EUCLIDEAN SPACE - Vectors A “free” vector is defined by a pair of points : Coordinates of the vector : We will always use column vectors, except for one case which I will mention later on. Given a three dimensional vector u, we use u-hat to represent a 3x3 skew symmetric matrix associated to it. In the literature, people also use u-product for the same thing. Using this notation, u-hat multiplying a vector v is then equal to their cross product. In particular, u crosses with u itself gets zero.

3D EUCLIDEAN SPACE – Inner Product and Cross Product Inner product between two vectors: Cross product between two vectors: We will always use column vectors, except for one case which I will mention later on. Given a three dimensional vector u, we use u-hat to represent a 3x3 skew symmetric matrix associated to it. In the literature, people also use u-product for the same thing. Using this notation, u-hat multiplying a vector v is then equal to their cross product. In particular, u crosses with u itself gets zero.

Coordinates are related by: RIGID-BODY MOTION – Rotation Rotation matrix: We will always use column vectors, except for one case which I will mention later on. Given a three dimensional vector u, we use u-hat to represent a 3x3 skew symmetric matrix associated to it. In the literature, people also use u-product for the same thing. Using this notation, u-hat multiplying a vector v is then equal to their cross product. In particular, u crosses with u itself gets zero. Coordinates are related by:

Coordinates are related by: RIGID-BODY MOTION – Rotation and Translation We will always use column vectors, except for one case which I will mention later on. Given a three dimensional vector u, we use u-hat to represent a 3x3 skew symmetric matrix associated to it. In the literature, people also use u-product for the same thing. Using this notation, u-hat multiplying a vector v is then equal to their cross product. In particular, u crosses with u itself gets zero. Coordinates are related by:

3D coordinates are related by: RIGID-BODY MOTION – Homogeneous Coordinates 3D coordinates are related by: Homogeneous coordinates are related by: Homogeneous coordinates of a vector: We will always use column vectors, except for one case which I will mention later on. Given a three dimensional vector u, we use u-hat to represent a 3x3 skew symmetric matrix associated to it. In the literature, people also use u-product for the same thing. Using this notation, u-hat multiplying a vector v is then equal to their cross product. In particular, u crosses with u itself gets zero.

BRDF Lambertian thin lens small FOV IMAGE FORMATION – Lens, Light, and Surfaces image irradiance surface radiance We will always use column vectors, except for one case which I will mention later on. Given a three dimensional vector u, we use u-hat to represent a 3x3 skew symmetric matrix associated to it. In the literature, people also use u-product for the same thing. Using this notation, u-hat multiplying a vector v is then equal to their cross product. In particular, u crosses with u itself gets zero. BRDF Lambertian thin lens small FOV

Pinhole Frontal pinhole IMAGE FORMATION – Pinhole Camera Model We will always use column vectors, except for one case which I will mention later on. Given a three dimensional vector u, we use u-hat to represent a 3x3 skew symmetric matrix associated to it. In the literature, people also use u-product for the same thing. Using this notation, u-hat multiplying a vector v is then equal to their cross product. In particular, u crosses with u itself gets zero.

Homogeneous coordinates IMAGE FORMATION – Pinhole Camera Model 2D coordinates Homogeneous coordinates We will always use column vectors, except for one case which I will mention later on. Given a three dimensional vector u, we use u-hat to represent a 3x3 skew symmetric matrix associated to it. In the literature, people also use u-product for the same thing. Using this notation, u-hat multiplying a vector v is then equal to their cross product. In particular, u crosses with u itself gets zero.

CAMERA PARAMETERS – Pixel Coordinates calibrated coordinates Linear transformation pixel coordinates We will always use column vectors, except for one case which I will mention later on. Given a three dimensional vector u, we use u-hat to represent a 3x3 skew symmetric matrix associated to it. In the literature, people also use u-product for the same thing. Using this notation, u-hat multiplying a vector v is then equal to their cross product. In particular, u crosses with u itself gets zero.

(intrinsic parameters) CAMERA PARAMETERS – Calibration Matrix and Camera Model Ideal pinhole Pixel coordinates Calibration matrix (intrinsic parameters) Projection matrix We will always use column vectors, except for one case which I will mention later on. Given a three dimensional vector u, we use u-hat to represent a 3x3 skew symmetric matrix associated to it. In the literature, people also use u-product for the same thing. Using this notation, u-hat multiplying a vector v is then equal to their cross product. In particular, u crosses with u itself gets zero. Camera model

Nonlinear transformation along the radial direction CAMERA PARAMETERS – Radial Distortion Nonlinear transformation along the radial direction We will always use column vectors, except for one case which I will mention later on. Given a three dimensional vector u, we use u-hat to represent a 3x3 skew symmetric matrix associated to it. In the literature, people also use u-product for the same thing. Using this notation, u-hat multiplying a vector v is then equal to their cross product. In particular, u crosses with u itself gets zero. Distortion correction: make lines straight

IMAGE FORMATION – Image of a Point Homogeneous coordinates of a 3-D point Homogeneous coordinates of its 2-D image Projection of a 3-D point to an image plane Now let me quickly go through the basic mathematical model for a camera system. Here is the notation. We will use a four dimensional vector X for the homogeneous coordinates of a 3-D point p, its image on a pre-specified plane will be described also in homogeneous coordinate as a three dimensional vector x. If everything is normalized, then W and z can be chosen to be 1. We use a 3x4 matrix Pi to denote the transformation from the world frame to the camera frame. R may stand for rotation, T for translation. Then the image x and the world coordinate X of a point is related through the equation, where lambda is a scale associated to the depth of the 3D point relative to the camera center o. But in general the matrix Pi can be any 3x4 matrix, because the camera may add some unknown linear transformation on the image plane. Usually it is denoted by a 3x 3 matrix A(t).

Image of a 3-D point Coimage of the point Preimage of the point NOTATION – Image, Coimage, Preimage of a Point Image of a 3-D point Coimage of the point Preimage of the point Now let me quickly go through the basic mathematical model for a camera system. Here is the notation. We will use a four dimensional vector X for the homogeneous coordinates of a 3-D point p, its image on a pre-specified plane will be described also in homogeneous coordinate as a three dimensional vector x. If everything is normalized, then W and z can be chosen to be 1. We use a 3x4 matrix Pi to denote the transformation from the world frame to the camera frame. R may stand for rotation, T for translation. Then the image x and the world coordinate X of a point is related through the equation, where lambda is a scale associated to the depth of the 3D point relative to the camera center o. But in general the matrix Pi can be any 3x4 matrix, because the camera may add some unknown linear transformation on the image plane. Usually it is denoted by a 3x 3 matrix A(t).

Coimage of a 3-D line Preimage of the line Image of the line NOTATION – Image, Coimage, Preimage of a Line Coimage of a 3-D line Preimage of the line Image of the line Now let me quickly go through the basic mathematical model for a camera system. Here is the notation. We will use a four dimensional vector X for the homogeneous coordinates of a 3-D point p, its image on a pre-specified plane will be described also in homogeneous coordinate as a three dimensional vector x. If everything is normalized, then W and z can be chosen to be 1. We use a 3x4 matrix Pi to denote the transformation from the world frame to the camera frame. R may stand for rotation, T for translation. Then the image x and the world coordinate X of a point is related through the equation, where lambda is a scale associated to the depth of the 3D point relative to the camera center o. But in general the matrix Pi can be any 3x4 matrix, because the camera may add some unknown linear transformation on the image plane. Usually it is denoted by a 3x 3 matrix A(t).

IMAGE FORMATION – Coimage of a Line Homogeneous representation of a 3-D line Homogeneous representation of its 2-D coimage Projection of a 3-D line to an image plane First let us talk about line features. To describe a line in 3-D, we need to specify a base point on the line and a vector indicating the direction of the line. On the image plane we can use a three dimensional vector l to describe the image of a line L. More specifically, if x is the image of a point on this line, its inner product with l is 0.

IMAGE FORMATION – Multiple Images “Preimages” are all incident at the corresponding features. Now consider multiple images of a simplest object, say a cube. All the constraints are incidence relations, are all of the same nature. Is there any way that we can express all the constraints in a unified way? Yes, there is. . . .

An Invitation to 3-D Vision: Ma, Soatto, Kosecka, Sastry, LIST OF REFERENCES Chapters 2 & 3 An Invitation to 3-D Vision: From Images to Geometric Models, Ma, Soatto, Kosecka, Sastry, Springer-Verlag, 2003. This is the outline of my talk. Basically we are interested in geometry of multiple images taken for a scene with multiple moving objects, or non-rigid motions, the so-called dynamical scenes. This requires us to generalize existing multiple view geometry developed mostly for static scene to a dynamical scenario. We will first introduce one way to model perspective projection of a scene by embedding its dynamics into a higher dimensional space. This allows us to address conceptual issues such as whether or not a full reconstruction of the scene structure and dynamics is possible, the so-called observability issue from system theoretical viewpoint. As we will see, in a multiple view setting, the observability is not a critical issue, in a sense that in principle it is always possible to fully recover the scene from sufficiently many views, even a rather rich class of dynamics is concerned. Then, like the classic multiple view geometry, what is important now is to identify all the intrinsic constraints, such as the epipolar constraint, among images which will potentially allow us to recover the structure and dynamics. We know that in multiple view geometry for static scene, these constraints boil down to multilinear constraints. However, it is difficult to generalize them to the dynamical setting, because as we will see that many intrinsic constraints that arise in the dynamical setting is NOT going to be linear, even if the scene dynamics themselves are. We therefore propose in this talk a different approach. Our previous work has shown that a more global characterization of constraints among multiple images of a static scene is the so called rank conditions on certain matrix. We will show in this talk that the same principle carries into the context of dynamical scenes, even if different types of geometric primitives are considered. Finally we conclude our talk by pointing out a few open directions and some of our current work on rank related issues.