Presentation is loading. Please wait.

Presentation is loading. Please wait.

Image and Geometry Yi Ma

Similar presentations


Presentation on theme: "Image and Geometry Yi Ma"— Presentation transcript:

1 Image and Geometry Yi Ma
Digital Signal Processing Seminar (2002) Image and Geometry Yi Ma Perception & Decision Laboratory Decision & Control Group, CSL Image Formation & Processing Group, Beckman Electrical & Computer Engineering Dept., UIUC

2 IMAGES AND GEOMETRY: From 3D to 2D and then back to 3D
GEOMETRY FOR MULTIPLE IMAGES: Without knowledge of scene GEOMETRY FOR SINGLE IMAGES: With knowledge of scene GEOMETRY FOR MULTIPLE IMAGES: With knowledge of scene APPLICATIONS: Vision, graphics, robotics, and cognition, etc. CONCLUSIONS: Open problems and future work

3 IMAGES AND GEOMETRY – A Little History of Perspective Imaging
Pinhole (perspective) imaging, in most ancient civilizations. Euclid, perspective projection, 4th century B.C., Alexandria Pompeii frescos, 1st century A.D. Image courtesy of C. Taylor

4 IMAGES AND GEOMETRY – A Little History of Perspective Imaging
Fillippo Brunelleschi, first Renaissance artist painted with correct perspective,1413 “Della Pictura”, Leone Battista Alberti, 1435 Leonardo Da Vinci, stereopsis, shading, color, 1500s “The scholar of Athens”, Raphael, 1518 Image courtesy of C. Taylor

5 IMAGES AND GEOMETRY – The Fundamental Problem
Input: Corresponding “features” in multiple images. Output: Camera motion, camera calibration, object structure. Jana’s apartment Ignoring other pictorial cues: texture, shading, contour, etc. Image courtesy of Jana Kosecka

6 IMAGES AND GEOMETRY – History of “Modern” Geometric Vision
Chasles, formulated the two-view seven-point problem,1855 Hesse, solved the above problem, 1863 Kruppa, solved the two-view five-point problem, 1913 Longuet-Higgins, the two-view eight-point algorithm, 1981 Liu and Huang, the three-view trilinear constraints, 1986 Huang and Faugeras, SVD based eight-point algorithm, 1989 Triggs, the four-view quadrilinear constraints, 1995 Ma et. al., the multiple-view rank condition, 2000

7 Synonyms: Group = Symmetry
IMAGES AND GEOMETRY – An Uncanny deja vu? “The rise of projective geometry made such an overwhelming impression on the geometers of the first half of the nineteenth century that they tried to fit all geometric considerations into the projective scheme. ... The dictatorial regime of the projective idea in geometry was first successfully broken by the German astronomer and geometer Mobius, but the classical document of the democratic platform in geometry establishing the group of transformations as the ruling principle in any kind of geometry and yielding equal rights to independent consideration to each and any such group, is F. Klein's Erlangen program.” --- Herman Weyl, Classic Groups, 1952 Synonyms: Group = Symmetry

8 IMAGES AND GEOMETRY – Motivating Examples (Berkeley Campus)
Image courtesy of Paul Debevec

9 IMAGES AND GEOMETRY – Motivating Examples (CSL Building, UIUC)
Image courtesy of Kun Huang

10 IMAGES AND GEOMETRY – Are Multiple Views Necessary?
Secondly, observe that symmetry is ambient in nature, in both the micro world and macro world.

11 IMAGES AND GEOMETRY – Are Multiple Views Necessary?
In the man-made world, symmetry is even more dominant. A common phenomenon associated with all symmetric objects is that even from a single perspective image of them, we get a very accurate sense about their 3-D structure and pose.

12 GEOMETRY FOR MULITPLE IMAGES – A Little Notation
The “hat” of a vector: We will always use column vectors, except for one case which I will mention later on. Given a three dimensional vector u, we use u-hat to represent a 3x3 skew symmetric matrix associated to it. In the literature, people also use u-product for the same thing. Using this notation, u-hat multiplying a vector v is then equal to their cross product. In particular, u crosses with u itself gets zero.

13 GEOMETRY FOR MULITPLE IMAGES – Image of a Point
Homogeneous coordinates of a 3-D point Homogeneous coordinates of its 2-D image Projection of a 3-D point to an image plane Now let me quickly go through the basic mathematical model for a camera system. Here is the notation. We will use a four dimensional vector X for the homogeneous coordinates of a 3-D point p, its image on a pre-specified plane will be described also in homogeneous coordinate as a three dimensional vector x. If everything is normalized, then W and z can be chosen to be 1. We use a 3x4 matrix Pi to denote the transformation from the world frame to the camera frame. R may stand for rotation, T for translation. Then the image x and the world coordinate X of a point is related through the equation, where lambda is a scale associated to the depth of the 3D point relative to the camera center o. But in general the matrix Pi can be any 3x4 matrix, because the camera may add some unknown linear transformation on the image plane. Usually it is denoted by a 3x 3 matrix A(t).

14 GEOMETRY FOR MULITPLE IMAGES – Image of a Line
Homogeneous representation of a 3-D line Homogeneous representation of its 2-D co-image Projection of a 3-D line to an image plane First let us talk about line features. To describe a line in 3-D, we need to specify a base point on the line and a vector indicating the direction of the line. On the image plane we can use a three dimensional vector l to describe the image of a line L. More specifically, if x is the image of a point on this line, its inner product with l is 0.

15 GEOMETRY FOR MULITPLE IMAGES – Incidence Relations
“Pre-images” are all incident at the corresponding features. Now consider multiple images of a simplest object, say a cube. All the constraints are incidence relations, are all of the same nature. Is there any way that we can express all the constraints in a unified way? Yes, there is. . . .

16 GEOMETRY FOR MULITPLE IMAGES – Point vs. Line
Point Features Line Features M encodes exactly the 3-D information missing in one image. We may then derive a M matrix for a line feature, we call it M_l. Comparing with the M matrix for a point feature, now denoted as M_p, they both have rank 1. M_l is however a matrix of four columns. The linear dependency between any two rows of M_l gives rise to the trilinear constraints in terms of image lines, which is well-known in the literature. Geometric interpretation for the M_l matrix is no longer a sphere, but a circle. This circle gives a family of parallel lines which may give the same M_l matrix. The radius of the circle is the distance of these lines from the center of the reference camera frame and the normal of this circle is the direction of these lines.

17 GEOMETRY FOR MULITPLE IMAGES – Point vs. Line
Continue with constraints that the rank of the matrix M_l imposes on multiple images of a line feature. If rank(M) is 1, it means that all the planes through every camera center and image line intersect at a unique line in 3-D. If the rank is 0, it corresponds to the only degenerate case that all the planes are the same hence the 3-D line is determined only up to a plane on which all the camera centers must lie.

18 GEOMETRY FOR MULITPLE IMAGES – The Multiple-view Matrix
Theorem [Universal Rank Condition] for images of a point on a line: Here comes my favorite slide: Consider multiple images of a point on a line. In order to express all the incidence constraints associated to these features and all their images, you can formally define a matrix M as following, where the components D_i and D_i^perp are up to you to choose and plug in. D_I’s are really the images of either the point or the line, D_I^perp’s are the coimages. Choices for D_I and D_j are independent if I is not equal to j. The interesting thing is no matter what you choose, the rank of the resulting matrix must be one of the two cases. The second case gives you all the multi-linear constraints, of course only up to 2 or 3 views; if you really want to know what is among four views, there are some nonlinear constraints. We will see a few examples and give you some intuitive ideas…

19 GEOMETRY FOR MULITPLE IMAGES – Images of a Family of Lines
each is an image of a (different) line in 3-D: Rank =3 Rank =2 Rank =1 . What is the essential meaning of this rank 2 case? Here is an example explaining it. Consider a family of lines in 3D intersecting at one point p. You then randomly choose the image of any of the lines in the family in each view and form a multiple view matrix. Then this matrix in general has rank 2. We know before that if all the images chosen happen to correspond to the same 3D line, the rank of M is 1. Here, you don’t need to have exact correspondence among those lines, yet you still get some non-trivial constraints among their images… . . . . . .

20 GEOMETRY FOR MULITPLE IMAGES – Mixed Features
The rank condition actually allows to you to do multiple view analysis globally. For example if you choose a multiple view matrix as following. Its rank can only be 1 or 2. Corresponding to each value, there is a generic picture of the configuration of the 3D features involved and the relative camera configuration.

21 GEOMETRY FOR MULITPLE IMAGES – Coplanar Features
Homogeneous representation of a 3-D plane We have talked about points and lines, what about plane? Suppose now the point or the line feature belongs to some plane in 3D, say pi. We know in general such a plane can be expressed by the following equation. We may lump all the coefficients into a vector pi. Pi^1 is just the first three components and pi^2 is the d. Then the rank condition we had before must be modified somehow since now we have this extra restriction. It turns out all you need to do is to append an extra row to the multiple view matrix and then all the rank conditions remain exactly the same. Corollary [Coplanar Features] Rank conditions on the new extended multiple view matrix remain exactly the same!

22 GEOMETRY FOR MULITPLE IMAGES – Pairwise Homography
Given that a point and line features lie on a plane in 3-D space: In addition to previous constraints, it simultaneously gives homography: For example, the multiple view matrices for m images of a point or a line on such a plane are given by the following two matrices. According to the theorem, they both should have rank 1. What this extra row gives you is exactly the so called homography in computer vision literature. In addition to the homography, the multiple view matrix keeps all the constraints.

23 Iteration GEOMETRY FOR MULITPLE IMAGES – Reconstruction Algorithms
Given m images of n(>7) points For the jth point SVD Iteration For the ith image SVD First assume we have m images of n points, where n is at least 8 (actually we need less, but 8 is necessary for initialization using two views linear algorithm). Then for each point, we have a rank condition on the m images for this particular point. If we know all the motions Ri, Ti, then based on its m images, we can find the kernel of the matrix and hence the depth (structure) of the point. On the other hand, for each view, the images of all the points share the same camera motion and hence motion Ri, Ti can be deduced also using SVD if the structure \lambda_I’s are know. If we have perfect data without noise, then first we can pick two views and use these two views to calculate the structure \lambda_I’s for all the points and then calculate the motion for each image. In the presence noise, what do we do? Iteration. Initialized the structure, calculate the motions, then re-calculate the structure and then motions… until it converges.

24 GEOMETRY FOR MULITPLE IMAGES – Reconstruction Algorithms
90.840 89.820 Here is one experiment of reconstructing a calibration cube from 4 views of 23 points on two sides. This is the result. We can see that the coplanarity of points is almost preserved. The shape is ok and the orthoganality is good.

25 GEOMETRY FOR MULITPLE IMAGES – Reconstruction Algorithms
This is another example of an outdoor scene. There are some errors but the basic shape is maintained. Please note that its is not the case that the more points the better, actually as another piece of result I showed in the proposal, if we extract 15 points from these 24 points, sometimes we can getter even better reconstruction. And in another experiment, 10 points from these 24 points will totally fail. So this leads to the question of under what conditions this algorithm converges and renders reasonable results. Currently we don’t have a systematic answer for that and we are going to explore it in the future work.

26 GEOMETRY FOR SINGLE IMAGES – With Scene Knowledge
Why does an image of a symmetric object give away its structure? Why does an image of a symmetric object give away its pose? What else can we get from an image of a symmetric object? This observation leads to a few fundamental questions related to symmetry:

27 GEOMETRY FOR SINGLE IMAGES – Symmetry related scene knowledge
So basically, the concept of symmetry is ubiquitous. It captures almost all our notions about regularity of shapes in the scene. Symmetry captures almost all “regularities”.

28 GEOMETRY FOR SINGLE IMAGES – Hidden Images from Rotation
Let us take a look at a few examples and see why symmetry might have encoded 3-D information of a structure inside a single image and why such information can be reliably retrieved. First, let us look at a square board which is obviously invariant under a rotation by every 90 degrees around its center. The rotation does not change the overall image but does change the correspondence between the features in the image and features in 3-D. The fact that a proper permutation of features in a single image corresponds to an isometric transformation of a symmetric object in 3-D often encodes sufficient information about the structure and the pose of the object. Every such a permutation therefore generates a new image, we call it a hidden image, of the object seen from a different vantage point.

29 GEOMETRY FOR SINGLE IMAGES – Hidden Images from Reflection
Obviously, rotation is not the only symmetry that the board admits. We may also reflect the board with respect to the x and y axes or its diagonal. These transformations give rise to even more hidden images.

30 GEOMETRY FOR SINGLE IMAGES – Hidden Images from Translation
Besides rotation and reflection, another type of isometric transformation is translation. One image of a tiled floor contains tens and hundreds of “hidden images”. They give overwhelming information about the orientation and location of the floor.

31 GEOMETRY FOR SINGLE IMAGES – Symmetric Structure
Definition. A set of 3-D features S is called a symmetric structure if there exists a non-trivial subgroup G of E(3) that acts on it such that for every g in G, the map is an (isometric) automorphism of S. We say the structure S has a group symmetry G. G is isometric; G is discontinuous. Image of a symmetric object.

32 GEOMETRY FOR SINGLE IMAGES – Hidden Multiple Views

33 GEOMETRY FOR SINGLE IMAGES – Symmetric Rank Condition
Solving g0 from Lyapunov equations: with g’i and gi known.

34 THREE TYPES OF SYMMETRY – Pose from a Reflective Symmetry
Pr

35 THREE TYPES OF SYMMETRY – Reflective Symmetry (Experiment)
Large baseline between hidden images results in large signal-to-noise ratio.

36 THREE TYPES OF SYMMETRY – Reflective Symmetry (Experiment)

37 THREE TYPES OF SYMMETRY – Pose from a Rotational Symmetry

38 THREE TYPES OF SYMMETRY – Rotational Symmetry (Experiment)

39 THREE TYPES OF SYMMETRY – Rotational Symmetry (Experiment)

40 THREE TYPES OF SYMMETRY – Pose from a Translatory Symmetry

41 THREE TYPES OF SYMMETRY – Translatory Symmetry (Experiment)

42 THREE TYPES OF SYMMETRY – Translatory Symmetry (Experiment)

43 THREE TYPES OF SYMMETRY – Ambiguity in Pose Recovery
“(a+b)-parameter” means there are an a-parameter family of ambiguity in R0 of g0 and a b-parameter family of ambiguity in T0 of g0. P Pr Pr N

44 GEOMETRY FOR MULTIPLE IMAGES – With Scene Knowledge
3-D reconstruction from multiple views with symmetry is simple, accurate and robust!

45 GEOMETRY FOR MULTIPLE IMAGES – Hidden Images in Each View
Symmetry on object 1 2 3 4 (3) (4) (2) (1) Virtual camera-camera 2qThis is the outline of my talk. Basically we are interested in geometry of multiple images taken for a scene with multiple moving objects, or non-rigid motions, the so-called dynamical scenes. This requires us to generalize existing multiple view geometry developed mostly for static scene to a dynamical scenario. We will first introduce one way to model perspective projection of a scene by embedding its dynamics into a higher dimensional space. This allows us to address conceptual issues such as whether or not a full reconstruction of the scene structure and dynamics is possible, the so-called observability issue from system theoretical viewpoint. As we will see, in a multiple view setting, the observability is not a critical issue, in a sense that in principle it is always possible to fully recover the scene from sufficiently many views, even a rather rich class of dynamics is concerned. Then, like the classic multiple view geometry, what is important now is to identify all the intrinsic constraints, such as the epipolar constraint, among images which will potentially allow us to recover the structure and dynamics. We know that in multiple view geometry for static scene, these constraints boil down to multilinear constraints. However, it is difficult to generalize them to the dynamical setting, because as we will see that many intrinsic constraints that arise in the dynamical setting is NOT going to be linear, even if the scene dynamics themselves are. We therefore propose in this talk a different approach. Our previous work has shown that a more global characterization of constraints among multiple images of a static scene is the so called rank conditions on certain matrix. We will show in this talk that the same principle carries into the context of dynamical scenes, even if different types of geometric primitives are considered. Finally we conclude our talk by pointing out a few open directions and some of our current work on rank related issues.

46 GEOMETRY FOR MULTIPLE IMAGES – Reflective Homography
2 pairs of symmetric points 1(2) 2(1) 3(4) 4(3) Reflective homography Decompose H to obtain (R’, T’, N) and T0 2qThis is the outline of my talk. Basically we are interested in geometry of multiple images taken for a scene with multiple moving objects, or non-rigid motions, the so-called dynamical scenes. This requires us to generalize existing multiple view geometry developed mostly for static scene to a dynamical scenario. We will first introduce one way to model perspective projection of a scene by embedding its dynamics into a higher dimensional space. This allows us to address conceptual issues such as whether or not a full reconstruction of the scene structure and dynamics is possible, the so-called observability issue from system theoretical viewpoint. As we will see, in a multiple view setting, the observability is not a critical issue, in a sense that in principle it is always possible to fully recover the scene from sufficiently many views, even a rather rich class of dynamics is concerned. Then, like the classic multiple view geometry, what is important now is to identify all the intrinsic constraints, such as the epipolar constraint, among images which will potentially allow us to recover the structure and dynamics. We know that in multiple view geometry for static scene, these constraints boil down to multilinear constraints. However, it is difficult to generalize them to the dynamical setting, because as we will see that many intrinsic constraints that arise in the dynamical setting is NOT going to be linear, even if the scene dynamics themselves are. We therefore propose in this talk a different approach. Our previous work has shown that a more global characterization of constraints among multiple images of a static scene is the so called rank conditions on certain matrix. We will show in this talk that the same principle carries into the context of dynamical scenes, even if different types of geometric primitives are considered. Finally we conclude our talk by pointing out a few open directions and some of our current work on rank related issues. Solve Lyapunov equation to obtain R0.

47 ? GEOMETRY FOR MULTIPLE IMAGES – Alignment of Different Objects
This is the outline of my talk. Basically we are interested in geometry of multiple images taken for a scene with multiple moving objects, or non-rigid motions, the so-called dynamical scenes. This requires us to generalize existing multiple view geometry developed mostly for static scene to a dynamical scenario. We will first introduce one way to model perspective projection of a scene by embedding its dynamics into a higher dimensional space. This allows us to address conceptual issues such as whether or not a full reconstruction of the scene structure and dynamics is possible, the so-called observability issue from system theoretical viewpoint. As we will see, in a multiple view setting, the observability is not a critical issue, in a sense that in principle it is always possible to fully recover the scene from sufficiently many views, even a rather rich class of dynamics is concerned. Then, like the classic multiple view geometry, what is important now is to identify all the intrinsic constraints, such as the epipolar constraint, among images which will potentially allow us to recover the structure and dynamics. We know that in multiple view geometry for static scene, these constraints boil down to multilinear constraints. However, it is difficult to generalize them to the dynamical setting, because as we will see that many intrinsic constraints that arise in the dynamical setting is NOT going to be linear, even if the scene dynamics themselves are. We therefore propose in this talk a different approach. Our previous work has shown that a more global characterization of constraints among multiple images of a static scene is the so called rank conditions on certain matrix. We will show in this talk that the same principle carries into the context of dynamical scenes, even if different types of geometric primitives are considered. Finally we conclude our talk by pointing out a few open directions and some of our current work on rank related issues.

48 GEOMETRY FOR MULTIPLE IMAGES – Scale Correction
For a point p on the intersection line This is the outline of my talk. Basically we are interested in geometry of multiple images taken for a scene with multiple moving objects, or non-rigid motions, the so-called dynamical scenes. This requires us to generalize existing multiple view geometry developed mostly for static scene to a dynamical scenario. We will first introduce one way to model perspective projection of a scene by embedding its dynamics into a higher dimensional space. This allows us to address conceptual issues such as whether or not a full reconstruction of the scene structure and dynamics is possible, the so-called observability issue from system theoretical viewpoint. As we will see, in a multiple view setting, the observability is not a critical issue, in a sense that in principle it is always possible to fully recover the scene from sufficiently many views, even a rather rich class of dynamics is concerned. Then, like the classic multiple view geometry, what is important now is to identify all the intrinsic constraints, such as the epipolar constraint, among images which will potentially allow us to recover the structure and dynamics. We know that in multiple view geometry for static scene, these constraints boil down to multilinear constraints. However, it is difficult to generalize them to the dynamical setting, because as we will see that many intrinsic constraints that arise in the dynamical setting is NOT going to be linear, even if the scene dynamics themselves are. We therefore propose in this talk a different approach. Our previous work has shown that a more global characterization of constraints among multiple images of a static scene is the so called rank conditions on certain matrix. We will show in this talk that the same principle carries into the context of dynamical scenes, even if different types of geometric primitives are considered. Finally we conclude our talk by pointing out a few open directions and some of our current work on rank related issues.

49 GEOMETRY FOR MULTIPLE IMAGES – Alignment of Different Views

50 GEOMETRY FOR MULTIPLE IMAGES – Scale Correction
This is the outline of my talk. Basically we are interested in geometry of multiple images taken for a scene with multiple moving objects, or non-rigid motions, the so-called dynamical scenes. This requires us to generalize existing multiple view geometry developed mostly for static scene to a dynamical scenario. We will first introduce one way to model perspective projection of a scene by embedding its dynamics into a higher dimensional space. This allows us to address conceptual issues such as whether or not a full reconstruction of the scene structure and dynamics is possible, the so-called observability issue from system theoretical viewpoint. As we will see, in a multiple view setting, the observability is not a critical issue, in a sense that in principle it is always possible to fully recover the scene from sufficiently many views, even a rather rich class of dynamics is concerned. Then, like the classic multiple view geometry, what is important now is to identify all the intrinsic constraints, such as the epipolar constraint, among images which will potentially allow us to recover the structure and dynamics. We know that in multiple view geometry for static scene, these constraints boil down to multilinear constraints. However, it is difficult to generalize them to the dynamical setting, because as we will see that many intrinsic constraints that arise in the dynamical setting is NOT going to be linear, even if the scene dynamics themselves are. We therefore propose in this talk a different approach. Our previous work has shown that a more global characterization of constraints among multiple images of a static scene is the so called rank conditions on certain matrix. We will show in this talk that the same principle carries into the context of dynamical scenes, even if different types of geometric primitives are considered. Finally we conclude our talk by pointing out a few open directions and some of our current work on rank related issues. For any image x1 in the first view, its corresponding image in the second view is:

51 Method is object-centered and baseline-independent.
GEOMETRY FOR MULTIPLE IMAGES – Alignment of Multiple Views This is the outline of my talk. Basically we are interested in geometry of multiple images taken for a scene with multiple moving objects, or non-rigid motions, the so-called dynamical scenes. This requires us to generalize existing multiple view geometry developed mostly for static scene to a dynamical scenario. We will first introduce one way to model perspective projection of a scene by embedding its dynamics into a higher dimensional space. This allows us to address conceptual issues such as whether or not a full reconstruction of the scene structure and dynamics is possible, the so-called observability issue from system theoretical viewpoint. As we will see, in a multiple view setting, the observability is not a critical issue, in a sense that in principle it is always possible to fully recover the scene from sufficiently many views, even a rather rich class of dynamics is concerned. Then, like the classic multiple view geometry, what is important now is to identify all the intrinsic constraints, such as the epipolar constraint, among images which will potentially allow us to recover the structure and dynamics. We know that in multiple view geometry for static scene, these constraints boil down to multilinear constraints. However, it is difficult to generalize them to the dynamical setting, because as we will see that many intrinsic constraints that arise in the dynamical setting is NOT going to be linear, even if the scene dynamics themselves are. We therefore propose in this talk a different approach. Our previous work has shown that a more global characterization of constraints among multiple images of a static scene is the so called rank conditions on certain matrix. We will show in this talk that the same principle carries into the context of dynamical scenes, even if different types of geometric primitives are considered. Finally we conclude our talk by pointing out a few open directions and some of our current work on rank related issues. Method is object-centered and baseline-independent.

52 GEOMETRY FOR MULTIPLE IMAGES – Experimental Results
This is the outline of my talk. Basically we are interested in geometry of multiple images taken for a scene with multiple moving objects, or non-rigid motions, the so-called dynamical scenes. This requires us to generalize existing multiple view geometry developed mostly for static scene to a dynamical scenario. We will first introduce one way to model perspective projection of a scene by embedding its dynamics into a higher dimensional space. This allows us to address conceptual issues such as whether or not a full reconstruction of the scene structure and dynamics is possible, the so-called observability issue from system theoretical viewpoint. As we will see, in a multiple view setting, the observability is not a critical issue, in a sense that in principle it is always possible to fully recover the scene from sufficiently many views, even a rather rich class of dynamics is concerned. Then, like the classic multiple view geometry, what is important now is to identify all the intrinsic constraints, such as the epipolar constraint, among images which will potentially allow us to recover the structure and dynamics. We know that in multiple view geometry for static scene, these constraints boil down to multilinear constraints. However, it is difficult to generalize them to the dynamical setting, because as we will see that many intrinsic constraints that arise in the dynamical setting is NOT going to be linear, even if the scene dynamics themselves are. We therefore propose in this talk a different approach. Our previous work has shown that a more global characterization of constraints among multiple images of a static scene is the so called rank conditions on certain matrix. We will show in this talk that the same principle carries into the context of dynamical scenes, even if different types of geometric primitives are considered. Finally we conclude our talk by pointing out a few open directions and some of our current work on rank related issues.

53 GEOMETRY FOR MULTIPLE IMAGES – Experimental Results
This is the outline of my talk. Basically we are interested in geometry of multiple images taken for a scene with multiple moving objects, or non-rigid motions, the so-called dynamical scenes. This requires us to generalize existing multiple view geometry developed mostly for static scene to a dynamical scenario. We will first introduce one way to model perspective projection of a scene by embedding its dynamics into a higher dimensional space. This allows us to address conceptual issues such as whether or not a full reconstruction of the scene structure and dynamics is possible, the so-called observability issue from system theoretical viewpoint. As we will see, in a multiple view setting, the observability is not a critical issue, in a sense that in principle it is always possible to fully recover the scene from sufficiently many views, even a rather rich class of dynamics is concerned. Then, like the classic multiple view geometry, what is important now is to identify all the intrinsic constraints, such as the epipolar constraint, among images which will potentially allow us to recover the structure and dynamics. We know that in multiple view geometry for static scene, these constraints boil down to multilinear constraints. However, it is difficult to generalize them to the dynamical setting, because as we will see that many intrinsic constraints that arise in the dynamical setting is NOT going to be linear, even if the scene dynamics themselves are. We therefore propose in this talk a different approach. Our previous work has shown that a more global characterization of constraints among multiple images of a static scene is the so called rank conditions on certain matrix. We will show in this talk that the same principle carries into the context of dynamical scenes, even if different types of geometric primitives are considered. Finally we conclude our talk by pointing out a few open directions and some of our current work on rank related issues.

54 GEOMETRY FOR MULTIPLE IMAGES – Image Transfer 1
This is the outline of my talk. Basically we are interested in geometry of multiple images taken for a scene with multiple moving objects, or non-rigid motions, the so-called dynamical scenes. This requires us to generalize existing multiple view geometry developed mostly for static scene to a dynamical scenario. We will first introduce one way to model perspective projection of a scene by embedding its dynamics into a higher dimensional space. This allows us to address conceptual issues such as whether or not a full reconstruction of the scene structure and dynamics is possible, the so-called observability issue from system theoretical viewpoint. As we will see, in a multiple view setting, the observability is not a critical issue, in a sense that in principle it is always possible to fully recover the scene from sufficiently many views, even a rather rich class of dynamics is concerned. Then, like the classic multiple view geometry, what is important now is to identify all the intrinsic constraints, such as the epipolar constraint, among images which will potentially allow us to recover the structure and dynamics. We know that in multiple view geometry for static scene, these constraints boil down to multilinear constraints. However, it is difficult to generalize them to the dynamical setting, because as we will see that many intrinsic constraints that arise in the dynamical setting is NOT going to be linear, even if the scene dynamics themselves are. We therefore propose in this talk a different approach. Our previous work has shown that a more global characterization of constraints among multiple images of a static scene is the so called rank conditions on certain matrix. We will show in this talk that the same principle carries into the context of dynamical scenes, even if different types of geometric primitives are considered. Finally we conclude our talk by pointing out a few open directions and some of our current work on rank related issues.

55 GEOMETRY FOR MULTIPLE IMAGES – Image Transfer 2
This is the outline of my talk. Basically we are interested in geometry of multiple images taken for a scene with multiple moving objects, or non-rigid motions, the so-called dynamical scenes. This requires us to generalize existing multiple view geometry developed mostly for static scene to a dynamical scenario. We will first introduce one way to model perspective projection of a scene by embedding its dynamics into a higher dimensional space. This allows us to address conceptual issues such as whether or not a full reconstruction of the scene structure and dynamics is possible, the so-called observability issue from system theoretical viewpoint. As we will see, in a multiple view setting, the observability is not a critical issue, in a sense that in principle it is always possible to fully recover the scene from sufficiently many views, even a rather rich class of dynamics is concerned. Then, like the classic multiple view geometry, what is important now is to identify all the intrinsic constraints, such as the epipolar constraint, among images which will potentially allow us to recover the structure and dynamics. We know that in multiple view geometry for static scene, these constraints boil down to multilinear constraints. However, it is difficult to generalize them to the dynamical setting, because as we will see that many intrinsic constraints that arise in the dynamical setting is NOT going to be linear, even if the scene dynamics themselves are. We therefore propose in this talk a different approach. Our previous work has shown that a more global characterization of constraints among multiple images of a static scene is the so called rank conditions on certain matrix. We will show in this talk that the same principle carries into the context of dynamical scenes, even if different types of geometric primitives are considered. Finally we conclude our talk by pointing out a few open directions and some of our current work on rank related issues.

56 GEOMETRY FOR MULTIPLE IMAGES – Camera Poses
This is the outline of my talk. Basically we are interested in geometry of multiple images taken for a scene with multiple moving objects, or non-rigid motions, the so-called dynamical scenes. This requires us to generalize existing multiple view geometry developed mostly for static scene to a dynamical scenario. We will first introduce one way to model perspective projection of a scene by embedding its dynamics into a higher dimensional space. This allows us to address conceptual issues such as whether or not a full reconstruction of the scene structure and dynamics is possible, the so-called observability issue from system theoretical viewpoint. As we will see, in a multiple view setting, the observability is not a critical issue, in a sense that in principle it is always possible to fully recover the scene from sufficiently many views, even a rather rich class of dynamics is concerned. Then, like the classic multiple view geometry, what is important now is to identify all the intrinsic constraints, such as the epipolar constraint, among images which will potentially allow us to recover the structure and dynamics. We know that in multiple view geometry for static scene, these constraints boil down to multilinear constraints. However, it is difficult to generalize them to the dynamical setting, because as we will see that many intrinsic constraints that arise in the dynamical setting is NOT going to be linear, even if the scene dynamics themselves are. We therefore propose in this talk a different approach. Our previous work has shown that a more global characterization of constraints among multiple images of a static scene is the so called rank conditions on certain matrix. We will show in this talk that the same principle carries into the context of dynamical scenes, even if different types of geometric primitives are considered. Finally we conclude our talk by pointing out a few open directions and some of our current work on rank related issues.

57 GEOMETRY FOR MULTIPLE IMAGES – Full 3-D Model
This is the outline of my talk. Basically we are interested in geometry of multiple images taken for a scene with multiple moving objects, or non-rigid motions, the so-called dynamical scenes. This requires us to generalize existing multiple view geometry developed mostly for static scene to a dynamical scenario. We will first introduce one way to model perspective projection of a scene by embedding its dynamics into a higher dimensional space. This allows us to address conceptual issues such as whether or not a full reconstruction of the scene structure and dynamics is possible, the so-called observability issue from system theoretical viewpoint. As we will see, in a multiple view setting, the observability is not a critical issue, in a sense that in principle it is always possible to fully recover the scene from sufficiently many views, even a rather rich class of dynamics is concerned. Then, like the classic multiple view geometry, what is important now is to identify all the intrinsic constraints, such as the epipolar constraint, among images which will potentially allow us to recover the structure and dynamics. We know that in multiple view geometry for static scene, these constraints boil down to multilinear constraints. However, it is difficult to generalize them to the dynamical setting, because as we will see that many intrinsic constraints that arise in the dynamical setting is NOT going to be linear, even if the scene dynamics themselves are. We therefore propose in this talk a different approach. Our previous work has shown that a more global characterization of constraints among multiple images of a static scene is the so called rank conditions on certain matrix. We will show in this talk that the same principle carries into the context of dynamical scenes, even if different types of geometric primitives are considered. Finally we conclude our talk by pointing out a few open directions and some of our current work on rank related issues.

58 APPLICATIONS – Automatic Matching & Reconstruction
Resolving the “chicken and egg” problem between automatic cell detection, matching (across large baselines) and reconstruction.

59 APPLICATIONS – Automatic Matching & Reconstruction

60 APPLICATIONS – Automatic Matching & Reconstruction

61 APPLICATIONS – Automatic Matching & Reconstruction
Aspect Ratios Reconstruction Ground Truth Whiteboard 1.54 1.51 Table 1.01 1.00

62 APPLICATIONS – Vehicle Heading
On the road to my apartment from CSL…

63 APPLICATIONS – Unmanned Aerial Vehicles (UAVs)
Rate: 10Hz Accuracy: 5cm, 4o Berkeley Aerial Robot (BEAR) Project

64 APPLICATIONS – Visual Perception
INS canonical views (3/4 view)

65 APPLICATIONS – Visual Perception (Illusions)
(Perspective + symmetry) > (orthographic + symmetry) ? Necker cube A perspective cube

66 APPLICATIONS – Visual Perception (Illusions)
What happens if symmetry is wrongfully applied? Ames room Escher’s water fall

67 CONCLUSIONS – A Few “Equalities”
Multiple (perspective) images = multiple-view rank condition Single image + symmetry = “multiple-view” rank condition Multiple images + symmetry = rank condition + scale correction In applications such as computer graphics and autonomous robotics, vision the best laser range sensors, INS, and GPS

68 CONCLUSIONS – Open Directions and Problems
Resolving the “chicken and egg” problem between segmentation, correspondence, reconstruction, recognition 2. Global symmetry vs. local symmetry (Groupoids). 3. Multiple-motion segmentation (Polynomial factorization). D invariants vs. 2-D invariants (Differential geometry). 5. Geometry, algebra, statistics, and perception


Download ppt "Image and Geometry Yi Ma"

Similar presentations


Ads by Google