Presentation on theme: "Image Metrology and 3D Reconstruction. Stereo The classic way by which one obtains 3D information from images is via stereo. We have two eyes, and because."— Presentation transcript:
Stereo The classic way by which one obtains 3D information from images is via stereo. We have two eyes, and because of the way the world is projected differently onto our eyes, we are able to compute the relative distances of objects. Objects that are close are more widely separated on our retinas, and far objects project to points that are closer together.
To understand how we can take measurements from images we need to understand how images are formed...
In a camera we have a flat image plane (rather than a spherical retina). Points in the 3D world are projected onto the image plane. For convenience we typically draw the image plane in front of the projection centre. This produces a result that is identical to the case where the image plane is behind except that now the image is the right way up!
In order to use a camera for 3D measurements it must be calibrated. This involves finding the mathematical relationship between 3D points in the world and where they appear in the image. This is known as the Projection Matrix or the Camera Calibration Matrix. 3D world coordinates Scaled image coordinates (divide by s to get actual coordinates)
Calibration typically requires a calibration target that has marked on it some carefully measured 3D positions. These known 3D positions are then related to the positions in the image where these points appear to obtain the calibration matrix
Image Metrology: Stereo If a 3D calibration frame is placed in a scene and photographed from two or more directions 3D measurements can be made. However, care is needed to get accurate results.
Image Metrology: Stereo Stereo reconstruction 3D Measurements can be taken from the photographs long after the scene has been destroyed.
Stereo Knowing the position of an object in one image means that the object must lie somewhere along the viewing ray defined by that point in the image. If we have two images of a scene, and hence can define two viewing rays, we can solve for the 3D location of that point by finding the intersection of the two viewing rays.
The main problem in stereo is establishing correspondences between points in the two images. This is known as the ‘matching problem’. An important part of matching involves the calculation of the epipolar lines. The position of a point in one image defines a viewing ray. The image of this viewing ray in the other image is its epipolar line. The matching point in this image must be on this line. For forensic applications we can afford to use a manual, or semi-manual approach. Matched points can be picked out manually by a user and the computer can then refine this by searching near the selected points, along epipolar lines, to obtain an accurate match by correlation.
3D Reconstruction From a Single View We, as humans, can look at a photograph or a well executed painting and deduce considerable 3D information. Computer vision techniques that emulate aspects of this process are starting to emerge.
Merton College, Oxford From a single view such as this we can deduce considerable 3D information
This work has resulted from the fusion of two areas of research The use of projective invariants for object recognition This involves finding properties of objects that are invariant to perspective projection. This allows you to recognize an object/feature no matter what view you have of it. This is typified by the work of Zisserman, Forsyth, Mundy and Hartley at Oxford and GE Research in the early 90s. Camera Autocalibration This involves automatically determining camera calibration parameters from matched image points in stereo pairs of images, or motion sequences. Major contributors to this area of research were the group at INRIA, France, headed by Olivier Faugeras, and the groups at Oxford and GE Research.
A Brief history of Perspective... Early art depicted scenes in a very symbolic form. Cave paintings showed animals in profile Egyptian art depicted people with their heads in profile, torso in front view, and waist and legs in profile. Medieval art depicted people and objects very much like cardboard cutouts stuck on a screen. It was during the Italian Renaissance around 1430 that concepts of perspective were developed. Uccello was one of the first artists to use perspective. Here is some of his work
The Battle of San Romano ~1430 Note the alignment of the spears on the ground, the foreshortened image of the fallen soldier on the left and the road in the background. These features were sensational for the time.
The Hunt ~1460 Note the pattern of the trees vanishing towards an apparent horizon in the distance
Drawing of a Chalice (date unknown) This is a remarkable piece of work. It emulates what we take for granted with computer generated wire- frame graphics - except that it was done over 500 years ago with pencil and paper!
The first book that described perspective was produced by Leon Battista Alberti in 1435. Copies of this book can be found in the Architecture and Fine Arts Library at UWA. It is fascinating reading.
Vanishing Points and Vanishing Lines Establishing vanishing points and vanishing lines are the fundamental operations in reconstructing 3D information from a perspective image. In perspective two parallel lines meet at a point Two sets of parallel lines in different directions in a plane will give two vanishing points that define the vanishing line of the plane. All lines that lie in planes parallel to this plane will vanish at points on this line
The Horizon The horizon is where a plane through the projection centre, and parallel to the reference plane cuts through the image plane. Anything below this line will be projected to a point below the horizon, anything above is projected above the horizon.
Establishing Relative Sizes of Objects in a Perspective Image
The point h r is the reference height. If we draw a line through the base points b r and b u to the vanishing line we get the vanishing point of that line. A line from h r to this vanishing point represents a line that is parallel (in the 3D world) to the line through the base points. Point i is the same height above b u as h r is above b r We cannot use the ratio of the lengths b u to h u over b u to i. Ratios of lengths are not preserved under perspective.
The Cross Ratio If we have 4 co-linear points the ratio of the ratio of lengths is invariant to projection. is invariant to perspective projection. In the previous diagram the four points b u, h u, i and v are co-linear, as are the four points b r, h r, c and v. (Actually one cannot apply the cross ratio directly here because point v is at infinity - but the principle holds) That is, the expression
ABCD 111 Cross ratio = AC AD BC BD 2 3 1 2 == 4 3 = 1.333 Cross ratio of equi-spaced points
Cross ratio = AC AD BC BD 488 596 173 282 == 1.334 Measurements from an image…
A cross ratio measured from an image will be identical to the same cross ratio measured in the real world. If we can measure some of the lengths from reference objects in the world we can calculate an unknown length using the cross ratio AC AD BC BD 488 596 173 282 ==1.334 ? 3 1 2 = unknown length ? = 2 known lengths cross ratio from image
An example of height measurement taken from Andrew Zisserman's web pages
Cross ratio measured on image = 1.28 Cross ratio measured on car = 1.32
Cross ratio measured on image = 1.39 Cross ratio measured on car = 1.41
Image Metrology: Rectification Calibration targets allow views of flat surfaces to be rectified. Rectified views allow measurements to be taken.
Transformations of a planar surface original surface rotate scale affine transformation perspective transformation
The projective transformation of a planar surface into an image (which is simply another planar surface) can be represented by a matrix equation. There are 8 unknown parameters in the projection matrix. If we know the coordinates of 4 points in the original plane we can solve for these 8 parameters. We can then invert this equation to convert our image of the plane into a plan view of the plane xy coordinates in the planescaled image coordinates
Image Metrology Rectified views of the fence and ground. Criminisi, Reid and Zisserman 1999
Image Rectification Even if you do not have a calibration target in the scene it is possible to undo the perspective distortion of a plane in the scene if we can find the vanishing line of the plane, and if we have two reference measurements of known lengths, or angles, in the scene.
Transformations of a planar surface original surface rotate scale affine transformation perspective transformation Knowledge of the vanishing line of the plane allows you to invert the perspective transformation. Knowledge of two lengths, or two angles, allows you to invert the affine transformation
Parallel lines in the paving are used to define two vanishing points and hence the vanishing line of the plane. The blue lines lead to the two vanishing points. The black line across the top of the image the vanishing line of the plane. It indicates the height of the camera relative to the objects in the scene. Points R1-R3 and S1-S4 are used to provide constraints on the affine transformation.
There is still some distortion due to uncorrected lens distortion - lines are not quite straight in the image. Look at the portion of the square in the paving that is cut off at the very left of the rectified image - now try to find it in the original image... Note that the rectified image cannot include points that are too close to the vanishing line as these points are at infinity! This is why the rectified image is cut off on the left hand side where it is.
Some results generated by the Oxford Visual Geometry Group…