# Finding A Chessboard: An Introduction To Computer Vision

## Presentation on theme: "Finding A Chessboard: An Introduction To Computer Vision"— Presentation transcript:

Finding A Chessboard: An Introduction To Computer Vision
March 9, 2005 Finding A Chessboard: An Introduction To Computer Vision Martin C. Martin

Abstract Complete with a demonstration.
Inspired by the Tangible Media group at the Media Lab, I thought it would be cool to make a computer chess game that you play on a real chess board, where the human player's pieces are real pieces and the computer's are projected. The computer has a camera pointed at the board, and uses computer vision techniques to find the individual pieces. At the moment, all I have working is finding the full 3D position & orientation of the board. It turns out, just finding the board in the image is full of a lot of interesting subtleties, and uses techniques from 3 decades of computer vision. Come learn how computer vision works, what homogeneous coordinates are, and what a little algebra can do for you. March 9, 2005

Points To Make Assume audience doesn’t have any pressing chessboard-locating problems Instead, interested in: Computer Vision “Lessons Learned” about engineering “Lessons Learned” about “flavour” of approach, e.g. what I tried, even if it didn’t work, in order to show a set of reasonable alternatives March 9, 2005

Inspiration Urp by John Underkoffler @ Tangible Media, MIT [Video]
March 9, 2005 Inspiration Visited Media Lab in 1998 Very intuitive & immediate Urp by John Tangible Media, MIT [Video] March 9, 2005

Computer Chess Project: Chess against the computer, where board and your pieces are real, it’s pieces are projected [Picture?] March 9, 2005

Motivation & Inspiration
A spare time project, because I thought it would be fun Project: Chess against the computer, where board and your pieces are real, it’s pieces are projected Working so far: very robust localizing of chessboard (full 3D location and orientation) March 9, 2005

Requirements Interaction should be as natural as possible
E.g. pieces don’t have to be centered in their squares, or even completely in them Should be easy to set up and give demonstrations Little calibration as possible Work in many lighting conditions Although only with this board and pieces Camera needs to be at angle to board Board made by my father just before he met my mother My brother and I learned to play on it I’m not a big chess fan, I just like project March 9, 2005

Computer Vision ’70s to ’80s: Feature Detection
Initial Idea: Corners are unique, look for them Compute the “cornerness” at each pixel by adding the values of some nearby pixels, and subtracting others, in this pattern: Constant Image (e.g. middle of square): output = 0 Edge between two regions (e.g. two squares side-by-side): output = 0 Where four squares come together: output max (+ or -) March 9, 2005

Corner Detector In Practice
Actually, the absolute value of the output March 9, 2005

Problems With Corner Detection
Edge effects Strong response for some non-corners Easily obscured by pieces, hand How to link them up when many are obscured? Go back to something older: Find edges March 9, 2005

Computer Vision ’60s to Early ’70s: Line Drawings
March 9, 2005 Computer Vision ’60s to Early ’70s: Line Drawings Memory/speed/programs too small to do much at every pixel First, convert to line drawing Line drawing captures much of the content of an image March 9, 2005

Edge Finding Very common in early computer vision - - - + + +
Early computers didn’t have much power Very early: enter lines by hand A little later: extract line drawing from image Basic idea: for vertical edge, subtract pixels on left from pixels on right Similarly for horizontal edge March 9, 2005

Edge Finding Need separate mask for each orientation?
No! Can compute intensity gradient from horizontal & vertical gradients Think of intensity as a (continuous) function of 2D position: f(x,y) Rate of change of intensity in direction (u, v) is Magnitude changes as cosine of (u, v) Magnitude maximum when (u, v) equals (∂f/∂x, ∂f/∂y) (∂f/∂x, ∂f/∂y) is called the gradient Magnitude is strength of line at this point Direction is perpendicular to line March 9, 2005

Localizing Line How do we decide where lines are & where they aren’t?
One idea: threshold the magnitude Problem: what threshold to use? Depends on lighting, etc. Problem: will still get multiple pixels at each image location March 9, 2005

Laplacian Better idea: find the peak Image shows magnitude
i.e. where the 2nd derivative crosses zero Image shows magnitude One ridge is positive, one negative March 9, 2005

March 9, 2005

Extracting Whole Lines
So Far: intensity image  “lineness” image Next: “lineness” image  list of lines Need to accumulate contributions from across image Could be many gaps Want to extract position & orientation of lines Boundaries won’t be robust, so consider lines to run across the entire image March 9, 2005

Hough Transform Parameterize lines by angle and distance to center of image Discretize these and create a 2D grid covering the entire range Each unsuppressed pixel is part of a line Perpendicular to the gradient Add it’s strength to the bin for that line θ d March 9, 2005

Hough Transform Distance To Center Angle March 9, 2005

March 9, 2005 Take The 24 Biggest Lines Show how robust it is by moving the board around March 9, 2005

Demonstration March 9, 2005

Improving Them Knowing the bin gives us the approximate orientation and location of the line Could then go back to the image and improve the estimate, e.g. using robust line fitting to all pixels near the approximate line Didn’t code this, for two reasons Coding “all pixels near approximate line” is a bit of a pain Wanted to develop rest of code with noisy line estimates, to make it robust March 9, 2005

Coordinate Systems In 2D (u, v) (i.e. on the screen):
Origin at center of screen u horizontal, increasing to the right v vertical, increasing down Maximum u and v determined by field of view. In 3D (x, y, z) (camera’s frame): Origin at eye Looking along z axis x & y in directions of u & v respectively March 9, 2005

2D  3D y v d z Perspective projection (u, v): image coordinates (2D)
March 9, 2005 2D  3D This “camera works like an eye” thing is a red herring Point of camera is to recreate the INPUT to the eye, i.e. light from picture recreates light from world Perspective projection (u, v): image coordinates (2D) (x, y, z): world coordinates (3D) Similar Triangles Free to assume virtual screen at d = 1 Relation: u = x/z, v = y/z y v d z March 9, 2005

2D 3D p = (u, v) p = (x, y, z) u = x/z, v=y/z March 9, 2005

Lines in 3D map to lines on screen
Proof: the 3D line, plus the origin (eye), form a plane. All light rays from the 3D line to the eye are in this plane. The intersection of that plane and the image plane form a line March 9, 2005

2D 3D p = (u, v) p = (x, y, z) u = x/z, v=y/z Line  Line
March 9, 2005

From 2D Lines To 3D Lines If a group of lines are parallel in 3D, what’s the corresponding 2D constraint? Equation of line in 2D: Au + Bv + C = 0 Substituting in our formula for u & v: Ax/z + By/z + C = 0 Ax + By + Cz = 0 A 3D plane through the origin containing the 3D line Let L = (A, B, C) & p = (x, y, z). Then L•p = 0 March 9, 2005

2D 3D p = (u, v) p = (x, y, z) u = x/z, v=y/z Line  Line Line 
Au + Bv + C = 0 Plane Ax + By + Cz = 0 L•p = 0 March 9, 2005

Recovering 3D Direction
Let’s represent the 3D line parametrically, i.e. as the set of p0+td for all t, where d is the direction. The 9 parallel lines on the board have the same d but different p0. For all t: L•(p0+td) = L•p0 + t L•d = 0 Since p0 is on the line, L•p0 = 0. Therefore, L•d = 0, i.e. Axd + Byd + Czd = 0 That is, the point (xd, yd, zd) (which is not necessarily on the 3D line) projects to a point on the 2D line d is the same for all 9 lines in a group; so it must be the common intersection point, i.e. the vanishing point Knowing the 2D vanishing point gives us the full 3D direction! March 9, 2005

Vanishing Point = 3D Direction
I.e. treating the intersection point as a 3D point and normalizing it gives us the direction vector. Since the two groups of lines are orthogonal in 3D space, their dot product must be zero. We can use this as a consistency check: when the dot product is far from zero, we didn’t isolate the right groups. Or, we can use it to estimate the FOV, if we don’t trust our current estimate. We now have our first piece of 3D information: the full 3D orientation. March 9, 2005

March 9, 2005 Vanishing Point What’s the 2D constraint for lines that are parallel in 3D? Aren’t parallel, angles aren’t evenly spaced, but They have a common point of intersection: the vanishing point March 9, 2005

Ames Room Viewing Direction
March 9, 2005 Ames Room The vanishing point is a very strong constraint Viewing Direction March 9, 2005

2D 3D p = (u, v) p = (x, y, z) u = x/z, v=y/z Line  Line Line 
Au + Bv + C = 0 Plane Ax + By + Cz = 0 L•p = 0 Common Intersection  Parallel Lines Vanishing Point  Common Direction March 9, 2005

Finding The Vanishing Point
Want to find a group of 9 2D lines with a common intersection point For two lines A1u+B1v+C1 = 0 and A2u+B2v+C2 = 0, intersection point is on both lines, therefore satisfies both linear equations: In reality, only approximately intersect at a common location March 9, 2005

Representing the Vanishing Point
Problem: If camera is perpendicular to board, 2D lines are parallel  no solution Problem: If camera is almost perpendicular to board, small error in line angle make for big changes in intersection location Euclidean distance between points is poor measure of similarity March 9, 2005

Desired Metric Sensitivity Analysis for Intersection Point:
Of the form u’/w, v’/w. If -1 ≤ A,B,C ≤ 1, then -2 ≤ u’, v’, w ≤ 2 So, the only way for the intersection point to be large is if w is small; the right metric is roughly 1/w A representation with that property is… March 9, 2005

Homogeneous Coordinates
Introduced by August Ferdinand Möbius An example of projective geometry Represent a 2D point using 3 numbers The point (u, v, w) corresponds to u/w, v/w Formula for perspective projection means any 3D point is already the homogeneous coordinate of its 2D projection Multiplying by a scalar doesn’t change the point: (cu, cv, cw) represents same 2D point as (u, v, w) March 9, 2005

Homogeneous Coordinates
w=1 (u, v, 1) March 9, 2005

Project Intersection Point Onto Unit Sphere
March 9, 2005 Project Intersection Point Onto Unit Sphere Points at infinity map to equator, keeping their direction Simply use normal 3D Euclidean distances between points on the sphere March 9, 2005

Lines Become “Great Circles”
March 9, 2005 Lines Become “Great Circles” Points at infinity map to equator, keeping their direction Simply use normal 3D Euclidean distances between points on the sphere March 9, 2005

Clustering: Computer Vision In The Nineties
60s and 70s: Promise of human equivalence right around the corner 80s: Backlash against AI Like an “internet startup” now 90s: Extensions of existing engineering techniques Applied statistics: Bayesian Networks Control Theory: Reinforcement Learning March 9, 2005

K Means Points at infinity map to equator, keeping their direction
March 9, 2005 K Means Points at infinity map to equator, keeping their direction Simply use normal 3D Euclidean distances between points on the sphere March 9, 2005

K-Means [Describe the basic idea] March 9, 2005

Measuring Field Of View
What happens if we get the field of view wrong? That’s equivalent to getting d, the distance of the image plane, wrong: +ymax +ymax -ymax -ymax March 9, 2005

Measuring Field Of View
Perspective Projection becomes v = dy/z, u = dx/z 2D line is still Au + Bv + C = 0. Substitue: Adx/z + Bdy/z + C = 0 Ax + By +(C/d)z = 0 Let L = (A, B, C/d), p = (x, y, z).  L•p = 0 If (xi, yi, zi) are the 3D direction vectors, then (xi, yi, zi/d) is on the 2D line. Call this 2D point (ui, vi, wi). 3D vectors perpendicular: u1 u2 + v1 v2 +d2w1 w2 = 0. Can solve for d. March 9, 2005

Measuring Field Of View
d & horizontal FOV are related by xmax/d = sin(FOV/2) Our estimate of d has the least error when w1 w2 is maximum. The equation is symmetric in x & y, so let’s rotate the board to be in the x-w plane. So when d is close to 1, the direction vectors are (cos(θ), sin(θ)) and (-sin(θ), cos(θ)). Max at θ = 45°, image as in next slide. March 9, 2005

Measuring FOV March 9, 2005

Rigid Objects [Talk about rigid transformations and the 6 degrees of freedom?] March 9, 2005

2D to 3D: Distance How do we get the distance to the board? From the spacing between lines. While 3D lines map to 2D lines, points equally spaced along a 3D line AREN’T equally spaced in 2D: March 9, 2005

Distance To Board Let c = (0, 0, zc) be the point were the z axis hits the board For each line in group 1, we find the 3D perpendicular distance from c to the line, along d2 These should be equally spaced in 3D Find the common difference, like Millikan oil drop expr. March 9, 2005

Distance To Board Let Li = (Ai, Bi, Ci) be the ith line in group 1.
We want to find ti such that the point c + ti d2 is on Li, i.e. Li•(c + ti d2) = 0 Li•c + ti Li•d2 = 0 ti = - Li•c / Li•d2 = - Cizc / Li•d2 zc is the only unknown, so put that on the left Let si = ti/zc = -Ci / Li•d2 If we choose our 3D units so that the squares are 1x1, then the common difference is 1/zc March 9, 2005

Distance To Board So, given the si, we need to find t0 and zc such that: si = ti/zc = (i + t0) / zc, i = 0…8 But, we have outliers & occasional omission So we use robust estimation March 9, 2005

Robust Estimation Many existing parameter estimation algorithms optimize a continuous function Sometimes there’s a closed form (e.g. MLE of center of Gaussian is just the sample mean) Sometimes it’s something more iterative (e.g. Newton-Rhapson) However, these are usually sensitive to outliers Data cleaning is often a big part of the analysis The reason why decision trees (which are extreemly robust to outliers) are the best all-round “off-the-shelf” data mining technique March 9, 2005

Robust Estimation Find distance (and offset) to maximize score:
Involves evaluating on a fine grid Isn’t time consuming here, since the number of data points is small March 9, 2005