Presentation is loading. Please wait.

Presentation is loading. Please wait.

Geometric Hashing: A General and Efficient Model-Based Recognition Scheme Yehezkel Lamdan and Haim J. Wolfson ICCV 1988 Presented by Budi Purnomo Nov 23rd.

Similar presentations


Presentation on theme: "Geometric Hashing: A General and Efficient Model-Based Recognition Scheme Yehezkel Lamdan and Haim J. Wolfson ICCV 1988 Presented by Budi Purnomo Nov 23rd."— Presentation transcript:

1 Geometric Hashing: A General and Efficient Model-Based Recognition Scheme Yehezkel Lamdan and Haim J. Wolfson ICCV 1988 Presented by Budi Purnomo Nov 23rd 2004 Yehezkel Lamdan and Haim J. Wolfson ICCV 1988 Presented by Budi Purnomo Nov 23rd 2004

2 Motivation Object recognition (ultimate goal of most computer vision research). Inputs: A database of objects. A database of objects. A scene or image to recognize. A scene or image to recognize.Problems: 1.Objects in the scene undergo some transformations. 2.Objects may partially occlude each other. 3.Computationally expensive to retrieve each object from database and compare it against the observed scene. Object recognition (ultimate goal of most computer vision research). Inputs: A database of objects. A database of objects. A scene or image to recognize. A scene or image to recognize.Problems: 1.Objects in the scene undergo some transformations. 2.Objects may partially occlude each other. 3.Computationally expensive to retrieve each object from database and compare it against the observed scene.

3 Problem Statement Recognition under Similarity Transformation: “Is there a transformed (rotated, translated and scaled) subset of some model point-set which matches a subset of the scene point-set?” Recognition under Similarity Transformation: “Is there a transformed (rotated, translated and scaled) subset of some model point-set which matches a subset of the scene point-set?”

4 Outline 1. Key idea 2.General Framework 3.Recognition under Various Transformations 4.Recognition of 3D Objects from 2D Images 5.Recognition of Polyhedra Objects 6.Comparisons – Alignment – Generalized Hough Transform 1. Key idea 2.General Framework 3.Recognition under Various Transformations 4.Recognition of 3D Objects from 2D Images 5.Recognition of Polyhedra Objects 6.Comparisons – Alignment – Generalized Hough Transform

5 Key Idea (1/8) Recognizing a pentagon in an image

6 Key Idea (2/8) Blue: 1

7 Key Idea (3/8) Red: 1

8 Key Idea (4/8) Green: 5

9 Key Idea (5/8) Purple: 1

10 Key Idea (6/8) Brown: 1

11 Key Idea (7/8) Blue: 1 Red: 1 Green: 5 Purple: 1 Brown: 1 Object is a pentagon!

12 Key Idea (8/8) Blue: 1 Red: 2 Green: 2 Purple: 1 Brown: 1 Object is NOT a pentagon!

13 Brute Force Recognition Let m : points on the model, n : points on the scene. Recognize a single model: O(( m x n ) 2 x t ) where t is the complexity to verify the model against the scene. If m = n, and t = n, then we have O( n 5 ) to recognize a single model. Let m : points on the model, n : points on the scene. Recognize a single model: O(( m x n ) 2 x t ) where t is the complexity to verify the model against the scene. If m = n, and t = n, then we have O( n 5 ) to recognize a single model.

14 General Framework (1/2) Two stages algorithm: 1.Preprocessing (for each model): For each feature points pair: Define a local coordinate basis on this pair. Define a local coordinate basis on this pair. Compute and quantize all other feature points in this coordinate basis. Compute and quantize all other feature points in this coordinate basis. Record (model, basis) in a hash table. Record (model, basis) in a hash table. Two stages algorithm: 1.Preprocessing (for each model): For each feature points pair: Define a local coordinate basis on this pair. Define a local coordinate basis on this pair. Compute and quantize all other feature points in this coordinate basis. Compute and quantize all other feature points in this coordinate basis. Record (model, basis) in a hash table. Record (model, basis) in a hash table.

15 General Framework (2/2) 2.Online recognition (given a scene, extract feature points): a)Pick arbitrary ordered pair: Compute the other points using this pair as a basis. Compute the other points using this pair as a basis. For all the transformed points, vote all records (model, basis) appear in the corresponding entry in the hash table, and histogram them. For all the transformed points, vote all records (model, basis) appear in the corresponding entry in the hash table, and histogram them. b)Matching candidates: (model, basis) pairs with large number of votes. c)Recover the transformation that results in the best least- squares match between all corresponding feature points. d)Transform the features, and verify against the input image features (if fails, repeat to 1). 2.Online recognition (given a scene, extract feature points): a)Pick arbitrary ordered pair: Compute the other points using this pair as a basis. Compute the other points using this pair as a basis. For all the transformed points, vote all records (model, basis) appear in the corresponding entry in the hash table, and histogram them. For all the transformed points, vote all records (model, basis) appear in the corresponding entry in the hash table, and histogram them. b)Matching candidates: (model, basis) pairs with large number of votes. c)Recover the transformation that results in the best least- squares match between all corresponding feature points. d)Transform the features, and verify against the input image features (if fails, repeat to 1).

16 Two Stages Algorithm (1/2) [1]

17 Two Stages Algorithm (2/2) [1]

18 Complexity Assume m = n, and k is the number of point to define the basis. Preprocessing: O( n k+1 ) for a single model.Preprocessing: O( n k+1 ) for a single model. Recognition: O( n k+1 ) against all objects in the database.Recognition: O( n k+1 ) against all objects in the database. Assume m = n, and k is the number of point to define the basis. Preprocessing: O( n k+1 ) for a single model.Preprocessing: O( n k+1 ) for a single model. Recognition: O( n k+1 ) against all objects in the database.Recognition: O( n k+1 ) against all objects in the database.

19 Under Various Transformations (1/2) 1.Translation in 2D and 3D. 1-point basis. 1-point basis. O(n 2 ). O(n 2 ). 2.Similarity transformation in 2D. 2-point basis. 2-point basis. O(n 3 ). O(n 3 ). 3.Similarity transformation in 3D. 3-point basis. 3-point basis. O(n 4 ). O(n 4 ). 1.Translation in 2D and 3D. 1-point basis. 1-point basis. O(n 2 ). O(n 2 ). 2.Similarity transformation in 2D. 2-point basis. 2-point basis. O(n 3 ). O(n 3 ). 3.Similarity transformation in 3D. 3-point basis. 3-point basis. O(n 4 ). O(n 4 ).

20 Under Various Transformations (2/2) 4.Affine transformation 3-point basis. 3-point basis. O(n 4 ) O(n 4 ) 5.Projective transformation 4-point basis. 4-point basis. O(n 5 ) O(n 5 ) 4.Affine transformation 3-point basis. 3-point basis. O(n 4 ) O(n 4 ) 5.Projective transformation 4-point basis. 4-point basis. O(n 5 ) O(n 5 )

21 Recognition of 3D Objects from 2D Images (1/5) 1.Correspondence of planes Preprocessing: consider planar sections of the 3D object which contain three of more interest points. Preprocessing: consider planar sections of the 3D object which contain three of more interest points. Hash (model, plane, basis) triplet. Hash (model, plane, basis) triplet. Use either projective transformation or affine transformation. Use either projective transformation or affine transformation. Once the planes correspondence have been established, the position of the entire 3D body is solved. Once the planes correspondence have been established, the position of the entire 3D body is solved. 1.Correspondence of planes Preprocessing: consider planar sections of the 3D object which contain three of more interest points. Preprocessing: consider planar sections of the 3D object which contain three of more interest points. Hash (model, plane, basis) triplet. Hash (model, plane, basis) triplet. Use either projective transformation or affine transformation. Use either projective transformation or affine transformation. Once the planes correspondence have been established, the position of the entire 3D body is solved. Once the planes correspondence have been established, the position of the entire 3D body is solved.

22 Recognition of 3D Objects from 2D Images (2/5) 2.Singular affine transformation A x + b = U where A : 2x3 affine matrix x : 3x1 3D vector b : 2x1 2D translation vector U : 2x1 image 2.Singular affine transformation A x + b = U where A : 2x3 affine matrix x : 3x1 3D vector b : 2x1 2D translation vector U : 2x1 image

23 Recognition of 3D Objects from 2D Images (3/5) A set of four non-coplanar points in 3D defines a 3D affine basis: – One point as origin – The vectors between origin and the other three points as the unit (oblique) coordinate system. Preprocess the model points in this four- basis point. A set of four non-coplanar points in 3D defines a 3D affine basis: – One point as origin – The vectors between origin and the other three points as the unit (oblique) coordinate system. Preprocess the model points in this four- basis point.

24 Recognition of 3D Objects from 2D Images (4/5) Recognition: Pick four points: p 0, p 1, p 2, and p 3 --> three vectors: v 1, v 2, and v 3 in the 2D image. Pick four points: p 0, p 1, p 2, and p 3 --> three vectors: v 1, v 2, and v 3 in the 2D image. Exists :  v 1 +  v 2 +  v 3 = 0, where ( , ,  ) ≠ 0 Exists :  v 1 +  v 2 +  v 3 = 0, where ( , ,  ) ≠ 0 A point p in the image, with v be the vector from p 0 to p. A point p in the image, with v be the vector from p 0 to p. Vote for all t ≠ 0 ( a line with parameter t): Vote for all t ≠ 0 ( a line with parameter t): v = (  + t  ) v 1 + (  + t  ) v 2 + ( t  ) v 3, where ( ,  ) is the coordinate of v in the v 1, v 2 basis. Recognition: Pick four points: p 0, p 1, p 2, and p 3 --> three vectors: v 1, v 2, and v 3 in the 2D image. Pick four points: p 0, p 1, p 2, and p 3 --> three vectors: v 1, v 2, and v 3 in the 2D image. Exists :  v 1 +  v 2 +  v 3 = 0, where ( , ,  ) ≠ 0 Exists :  v 1 +  v 2 +  v 3 = 0, where ( , ,  ) ≠ 0 A point p in the image, with v be the vector from p 0 to p. A point p in the image, with v be the vector from p 0 to p. Vote for all t ≠ 0 ( a line with parameter t): Vote for all t ≠ 0 ( a line with parameter t): v = (  + t  ) v 1 + (  + t  ) v 2 + ( t  ) v 3, where ( ,  ) is the coordinate of v in the v 1, v 2 basis.

25 Recognition of 3D Objects from 2D Images (5/5) 3.Establishing a viewing angle with similarity transformation. Tesselate a viewing sphere (uniform in spherical coordinates). Tesselate a viewing sphere (uniform in spherical coordinates). Record (model, basis, angle) in the hash table. Record (model, basis, angle) in the hash table. 2-point basis: O(n 3 ) (the same order as without viewing angle because the viewing angle introduces only a constant factor -- independent of the scene). 2-point basis: O(n 3 ) (the same order as without viewing angle because the viewing angle introduces only a constant factor -- independent of the scene). 3.Establishing a viewing angle with similarity transformation. Tesselate a viewing sphere (uniform in spherical coordinates). Tesselate a viewing sphere (uniform in spherical coordinates). Record (model, basis, angle) in the hash table. Record (model, basis, angle) in the hash table. 2-point basis: O(n 3 ) (the same order as without viewing angle because the viewing angle introduces only a constant factor -- independent of the scene). 2-point basis: O(n 3 ) (the same order as without viewing angle because the viewing angle introduces only a constant factor -- independent of the scene).

26 Recognition of Polyhedral Objects Polygonal objects Choose an edge as the basis, record (model, basis edge) in the hash table. Choose an edge as the basis, record (model, basis edge) in the hash table. Preprocessing and recognition is O(n 2 ). Preprocessing and recognition is O(n 2 ). Polygonal objects Choose an edge as the basis, record (model, basis edge) in the hash table. Choose an edge as the basis, record (model, basis edge) in the hash table. Preprocessing and recognition is O(n 2 ). Preprocessing and recognition is O(n 2 ). [1]

27 Comparisons (1/2) 1.With alignment method. Use exhaustive enumeration of all possible pairs in the objects and the images. Use exhaustive enumeration of all possible pairs in the objects and the images. Geometric hashing can process all models simultaneously, while the alignment method processes models sequentially. Geometric hashing can process all models simultaneously, while the alignment method processes models sequentially. The alignment method does not require any additional memory, while geometric hashing requires a large memory to store hash table. The alignment method does not require any additional memory, while geometric hashing requires a large memory to store hash table. Geometric hashing more efficient if: Geometric hashing more efficient if: The scene contains enough features (6-10) for efficient recognition by voting. The scene contains enough features (6-10) for efficient recognition by voting. There are many models. There are many models. 1.With alignment method. Use exhaustive enumeration of all possible pairs in the objects and the images. Use exhaustive enumeration of all possible pairs in the objects and the images. Geometric hashing can process all models simultaneously, while the alignment method processes models sequentially. Geometric hashing can process all models simultaneously, while the alignment method processes models sequentially. The alignment method does not require any additional memory, while geometric hashing requires a large memory to store hash table. The alignment method does not require any additional memory, while geometric hashing requires a large memory to store hash table. Geometric hashing more efficient if: Geometric hashing more efficient if: The scene contains enough features (6-10) for efficient recognition by voting. The scene contains enough features (6-10) for efficient recognition by voting. There are many models. There are many models.

28 Comparisons (2/2) 2.With Generalized Hough Transform (GHT). GHT quantizes all possible (continuous) transformations between the model and the scene into a set of bins, while GHT quantizes all possible (continuous) transformations between the model and the scene into a set of bins, while Geometric Hashing quantizes just the (discrete) transformation represented by the basis. Geometric Hashing quantizes just the (discrete) transformation represented by the basis. 2.With Generalized Hough Transform (GHT). GHT quantizes all possible (continuous) transformations between the model and the scene into a set of bins, while GHT quantizes all possible (continuous) transformations between the model and the scene into a set of bins, while Geometric Hashing quantizes just the (discrete) transformation represented by the basis. Geometric Hashing quantizes just the (discrete) transformation represented by the basis.

29 Summary Ability to recognize objects that have undergo an arbitrary transformation.Ability to recognize objects that have undergo an arbitrary transformation. Can perform partial matching.Can perform partial matching. Efficient and can be parallelized easily.Efficient and can be parallelized easily. Use transformation-invariant access key to the hash table.Use transformation-invariant access key to the hash table. Two phases (preprocessing and recognition).Two phases (preprocessing and recognition). Require a large memory to store hash table.Require a large memory to store hash table. Ability to recognize objects that have undergo an arbitrary transformation.Ability to recognize objects that have undergo an arbitrary transformation. Can perform partial matching.Can perform partial matching. Efficient and can be parallelized easily.Efficient and can be parallelized easily. Use transformation-invariant access key to the hash table.Use transformation-invariant access key to the hash table. Two phases (preprocessing and recognition).Two phases (preprocessing and recognition). Require a large memory to store hash table.Require a large memory to store hash table.

30 References [1] Yehezkel Lamdan and Haim J. Wolfson, Geometric Hashing: A General and Efficient Model-Based Recognition Scheme, ICCV, 1988.


Download ppt "Geometric Hashing: A General and Efficient Model-Based Recognition Scheme Yehezkel Lamdan and Haim J. Wolfson ICCV 1988 Presented by Budi Purnomo Nov 23rd."

Similar presentations


Ads by Google