Automatic Matching of Multi-View Images

Slides:

Advertisements

Similar presentations

Distinctive Image Features from Scale-Invariant Keypoints

Advertisements

Feature Detection. Description Localization More Points Robust to occlusion Works with less texture More Repeatable Robust detection Precise localization.

The SIFT (Scale Invariant Feature Transform) Detector and Descriptor

Outline Feature Extraction and Matching (for Larger Motion)

TP14 - Local features: detection and description Computer Vision, FCUP, 2014 Miguel Coimbra Slides by Prof. Kristen Grauman.

Object Recognition using Invariant Local Features Applications l Mobile robots, driver assistance l Cell phone location or object recognition l Panoramas,

CSE 473/573 Computer Vision and Image Processing (CVIP)

Distinctive Image Features from Scale- Invariant Keypoints Mohammad-Amin Ahantab Technische Universität München, Germany.

Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision.

Matching with Invariant Features

Object Recognition with Invariant Features n Definition: Identify objects or scenes and determine their pose and model parameters n Applications l Industrial.

A Study of Approaches for Object Recognition

Object Recognition with Invariant Features n Definition: Identify objects or scenes and determine their pose and model parameters n Applications l Industrial.

Scale Invariant Feature Transform

Distinctive Image Feature from Scale-Invariant KeyPoints

Feature extraction: Corners and blobs

Distinctive image features from scale-invariant keypoints. David G. Lowe, Int. Journal of Computer Vision, 60, 2 (2004), pp Presented by: Shalomi.

Scale Invariant Feature Transform (SIFT)

Representation, Description and Matching

Image Features: Descriptors and matching

SIFT - The Scale Invariant Feature Transform Distinctive image features from scale-invariant keypoints. David G. Lowe, International Journal of Computer.

Blob detection.

1 Invariant Local Feature for Object Recognition Presented by Wyman 2/05/2006.

The SIFT (Scale Invariant Feature Transform) Detector and Descriptor

Distinctive Image Features from Scale-Invariant Keypoints David G. Lowe – IJCV 2004 Brien Flewelling CPSC 643 Presentation 1.

Scale-Invariant Feature Transform (SIFT) Jinxiang Chai.

Overview Introduction to local features

Interest Point Descriptors

Distinctive Image Features from Scale-Invariant Keypoints By David G. Lowe, University of British Columbia Presented by: Tim Havinga, Joël van Neerbos.

Computer vision.

Interest Point Descriptors and Matching

Overview Harris interest points Comparing interest points (SSD, ZNCC, SIFT) Scale & affine invariant interest points Evaluation and comparison of different.

Local invariant features Cordelia Schmid INRIA, Grenoble.

Reporter: Fei-Fei Chen. Wide-baseline matching Object recognition Texture recognition Scene classification Robot wandering Motion tracking.

Feature Detection and Descriptors

CSE 185 Introduction to Computer Vision Local Invariant Features.

Evaluation of interest points and descriptors. Introduction Quantitative evaluation of interest point detectors –points / regions at the same relative.

CVPR 2003 Tutorial Recognition and Matching Based on Local Invariant Features David Lowe Computer Science Department University of British Columbia.

CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai.

Lecture 7: Features Part 2 CS4670/5670: Computer Vision Noah Snavely.

Local invariant features Cordelia Schmid INRIA, Grenoble.

Distinctive Image Features from Scale-Invariant Keypoints Ronnie Bajwa Sameer Pawar * * Adapted from slides found online by Michael Kowalski, Lehigh University.

Harris Corner Detector & Scale Invariant Feature Transform (SIFT)

Distinctive Image Features from Scale-Invariant Keypoints David Lowe Presented by Tony X. Han March 11, 2008.

Overview Introduction to local features Harris interest points + SSD, ZNCC, SIFT Scale & affine invariant interest point detectors Evaluation and comparison.

Feature extraction: Corners and blobs. Why extract features? Motivation: panorama stitching We have two images – how do we combine them?

Features, Feature descriptors, Matching Jana Kosecka George Mason University.

Matching features Computational Photography, Prof. Bill Freeman April 11, 2006 Image and shape descriptors: Harris corner detectors and SIFT features.

A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter )

Project 3 questions? Interest Points and Instance Recognition Computer Vision CS 143, Brown James Hays 10/21/11 Many slides from Kristen Grauman and.

Local features: detection and description

CSE 185 Introduction to Computer Vision Local Invariant Features.

Recognizing specific objects Matching with SIFT Original suggestion Lowe, 1999,2004.

Distinctive Image Features from Scale-Invariant Keypoints Presenter :JIA-HONG,DONG Advisor : Yen- Ting, Chen 1 David G. Lowe International Journal of Computer.

Blob detection.

SIFT Scale-Invariant Feature Transform David Lowe

Interest Points EE/CSE 576 Linda Shapiro.

Distinctive Image Features from Scale-Invariant Keypoints

Scale Invariant Feature Transform (SIFT)

TP12 - Local features: detection and description

Local features: detection and description May 11th, 2017

CAP 5415 Computer Vision Fall 2012 Dr. Mubarak Shah Lecture-5

The SIFT (Scale Invariant Feature Transform) Detector and Descriptor

Interest Points & Descriptors 3 - SIFT

ECE734 Project-Scale Invariant Feature Transform Algorithm

Lecture VI: Corner and Blob Detection

SIFT SIFT is an carefully designed procedure with empirically determined parameters for the invariant and distinctive features.

Presented by Xu Miao April 20, 2005

Presentation transcript:

Automatic Matching of Multi-View Images Ed Bremer University of Rochester

Automatic Matching of Multi-View Images References [1] Mikolajczyk, K., Schmid, C., 2004, A performance evaluation of local descriptors, Submitted to PAMI, October 2004, http://lear.inrialpes.fr/pubs/2004/MS04a [2] Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., Van Gool, L., 2004, A comparison of affine region detectors, Submitted to International Journal of Computer Vision, August 2004, http://lear.inrialpes.fr/pubs/2004/MTSZMSKG04 [3] Lowe, D., 2004. Distinctive Image Features from Scale-Invariant Keypoints, International Journal of Computer Vision, 60, 2 (2004), pp. 91-118. [4] Matas, J., Chum, O., Urban, M., Pajdla,T. 2002. Robust Wide Baseline Stereo From Maximally Stable Extremal Regions, Proc British Machine Vision Conference BMVC2002, pages 384 – 393. [5] Zisserman, A., Schaffalitzky, F., 2002, Multi-view matching for unordered image sets, or ”How do I organize my holiday snaps?”, Proceedings of the 7th European Conference on Computer Vision, Copenhagen, Denmark, pages 414-431, vol 1. [6] Baumberg, A., 2000, Reliable Feature Matching Across Widely Separated Views, In Proc. CVPR ,pages 774-781. [7] Mikolajczyk, K, Schmid, C., 2001, Indexing based on scale invariant interest points, In Proc. 8th ICCV, pages 525-531. Automatic Matching of Multi-View Images

Automatic Matching of Multi-View Images Outline Motivation Applications Process Components Region Detectors Descriptors Matching Criteria Performance Evaluation Conclusion & Next Steps Automatic Matching of Multi-View Images

Automatic Matching of Multi-View Images Motivation Multi-view/Multi-image Matching Multiple images of scene taken by single or multiple cameras with different rotation, scale, viewpoint and illumination 3D scene Automatic Matching of Multi-View Images

Automatic Matching of Multi-View Images Motivation Applications … detecting matching regions is used in all the following Image registration Super-resolution Stereo vision Object detection and recognition Object and motion tracking Indexing and retrieval of objects 3D scene reconstruction Scene recognition Automatic Matching of Multi-View Images

Examples of Multi-view Images [2] [2] Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., Van Gool, L., 2004, A comparison of affine region detectors, Submitted to International Journal of Computer Vision, August 2004, http://lear.inrialpes.fr/pubs/2004/MTSZMSKG04 Automatic Matching of Multi-View Images

Automatic Matching of Multi-View Images Process Components Covariant region detection Detect image regions covariant to class of transformation between reference image and transformed image Invariant descriptor Compute invariant descriptors from covariant regions Descriptor matching Compute distance between descriptors in reference image and transformed image [1] Mikolajczyk, K., Schmid, C., 2004, A performance evaluation of local descriptors, Submitted to PAMI, http://lear.inrialpes.fr/pubs/2004/MS04a Automatic Matching of Multi-View Images

Automatic Matching of Multi-View Images Region Detectors Support regions for computation of descriptors Determined independently in each image Scale invariant or Affine invariant Can be points (feature points) or regions (covariant) Provide dense (local) coverage – robust to occlusion Need to be stable and repeatable Five region detectors - Harris points -> invariant to rotation Harris-Laplacian -> invariant to rotation and scale Hessian-Laplace ->invariant to rotation and scale Harris-Affine -> invariant to affine image transformations Hessian-Affine -> invariant to affine image transformations [1] Mikolajczyk, K., Schmid, C., 2004, A performance evaluation of local descriptors, Submitted to PAMI, http://lear.inrialpes.fr/pubs/2004/MS04a Automatic Matching of Multi-View Images

Automatic Matching of Multi-View Images Region Detectors Harris points - Maxima of Harris function used to locate interest point Support region fixed in size, 41x41 neighborhood centered at interest point Harris-Laplace regions - Scale adapted Harris function Interest point is local minima or maxima across scale-space by Laplacian-of-Gaussian [1] Mikolajczyk, K., Schmid, C., 2004, A performance evaluation of local descriptors, Submitted to PAMI, http://lear.inrialpes.fr/pubs/2004/MS04a Automatic Matching of Multi-View Images

Automatic Matching of Multi-View Images Region Detectors Harris-Laplace Performance - Approximately 10% better than Laplacian, Lowe or gradient methods. Harris standard detector is very poor under scale changes [7] Mikolajczyk, K., Schmid, C., 2001, Indexing based on scale invariant interest points, In Proc. 8th ICCV, Pages 525-531. Automatic Matching of Multi-View Images

Automatic Matching of Multi-View Images Region Detectors Hessian-Laplace regions - Interest point is at local maxima of Hessian determinant Location in scale-space using maxima of Laplacian-of-Gaussian (can also use Difference-of-Gaussians) [1] Mikolajczyk, K., Schmid, C., 2004, A performance evaluation of local descriptors, Submitted to PAMI, http://lear.inrialpes.fr/pubs/2004/MS04a [3] Lowe, D., 2004. Distinctive Image Features from Scale-Invariant Keypoints, International Journal of Computer Vision, 60, 2 (2004), pp. 91-118. Automatic Matching of Multi-View Images

Automatic Matching of Multi-View Images Region Detectors Harris-Affine regions - Find regions using Harris-Laplace detector Region based on 2nd moment & affine adapted Hessian-Affine regions - Find regions using Hessian-Laplace detector Affine adapted region based on 2nd moment. [2] Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., Van Gool, L., 2004, A comparison of affine region detectors, Submitted to International Journal of Computer Vision, August 2004, http://lear.inrialpes.fr/pubs/2004/MTSZMSKG04 Automatic Matching of Multi-View Images

Automatic Matching of Multi-View Images Region Detectors Regions produced by Harris-Affine and Hessian-Affine detectors [2] Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., Van Gool, L., 2004, A comparison of affine region detectors, Submitted to International Journal of Computer Vision, August 2004, http://lear.inrialpes.fr/pubs/2004/MTSZMSKG04 Automatic Matching of Multi-View Images

Automatic Matching of Multi-View Images Region Detectors Affine normalization using 2nd moment matrix for region L and R [2] Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., Van Gool, L., 2004, A comparison of affine region detectors, Submitted to International Journal of Computer Vision, August 2004, http://lear.inrialpes.fr/pubs/2004/MTSZMSKG04 Automatic Matching of Multi-View Images

Automatic Matching of Multi-View Images Region Detectors Region normalization Detectors produce circular or elliptical regions Size dependant on detection scale Map regions to circular region with constant radius Rotate regions in direction of dominant gradient orientation Illumination normalization Use affine transformation -> aI(x) + b Mean and standard deviation of pixel intensities [1] Mikolajczyk, K., Schmid, C., 2004, A performance evaluation of local descriptors, Submitted to PAMI, http://lear.inrialpes.fr/pubs/2004/MS04a Automatic Matching of Multi-View Images

Automatic Matching of Multi-View Images Descriptors Descriptors -> Feature vector Invariant to changes in scale, rotation, affine translation and affine illumination Need to be distinct, stable and repeatable Distribution (histogram) type or Covariance type Ten Descriptor types Scale-Invariant Feature Transform (SIFT) Gradient Location and Orientation histogram (GLOH) Shape Context Principal Component Analysis (PCA)-SIFT Steerable Filters Differential Invariants Complex Filters Moment Invariants Cross-Correlation Spin Image [1] Mikolajczyk, K., Schmid, C., 2004, A performance evaluation of local descriptors, Submitted to PAMI, http://lear.inrialpes.fr/pubs/2004/MS04a Automatic Matching of Multi-View Images

Automatic Matching of Multi-View Images Descriptors SIFT and GLOH 3D Descriptors SIFT -> 4 x 4 x 8 = 128 dimension descriptor GLOH -> Log-polar [(2 x 8) + 1] x 16 = 272 dimension descriptor [1] Mikolajczyk, K., Schmid, C., 2004, A performance evaluation of local descriptors, Submitted to PAMI, http://lear.inrialpes.fr/pubs/2004/MS04a Automatic Matching of Multi-View Images

Automatic Matching of Multi-View Images Matching Criteria Distance measure Find putative matches between images Mahalanobis distance – used for covariant descriptors Euclidean distance – used for distribution (histogram) descriptors Direct distance comparison not suitable for indexing or database searching Simple threshold Descriptors match if distance between is below threshold t Descriptor in reference image can have many matches to descriptors in transformed image Nearest Neighbor (NN) Find closest match between descriptors in reference and transformed image Descriptor in reference image can have only 1 match to descriptor in transformed image Automatic Matching of Multi-View Images

Performance Evaluation Criterion basis Recall rate = #correct matched/#correspondences 1-precision = #false matches/[#correct matches + #false matches] Ideal descriptor -> recall rate = 1, for all precision given no overlap error [1] Mikolajczyk, K., Schmid, C., 2004, A performance evaluation of local descriptors, Submitted to PAMI, http://lear.inrialpes.fr/pubs/2004/MS04a Automatic Matching of Multi-View Images

SIFT - Scale Invariant Feature Transform Scale Invariant Feature Transform (SIFT) Lowe [3] Features – Invariant to image scale, rotation Invariant for small changes in illumination and 3D camera viewpoint Extracts large number of highly distinctive features Enables detection of small objects Improved performance in cluttered scenes Algorithms are efficient – complex operations applied to local regions or features vs whole image Procedure Scale-space extrema detection Keypoint localization Orientation asignment Keypoint vector (descriptor) Automatic Matching of Multi-View Images

SIFT - Scale Invariant Feature Transform [3] Scale-Space Blob Detector - Search for stable features over all scales and image locations Scale-space kernel -> Gaussian function Difference of Gaussian Automatic Matching of Multi-View Images

SIFT - Scale Invariant Feature Transform [3] Difference of Gaussian (DoG) simple subtraction of blurred L images Approximation to scale-normalized Laplacian of Gaussian Maxima or minima of scale-normalized Laplacian produces the most stable image features compared to gradient, Hessian, or Harris corner function (Mikolajczyk 2002) Automatic Matching of Multi-View Images

SIFT - Scale Invariant Feature Transform [3] Scale-Space Image Set - Divide each octave into s intervals Compute s + 3 filtered (increasing blurry) images, k = 2(1/s) s = 3, k = 1.26 -> 6th –> 3.18σ 5th –> 2.52σ 4th –> 2.00σ 3rd –> 1.59σ 2nd –> 1.26σ 1st –> 1.00σ Subtract adjacent images to produce DoG images Repeat for next octave using 2nd image from top and decimate by 2 Automatic Matching of Multi-View Images

SIFT - Scale Invariant Feature Transform [3] Scale-Space Pyramid - (from Lowe) Automatic Matching of Multi-View Images

SIFT - Scale Invariant Feature Transform [3] Locating Scale-Space Extrema - Detection of local maxima or minima of D(x, y, σ) Compare each sample point to 8 neighbors in same scale image and 9 neighbors in scale image above and below. Mark if sample is greater than or less than all of the neighbors Compares s number of DoG images Automatic Matching of Multi-View Images

SIFT - Scale Invariant Feature Transform [3] Improving Localization - Reject points that have low contrast using: <threshold Where –> Gives offset extremum -> Hessian and derivative of D(x, y, σ) uses differences of neighboring sample points. x = (x, y , σ)T is offset from sample point Automatic Matching of Multi-View Images

SIFT - Scale Invariant Feature Transform [3] Edge Rejection - Eliminate poorly defined peaks (edges) using Hessian matrix Verify ratio of principal curves is less than threshold r<10 Efficient to compute -> less than 20 floating point operations Automatic Matching of Multi-View Images

SIFT - Scale Invariant Feature Transform [3] Results from Lowe [3] – 832 keypoints reduced to 536 (233x189 image) Automatic Matching of Multi-View Images

SIFT - Scale Invariant Feature Transform Results from Lowe [3] – performance measures Automatic Matching of Multi-View Images

SIFT - Scale Invariant Feature Transform Results from Lowe [3] – performance measures Automatic Matching of Multi-View Images

SIFT - Scale Invariant Feature Transform [3] Orientation – rotational invariance Use scale of point to select image L(x, y, σ) Compute the gradient m(x, y) and orientation θ(x, y) at each image sample using differences. Orientation histogram of sample points – entries weighted by gradient magnitude and a Gaussian window around the keypoint, bins cover 360° range Peaks in histogram correspond to dominant directions of local gradients Automatic Matching of Multi-View Images

SIFT - Scale Invariant Feature Transform [3] Descriptor – the feature vector 8x8 sub-region histograms allow shift in gradient positions 128 element feature vector -> 4x4 array of 8 orientations (2x2x8 from Lowe is shown below) Feature vectors matched by nearest neighbor (Euclidean distance) Automatic Matching of Multi-View Images

SIFT - Scale Invariant Feature Transform [3] Results from Lowe [3] – Two training objects recognized in cluttered image Small squares show point matches Large rectangles shown border of training image after affine transformation Automatic Matching of Multi-View Images

Automatic Matching of Multi-View Images Conclusions Conclusions Harris-Laplacian region detector performs better than Laplacian, DoG and gradient scale-space operators Scale-space detectors provide invariance to rotation, scale and small changes to illumination and viewpoint. Affine adaptation provides invariance to affine transformations GLOH and SIFT descriptors provide the best performance. Dense, localized descriptors perform well under occlusions Nexts steps Coding and testing of region detectors, descriptors and matching… Automatic Matching of Multi-View Images