IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert A study of the 2D - SIFT algorithm Dimitri Van Cauwelaert.

Slides:

Advertisements

Similar presentations

Distinctive Image Features from Scale-Invariant Keypoints

Advertisements

Distinctive Image Features from Scale-Invariant Keypoints David Lowe.

Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

Presented by Xinyu Chang

The SIFT (Scale Invariant Feature Transform) Detector and Descriptor

TP14 - Local features: detection and description Computer Vision, FCUP, 2014 Miguel Coimbra Slides by Prof. Kristen Grauman.

A NOVEL LOCAL FEATURE DESCRIPTOR FOR IMAGE MATCHING Heng Yang, Qing Wang ICME 2008.

Object Recognition using Invariant Local Features Applications l Mobile robots, driver assistance l Cell phone location or object recognition l Panoramas,

Distinctive Image Features from Scale- Invariant Keypoints Mohammad-Amin Ahantab Technische Universität München, Germany.

Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision.

Instructor: Mircea Nicolescu Lecture 13 CS 485 / 685 Computer Vision.

Object Recognition with Invariant Features n Definition: Identify objects or scenes and determine their pose and model parameters n Applications l Industrial.

Fast High-Dimensional Feature Matching for Object Recognition David Lowe Computer Science Department University of British Columbia.

Robust and large-scale alignment Image from

(1) Feature-point matching by D.J.Duff for CompVis Online: Feature Point Matching Detection,

A Study of Approaches for Object Recognition

Object Recognition with Invariant Features n Definition: Identify objects or scenes and determine their pose and model parameters n Applications l Industrial.

SIFT Guest Lecture by Jiwon Kim

Distinctive Image Feature from Scale-Invariant KeyPoints

Distinctive image features from scale-invariant keypoints. David G. Lowe, Int. Journal of Computer Vision, 60, 2 (2004), pp Presented by: Shalomi.

Feature-based object recognition Prof. Noah Snavely CS1114

Object Recognition Using Distinctive Image Feature From Scale-Invariant Key point D. Lowe, IJCV 2004 Presenting – Anat Kaspi.

Scale Invariant Feature Transform (SIFT)

NIPS 2003 Tutorial Real-time Object Recognition using Invariant Local Image Features David Lowe Computer Science Department University of British Columbia.

1 Invariant Local Feature for Object Recognition Presented by Wyman 2/05/2006.

Distinctive Image Features from Scale-Invariant Keypoints David G. Lowe – IJCV 2004 Brien Flewelling CPSC 643 Presentation 1.

Sebastian Thrun CS223B Computer Vision, Winter Stanford CS223B Computer Vision, Winter 2005 Lecture 3 Advanced Features Sebastian Thrun, Stanford.

Lecture 6: Feature matching and alignment CS4670: Computer Vision Noah Snavely.

Scale-Invariant Feature Transform (SIFT) Jinxiang Chai.

Interest Point Descriptors

Distinctive Image Features from Scale-Invariant Keypoints By David G. Lowe, University of British Columbia Presented by: Tim Havinga, Joël van Neerbos.

Computer vision.

Keypoint-based Recognition Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 03/04/10.

Final Exam Review CS485/685 Computer Vision Prof. Bebis.

Internet-scale Imagery for Graphics and Vision James Hays cs195g Computational Photography Brown University, Spring 2010.

Object Tracking/Recognition using Invariant Local Features Applications l Mobile robots, driver assistance l Cell phone location or object recognition.

Reporter: Fei-Fei Chen. Wide-baseline matching Object recognition Texture recognition Scene classification Robot wandering Motion tracking.

Lecture 4: Feature matching CS4670 / 5670: Computer Vision Noah Snavely.

CVPR 2003 Tutorial Recognition and Matching Based on Local Invariant Features David Lowe Computer Science Department University of British Columbia.

CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai.

Wenqi Zhu 3D Reconstruction From Multiple Views Based on Scale-Invariant Feature Transform.

Lecture 7: Features Part 2 CS4670/5670: Computer Vision Noah Snavely.

Distinctive Image Features from Scale-Invariant Keypoints Ronnie Bajwa Sameer Pawar * * Adapted from slides found online by Michael Kowalski, Lehigh University.

Features Digital Visual Effects, Spring 2006 Yung-Yu Chuang 2006/3/15 with slides by Trevor Darrell Cordelia Schmid, David Lowe, Darya Frolova, Denis Simakov,

Distinctive Image Features from Scale-Invariant Keypoints David Lowe Presented by Tony X. Han March 11, 2008.

CSE 185 Introduction to Computer Vision Feature Matching.

A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter )

Local features: detection and description

Distinctive Image Features from Scale-Invariant Keypoints

776 Computer Vision Jan-Michael Frahm Spring 2012.

Recognizing specific objects Matching with SIFT Original suggestion Lowe, 1999,2004.

CSCI 631 – Foundations of Computer Vision March 15, 2016 Ashwini Imran Image Stitching.

Distinctive Image Features from Scale-Invariant Keypoints Presenter :JIA-HONG,DONG Advisor : Yen- Ting, Chen 1 David G. Lowe International Journal of Computer.

776 Computer Vision Jan-Michael Frahm Spring 2012.

Visual homing using PCA-SIFT

SIFT Scale-Invariant Feature Transform David Lowe

CS262: Computer Vision Lect 09: SIFT Descriptors

Lecture 07 13/12/2011 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.

Distinctive Image Features from Scale-Invariant Keypoints

Scale Invariant Feature Transform (SIFT)

Nearest-neighbor matching to feature database

Nearest-neighbor matching to feature database

CAP 5415 Computer Vision Fall 2012 Dr. Mubarak Shah Lecture-5

From a presentation by Jimmy Huff Modified by Josiah Yoder

The SIFT (Scale Invariant Feature Transform) Detector and Descriptor

ECE734 Project-Scale Invariant Feature Transform Algorithm

Presented by Xu Miao April 20, 2005

Presentation transcript:

IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert A study of the 2D - SIFT algorithm Dimitri Van Cauwelaert

Introduction SIFT : Scale invariant feature transform Method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene invented by David Lowe in 1999 IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert

Introduction Feature: local property of an image Invariant to: Image scaling Rotation Robust matching across: Substantial range of affine distortion Change in 3D viewpoint Addition of noise Change in illumination IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert

Introduction IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert SIFT Features

Introduction Based on a model of the behavior of complex cells in the cerebral cortex of mammalian vision Recent research - Edelman, Intrator and Poggio – indicates that if feature position is allowed to shift over a small area while maintaining orientation and spatial frequency reliable matching increases significantly IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert

The algorithm For both the image an the training image, feature extraction based on: Scale space extrema detection Keypoint localization Orientation assignment Keypoint descriptor Large amounts of features are generated  will provide more reliable matching  Detection of small objects in cluttered backgrounds Typically: 2000 stable features in an image of 500x500 pixels IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert

The algoritm Extraction, using a fast nearest neighbor algorithm, of candidate matching features based on the Euclidean distance between the descriptor vectors Clustering of matched features that agree on object location and pose These clusters are subject to further detailed verification  Least squared estimate for an affine approximation to the object pose  Outliers are discarded to improve the reliability of the matching IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert

The algorithm Cascade filtered approach The more computationally challenging operations are applied to items that pass initial testing. For small images images near real-time computation IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert

The algorithm – detection of scale space extrema IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert Building a scale space pyramid: All scales must be examined to identify scale-invariant features An efficient function is to compute the Difference of Gaussian (DOG) pyramid (Burt & Adelson, 1983)

The algorithm – detection of scale space extrema IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert Scale space processed one octave at the time Resamping to limit computations, we can do this without aliasing problems because the blurring is limiting the higher spatial frequencies

The algorithm – detection of scale space extrema IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert Within one DOG scale look for minima and maxima considering the current scale, the scale above and the scale below

The algorithm – orientation assignment IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert Goal: expressing the feature descriptor relatively to this orientation and thus achieving rotational invariance A circular Gaussian weighted window (radius depending on the scale of the keypoint) is taken around the keypoint For each pixel within this window the magnitude and the orientation of the gradient is determined. A 36 bins (covering 360 degrees) orientation histogram is filled using the Gaussian window and gradient magnitude as weights.

The algorithm – orientation assignment IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert Highest peak in the smoothed histogram is the assigned orientation Peaks having more than 80 % of the value of this highest peaks are also assigned as possible orientations A parabola is fit to the 3 histogram values closest to the peak to interpolate the peak position for better accuracy

The algorithm – the local image descriptor IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert Again consider a Gaussian weighting function around the keypoint location In this window gradient magnitudes and orientation are rotated according to the assigned keypoint orientation The 16x16 samples around the keypoint are grouped in a 4x4 array. In each array the samples are added to orientation bins (here 8) using again the Gaussian window as well as the gradient magnitude as weighting functions

The algorithm – the local image descrciptor IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert

The algorithm – the local image descriptor IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert To avoid significant changes in the descriptor vector as one pixel would shift from one pixel group to another. Shifting pixels in and out of a group is done using an additional linear weighting function Dimensionality: Using r orientation bins for each pixel group Using and n x n pixel group array The resulting vector describing the feature has r x n x n dimensions

The algorithm – matching to large databases IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert Matching features in two images: Using the Euclidean distance between the two descriptor vector and then treshholding them would be intuitive, but appears not to give reliable results A more effective measure is obtained by comparing the distance of the closest neighbor to that of the second closest neighbor Distance of correct match must be significantly greater than the distance of the second closest neighbor in order to avoid ambiguity

The algorithm – matching to large databases IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert Threshold of 0.8 provided excellent separation

The algorithm – matching to large databases IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert No algorithms are known that can identify the exact nearest neighbor of points in high dimensional spaces that are more efficient than exhaustive search Algorithms such as K-d tree provide no speedup Approximate algorithm called best bin first (BBF)  Bins in feature space are searched in order of their closest distance from the query location (priority queue)  Only the first x bins are tested  Returns the closest neighbor with high probability  Drastic increase in speed

The algorithm – matching to large databases IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert The Hough transform identifies clusters of features with a consistent interpretation by using each feature to vote for all object poses that are consistent with the feature. The affine transformation has 6 degrees of freedom, thus using a minimum of 3 points from a cluster we can make an estimate for the affine transformation between the image and the training image  Clusters of less then 3 features are discarded  Using all the features within a cluster, a least-squared solution in determined for the fitted affine transformation

The algorithm – matching to large databases IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert Each feature in the cluster is now checked not to deviate to much from the least square solution. If it does the feature is discarded and the least square solution is recalculated => After several iterations (providing the number of remaining features in the cluster does not fall below 3) a reliable affine transformation is determined.

Demo – recognition of a car IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert We will use a template of a car and try to match it against a scene in which this car is present template

Demo – recognition of a car IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert Five points from the template are correctly identified in the scene t = 0 ms

Demo – recognition of a car IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert Six points from the template are correctly identified in the scene, however also note the incorrect match in the right of the image t = 400 ms

Demo – recognition of a car IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert Six points from the template are correctly identified in the scene. t = 800 ms

Demo – recognition of a car IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert six points from the template are correctly identified in the scene (one point dos not belong to the car however). t = 1200 ms

Demo – recognition of a car IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert More points are being recognized, two points are wrongly matched t = 1600 ms

Demo – recognition of a car IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert A lot of points are correctly matched (this is to be expected since the template was derived from this image). Two points are incorrectly matched t = 2000 ms

Demo – recognition of people IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert Most points are reliably matched, however there are outliers, these could be removed by using a model for consistency in the mapping process

Demo – recognition of people IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert Clearly the algorithm falls short in matching the person in this scene, taking into account the fact that there is a big difference in viewpoint, illumination and scale. Notice that even for humans the matching process is not straightforward.

Results To some point, the technique appears to be robust against image rotation, scaling, substantial range of affine distortion, addition of noise, change in illumination Extracting large numbers of features leads to robustness in extracting small objects among clutter However in depth rotation of the image of more than 20 percent results in a much lower recognition Computationally efficient IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert

Applications View matching for 3D reconstruction => Structure from motion Motion tracking and segmentation Robot localization Image panorama assembly Epipolar calibration IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert

Applications Image panorama assembly IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert

Applications Robot localization, motion tracking IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert

Applications IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert Sony Aibo (Evolution Robotics) SIFT usage: Recognize charging station Communicate with visual cards

Future work Evaluation of the algorithm in matching faces in a cluttered environment While systematically varying scale, rotation, viewpoint and illumination IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert

Future work Using the algorithm for long range tracking of objects  Filtering using a priory knowledge  For example in video we have an estimate for the speed vector calculated from previous frames  Integration gives a bounding box where the match is to be found Integration of new techniques: SURF: Speeded Up Robust Features GLOH (Gradient Location and Orientation Histogram) => using principal component analysis IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert

Future work Evaluation other descriptors  e.g. incorporation of illumination invariant color parameters Incorporation of texture parameters (descriptor build of several scales rather than one current scale) Dynamic descriptor rather than a static one,  training determines which parameters should be used  Closer study on recent achievements in biological studies of the mammalian vision  It is clear that mammals are still much better at recognition than computer algorithms => promising opportunities IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert