A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter 2014-2015)

Slides:



Advertisements
Similar presentations
Distinctive Image Features from Scale-Invariant Keypoints
Advertisements

Feature Detection. Description Localization More Points Robust to occlusion Works with less texture More Repeatable Robust detection Precise localization.
Distinctive Image Features from Scale-Invariant Keypoints David Lowe.
Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.
Presented by Xinyu Chang
The SIFT (Scale Invariant Feature Transform) Detector and Descriptor
TP14 - Local features: detection and description Computer Vision, FCUP, 2014 Miguel Coimbra Slides by Prof. Kristen Grauman.
Object Recognition using Invariant Local Features Applications l Mobile robots, driver assistance l Cell phone location or object recognition l Panoramas,
CSE 473/573 Computer Vision and Image Processing (CVIP)
Distinctive Image Features from Scale- Invariant Keypoints Mohammad-Amin Ahantab Technische Universität München, Germany.
Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision.
Object Recognition with Invariant Features n Definition: Identify objects or scenes and determine their pose and model parameters n Applications l Industrial.
(1) Feature-point matching by D.J.Duff for CompVis Online: Feature Point Matching Detection,
A Study of Approaches for Object Recognition
Object Recognition with Invariant Features n Definition: Identify objects or scenes and determine their pose and model parameters n Applications l Industrial.
Distinctive Image Feature from Scale-Invariant KeyPoints
Feature extraction: Corners and blobs
Distinctive image features from scale-invariant keypoints. David G. Lowe, Int. Journal of Computer Vision, 60, 2 (2004), pp Presented by: Shalomi.
Object Recognition Using Distinctive Image Feature From Scale-Invariant Key point D. Lowe, IJCV 2004 Presenting – Anat Kaspi.
Scale Invariant Feature Transform (SIFT)
Automatic Matching of Multi-View Images
SIFT - The Scale Invariant Feature Transform Distinctive image features from scale-invariant keypoints. David G. Lowe, International Journal of Computer.
Blob detection.
The SIFT (Scale Invariant Feature Transform) Detector and Descriptor
Distinctive Image Features from Scale-Invariant Keypoints David G. Lowe – IJCV 2004 Brien Flewelling CPSC 643 Presentation 1.
Scale-Invariant Feature Transform (SIFT) Jinxiang Chai.
Interest Point Descriptors
Distinctive Image Features from Scale-Invariant Keypoints By David G. Lowe, University of British Columbia Presented by: Tim Havinga, Joël van Neerbos.
Computer vision.
Internet-scale Imagery for Graphics and Vision James Hays cs195g Computational Photography Brown University, Spring 2010.
Bag of Visual Words for Image Representation & Visual Search Jianping Fan Dept of Computer Science UNC-Charlotte.
Object Tracking/Recognition using Invariant Local Features Applications l Mobile robots, driver assistance l Cell phone location or object recognition.
Reporter: Fei-Fei Chen. Wide-baseline matching Object recognition Texture recognition Scene classification Robot wandering Motion tracking.
CVPR 2003 Tutorial Recognition and Matching Based on Local Invariant Features David Lowe Computer Science Department University of British Columbia.
CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai.
Wenqi Zhu 3D Reconstruction From Multiple Views Based on Scale-Invariant Feature Transform.
Distinctive Image Features from Scale-Invariant Keypoints Ronnie Bajwa Sameer Pawar * * Adapted from slides found online by Michael Kowalski, Lehigh University.
Harris Corner Detector & Scale Invariant Feature Transform (SIFT)
Features Digital Visual Effects, Spring 2006 Yung-Yu Chuang 2006/3/15 with slides by Trevor Darrell Cordelia Schmid, David Lowe, Darya Frolova, Denis Simakov,
Kylie Gorman WEEK 1-2 REVIEW. CONVERTING AN IMAGE FROM RGB TO HSV AND DISPLAY CHANNELS.
Distinctive Image Features from Scale-Invariant Keypoints David Lowe Presented by Tony X. Han March 11, 2008.
Feature extraction: Corners and blobs. Why extract features? Motivation: panorama stitching We have two images – how do we combine them?
Scale Invariant Feature Transform (SIFT)
Presented by David Lee 3/20/2006
Recognizing specific objects Matching with SIFT Original suggestion Lowe, 1999,2004.
Distinctive Image Features from Scale-Invariant Keypoints Presenter :JIA-HONG,DONG Advisor : Yen- Ting, Chen 1 David G. Lowe International Journal of Computer.
Blob detection.
SIFT.
Visual homing using PCA-SIFT
SIFT Scale-Invariant Feature Transform David Lowe
CS262: Computer Vision Lect 09: SIFT Descriptors
Interest Points EE/CSE 576 Linda Shapiro.
Presented by David Lee 3/20/2006
Lecture 07 13/12/2011 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.
Distinctive Image Features from Scale-Invariant Keypoints
Scale Invariant Feature Transform (SIFT)
SIFT paper.
Project 1: hybrid images
TP12 - Local features: detection and description
Feature description and matching
CAP 5415 Computer Vision Fall 2012 Dr. Mubarak Shah Lecture-5
From a presentation by Jimmy Huff Modified by Josiah Yoder
The SIFT (Scale Invariant Feature Transform) Detector and Descriptor
Interest Points & Descriptors 3 - SIFT
SIFT.
Feature descriptors and matching
SIFT SIFT is an carefully designed procedure with empirically determined parameters for the invariant and distinctive features.
Presented by Xu Miao April 20, 2005
Presentation transcript:

A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter )

Introduction The Scale-Invariant Feature Transform by David Lowe is useful in many applications of object recognition. Our objective in this presentation is to understand how to extract SIFT descriptors from an image

Introduction To extract SIFT keypoints, we use a cascaded filtering algorithm with the following four steps of filtering: –Scale-Space Extrema Detection –Keypoint Localization –Orientation Assignment –Keypoint Descriptor This algorithm is efficient as its more expensive operations are performed on a small subset of the initial image input.

Scale-Space Extrema Detection – Get the Points! In order to have scale-invariant features, we must have a way to extract features from an image across all scales. This can be done using a continuous function known as scale-space (Witkin, 1983) The only scale-space kernel is the Gaussian function. Lowe proposed to use Difference of Gaussians (DOG) in order to collect extrema as interest points.

Scale-Space Extrema Detection – Get the Points! Scale-space groups an image into an octave with S levels. The smoothing is done incrementally such that σ of the S + 1 image in the octave is twice that of the first image.

Scale-Space Extrema Detection – Get the Points!

DOG is used for its efficiency. Using the images to the right, we may now find the extrema for this octave.

Scale-Space Extrema Detection – Get the Points! If a point is greater or less than its 26 neighbors, it is regarded as an extreme point. This is a relatively inexpensive step as most points are not compared to every neighbor. Note that this comparison cannot be done on the boundaries of an image or on the top and bottom DOG.

Scale-Space Extrema Detection – Get the Points!... Each octave is processed separately. Each octave starts with σ twice the value of σ of the previous octave and continues to increase. 2σ σ As sample points are collected, they are stored as a three-vector p = (x, y, σ) [σ being scale in this case]

Refine the Points! If we were to stop after the first steps, we would have too many interest points to be effective. In this second step, we eliminate points of low contrast. [Ignoring localization of “real” SIFT here…] Can you see the truck??

Refine the Points! Only keep points where DOG > some threshold (e.g. 3% of maximum intensity in original image)

Refine the Points! By applying this to our previous image, with 8714 sample points… We reduce the number of sample points to 362

We may further refine the sample points by removing them from edges. First, we take the Hessian matrix computed at the location and scale of the keypoint. Further Refine the Points!

The eigenvalues of the matrix H are proportional to the principal curvatures of D. If a point is on an edge, its ratio of eigenvalues will be very high (recall Harris Corner Detector). Since we are only concerned with ratios we may set a threshold r, where α = rβ and Therefore, if the point is ignored.

Further Refine the Points! By applying this to our previous image, with 362 sample points… We reduce the number of sample points to 240

Orientation Assignment In order to be rotation invariant, each point must have a reference angle based on its neighbor points. We find the magnitude and angle of every pixel in the scale space by the following equations We are concerned with the points in the region of the keypoint.

The magnitudes are weighted according to a Gaussian function centered at the keypoint. Orientation Assignment

We then use the magnitudes to populate a histogram of 36 bins Orientation Assignment

A parabola is fit to the maximum value and the two values nearest to it. The maximum of this parabola gives us the angle θ. Furthermore, the point now has four components p = (x, y, σ, θ) Orientation Assignment

Keypoint Descriptor We now assign a descriptor to the sample point. The two above points represent sample points, with the red arrow being the points orientation assignment. By assigning a keypoint descriptor, we will know if these two are alike or not.

Keypoint Descriptor We again use gradients of neighboring pixels to determine the descriptor. The size of the region is a Gaussian window proportional to the scale of the keypoint.

Keypoint Descriptor We first must rotate the neighboring pixels vectors relative to the keypoint’s angle θ.

Notice that these two are (most likely) a match after this step is done to ensure rotation invariance! Keypoint Descriptor

We then group the vectors from step 3 into a 2 x 2 set with 8 bins each. However, experimentation has shown it is best to use a 4 x 4 set with 8 bins each for maximum effectiveness and efficiency. This is essentially a 128-feature vector. Keypoint Descriptor

By generalizing the gradient vectors in the neighboring pixels into 8 bins, this keypoint is resilient against different 3D perspectives. Keypoint Descriptor

In order to be resilient to differences in illumination, we normalize the entries of the feature vector. This makes the descriptor invariant to changes in contrast or brightness In order to be resilient to non-linear changes in illumination, such as camera saturation, we reduce the effect of large gradient vectors by setting a threshold in the feature vector such that no value is larger than 0.2. We then re- normalize. Keypoint Descriptor

Rotation Invariance

Scale Invariance

3D Perspective Resilience

Occlusion – with outliers

Occlusion

Tracking