Lecture 2 Mei-Chen Yeh 03/09/2010. Outline Demos Image representation and feature extraction – Global features – Local features: SIFT Assignment #2 (due:

Slides:

Advertisements

Similar presentations

Distinctive Image Features from Scale-Invariant Keypoints

Advertisements

Feature Detection. Description Localization More Points Robust to occlusion Works with less texture More Repeatable Robust detection Precise localization.

Distinctive Image Features from Scale-Invariant Keypoints David Lowe.

Group Meeting Presented by Wyman 10/14/2006

Presented by Xinyu Chang

CS Spring 2009 CS 414 – Multimedia Systems Design Lecture 4 – Digital Image Representation Klara Nahrstedt Spring 2009.

The SIFT (Scale Invariant Feature Transform) Detector and Descriptor

Object Recognition using Invariant Local Features Applications l Mobile robots, driver assistance l Cell phone location or object recognition l Panoramas,

CSE 473/573 Computer Vision and Image Processing (CVIP)

Distinctive Image Features from Scale- Invariant Keypoints Mohammad-Amin Ahantab Technische Universität München, Germany.

Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision.

Matching with Invariant Features

Content Based Image Retrieval

Object Recognition with Invariant Features n Definition: Identify objects or scenes and determine their pose and model parameters n Applications l Industrial.

SURF: Speeded-Up Robust Features

A Study of Approaches for Object Recognition

Object Recognition with Invariant Features n Definition: Identify objects or scenes and determine their pose and model parameters n Applications l Industrial.

Feature matching and tracking Class 5 Read Section 4.1 of course notes Read Shi and Tomasi’s paper on.

Distinctive Image Feature from Scale-Invariant KeyPoints

Feature extraction: Corners and blobs

Distinctive image features from scale-invariant keypoints. David G. Lowe, Int. Journal of Computer Vision, 60, 2 (2004), pp Presented by: Shalomi.

Scale Invariant Feature Transform (SIFT)

Automatic Matching of Multi-View Images

Blob detection.

The SIFT (Scale Invariant Feature Transform) Detector and Descriptor

CS4670: Computer Vision Kavita Bala Lecture 8: Scale invariance.

776 Computer Vision Jan-Michael Frahm, Enrique Dunn Spring 2013.

Distinctive Image Features from Scale-Invariant Keypoints David G. Lowe – IJCV 2004 Brien Flewelling CPSC 643 Presentation 1.

Scale-Invariant Feature Transform (SIFT) Jinxiang Chai.

Interest Point Descriptors

Distinctive Image Features from Scale-Invariant Keypoints By David G. Lowe, University of British Columbia Presented by: Tim Havinga, Joël van Neerbos.

Computer vision.

Internet-scale Imagery for Graphics and Vision James Hays cs195g Computational Photography Brown University, Spring 2010.

Bag of Visual Words for Image Representation & Visual Search Jianping Fan Dept of Computer Science UNC-Charlotte.

Object Tracking/Recognition using Invariant Local Features Applications l Mobile robots, driver assistance l Cell phone location or object recognition.

Reporter: Fei-Fei Chen. Wide-baseline matching Object recognition Texture recognition Scene classification Robot wandering Motion tracking.

776 Computer Vision Jan-Michael Frahm Fall SIFT-detector Problem: want to detect features at different scales (sizes) and with different orientations!

CVPR 2003 Tutorial Recognition and Matching Based on Local Invariant Features David Lowe Computer Science Department University of British Columbia.

CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai.

Wenqi Zhu 3D Reconstruction From Multiple Views Based on Scale-Invariant Feature Transform.

Lecture 7: Features Part 2 CS4670/5670: Computer Vision Noah Snavely.

Distinctive Image Features from Scale-Invariant Keypoints Ronnie Bajwa Sameer Pawar * * Adapted from slides found online by Michael Kowalski, Lehigh University.

Harris Corner Detector & Scale Invariant Feature Transform (SIFT)

Features Digital Visual Effects, Spring 2006 Yung-Yu Chuang 2006/3/15 with slides by Trevor Darrell Cordelia Schmid, David Lowe, Darya Frolova, Denis Simakov,

Lecture 1 Mei-Chen Yeh 03/02/2010. Announcements TA: 游宗毅 Assignment #1 due 03/09.

Distinctive Image Features from Scale-Invariant Keypoints David Lowe Presented by Tony X. Han March 11, 2008.

Feature extraction: Corners and blobs. Why extract features? Motivation: panorama stitching We have two images – how do we combine them?

Features, Feature descriptors, Matching Jana Kosecka George Mason University.

A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter )

Distinctive Image Features from Scale-Invariant Keypoints

Presented by David Lee 3/20/2006

Distinctive Image Features from Scale-Invariant Keypoints Presenter :JIA-HONG,DONG Advisor : Yen- Ting, Chen 1 David G. Lowe International Journal of Computer.

Blob detection.

SIFT Scale-Invariant Feature Transform David Lowe

CS262: Computer Vision Lect 09: SIFT Descriptors

Presented by David Lee 3/20/2006

Lecture 07 13/12/2011 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.

Distinctive Image Features from Scale-Invariant Keypoints

Scale Invariant Feature Transform (SIFT)

Project 1: hybrid images

Feature description and matching

CAP 5415 Computer Vision Fall 2012 Dr. Mubarak Shah Lecture-5

From a presentation by Jimmy Huff Modified by Josiah Yoder

The SIFT (Scale Invariant Feature Transform) Detector and Descriptor

Edge detection f(x,y) viewed as a smooth function

Feature descriptors and matching

SIFT SIFT is an carefully designed procedure with empirically determined parameters for the invariant and distinctive features.

Presented by Xu Miao April 20, 2005

Presentation transcript:

Lecture 2 Mei-Chen Yeh 03/09/2010

Outline Demos Image representation and feature extraction – Global features – Local features: SIFT Assignment #2 (due: 03/16)

Demos Augmented Reality – – Tracking – Traffic Traffic – Counting people Counting people Image search – MyFinder: – Simplicity: Image annotation – ALIPR: Embedded face detection and recognition Tiling slide show Pivot:

Multimedia Systems: A Multidisciplinary Subject Signal Processing Data Mining Machine Learning Pattern Recognition Networking … and more!

Topics (1) Image/video processing – Feature extraction – Video syntax analysis – Compression

Topics (2) Content-based image/video retrieval – Copy detection – Region-based retrieval – Multi-dimensional indexing

Topics (3) Multimodal system – Audio processing – Multimodality analysis

Topics (4) Semantic concept detection – Object detection – Object recognition

Topics (5) Tracking – Motion features – Models – Single-, multiple-object tracking

Topic (6) Qualify of Service/Experience – QoE Framework – VoIP System Evaluation – Imaging System Evaluation

Resources of the readings ACM International Conference on Multimedia – The premier annual event on multimedia research, technology, and art – Started since 1993 – >400 attendees – Program: Content, Systems, Applications, HC tracks – Full papers (16%), short papers (28%) – Technical demonstrations, open source software competition, the doctoral symposium, tutorials (6), workshops (11), a brave new topic session, panels (2), Multimedia grand challenge IEEE Transactions on Multimedia

Image Representations

Multimedia file formats A list of some formats used in the popular product “Macromedia Director” These formats differ mainly in how data are compressed. Features are normally extracted from raw data.

1-bit images Each pixel is stored as a single bit (0 or 1), so also referred to as binary image. So-called 1-bit monochrome image No color

8-bit gray-level images Each pixel has a gray- value between 0 and 255. (0=>black, 255=>white) Image resolution refers to the number of pixels in a digital image A 640 x 480 grayscale image requires ??? kB One byte per pixel 640x480 = 307,200 ~ 300 kB

24-bit color images Each pixel is represented by three bytes, usually representing RGB. This format supports 256x256x256 (16,777,216) possible colors. A 640x bit color image would require kB! Lena: 1972 Lena: 1997

Image Features

Feature types Global features – Color – Shape – Texture Local features – SIFT – SURF – Self-similarity descriptor – Shape context descriptor – … … … … A fixed-length feature vector

Color histogram A color histogram counts pixels with a given pixel value in Red, Green, and Blue (RGB). An example of histogram that has bins, for 24-bit color images:

Color histogram (cont.) Quantization

Color histogram (cont.) Problems of such a representation Case 1 Case 2 Case 3 SAME!

Search by color histograms

Regional color Divide the image into regions Extract a color histogram for each region Put together those color histograms into a long feature vector

Textures Many natural and man-made objects are distinguished by their texture. Man-made textures – Walls, clothes, rugs… Natural textures – Water, clouds, sand, grass, … What is this?

Examples More:

Texture features Structural – Describe arrangement of texture elements – E.g., “texton model”, “texel model” Statistical – Characterize texture in terms of statistics – E.g., co-occurrence matrix, Markov random field Spectral – Analyze in spatial-frequency domain – E.g., Fourier transform, Gabor filter, wavelets

Textual Properties Coarseness: coarse vs. fine Contrast: high vs. low Orientation: directional vs. non-directional Edge: line-like vs. blob-like Regularity: regular vs. random Roughness: rough vs. smooth

Shape Boundary-based feature – Use only the outer boundary of the shape – E.g. Fourier descriptor, shape context descriptor Region-based feature – Use the entire shape region – Local descriptors

Shape: Fourier descriptor

Properties Invariant to translation, scale, and rotation

Feature types Global features – Color – Shape – Texture Local features – SIFT – SURF – Self-similarity descriptor – Shape context descriptor – … … … … A fixed-length feature vector

David G. Lowe. Distinctive Image Features from Scale-Invariant Key- points, IJCV, 2004

What is SIFT? Scale Invariant Feature Transform (SIFT) is an approach for detecting and extracting local feature descriptors from an image. SIFT feature descriptors are reasonably invariant to – scaling – rotation – image noise – changes in illumination – small changes in viewpoint

Types of invariance illuminationscalerotation viewing angle

………. Number of keypoints Feature dimension

Matching two images

Densely cover the image (an image with 500x500 pixels => 2000 feature vectors) Distinctive Invariant to image scale, rotation, and partially invariant to changing viewpoints and illumination Perform the best among local descriptors – K. Mikolajczyk and C. Schmid, “A performance evaluation of local descriptors,” PAMI 05.

Simple test (scale and rotate) Scale to 60% and rotate 30 degree 693 keypoints 349 keypoints 214 matches!

Simple test (illumination) 693 keypoints 467 matches! 633 keypoints

693 keypoints 728 keypoints 25 matches! Simple test (different appearance)

693 keypoints 832 keypoints 1 match!

Simple Test (different appearance with occlusion) 693 keypoints 1124 keypoints 0 match!

How to generate SIFT feature descriptors? How to use SIFT features descriptors (for object recognition, image retrieval, etc.) ? About SIFT …

SIFT: Overview Major stages of SIFT computation Scale-space extrema detection Keypoint localization Orientation assignment Keypoint descriptor An image feature vectors (128-d) Identify potential interest points (location, scale) Localize candidate keypoints Reduced sets of (location, scale) Identify the dominant orientations (location, scale, orientation) Build a descriptor based on histogram of gradients in local neighborhood Interest point detector + descriptor

Step 1: Scale-space extrema detection How do we detect locations that are invariant to scale change of the image? Detecting extrema in scale-space – For a given image I(x,y), its linear scale-space representation: – Be efficiently implemented by searching for local peaks in a series of DoG (difference-of-Gaussian) images

Step 1: Scale-space extrema detection σ kσkσ k2σk2σ

DoG images Gaussian images

Step 2: Scale-space extrema detection DoG If X is the largest or the smallest of all of its neighbors, X is called a keypoint.

Why DoG? An efficient function to compute A close approximation to the scale-normalized Laplacian of Gaussian – Lindeberg showed that the normalization of the Laplacian with the factor σ 2 is required for true scale invariance. (1994) – Mikolajczyk found that the maxima and minima of produce the most stable image features. (2002) DoG v.s.

Output of Step 1 ~ 2000 keypoints in a 500x500 image Too many keypoints!

Step 2: Accurate keypoint localization Reject points that have low contrast or are poorly localized along an edge Image size: 233x

Step 2: Accurate keypoint localization Another example Extrema of DoG across scales After removal of low contrast points After removal of edge responses

Step 2: Accurate keypoint localization Simple method (Lowe, ICCV 1999) – Use gradient magnitudes More sophisticated method (Brown and Lowe, BMVC 2002) – Use the Taylor expansion of the scale-space function, compare the function value at the extremum to a threshold (0.03) – Use the ratio of eigenvalues of a 2x2 Hessian matrix, eliminate keypoints with a ratio greater than 10

Step 3: Orientation assignment

To achieve invariance to rotation Compute gradient magnitude and orientation for each image sample L(x, y, σ) Form an orientation histogram from the gradient orientations of sample points within a region around the keypoint, weighted by its gradient magnitude and a Gaussian-weighted window Detect the highest peak

Step 4: Local image descriptor Use a 4x4 grid computed from a 16x16 sample array 128-d = 4 * 4 * 8 (orientations) Examples: 2x2 grid on a 8x8 sample array

Step 4: Local image descriptor Fairly compact (128 values)

Results

Summary Scale-space extrema detection Keypoint localization Orientation assignment Keypoint descriptor An image feature vectors scale rotation illumination change viewpoint change Invariant to…

Discussions Do local features solve the object recognition problem? How do we deal with the false positives outside the object? How do we reduce the complexity matching two sets of local features?

Assignment #2 Download SIFT demo program – – Or nts/siftDemoV4.zip nts/siftDemoV4.zip Prepare at least two pairs of images which you think are similar – 1 st set: SIFT can match well – 2 nd set: SIFT cannot match well to TA your report that – Your experimental results – Your observations