A Study of Approaches for Object Recognition

Slides:

Advertisements

Similar presentations

Distinctive Image Features from Scale-Invariant Keypoints

Advertisements

Feature Detection. Description Localization More Points Robust to occlusion Works with less texture More Repeatable Robust detection Precise localization.

Distinctive Image Features from Scale-Invariant Keypoints David Lowe.

Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.

Presented by Xinyu Chang

The SIFT (Scale Invariant Feature Transform) Detector and Descriptor

Object Recognition using Invariant Local Features Applications l Mobile robots, driver assistance l Cell phone location or object recognition l Panoramas,

Distinctive Image Features from Scale- Invariant Keypoints Mohammad-Amin Ahantab Technische Universität München, Germany.

Instructor: Mircea Nicolescu Lecture 13 CS 485 / 685 Computer Vision.

IBBT – Ugent – Telin – IPI Dimitri Van Cauwelaert A study of the 2D - SIFT algorithm Dimitri Van Cauwelaert.

Object Recognition with Invariant Features n Definition: Identify objects or scenes and determine their pose and model parameters n Applications l Industrial.

Robust and large-scale alignment Image from

Object Recognition with Invariant Features n Definition: Identify objects or scenes and determine their pose and model parameters n Applications l Industrial.

Distinctive Image Feature from Scale-Invariant KeyPoints

Feature extraction: Corners and blobs

Distinctive image features from scale-invariant keypoints. David G. Lowe, Int. Journal of Computer Vision, 60, 2 (2004), pp Presented by: Shalomi.

Object Recognition Using Distinctive Image Feature From Scale-Invariant Key point D. Lowe, IJCV 2004 Presenting – Anat Kaspi.

Scale Invariant Feature Transform (SIFT)

Automatic Matching of Multi-View Images

SIFT - The Scale Invariant Feature Transform Distinctive image features from scale-invariant keypoints. David G. Lowe, International Journal of Computer.

1 Invariant Local Feature for Object Recognition Presented by Wyman 2/05/2006.

Distinctive Image Features from Scale-Invariant Keypoints David G. Lowe – IJCV 2004 Brien Flewelling CPSC 643 Presentation 1.

Sebastian Thrun CS223B Computer Vision, Winter Stanford CS223B Computer Vision, Winter 2005 Lecture 3 Advanced Features Sebastian Thrun, Stanford.

Scale-Invariant Feature Transform (SIFT) Jinxiang Chai.

Distinctive Image Features from Scale-Invariant Keypoints By David G. Lowe, University of British Columbia Presented by: Tim Havinga, Joël van Neerbos.

Computer vision.

Internet-scale Imagery for Graphics and Vision James Hays cs195g Computational Photography Brown University, Spring 2010.

Recognition and Matching based on local invariant features Cordelia Schmid INRIA, Grenoble David Lowe Univ. of British Columbia.

Bag of Visual Words for Image Representation & Visual Search Jianping Fan Dept of Computer Science UNC-Charlotte.

Object Tracking/Recognition using Invariant Local Features Applications l Mobile robots, driver assistance l Cell phone location or object recognition.

Local invariant features Cordelia Schmid INRIA, Grenoble.

Reporter: Fei-Fei Chen. Wide-baseline matching Object recognition Texture recognition Scene classification Robot wandering Motion tracking.

CVPR 2003 Tutorial Recognition and Matching Based on Local Invariant Features David Lowe Computer Science Department University of British Columbia.

CSCE 643 Computer Vision: Extractions of Image Features Jinxiang Chai.

Wenqi Zhu 3D Reconstruction From Multiple Views Based on Scale-Invariant Feature Transform.

Lecture 7: Features Part 2 CS4670/5670: Computer Vision Noah Snavely.

Distinctive Image Features from Scale-Invariant Keypoints Ronnie Bajwa Sameer Pawar * * Adapted from slides found online by Michael Kowalski, Lehigh University.

Harris Corner Detector & Scale Invariant Feature Transform (SIFT)

Distinctive Image Features from Scale-Invariant Keypoints David Lowe Presented by Tony X. Han March 11, 2008.

Features, Feature descriptors, Matching Jana Kosecka George Mason University.

CSE 185 Introduction to Computer Vision Feature Matching.

A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter )

Local features: detection and description

Distinctive Image Features from Scale-Invariant Keypoints

Scale Invariant Feature Transform (SIFT)

Presented by David Lee 3/20/2006

776 Computer Vision Jan-Michael Frahm Spring 2012.

Image features and properties. Image content representation The simplest representation of an image pattern is to list image pixels, one after the other.

Recognizing specific objects Matching with SIFT Original suggestion Lowe, 1999,2004.

Distinctive Image Features from Scale-Invariant Keypoints Presenter :JIA-HONG,DONG Advisor : Yen- Ting, Chen 1 David G. Lowe International Journal of Computer.

Blob detection.

776 Computer Vision Jan-Michael Frahm Spring 2012.

Visual homing using PCA-SIFT

SIFT Scale-Invariant Feature Transform David Lowe

CS262: Computer Vision Lect 09: SIFT Descriptors

Lecture 07 13/12/2011 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.

Distinctive Image Features from Scale-Invariant Keypoints

Scale Invariant Feature Transform (SIFT)

Nearest-neighbor matching to feature database

Nearest-neighbor matching to feature database

CAP 5415 Computer Vision Fall 2012 Dr. Mubarak Shah Lecture-5

From a presentation by Jimmy Huff Modified by Josiah Yoder

Aim of the project Take your image Submit it to the search engine

The SIFT (Scale Invariant Feature Transform) Detector and Descriptor

CSE 185 Introduction to Computer Vision

ECE734 Project-Scale Invariant Feature Transform Algorithm

Presented by Xu Miao April 20, 2005

Recognition and Matching based on local invariant features

Presentation transcript:

A Study of Approaches for Object Recognition Presented by Wyman Wong 12/9/2005

Outlines Introduction Model-Based Object Recognition AAM Inverse Composition AAM View-Based Object Recognition Recognition based on boundary fragments Recognition based on SIFT Proposed Research Conclusion and Future Work

Introduction Object Recognition A task of finding 3D objects from 2D images (or even video) and classifying them into one of the many known object types Closely related to the success of many computer vision applications robotics, surveillance, registration … etc. A difficult problem that a general and comprehensive solution to this problem has not been made

Introduction Two main streams of approaches: Model-Based Object Recognition 3D model of the object being recognized is available Compare the 2D representation of the structure of an object with the 2D projection of the model View-Based Object Recognition 2D representations of the same object viewed at different angles and distances when available Extract features (as the representations of object) and compare them to the features in the feature database The 3D model contains detailed information about the object, including the shape of its structure, the spatial relationship between its parts and its appearance.

Introduction Pros and Cons of each main stream: Model-Based Object Recognition Model features can be predicted from just a few detected features based on the geometric constraints Models sacrifice its generality View-Based Object Recognition Greater generality and more easily trainable from visual data Matching is done by comparing the entire objects, some methods may be sensitive to clutter and occlusion

Model-Based Object Recognition Commonly used in face recognition General Steps: Locate the object, locate and label its structure, adjust the model's parameters until the model generates an image similar enough to the real object. Active Appearance Models (AAM) have been proved to be highly useful models for face recognition

Active Appearance Models They model shape and appearance of objects separately Shape: the vertex locations of a mesh Appearance: the pixels’ values of a mesh Both of the parameters above used PCA to generalize the face recognition to generic face Fitting an AAM: non-linear optimization solution is applied which iteratively solve for incremental additive updates to the shape and appearance coefficients SHORT description only

Inverse Compositional AAMs The major difference of these models with AAMs is the fitting algorithm AAM: additive incremental update shape and appearance parameters ICAAM: inverse compositional update – The algorithm updates the entire warp by composing the current warp with the computed incremental warp

View-Based Object Recognition Common approaches: Correlation-based template matching (Li, W. et al. 95) SEA, PDE, … etc Not effective when the following happens: illumination of environment changes Posture and scale of object changes Occlusion Color Histogram (Swain, M.J. 90) Construct histogram for an object and match it over image It is robust to changing of viewpoint and occlusion But it requires good isolation and segmentation of objects

View-Based Object Recognition Common approaches: Feature based Extract features from the image that are salient and match only to those features when searching all location for matches Feature types: groupings of edges, SIFT … etc Feature’s property preferences: View invariant Detected frequently enough for reliable recognition Distinctive Image descriptor is created based on detected features to increase the matching performance Image descriptor = Key / Index to database of features Descriptor’s property preferences: Invariant to scaling, rotation, illumination, affine transformation and noise Feature, not necessary a corner

Nelson’s Approach Recognition based on 2D Boundary Fragments Prepare 53 clean images for each object and build 3D recognition database: Object Camera

Nelson’s Approach Test images used in Nelson’s experiment and their features

Nelson’s Approach Nelson’s experiment has shown his approach has high accuracy 97.0% success rate for 24 objects database under the following conditions: Large number of images Clean images Very different objects No occlusion and clutter

Lowe’s Approach Recognition based on Scale Invariant Feature Transform (SIFT) SIFT generates distinctive invariant features SIFT based image descriptors are generally most resistant to common image deformations (Mikolajczyk 2005) SIFT – four steps: Scale-space extrema detection Keypoint localization Orientation assignment Keypoint descriptor computation

Scale-space extrema detection DOG ~ LOG Search over all sample points in all scales and find extrema that are local maxima or minima in laplacian space Also talk about using extrema, instead of absolute threshold Small keypoints  Solve occlusion problem Large keypoints  Robust to noise and image blur

Keypoint localization Reject keypoints with the following properties: Low contrast (sensitive to noise) Localized along edge (sliding effect) Solution: Filter points with value D below 0.03 Apply Hessian edge detector

Orientation assignment Pre-compute the gradient magnitude and orientation Use them to construct keypoint descriptor

Keypoint descriptor computation Create orientation histogram over 4x4 sample regions around the keypoint locations Each histogram contains 8 orientation bins 4x4x8 = 128 elements vectors (distinctively representing a feature)

Object Recognition based on SIFT Nearest-neighbor algorithm Matching: assign features to objects There can be many wrong matches Solution Identify clusters of features Generalized Hough transform Determine pose of object and then discard outliers

Proposed Research Personally, I think model-based approach does have better performance Success of model-based approach requires: All models of objects to be detected Automatically construct models Automatically select the best model How do the system know which 3D model to be used on a specific image of object? By view-based approach Human looks at an image of object for a moment and then realize which model to be used on that object Then use the specific model to refine the identification of the specific object

Hybrid of bottom-up and top-down View-based approaches just presented are bottom-up approaches Features: edges, extrema (Low Level) Descriptors of features Matching Identification of object (High Level) Can it be like that? Features … Matching (Lower Level) Guessing of object (Higher Level) Identification of object

Hierarchy of features Lowe’s system All features have equal weight in voting of object during identification of object (subject to be verified by examining the opened source code) Special features do not have enough voting power to shift the result to the correct one Consider the following scenario: Two objects have many similar features, a1 to a100 are similar to b1 to b100, and have just one very different feature, a* for object A and b* for object B Many a1 to a100 may be poorly captured by imaging device and mismatched as b1 to b100 , even we can still recognize the feature a*, the system may still think the object is B Object A Object B

Extension of SIFT Color descriptors Local texture measures incorporated into feature descriptors Scale-invariant edge groupings *Generic object class recognition

Conclusion and Future Work Discussed the different approaches in object recognition Discussed what is SIFT and how it works Discussed the possible extensions to SIFT Design hybrid approach Design extensions

Q & A Thank you very much!

Things to be understood Find extrema over same scale space is good, why need to find over different scale?