Evaluation of Research Theme CogB. Objectives LEAR: LEArning and Recognition in vision Visual recognition and scene understanding –Particular objects.

Slides:



Advertisements
Similar presentations
Distinctive Image Features from Scale-Invariant Keypoints
Advertisements

Coherent Laplacian 3D protrusion segmentation Oxford Brookes Vision Group Queen Mary, University of London, 11/12/2009 Fabio Cuzzolin.
Feature extraction: Corners
SoLSTiCe Similarity of locally structured data in computer vision Université-Jean Monnet (Saint-Etienne) LIRIS (Lyon) (1/02/ ) Elisa Fromont,
Presented by Xinyu Chang
Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman University of Texas at Austin.
CSE 473/573 Computer Vision and Image Processing (CVIP)
Detecting Categories in News Video Using Image Features Slav Petrov, Arlo Faria, Pascal Michaillat, Alex Berg, Andreas Stolcke, Dan Klein, Jitendra Malik.
Global spatial layout: spatial pyramid matching Spatial weighting the features Beyond bags of features: Adding spatial information.
Ghunhui Gu, Joseph J. Lim, Pablo Arbeláez, Jitendra Malik University of California at Berkeley Berkeley, CA
Fast intersection kernel SVMs for Realtime Object Detection
Computer and Robot Vision I
1 Texmex – November 15 th, 2005 Strategy for the future Global goal “Understand” (= structure…) TV and other MM documents Prepare these documents for applications.
Local Descriptors for Spatio-Temporal Recognition
Computer Vision Group, University of BonnVision Laboratory, Stanford University Abstract This paper empirically compares nine image dissimilarity measures.
Recognition using Regions CVPR Outline Introduction Overview of the Approach Experimental Results Conclusion.
Feature extraction: Corners 9300 Harris Corners Pkwy, Charlotte, NC.
A Study of Approaches for Object Recognition
CS335 Principles of Multimedia Systems Content Based Media Retrieval Hao Jiang Computer Science Department Boston College Dec. 4, 2007.
Automatic Image Alignment (feature-based) : Computational Photography Alexei Efros, CMU, Fall 2005 with a lot of slides stolen from Steve Seitz and.
Feature extraction: Corners and blobs
Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman University of Oxford ICCV 2003.
Local Features and Kernels for Classification of Object Categories J. Zhang --- QMUL UK (INRIA till July 2005) with M. Marszalek and C. Schmid --- INRIA.
5/30/2006EE 148, Spring Visual Categorization with Bags of Keypoints Gabriella Csurka Christopher R. Dance Lixin Fan Jutta Willamowski Cedric Bray.
1 Invariant Local Feature for Object Recognition Presented by Wyman 2/05/2006.
Automatic Image Alignment (feature-based) : Computational Photography Alexei Efros, CMU, Fall 2006 with a lot of slides stolen from Steve Seitz and.
© 2013 IBM Corporation Efficient Multi-stage Image Classification for Mobile Sensing in Urban Environments Presented by Shashank Mujumdar IBM Research,
Machine learning & category recognition Cordelia Schmid Jakob Verbeek.
Overview Introduction to local features
Learning to classify the visual dynamics of a scene Nicoletta Noceti Università degli Studi di Genova Corso di Dottorato.
Bag-of-Words based Image Classification Joost van de Weijer.
Unsupervised Learning of Categories from Sets of Partially Matching Image Features Kristen Grauman and Trevor Darrel CVPR 2006 Presented By Sovan Biswas.
Action recognition with improved trajectories
Machine learning & category recognition Cordelia Schmid Jakob Verbeek.
Recognition and Matching based on local invariant features Cordelia Schmid INRIA, Grenoble David Lowe Univ. of British Columbia.
A General Framework for Tracking Multiple People from a Moving Camera
Overview Harris interest points Comparing interest points (SSD, ZNCC, SIFT) Scale & affine invariant interest points Evaluation and comparison of different.
Local invariant features Cordelia Schmid INRIA, Grenoble.
Marcin Marszałek, Ivan Laptev, Cordelia Schmid Computer Vision and Pattern Recognition, CVPR Actions in Context.
Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.
Features-based Object Recognition P. Moreels, P. Perona California Institute of Technology.
MSRI workshop, January 2005 Object Recognition Collected databases of objects on uniform background (no occlusions, no clutter) Mostly focus on viewpoint.
Feature extraction: Corners 9300 Harris Corners Pkwy, Charlotte, NC.
776 Computer Vision Jan-Michael Frahm, Enrique Dunn Spring 2013.
Lecture 7: Features Part 2 CS4670/5670: Computer Vision Noah Snavely.
A Sparse Texture Representation Using Affine-Invariant Regions Svetlana Lazebnik, Jean Ponce Svetlana Lazebnik, Jean Ponce Beckman Institute University.
Local invariant features Cordelia Schmid INRIA, Grenoble.
Efficient Visual Object Tracking with Online Nearest Neighbor Classifier Many slides adapt from Steve Gu.
Overview Introduction to local features Harris interest points + SSD, ZNCC, SIFT Scale & affine invariant interest point detectors Evaluation and comparison.
Feature extraction: Corners and blobs. Why extract features? Motivation: panorama stitching We have two images – how do we combine them?
Features Jan-Michael Frahm.
CS654: Digital Image Analysis
Image features and properties. Image content representation The simplest representation of an image pattern is to list image pixels, one after the other.
Keypoint extraction: Corners 9300 Harris Corners Pkwy, Charlotte, NC.
WLD: A Robust Local Image Descriptor Jie Chen, Shiguang Shan, Chu He, Guoying Zhao, Matti Pietikäinen, Xilin Chen, Wen Gao 报告人:蒲薇榄.
Blob detection.
Machine learning & object recognition Cordelia Schmid Jakob Verbeek.
Machine Learning and Category Representation Jakob Verbeek November 25, 2011 Course website:
Learning Mid-Level Features For Recognition
Paper Presentation: Shape and Matching
Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT
Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science
CS 1674: Intro to Computer Vision Scene Recognition
Aim of the project Take your image Submit it to the search engine
Brief Review of Recognition + Context
KFC: Keypoints, Features and Correspondences
SIFT keypoint detection
Lecture VI: Corner and Blob Detection
Presented by Xu Miao April 20, 2005
Recognition and Matching based on local invariant features
Presentation transcript:

Evaluation of Research Theme CogB

Objectives LEAR: LEArning and Recognition in vision Visual recognition and scene understanding –Particular objects and scenes –Object classes and categories –Human motion and actions Strategy : Robust image description + learning techniques

Axes Robust image description –Appropriate descriptors for objects and categories Statistical modeling and machine learning for vision –Selection and adaptation of existing techniques Visual object recognition and scene understanding –Description + learning

Overview Presentation of the team Positioning within INRIA and internationally Progress towards initial goals Main scientific contributions Future – next four years

Team Creation of the LEAR team in July 2003

Positioning in INRIA Main INRIA strategic challenge: Developing multimedia data and multimedia information processing The only INRIA team with object recognition as its central goal Expertise in image description and applied learning

INRIA teams with related themes Imedia: indexing, navigation and browsing in large multi- media data streams TexMex: management of multi-media databases, handling large data collections and developing multi- media and text descriptors Vista: analysis of image sequences, motion descriptors Ariana: image processing for remote sensing

International positioning In France and Europe: a few groups work on the problem (Amsterdam, Oxford, Leuven, TU Darmstadt) In the US: several groups use machine learning for visual recognition (CMU, Caltec, MIT, UBC, UCB, UCLA, UIUC) Competitive results compared to the above groups in –Image description (scale and affine invariant regions) –Classification and localization of object categories; winner of 14 out of 18 tasks of the PASCAL object recognition challenge –Learning-based human motion modeling

Progress towards initial goals LEAR was created two and a half years ago Significant progress towards each goal, especially –Category classification and detection –Machine learning Scientific production –Publications (65 journals, conferences & books in 3 years, mainly in the most competitive journals and conferences) –Software, databases available on our web page Collaborations (INRIA team MISTIS, UIUC in the US, ANU in Australia, Oxford, Leuven, LASMEA Clermont-Ferrand …)

Progress towards initial goals Industrial contracts (MBDA, Bertin technologies,Thales Optronics, Techno-Vision project Robin) Research contracts (French grant ACI “Large quantities of data” MoviStar, EU network PASCAL, EU project AceMedia, EU project CLASS, EADS and Marie Curie postdoctoral grants) Scientific organization (Editorial boards of PAMI and IJCV; program committees/area chairs of all major computer vision conferences; organization of ICCV’03 and CVPR’05; vice-head of AFRIF; co-ordination of EU project CLASS, Techno-Vision project Robin and ACI MoviStar)

Main contributions - overview Image descriptors –Scale- and affine-invariant detectors + descriptors –Local dense representations –Shape descriptors –Color descriptors Learning –Clustering –Dimensionality reduction –Markov random fields –SVM kernels

Main contributions - overview Object recognition –Texture recognition –Bag-of-features representation –Spatial features (semi-local parts, hierarchical spatial model) –Multi-class hierarchical classification –Recognition with 3D models –Human detection Human tracking and action recognition –Learning dynamical models for 2D articular human tracking –3D human pose and motion from monocular images

Invariant detectors and descriptors Scale and affine-invariant keypoint detectors [IJCV’04] –Matching in the presence of large viewpoints changes

Invariant detectors and descriptors Evaluation of detectors and descriptors [PAMI’05, IJCV’06] –Database with different scene types (textured and structured) and transformations –Definition of evaluation criteria –Collaboration with Oxford, Leuven, Prague Database and binaries available on the web –4000 access and 1000 downloads

Image retrieval demonstration

Dense representation Dense multi-scale local descriptors [ICCV’05] Still local, but captures more of the available information Clustering to obtain representative features –our clustering algorithm deals with very different densities Feature selection determines the most characteristic clusters

Bag-of-features for image classification Classification SVM Extract regionsCompute descriptors Find clusters and frequencies Compute distance matrix

Bag-of-features for image classification Excellent results in the presence of background clutter Our team won all image classification tasks of the PASCAL network challenge on visual object recognition bikesbooksbuildingcarspeoplephonestrees

Recognition with spatial relations A Approach [ICCV’05]: Semi-local parts: point regions and similar geometric neighborhood structure Validation, i.e. part selection Learn a probabilistic model of the object class (discriminative maximum entropy framework)

Recognition with spatial relations Improved recognition for classes with structure

Human detection [CVPR’05] Histogram of oriented image gradients as image descriptor SVM as classifier, importance weighted descriptors Winner of the PASCAL challenge on human detection

Human detection

Evaluation of category recognition Techno-Vision project Robin ( ) –Funded by the French ministries of defence and of research Construction of datasets and ground truth –Industrial partnership with MBDA, SAGEM, THALES, Bertin Tech, Cybernetix, EADS and CNES –Production of six datasets with thousands of annotated images, from satellite images to ground level images

Evaluation of category recognition Evaluation metrics for category classification and localization in collaboration with ONERA and CTA/DGA Organization of competitions in 2006, 38 registered participants (research teams) at the moment Datasets, metrics and evaluation tools will be publicly available for benchmarking

Learning based human motion capture learning [CVPR’04, ICML’04, PAMI’06], best student paper at the Rank Foundation Symposium on Machine Understanding of People

Learning based human motion capture

Future – next four years The major objectives remain valid Image description [low risk] –Learn image descriptors [PhD of D. Larlus] –Shape descriptors [postdoc of V. Ferrari] –Color descriptors [postdoc of J. Van de Weijer] –Spatial relations [PhD of M. Marszalek] Learning [medium risk] –Semi- & unsupervised learning, automatic annotation –Hierarchical structuring of categories –Existing collaborations, EU project CLASS, postdoc of J. Verbeek

Future – next four years Object recognition –Object detection & localization [low risk] –Large number of object categories [medium risk] –Scene interpretation [high risk] Human modeling and action recognition –Pose & motion for humans in general conditions [PhD A. Agarwal] –Recognition of actions and interactions