A Generic Approach for Image Classification Based on Decision Tree Ensembles and Local Sub-windows Raphaël Marée, Pierre Geurts, Justus Piater, Louis Wehenkel.

Slides:



Advertisements
Similar presentations
Applications of one-class classification
Advertisements

Context-based object-class recognition and retrieval by generalized correlograms by J. Amores, N. Sebe and P. Radeva Discussion led by Qi An Duke University.
Road-Sign Detection and Recognition Based on Support Vector Machines Saturnino, Sergio et al. Yunjia Man ECG 782 Dr. Brendan.
Rapid Object Detection using a Boosted Cascade of Simple Features Paul Viola, Michael Jones Conference on Computer Vision and Pattern Recognition 2001.
Designing Facial Animation For Speaking Persian Language Hadi Rahimzadeh June 2005.
Chapter 8 Content-Based Image Retrieval. Query By Keyword: Some textual attributes (keywords) should be maintained for each image. The image can be indexed.
Computer Vision for Human-Computer InteractionResearch Group, Universität Karlsruhe (TH) cv:hci Dr. Edgar Seemann 1 Computer Vision: Histograms of Oriented.
Complex Feature Recognition: A Bayesian Approach for Learning to Recognize Objects by Paul A. Viola Presented By: Emrah Ceyhan Divin Proothi Sherwin Shaidee.
Image Indexing and Retrieval using Moment Invariants Imran Ahmad School of Computer Science University of Windsor – Canada.
Ghunhui Gu, Joseph J. Lim, Pablo Arbeláez, Jitendra Malik University of California at Berkeley Berkeley, CA
Matching with Invariant Features
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
A Study of Approaches for Object Recognition
Ensemble Tracking Shai Avidan IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE February 2007.
Multiple Human Objects Tracking in Crowded Scenes Yao-Te Tsai, Huang-Chia Shih, and Chung-Lin Huang Dept. of EE, NTHU International Conference on Pattern.
Automatic Image Alignment (feature-based) : Computational Photography Alexei Efros, CMU, Fall 2005 with a lot of slides stolen from Steve Seitz and.
1 MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING By Kaan Tariman M.S. in Computer Science CSCI 8810 Course Project.
Robust Real-Time Object Detection Paul Viola & Michael Jones.
Smart Traveller with Visual Translator for OCR and Face Recognition LYU0203 FYP.
Automatic Image Alignment (feature-based) : Computational Photography Alexei Efros, CMU, Fall 2006 with a lot of slides stolen from Steve Seitz and.
K-means Based Unsupervised Feature Learning for Image Recognition Ling Zheng.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
Face Detection using the Viola-Jones Method
Computer vision.
A Simple Method to Extract Fuzzy Rules by Measure of Fuzziness Jieh-Ren Chang Nai-Jian Wang.
EADS DS / SDC LTIS Page 1 7 th CNES/DLR Workshop on Information Extraction and Scene Understanding for Meter Resolution Image – 29/03/07 - Oberpfaffenhofen.
Recognition and Matching based on local invariant features Cordelia Schmid INRIA, Grenoble David Lowe Univ. of British Columbia.
Hurieh Khalajzadeh Mohammad Mansouri Mohammad Teshnehlab
Learning a Fast Emulator of a Binary Decision Process Center for Machine Perception Czech Technical University, Prague ACCV 2007, Tokyo, Japan Jan Šochman.
Robust Real-time Face Detection by Paul Viola and Michael Jones, 2002 Presentation by Kostantina Palla & Alfredo Kalaitzis School of Informatics University.
Automated Target Recognition Using Mathematical Morphology Prof. Robert Haralick Ilknur Icke José Hanchi Computer Science Dept. The Graduate Center of.
ECE738 Advanced Image Processing Face Detection IEEE Trans. PAMI, July 1997.
Data Reduction via Instance Selection Chapter 1. Background KDD  Nontrivial process of identifying valid, novel, potentially useful, and ultimately understandable.
Chapter 4: Pattern Recognition. Classification is a process that assigns a label to an object according to some representation of the object’s properties.
School of Engineering and Computer Science Victoria University of Wellington Copyright: Peter Andreae, VUW Image Recognition COMP # 18.
Tony Jebara, Columbia University Advanced Machine Learning & Perception Instructor: Tony Jebara.
Harris Corner Detector & Scale Invariant Feature Transform (SIFT)
Levels of Image Data Representation 4.2. Traditional Image Data Structures 4.3. Hierarchical Data Structures Chapter 4 – Data structures for.
1Ellen L. Walker Category Recognition Associating information extracted from images with categories (classes) of objects Requires prior knowledge about.
Face recognition via sparse representation. Breakdown Problem Classical techniques New method based on sparsity Results.
Learning to Detect Faces A Large-Scale Application of Machine Learning (This material is not in the text: for further information see the paper by P.
Analysis of Classification Algorithms In Handwritten Digit Recognition Logan Helms Jon Daniele.
Text From Corners: A Novel Approach to Detect Text and Caption in Videos Xu Zhao, Kai-Hsiang Lin, Yun Fu, Member, IEEE, Yuxiao Hu, Member, IEEE, Yuncai.
FACE DETECTION : AMIT BHAMARE. WHAT IS FACE DETECTION ? Face detection is computer based technology which detect the face in digital image. Trivial task.
COMP24111: Machine Learning Ensemble Models Gavin Brown
Distinctive Image Features from Scale-Invariant Keypoints
Using decision trees to build an a framework for multivariate time- series classification 1 Present By Xiayi Kuang.
2D-LDA: A statistical linear discriminant analysis for image matrix
Image features and properties. Image content representation The simplest representation of an image pattern is to list image pixels, one after the other.
Finding Clusters within a Class to Improve Classification Accuracy Literature Survey Yong Jae Lee 3/6/08.
Martina Uray Heinz Mayer Joanneum Research Graz Institute of Digital Image Processing Horst Bischof Graz University of Technology Institute for Computer.
Recognition of biological cells – development
Efficient Image Classification on Vertically Decomposed Data
Can Computer Algorithms Guess Your Age and Gender?
Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas
COMP61011 : Machine Learning Ensemble Models
Recognition using Nearest Neighbor (or kNN)
Efficient Image Classification on Vertically Decomposed Data
Shape matching and object recognition using shape contexts
Outline Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no.
Outline S. C. Zhu, X. Liu, and Y. Wu, “Exploring Texture Ensembles by Efficient Markov Chain Monte Carlo”, IEEE Transactions On Pattern Analysis And Machine.
Local Binary Patterns (LBP)
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
Fourier Transform of Boundaries
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Recognition and Matching based on local invariant features
Xiao-Yu Zhang, Shupeng Wang, Xiaochun Yun
Presentation transcript:

A Generic Approach for Image Classification Based on Decision Tree Ensembles and Local Sub-windows Raphaël Marée, Pierre Geurts, Justus Piater, Louis Wehenkel University of Liège, Belgium Problem Many application domains require classification of characters, symbols, faces, 3D objects, textures, … Specific feature extraction methods must be manually adapted when considering a new application Approach Recent and generic ML algorithm based on decision tree ensembles and working directly on pixel values Extension with local sub-window extraction Results Competitive with the state of the art on four well known datasets: MNIST, ORL, COIL-100, OUTEX Encouraging results for robustness (generalisation, rotation, scaling, occlusion) Abstract 1

Image classification Many different kind of problems 2 Usually tackled using: Problem-specific feature extraction ie. extracting a reduced set of « interesting » features from the initially huge number of pixels + Learning or matching algorithm Our generic approach: Working directly on pixel values ie. without any feature extraction ie. images are described by integer values (grey or RGB intensities) of all pixels + Ensemble of decision trees

3 Ensemble of extremely randomized trees (extra-trees) Learning Top-down induction algorithm like classical decision tree (with tests at the internal nodes of the form [a k,l < a th ] that compare the value of the pixel at position (k,l) to a threshold a th ) but: Test attributes and thresholds in internal nodes are chosen randomly, Each tree is fully developed until it perfectly classifies images in the learning sample, Several extra-trees are built from the same learning sample. Testing Propagate the entire test image successively into all the trees (involves comparing pixel values to thresholds in test nodes) and assign to the image the majority class among the classes given by the trees. Global generic approach

Local generic approach 4 Extra-trees and Sub-windows Learning Given a window size w 1 x w 2 and a large number N w : Extract N w sub-windows at random from learning set images and assign to each sub-window the classification of its parent image; Build a model to classify these N w sub-windows by using the w 1 x w 2 pixel values that characterize them Testing Given the window size w 1 x w 2 : Extract all possible sub-windows of size w 1 x w 2 from test image; Apply the model on each sub-window; Assign to the image the majority class among the classes assigned to the sub-windows by the model

Experiments: description 5 Database specification Every image in each database is described by all its pixel values and belong to one class. DBs# images# features# classes MNIST (28x28x1) 10 ORL (92x112x1) 40 COIL (32x32x3) 100 OUTEX (128x128x3) 54 Database protocols Separation of each database in two independent sets: the learning set (LS) of pre-classified images used to build a model and the test set (TS) used to evaluate the model. MNIST LS: first images TS: last remaining images ORL 100 random runs: LS: 200 images TS: 200 remaining images COIL-100 LS: 1800 images (k*20°, k=0..17) TS: 5400 remaining images OUTEX LS: 432 images TS: 432 remaining images

6 Experiments: results DBsExtra-trees + Sub-windows State-of-the-art MNIST3.26%2.63% (w 1 =w 2 =24) 12% … 0.7% [1] ORL4.56% ± % ± 1.18 (w 1 =w 2 =32) 7.5% … 0% [2] COIL %0.39% (w 1 =w 2 =16) 12.5% … 0.1% [3] OUTEX64.35%2.78% (w 1 =w 2 =4) 9.5% … 0.2% [4] Error rates on test sets Computing times  Learning on OUTEX  Extra-trees: ± 5 sec  Extra-trees + Sub-Windows: ± 8min  Testing on OUTEX (one image)  Extra-trees: < 1 msec  Extra-trees + Sub-Windows: ± 0,6 sec [1] Y. LeCun and L. Bottou and Y. Bengio and P. Haffner, Gradient-based learning applied to document recognition, 1998 [2] R. Paredes and A. Perez-Cortes, Local representations and a direct voting scheme for face recognition, 2001 [3] S. Obrzalek and J. Matas, Object Recognition using Local Affine Frames on Distinguished Regions, 2002 [4] T. Mäenpää, M. Pietikäinen, and J. Viertola, Separating color and pattern information for color texture discrimination, 2002

7 Evaluation of Robustness Generalisation Rotation Scaling Occlusion Considering different learning sample sizes (COIL-100) Image-plane rotation of the test images (COIL-100) Scaled version of the test images, with model built from 32x32 images (COIL-100) Erasing right parts of the test images (COIL-100)

8 Conclusion Novel, generic, and simple method Competitive accuracy Our local generic method (Extra-trees + Sub-windows) is close to state-of-the-art methods without any problem-specific feature extraction but still slightly inferior to best results In practice, is it necessary to develop specific methods to have a slightly better accuracy ? Invariance Robustness to small transformations in test images Local approach more robust than global approach (many local feature vectors are left more or less intact by a given image transformation) 9 Future work directions Improving robustness Augmenting the learning sample with transformed versions of the original images Normalization of sub-window sizes and orientations Speed/accuracy trade-off for prediction Combining Sub-windows with other Machine Learning algorithms