Ivica Dimitrovski 1, Dragi Kocev 2, Suzana Loskovska 1, Sašo Džeroski 2 1 Faculty of Electrical Engineering and Information Technologies, Department of.

Slides:

Advertisements

Similar presentations

Applications of one-class classification

Advertisements

Complex Networks for Representation and Characterization of Images For CS790g Project Bingdong Li 9/23/2009.

Context-based object-class recognition and retrieval by generalized correlograms by J. Amores, N. Sebe and P. Radeva Discussion led by Qi An Duke University.

Random Forest Predrag Radenković 3237/10

Road-Sign Detection and Recognition Based on Support Vector Machines Saturnino, Sergio et al. Yunjia Man ECG 782 Dr. Brendan.

Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.

Three things everyone should know to improve object retrieval

Object Recognition with Features Inspired by Visual Cortex T. Serre, L. Wolf, T. Poggio Presented by Andrew C. Gallagher Jan. 25, 2007.

Evaluating Color Descriptors for Object and Scene Recognition Koen E.A. van de Sande, Student Member, IEEE, Theo Gevers, Member, IEEE, and Cees G.M. Snoek,

Decision trees for hierarchical multilabel classification A case study in functional genomics.

Automatic classification of weld cracks using artificial intelligence and statistical methods Ryszard SIKORA, Piotr BANIUKIEWICZ, Marcin CARYK Szczecin.

Data Mining Classification: Alternative Techniques

Carolina Galleguillos, Brian McFee, Serge Belongie, Gert Lanckriet Computer Science and Engineering Department Electrical and Computer Engineering Department.

Image Indexing and Retrieval using Moment Invariants Imran Ahmad School of Computer Science University of Windsor – Canada.

Ghunhui Gu, Joseph J. Lim, Pablo Arbeláez, Jitendra Malik University of California at Berkeley Berkeley, CA

Recognition using Regions CVPR Outline Introduction Overview of the Approach Experimental Results Conclusion.

Chapter 1: Introduction to Pattern Recognition

RBF Neural Networks x x1 Examples inside circles 1 and 2 are of class +, examples outside both circles are of class – What NN does.

1 Lecture 5: Automatic cluster detection Lecture 6: Artificial neural networks Lecture 7: Evaluation of discovered knowledge Brief introduction to lectures.

Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.

A Study of Approaches for Object Recognition

Announcements  Project proposal is due on 03/11  Three seminars this Friday (EB 3105) Dealing with Indefinite Representations in Pattern Recognition.

Dynamic Face Recognition Committee Machine Presented by Sunny Tang.

What is Cluster Analysis?

5/30/2006EE 148, Spring Visual Categorization with Bags of Keypoints Gabriella Csurka Christopher R. Dance Lixin Fan Jutta Willamowski Cedric Bray.

Smart Traveller with Visual Translator for OCR and Face Recognition LYU0203 FYP.

Multiple Object Class Detection with a Generative Model K. Mikolajczyk, B. Leibe and B. Schiele Carolina Galleguillos.

Rotation Forest: A New Classifier Ensemble Method 交通大學電子所蕭晴駿 Juan J. Rodríguez and Ludmila I. Kuncheva.

Oral Defense by Sunny Tang 15 Aug 2003

Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.

Ensemble Learning (2), Tree and Forest

Introduction --Classification Shape ContourRegion Structural Syntactic Graph Tree Model-driven Data-driven Perimeter Compactness Eccentricity.

K.U.Leuven Department of Computer Science Predicting gene functions using hierarchical multi-label decision tree ensembles Celine Vens, Leander Schietgat,

Image Pattern Recognition The identification of animal species through the classification of hair patterns using image pattern recognition: A case study.

Gene id GO GO GO Feature_1Feature_2... Feature_t

Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.

Professor: S. J. Wang Student : Y. S. Wang

Recognition using Regions (Demo) Sudheendra V. Outline Generating multiple segmentations –Normalized cuts [Ren & Malik (2003)] Uniform regions –Watershed.

LOGO Ensemble Learning Lecturer: Dr. Bo Yuan

Hierarchical Annotation of Medical Images Ivica Dimitrovski 1, Dragi Kocev 2, Suzana Loškovska 1, Sašo Džeroski 2 1 Department of Computer Science, Faculty.

1 Multiple Classifier Based on Fuzzy C-Means for a Flower Image Retrieval Keita Fukuda, Tetsuya Takiguchi, Yasuo Ariki Graduate School of Engineering,

Combining multiple learners Usman Roshan. Bagging Randomly sample training data Determine classifier C i on sampled data Goto step 1 and repeat m times.

Ensembles. Ensemble Methods l Construct a set of classifiers from training data l Predict class label of previously unseen records by aggregating predictions.

BAGGING ALGORITHM, ONLINE BOOSTING AND VISION Se – Hoon Park.

The Extraction and Classification of Craquelure for Geographical and Conditional Based Analysis of Pictorial Art Mouhanned El-Youssef, Spike Bucklow, Roman.

Data Reduction via Instance Selection Chapter 1. Background KDD  Nontrivial process of identifying valid, novel, potentially useful, and ultimately understandable.

School of Engineering and Computer Science Victoria University of Wellington Copyright: Peter Andreae, VUW Image Recognition COMP # 18.

Gang WangDerek HoiemDavid Forsyth. INTRODUCTION APROACH (implement detail) EXPERIMENTS CONCLUSION.

Using decision trees to build an a framework for multivariate time- series classification 1 Present By Xiayi Kuang.

Object Recognition by Discriminative Combinations of Line Segments and Ellipses Alex Chia ^˚ Susanto Rahardja ^ Deepu Rajan ˚ Maylor Leung ˚ ^ Institute.

Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.

Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.

Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte A Bag-of-Features Framework for Time Series Classification.

Combining Bagging and Random Subspaces to Create Better Ensembles

Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.

IMAGE PROCESSING RECOGNITION AND CLASSIFICATION

Supervised Time Series Pattern Discovery through Local Importance

PRESENTED BY Yang Jiao Timo Ahonen, Matti Pietikainen

Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas

Basic machine learning background with Python scikit-learn

Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science

outline Two region based shape analysis approach

Improving Retrieval Performance of Zernike Moment Descriptor on Affined Shapes Dengsheng Zhang, Guojun Lu Gippsland School of Comp. & Info Tech Monash.

Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.

Ensembles for predicting structured outputs

Decision tree ensembles in biomedical time-series classifaction

An Infant Facial Expression Recognition System Based on Moment Feature Extraction C. Y. Fang, H. W. Lin, S. W. Chen Department of Computer Science and.

Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.

A Data Partitioning Scheme for Spatial Regression

Presentation transcript:

Ivica Dimitrovski 1, Dragi Kocev 2, Suzana Loskovska 1, Sašo Džeroski 2 1 Faculty of Electrical Engineering and Information Technologies, Department of Computer Science, Skopje, Macedonia 2 Jožef Stefan Institute, Department of Knowledge Technologies, Ljubljana, Slovenia Hierarchical Classification of Diatom Images using Predictive Clustering Trees

Outline Hierarchical multi-label classification system for diatom image classification Contour and feature extraction from images Predictive Clustering Trees Ensembles: Bagging and random forests Experimental Design Results and Discussion

Diatom image classification (1) Diatoms: large and ecologically important group of unicellular or colonial organisms (algae) Variety of uses: water quality monitoring, paleoecology and forensics

Diatom image classification (2) different diatom species, half of them still undiscovered Automatic diatom classification image processing (feature extraction from images) image classification (labels and groups the images) Labels can be organized in a hierarchy and an image can be labeled with more than one label Predict all different levels in the hierarchy of taxonomic ranks: genus, species, variety, and form Goal of the complete system: assist a taxonomist in identifying a wide range of different diatoms

Diatom image classification (3) Set of images with their visual descriptors and annotations Taxonomic rank with hierarchical structure

Contour extraction from images Pre-segmentation of an image separate the diatom objects from dark or light debris identify the regions with structured objects merge nested regions Edge-based thresholding for contour extraction locate the boundary between the objects and the background produce a binary (black and white) image with the diatom contours Contour following trace the region borders in the binary image

Feature extraction from images Simple geometric properties length, width, size and the length-width ratio Simple shape descriptors rectangularity, triangularity, compactness, ellipticity, and circularity Fourier descriptors 30 coefficients SIFT histograms Invariant to changes in illumination, image noise, rotation, scaling, and small changes in viewpoint

Predictive Clustering Trees (PCTs) Standard Top-Down Induction of DTs Hierarchy of clusters Distance measure: minimization of intra-cluster variance Instantiation of the variance for different tasks Multiple targets, sequences, hierarchies

CLUS System where the PCTs framework is implemented (KULeuven & JSI) Available for download at The top-down induction algorithm for PCTs: Selecting the tests: reduction in variance caused by partitioning the instances 9

PCTs for Hierarchical Multi-Label Classification HMLC: an example can be labeled with multiple labels that are organized in a hierarchy { 1, 2, 2.2 }

PCTs for Hierarchical Multi-Label Classification Variance: average squared distance between each example’s label and the set’s mean label Weighted Euclidean distance: an error at the upper levels costs more than an error at the lower levels

Ensemble methods Ensembles are a set of predictive models Unstable base classifiers Voting schemes to combine the predictions into a single prediction Ensemble learning approaches Modification of the data Modification of the algorithm Bagging Random Forest

Ensemble methods Image from van Assche, PhD Thesis, 2008

Ensemble methods Image from van Assche, PhD Thesis, 2008 CLUS

ADIAC diatom image database Three different subset of images: 1099 images classified in 55 different taxa 1020 images classified in 48 different taxa 819 images classified in 37 different taxa The diatoms vary in shape and ornamentation

Experimental design – classifier Random Forests and Bagging of PCTs for HMLC: Feature Subset Size: 10% of the number of descriptive attributes Number of classifiers: 100 un-pruned trees Combine the predictions output by the base classifiers: probability distribution vote

Results (1) Predictive performance of the feature extraction algorithms and their combination ClassifierDescriptors# features Overall recognition rate [%] 55 diatom taxa 48 diatom taxa 37 diatom taxa Bagging Geometric and shape descriptors Fourier descriptors SIFT histograms Geometric and shape desc.+Fourier desc.+SIFT hist Random Forests Geometric and shape descriptors Fourier descriptors SIFT histograms Geometric and shape desc.+Fourier desc.+SIFT hist

Results (1) Predictive performance of the feature extraction algorithms and their combination ClassifierDescriptors# features Overall recognition rate [%] 55 diatom taxa 48 diatom taxa 37 diatom taxa Bagging Geometric and shape descriptors Fourier descriptors SIFT histograms Geometric and shape desc.+Fourier desc.+SIFT hist Random Forests Geometric and shape descriptors Fourier descriptors SIFT histograms Geometric and shape desc.+Fourier desc.+SIFT hist

Data DescriptorsClassifierEvaluationRecognition Rate [%] # Images# Taxa geometric and shape; Fourier; SIFT Bagging of predictive clustering trees 10-fold cross-validation geometric and shape; Fourier; SIFT Bagging of predictive clustering trees 10-fold cross-validation contour profiling; Legendre polynomials Decision trees; Neural networks; syntactical classifier Random separation (50/50) to train and test set geometric; shape; Fourier; image moments; ornamentation and morphological Bagging of Decision TreesLeave One Out geometric and shape; Fourier; SIFT Bagging of predictive clustering trees 10-fold cross-validation contour; segment; globalnearest -mean classifier set swaping (complex pseudo cross-validation) Gabor; Legendre polynomials; ornamentation Decision trees; Bayesian classifier Random separation (50/50) to train and test set contour; ornamentationBagging of Decision Trees 10 times random separation (75/25) train and test Gabor; Legendre polynomials; ornamentation; contour; global; geometric; shape; Fourier; image moments; morphological Bagging of Decision Trees 10 times random separation (75/25) train and test 96.9 Comparison of the performance of the ensembles of PCTs to the performance of the approaches from H. du Buf, M. M. Bayer (Eds.), Automatic diatom identification, World Scientific Publishing, 2002, pp. 245–257

Data DescriptorsClassifierEvaluationRecognition Rate [%] # Images# Taxa geometric and shape; Fourier; SIFT Bagging of predictive clustering trees 10-fold cross-validation geometric and shape; Fourier; SIFT Bagging of predictive clustering trees 10-fold cross-validation contour profiling; Legendre polynomials Decision trees; Neural networks; syntactical classifier Random separation (50/50) to train and test set geometric; shape; Fourier; image moments; ornamentation and morphological Bagging of Decision TreesLeave One Out geometric and shape; Fourier; SIFT Bagging of predictive clustering trees 10-fold cross-validation contour; segment; globalnearest -mean classifier set swaping (complex pseudo cross-validation) Gabor; Legendre polynomials; ornamentation Decision trees; Bayesian classifier Random separation (50/50) to train and test set contour; ornamentationBagging of Decision Trees 10 times random separation (75/25) train and test Gabor; Legendre polynomials; ornamentation; contour; global; geometric; shape; Fourier; image moments; morphological Bagging of Decision Trees 10 times random separation (75/25) train and test 96.9 Comparison of the performance of the ensembles of PCTs to the performance of the approaches from H. du Buf, M. M. Bayer (Eds.), Automatic diatom identification, World Scientific Publishing, 2002, pp. 245–257

Results (2) The presented approach has very high predictive performance (ranging from 96.2% to 98.7%) Recognition rates of 100% for the majority of taxa Lower recognition rates are achieved for taxa that are very similar to each other and difficult to distinguish Eunotia diatoms (presented on image), Fallacia diatoms Our results are better than the ones obtained from human annotators (63.3% recognition rate)

Conclusion Novel approach to taxonomic identification of taxa from microscopic images Different feature extraction approaches and hierarchical multi-label classification Very high predictive performance - the best reported performance on this dataset