Applications of one-class classification

Slides:



Advertisements
Similar presentations
Estimating the detector coverage in a negative selection algorithm Zhou Ji St. Jude Childrens Research Hospital Dipankar Dasgupta The University of Memphis.
Advertisements

Negative Selection Algorithms at GECCO /22/2005.
The Software Infrastructure for Electronic Commerce Databases and Data Mining Lecture 4: An Introduction To Data Mining (II) Johannes Gehrke
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Change Detection C. Stauffer and W.E.L. Grimson, “Learning patterns of activity using real time tracking,” IEEE Trans. On PAMI, 22(8): , Aug 2000.
Machine learning continued Image source:
Computer vision: models, learning and inference Chapter 13 Image preprocessing and feature extraction.
Image Analysis Phases Image pre-processing –Noise suppression, linear and non-linear filters, deconvolution, etc. Image segmentation –Detection of objects.
Lecture 07 Segmentation Lecture 07 Segmentation Mata kuliah: T Computer Vision Tahun: 2010.
Chapter 4: Linear Models for Classification
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
RBF Neural Networks x x1 Examples inside circles 1 and 2 are of class +, examples outside both circles are of class – What NN does.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Logistic Regression Rong Jin. Logistic Regression Model  In Gaussian generative model:  Generalize the ratio to a linear model Parameters: w and c.
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Object Class Recognition Using Discriminative Local Features Gyuri Dorko and Cordelia Schmid.
Visual Recognition Tutorial
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
Face Processing System Presented by: Harvest Jang Group meeting Fall 2002.
Aula 4 Radial Basis Function Networks
Introduction to machine learning
Radial-Basis Function Networks
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Anomaly detection Problem motivation Machine Learning.
Methods in Medical Image Analysis Statistics of Pattern Recognition: Classification and Clustering Some content provided by Milos Hauskrecht, University.
1 Template-Based Classification Method for Chinese Character Recognition Presenter: Tienwei Tsai Department of Informaiton Management, Chihlee Institute.
Processing of large document collections Part 2 (Text categorization) Helena Ahonen-Myka Spring 2006.
CPSC 601 Lecture Week 5 Hand Geometry. Outline: 1.Hand Geometry as Biometrics 2.Methods Used for Recognition 3.Illustrations and Examples 4.Some Useful.
ArrayCluster: an analytic tool for clustering, data visualization and module finder on gene expression profiles 組員:李祥豪 謝紹陽 江建霖.
COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.
1 Pattern Recognition Concepts How should objects be represented? Algorithms for recognition/matching * nearest neighbors * decision tree * decision functions.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
Texture. Texture is an innate property of all surfaces (clouds, trees, bricks, hair etc…). It refers to visual patterns of homogeneity and does not result.
1 SUPPORT VECTOR MACHINES İsmail GÜNEŞ. 2 What is SVM? A new generation learning system. A new generation learning system. Based on recent advances in.
ECE 8443 – Pattern Recognition Objectives: Error Bounds Complexity Theory PAC Learning PAC Bound Margin Classifiers Resources: D.M.: Simplified PAC-Bayes.
COMPARISON OF IMAGE ANALYSIS FOR THAI HANDWRITTEN CHARACTER RECOGNITION Olarik Surinta, chatklaw Jareanpon Department of Management Information System.
Visual Inspection Product reliability is of maximum importance in most mass-production facilities.  100% inspection of all parts, subassemblies, and.
Overview of Supervised Learning Overview of Supervised Learning2 Outline Linear Regression and Nearest Neighbors method Statistical Decision.
Line detection Assume there is a binary image, we use F(ά,X)=0 as the parametric equation of a curve with a vector of parameters ά=[α 1, …, α m ] and X=[x.
Discriminative Local Binary Patterns for Human Detection in Personal Album.
November 13, 2014Computer Vision Lecture 17: Object Recognition I 1 Today we will move on to… Object Recognition.
Pattern Recognition April 19, 2007 Suggested Reading: Horn Chapter 14.
1 Pattern Recognition Pattern recognition is: 1. A research area in which patterns in data are found, recognized, discovered, …whatever. 2. A catchall.
The Dirichlet Labeling Process for Functional Data Analysis XuanLong Nguyen & Alan E. Gelfand Duke University Machine Learning Group Presented by Lu Ren.
December 9, 2014Computer Vision Lecture 23: Motion Analysis 1 Now we will talk about… Motion Analysis.
Map of the Great Divide Basin, Wyoming, created using a neural network and used to find likely fossil beds See:
MACHINE LEARNING 8. Clustering. Motivation Based on E ALPAYDIN 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Classification problem:
Optimal Bayes Classification
Digital Image Processing
CSSE463: Image Recognition Day 11 Lab 4 (shape) tomorrow: feel free to start in advance Lab 4 (shape) tomorrow: feel free to start in advance Test Monday.
D. M. J. Tax and R. P. W. Duin. Presented by Mihajlo Grbovic Support Vector Data Description.
CVPR2013 Poster Detecting and Naming Actors in Movies using Generative Appearance Models.
Levels of Image Data Representation 4.2. Traditional Image Data Structures 4.3. Hierarchical Data Structures Chapter 4 – Data structures for.
Jakob Verbeek December 11, 2009
Radial Basis Function ANN, an alternative to back propagation, uses clustering of examples in the training set.
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
GENDER AND AGE RECOGNITION FOR VIDEO ANALYTICS SOLUTION PRESENTED BY: SUBHASH REDDY JOLAPURAM.
CSSE463: Image Recognition Day 11 Due: Due: Written assignment 1 tomorrow, 4:00 pm Written assignment 1 tomorrow, 4:00 pm Start thinking about term project.
Data Mining and Decision Support
Point Distribution Models Active Appearance Models Compilation based on: Dhruv Batra ECE CMU Tim Cootes Machester.
Visual Tracking by Cluster Analysis Arthur Pece Department of Computer Science University of Copenhagen
Giansalvo EXIN Cirrincione unit #4 Single-layer networks They directly compute linear discriminant functions using the TS without need of determining.
Computer Vision Lecture 7 Classifiers. Computer Vision, Lecture 6 Oleh Tretiak © 2005Slide 1 This Lecture Bayesian decision theory (22.1, 22.2) –General.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Map of the Great Divide Basin, Wyoming, created using a neural network and used to find likely fossil beds See:
Machine Learning Basics
Overview of Supervised Learning
Object Recognition Today we will move on to… April 12, 2018
EM Algorithm and its Applications
Lecture 16. Classification (II): Practical Considerations
Presentation transcript:

Applications of one-class classification -- searching for comparable applications for negative selection algorithms

background Purpose: looking for real world applications that demonstrates the usage of V-detector (a negative selection algorithm) One-class classification problem: Different from conventional classification: only information of one of the classes (target class) is available Original application: anomaly (outliner) detection More about the application than whatever classification methods involved

One-class Classification Basic concept of classification: A classifier is a function which outputs a class label from each input object. It cannot be constructed from known rules. In pattern recognition or machine learning: inferring a classifier (a function) from a set of training examples. Usually, the type of function is chosen beforehand and parameters are to be determined. Line classifier, mixture of Gaussians, neural networks, support vector classifiers

One-class Classification Basic concept of classification: Assumptions: continuity, enough information (amount of samples, limited noise), etc. Multi-class classification can be decomposed into two-class classifications

One-class classification Same problems as conventional classification Definition of errors Atypical training data Measuring the complexity of a solution The curse of dimensionality The generalization of the method

A conventional and a one-class classifier applied to an example dataset containing apples and pears, represented by 2 features per object. The solid line is the conventional classifier which distinguishes between the apples and pears, while the dashed line describes the dataset. This description can identify the outlier apple in the lower right corner, while the classifier will just classify it as an pear.

One-class classification Additional problems Most conventional classifier’s assumption that more or less balanced data. Hard to decide on the basis of on class how tightly the boundary should fit in each direction around the data. Hard to find which features should be used to find the best separation. Impossible to estimate false positives. More prominent curse of dimension. Extra constraints: closed boundary etc.

Various techniques Generate outliner detection Some methods requires near-target objects; Density method: directly estimating the density of target objects some works requires density estimate in the complete feature space Typical sample is assumed Reconstruction methods: based on prior knowledge Boundary methods Well defined distance

Application 1: texture classification Problem: classification of texture images polished granite (or ceramic) tiles that are widely used as construction elements The polished granite tiles are usually inspected by a human expert using a chosen master tile as the reference. Such inspection is subjective and qualitative One-class classifier is suitable: Outliners cannot be used to train any methods Recent development quasi-statistical representation of binary images used as a feature space for texture image classification

Based on CCR feature space (coordinated cluster representation) Outline of the method: Given a master texture image of a class, estimate statistics of CCR histogram Use parameters of the statistics to define a closed decision boundary.

Master images

CCR feature space A binary image intensity: Sa={sa{l,m}}, where l=1, 2, …L and m=1, 2, …, M A rectangular window W = I X J Scan all over the image with one pixel steps using that window The number of all possible state of the window is 2w Coordinated clusters representation consists of a histogram Ha(I,J)(b) a is the index of the image (I,J) indicated the size of the window b = 1, 2, …, 2w

When a histogram is normalized, it is considered as a proability distribution function of occurrence Fa(I,J)(b) = 1/A Ha(I,J)(b) Where A = (L-I+1)X(M-J+1) Histogram H contains all the information about n-point correlation moments of the image if and only if the separation vectors between n pixels fit between the scanning window In general, when the order of statistics is higher, more structural information is available There is a structural correspondence between a gray level image and its thresholded counterpart Provided that the binary image keeps enough structural information about a primary gray level image to be classified, the CCR of a binary image is highly suitable for recognition and classification of gray level texture image

Framework of classification Training phase a set of gray level image from each texture class Each threshold Calculate CCR distribution function Recognition phase Input test image Thresholded CCR distribution Compare with prototypes and assign to the class of best match One-class classification Define the limits of feature variations Establish the criterion

Thresholding (binarization) Because CCR is defined for binary image Fuzzy C-Means clustering method

Training phase assuming Q images of a class are available, a random set of P subimages is sampled If only one image is available, Q independent random sets are sampled Five measurements are calculated from distribution function Fa F: mass center of subimages (not a value, still a function or histogram) D: mean of distance (“distance” refers to the mean distance within a set) s: mean of standard deviation (“standard deviation” refers to the standard deviation of a set) D: mean of q-th sample center to the center of all samples s2: variance of each sets with regard to center of samples

C is the emprical adjustment parameter Criterion d(Ftest, F) < D+Cs D-2s<Dtest<D+2s C is the emprical adjustment parameter Ftest and Dtest are the mean of K ransom subimage of the texture image to be classified L1 distance is used as the measures of distinction d(Fa,Fb) = Sb|Fa(b)-Fb(b)|

Results C should be in the range of 1, 2, …, 20 Based on observation that s is approximately ten times less than D 8 master images (training data) plus 16 testing images are used (128X128). For C=1 or 2, only the master images are recognized More images are recognized for larger C For C<19, no mis-classification Proper C depends on the size of subimage (32, 24, 64 are discussed)

Application 2: authorship Problem: authorship verification Different from standard text categorization problem No realistic to train with negative samples Difference from other one-class classification Negative samples are not lacked – hard to choose to represent the entire class The object texts are long We can chunk to multiple samples – a set instead of single instance

New idea: Depth of difference between two sets Test the rate of degradation of accuracy as the best features are iteratively dropped

Standard method Choose a feature set: frequencies of function words, syntactic structures, parts-of-speech n-grams, complexity and richness measure, syntactic and orthographic idiosyncrasies Note: very different from text categorization by topic Having constructed feature vectors, use learning algorithm to construct distinguishing model Similar to categorization by topic Liner separators are believed to work well Assessment: k-fold cross-validation or bootstrapping

One-class scenario Naïve approach New approach: unmasking Chunk two works to generate two sufficient large sets Test if we can distinguish using cross-validation with high accuracy Failed in experiment (different works are just different enough to tell) New approach: unmasking In the above approach, a small number of features are doing all the work. They are likely to be from thematic differences, difference in genre or purpose, chronological shift of style, deliberate attempt to mask identity Unmasking: removing features that are most useful to distinguish Hypothesis: if they are by the same author, difference will be refelected in only a relative small number of features Sudden degradation shows the same author

Results: Corpus: 21 19th century English iterature Baseline: one-class SVM Extension: using negative samples to eliminate false positive Solution to a literary mystery: the case of the bashful rabbi

bibliography D. M. J. Tax, “One-class classification”, PhD thesis, 2001 D. M. J. Tax, “Data description toolbox: A Matlab toolbox for data description, outlier and novelty detection”. 2005 M. Koppel and J. Schler, Authorship verification as a one-class classification problem, in Proceedings of 21st International Conference on Machine Learning, 2004. R.E.Sanchez-Yanez et al, One-class texture classifier in the CCRfeature space, Pattern Recognition Letter, 24, 2003. R.E.Sanchez-Yanez et al, A framework for texture classification using the coordinated clusters representation, Pattern Recognition Letter, 24, 2003.