INFORMATION REPRESENTATION AND COMPRESSION

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

Face Recognition Sumitha Balasuriya.
EigenFaces and EigenPatches Useful model of variation in a region –Region must be fixed shape (eg rectangle) Developed for face recognition Generalised.
Eigenfaces for Recognition Presented by: Santosh Bhusal.
Face Recognition and Biometric Systems Eigenfaces (2)
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Face Recognition Method of OpenCV
INFORMATION REPRESENTATION. There is no known general method how to represent information about objects to get similar level of performance as in biological.
Face Recognition and Biometric Systems
Computer Vision Spring ,-685 Instructor: S. Narasimhan Wean 5403 T-R 3:00pm – 4:20pm Lecture #20.
Region labelling Giving a region a name. Image Processing and Computer Vision: 62 Introduction Region detection isolated regions Region description properties.
Dimensionality Reduction Chapter 3 (Duda et al.) – Section 3.8
Pattern Recognition Topic 1: Principle Component Analysis Shapiro chap
CS 790Q Biometrics Face Recognition Using Dimensionality Reduction PCA and LDA M. Turk, A. Pentland, "Eigenfaces for Recognition", Journal of Cognitive.
CONTENT BASED FACE RECOGNITION Ankur Jain 01D05007 Pranshu Sharma Prashant Baronia 01D05005 Swapnil Zarekar 01D05001 Under the guidance of Prof.
Face Recognition using PCA (Eigenfaces) and LDA (Fisherfaces)
Face Recognition Jeremy Wyatt.
Face Recognition Using Eigenfaces
FACE RECOGNITION, EXPERIMENTS WITH RANDOM PROJECTION
Basics of discriminant analysis
Comparison and Combination of Ear and Face Images in Appearance-Based Biometrics IEEE Trans on PAMI, VOL. 25, NO.9, 2003 Kyong Chang, Kevin W. Bowyer,
Smart Traveller with Visual Translator for OCR and Face Recognition LYU0203 FYP.
5. 1 JPEG “ JPEG ” is Joint Photographic Experts Group. compresses pictures which don't have sharp changes e.g. landscape pictures. May lose some of the.
Facial Recognition CSE 391 Kris Lord.
CS 485/685 Computer Vision Face Recognition Using Principal Components Analysis (PCA) M. Turk, A. Pentland, "Eigenfaces for Recognition", Journal of Cognitive.
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
Face Recognition Using EigenFaces Presentation by: Zia Ahmed Shaikh (P/IT/2K15/07) Authors: Matthew A. Turk and Alex P. Pentland Vision and Modeling Group,
Dimensionality Reduction: Principal Components Analysis Optional Reading: Smith, A Tutorial on Principal Components Analysis (linked to class webpage)
Recognition Part II Ali Farhadi CSE 455.
Face Recognition and Feature Subspaces
Face Recognition and Feature Subspaces
Principles of Pattern Recognition
Introduction to Computer Vision Olac Fuentes Computer Science Department University of Texas at El Paso El Paso, TX, U.S.A.
Presented by Tienwei Tsai July, 2005
Texture. Texture is an innate property of all surfaces (clouds, trees, bricks, hair etc…). It refers to visual patterns of homogeneity and does not result.
1 Recognition by Appearance Appearance-based recognition is a competing paradigm to features and alignment. No features are extracted! Images are represented.
COLOR HISTOGRAM AND DISCRETE COSINE TRANSFORM FOR COLOR IMAGE RETRIEVAL Presented by 2006/8.
Multivariate Statistics Matrix Algebra I W. M. van der Veld University of Amsterdam.
1 E. Fatemizadeh Statistical Pattern Recognition.
SINGULAR VALUE DECOMPOSITION (SVD)
Face Recognition: An Introduction
Chapter 7 Probability and Samples: The Distribution of Sample Means.
Computational Intelligence: Methods and Applications Lecture 23 Logistic discrimination and support vectors Włodzisław Duch Dept. of Informatics, UMK Google:
Designing multiple biometric systems: Measure of ensemble effectiveness Allen Tang NTUIM.
CSE 185 Introduction to Computer Vision Face Recognition.
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition Instructor’s Presentation Slides 1.
2005/12/021 Content-Based Image Retrieval Using Grey Relational Analysis Dept. of Computer Engineering Tatung University Presenter: Tienwei Tsai ( 蔡殿偉.
Multimodal Interaction Dr. Mike Spann
GENDER AND AGE RECOGNITION FOR VIDEO ANALYTICS SOLUTION PRESENTED BY: SUBHASH REDDY JOLAPURAM.
Irfan Ullah Department of Information and Communication Engineering Myongji university, Yongin, South Korea Copyright © solarlits.com.
2D-LDA: A statistical linear discriminant analysis for image matrix
Course 5 Edge Detection. Image Features: local, meaningful, detectable parts of an image. edge corner texture … Edges: Edges points, or simply edges,
JPEG. Introduction JPEG (Joint Photographic Experts Group) Basic Concept Data compression is performed in the frequency domain. Low frequency components.
Giansalvo EXIN Cirrincione unit #4 Single-layer networks They directly compute linear discriminant functions using the TS without need of determining.
Chapter 13 Discrete Image Transforms
1 Objective To provide background material in support of topics in Digital Image Processing that are based on matrices and/or vectors. Review Matrices.
Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.
EE368 Final Project Spring 2003
Redundant Equations for Matrices
PRESENTED BY Yang Jiao Timo Ahonen, Matti Pietikainen
Face Recognition and Feature Subspaces
Recognition: Face Recognition
Principal Component Analysis (PCA)
Outline Peter N. Belhumeur, Joao P. Hespanha, and David J. Kriegman, “Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection,”
Face Recognition and Detection Using Eigenfaces
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
PCA is “an orthogonal linear transformation that transfers the data to a new coordinate system such that the greatest variance by any projection of the.
Feature space tansformation methods
Parametric Methods Berlin Chen, 2005 References:
Midterm Exam Closed book, notes, computer Similar to test 1 in format:
Presentation transcript:

INFORMATION REPRESENTATION AND COMPRESSION

Our approach in TUT: We do not know how to describe locations of blocks so.... Let’s think first about GLOBAL cotnent description in which locations are not considered! That is look first into the problem in which only block STATISTICS is considered (we were illustrating on CAMSHIFT that color statistics gives good results)

Impact of Quantization Distribution of DCT coefficients for typical 8x8 DCT block We can see that higher frequency coefficients are small. If we use strong quantization they will be quantized to zero.

Under strong quantization only first 4x4 block of coefficients will be nonzero. This is equivalent to 4x4 DCT transform. There is another effect too: The greater the quantization the smaller the number of DIFFERENT blocks. In fact, with no quantization, every block is different Quantization is rounding the coefficients to limited number of values.

Coefficients of the 4x4 blocks DC – zero frequency, average light level in the block AC – correspond to different frequencies Quantization by QP [DC]=round[DC/QP] [AC]=round[AC/QP] DC AC ..... ... AC .... ..... ..... Higher QP -> more zeros in the block

Here is an illustration for a picture QP is quantization parameter, we see that as it is increasing the number of DCT patterns is reduced stronlgy

Now we use the following idea: Let’s see how the histogram of the quantized DCT blocks looks! For example, let’s find which blocks appear most often in a picture and create histogram of e.g. first 40 patterns

The shape of this histogram obviously depends on the quantization. If the quantization is low, the histogram will tend to be flat. If the quantization is high it will tend to have a peak.

Let us see example of histograms for two pictures Histograms of two face images

The database retrieval problem based on block histograms Assume we have database D of pictures 1,2,..i,,j..m We take a picture and want to check if it is in the database or if there are similar pictures there. Example: database of passport photographs. In our approach we will use the similarity measure between pictures based on their quantized histograms Histograms are treated as vectors and similarity is based on the following formula: Bi,j= i,j єD

The measure is city-block measure (differences between absolute values of coefficients) and it achives minimum value = 0. Then two histogram vectors should be identical. The closer the value to zero the more similar pictures should be. Remember that blocks are quantized so noise and nonrelevant features are removed. The question is what is the performance of such scheme but before we can check this, we need to look into the light normalization problem.

Light normalization problem The values of DCT transform coefficients depend on the light level. If the light level is higher the values are higher. If we use the same quantization for two identical pictures with different light levels the quantized blocks will be different. Light level can be normalized. First, let’s calculate average light level for a picture. For this we use values of DC coefficients in blocks Here we get average light level for a picture

Average light level DCall in a database is calculated in the same way based on values of DCmean for each picture. Next, the values of light level for each picture are rescaled by the factor of Rescaling makes that the values of coefficients in the quantized blocks will be similar:

The DC coefficients problem At high quantization levels very many blocks will have only DC coefficients. Information about these blocks will be only DC that is what ist the average light level in the block. But of interest is how the average light level is changing between the blocks. We want to use this information. What we make is that we will account for the information in the differences between DC values in neighbouring blocks.

DC differences between blocks In a) we see fragment of a picture in which DC values of the blocks are shown. For each block we have 8 neighbours like shown in b). We calculate 9 differences between the neighbours (8 for directions and 1 for the average from all directions) as shown in c). Now we order the differences and form a vector from first k coefficients as shown in d) for k=4

Combined histogram Bi,j= A combined histogram for AC blocks and DC vectors is now formed H =[ HAC , α xHDC ] where α is a numerical parameter which will be optimized later. Combined histogram means that we have two vectors for minimizing and they are summed with parameter α Bi,j= i,j єD

Optimization of database retrieval The question is: How good can be the database retrieval based on combined histogram? This means e.g. how many errors it will be made. But we can also ask another question: What is the best achievable performance of this approach? Remember that we use only statistical information but we have several parameters which can be selected: - quantization level - size of histograms - parameter α for combining histograms - size of DC difference vectors

Optimization procedure We can check this problem taking some databases and optimizing the parameters for best retrieval. This will show us what is the maximum performance. We did this for face databases using the following scheme:

EVALUATION OF RESULTS Given certain classification threshold, an input face image of person A may be falsely classified to person B. If the target person is person A. The ratio of how many images of person A have been classified into other persons is called False Rejection Rate, FRR. The ratio of how many images of other persons have been classified into person A is called False Acceptance Rate, FAR.

Equal Error Rate The ratio of how many images of other persons have been classified into person A is called False Acceptance Rate, FAR. From the FAR and FRR, an Equal Error Rate (EER) is achieved when both measures take equal values. The lower the EER is, the better is the system's performance, as the total error rate which is the sum of the FAR and the FRR at the point of the EER decreases. Typical performance of EER histogram for two face databases

DATABASE SELECTION There are two cases: Database in which there is only one (standard) picture of each person 2. Database in which there are multiple pictures of each person (and they might very different) In case 2. the same person should be retrieved for any of its pictures which can be difficult.

DATABASES SELECTED The GTF (Georgia Tech Face) database contains the face images of 50 people, from both male and female, each with 15 images. Most of the images were taken in two different sessions to account for the variations in illumination conditions, facial expression, appearance, different scales and orientations. For test, we store the first 11 images of each person in the database and the remaining 4 images serve as key images for retrieval. Therefore, the total number of stored images is 550 and the total number of key images is 200.

DATABASES SELECTED The ORL (Olivetti Research Laboratory) database contains 10 different images of 40 persons. Images were taken at different times, with slightly varying lighting, various facial expressions (open/closed eyes, smiling/non-smiling) and facial details (glasses/no-glasses). The ORL has thus more variations for images taken from one person. For experiment, we store the first 6 images of each person in the database and the remaining 4 images serve as key images. Therefore, the total number of stored images is 240 and the total number of key images is 160.

RESULTS We present results for AC only, for DC only and for combined histogram AC-Patterns Histograms Direction-Vectors Histograms Combined Histogram EER - ORL 1.25% 3.125% 0.625% EER - GTF 7% 4.5% The best result of ORL is obtained when: QP_AC=36, number of AC patterns=80, QP_DC=75, number of Direction-Vector patterns = 300 and α=0.7, γ=7. The best result of GTF is obtained when: QP_AC=10, number of AC patterns = 250, QP_DC = 20, number of Direction-Vector patterns = 400 and α=0.9, γ=5.

EVALUATION OF RESULTS Given certain classification threshold, an input face image of person A may be falsely classified to person B. If the target person is person A. The ratio of how many images of person A have been classified into other persons is called False Rejection Rate, FRR. The ratio of how many images of other persons have been classified into person A is called False Acceptance Rate, FAR.

ANOTHER DATABASE The FERET database contains overall more than 10,000 images from more than 1000 individuals taken in largely varying circumstances. The FERET database images are divided into several sets which are formed to match its methodology of evaluation. Here we made a test based on the sets fa and fb. In both of them, each face has one picture with picture in fb taken seconds after the corresponding picture in fa. The fa set which has size of 994 images and serves as the database, the fb set which has sizes of 992 images, is used as key images for retrieval from the fa.

EVALUATION OF RESULTS FERET is considered difficult database used in evaluation of professional applications: AC-Patterns Direction-Vectors Combined Histogram EER 4.6371% 7.06% 3.43% The best EER result is obtained when: QP_AC = 12, number of AC patterns = 400, QP_DC=12, number of Direction-Vector patterns = 400 and α=0.5, γ=4.

FERET METHODOLOGY OF EVALUATION For FERET there is another methodology based on calculation of how many correct retrievals will be obtained among n trials, n=1,2,…,3.

FERET EVALUATION FERET evaluation is called cumulative match score. Results are seen for histogram (red) and is overlaid with other known good methods. Rank means how many retrievals are made, one retrieval is most demanding.

Features based on Binary Feature Vectors For each non-border 4x4 image block, there are eight blocks surrounding it. Such a 3x3 block matrix is utilized here to generate a Binary Feature Vector (BFV). Taking the DC coefficients as an example: the nine DC coefficients within this area form a 3x3 DC coefficient matrix. By measuring and thresholding the magnitude of differences between the non-center DC’s and the central DC coefficient, a binary vector length 8 is formed. Two different cases are considered here: Case1: 0 – current coefficient ≤ threshold 1 – current coefficient > threshold Case2: 0 – current coefficient < threshold 1 – current coefficient ≥ threshold Example

DC-BFV Histogram (based on DC coeff.) AC-BFV Histogram (based on AC coeff. Example of DC-BFV histogram

Performance results for the Feret database Result is quite good if we take into account that the method uses statistical information only

WHICH IS THE BEST METHOD? On the FERET plot we see the best performance 95%. Which method it is? It is called EIGENFACES and it is based on calculation of eigenvectors and eigenevalues of matrices.

Construction of Face Space EIGENFACES Construction of Face Space Suppose a face image consists of N pixels, so it can be represented by a vector   of dimension N. Let                   be the training set of face images. The average face of these M images is given by Then each face     differs from the average face   by    :

EIGENFACES Now covariance matrix of the training images can be constructed: where                    The basis vectors of the face space, i.e., the eigenfaces, are then the orthogonal eigenvectors of the covariance matrix   . The number of training images is usually less than the number of pixels in an image, there will be only M-1, instead of N, meaningful eigenvectors

Eigenvalues, eigenvectors x is eigenvector for matrix A ís eigenvalue If S is an nonsingular nxn matrix then matrix B has the same eigenvalues B = SAS-1 nxn matrix has n eigenvalues

EIGENFACES Therefore, the eigenfaces are computed by first finding the eigenvectors,                  , of the M by M matrix L: The eigenvectors,                  , of the matrix   are then expressed by a linear combination of the difference face images,                  , weighted by                  :                                                                        In practice, a smaller set of M'(M'<M) eigenfaces is sufficient for face identification. Hence, only M' significant eigenvectors of L, corresponding to the largest M' eigenvalues, are selected for the eigenface computation

Thus further data compression can be obtained Thus further data compression can be obtained. M' is determined by a threshold,   , of the ratio of the eigenvalue summation:                                                                In the training stage, the face of each known individual,    , is projected into the face space and an M'-dimensional vector,    , is obtained:                                                                where    is the number of face classes

A distance threshold,   , that defines the maximum allowable distance from a face class as well as from the face space, is set up by computing half the largest distance between any two face classes:                                                                   In the recognition stage, a new image,   , is projected into the face space to obtain a vector,   :                                                                The distance of   to each face class is defined by

For the purpose of discriminating between face images and non-face like images, the distance, , between the original image,   , and its reconstructed image from the eigenface space,    , is also computed:                                                                where These distances are compared with the threshold given in equation (8) and the input image is classified by the following rules: IF       THEN input image is not a face image; IF       AND            THEN input image contains an unknown face; IF       AND                      THEN input image contains the face of individual    .

EXPERIMENTAL RESULTS The eigenface-based face recognition method was tested on the ORL face database. 150 images of 15 individuals, were selected for experiments.

EXPERIMENTAL RESULTS In the training stage, three images of each individual were used as the training samples, forming a training set totalling 45 images The average face of the training set

EXPERIMENTAL RESULTS                                                                                  The first 15 eigenfaces corresponding to the 15 largest eigenvalues.

EXPERIMENTAL RESULTS Recognition rate Recognition rate depends on training images – when single view images are used for training recognition is much worse

EXPERIMENTAL RESULTS Faces with calm expressions in the training stage and faces of the same individual but with various expressions in the testing stage Training images Test images lower images are projections in the face space

CONCLUSIONS Eigenfaces method treat images globally, no local information is used. Compression is done on global level. The method requires lots of computations but results are good. Explanation of good results: images are represented as combinations of ”simple” images and the system is trained on them.