Proposed architecture of a Fully Integrated

Proposed architecture of a Fully Integrated
UNIVERSITY “POLITEHNICA”, BUCURESTI FACULTY OF ELECTRONICS, TELECOMMUNICATIONS AND INFORMATION TECHNOLOGY Proposed architecture of a Fully Integrated Modular Neural Network-based Automatic Facial Emotion Recognition system based on Facial Action Coding System Author: Mihai Gavrilescu Bucuresti, 2014

Abstract describe the architecture of a fully integrated modular neural network-based automatic facial emotion recognition (FER) system able to recognize emotions based on the Facial Action Code System (FACS). The proposed framework makes use of a neural network to combine the recognition results from different sources, improving the integration of different types of classifiers, in order to provide better facial emotion recognition results. We present the architecture and the implementation details, as well as results and possible improvements that can be brought to the current framework. Proposed architecture of a Fully Integrated Modular Neural Network-based Automatic Facial Emotion Recognition system based on Facial Action Coding System

Content State of the Art Model Architecture Experimental Results
Conclusions Proposed architecture of a Fully Integrated Modular Neural Network-based Automatic Facial Emotion Recognition system based on Facial Action Coding System

State of the Art The static approach proposes the most common use of Bayesian network classifiers in order to classify each frame of a video to one of the facial expression categories, tracking the results of that specific frame. The dynamic approach takes into account the temporal pattern provided by the frame sequence. In this approach multi-level Hidden Markov Models (HMM) are used, which proved to offer encouraging results not only in terms of facial expression recognition, but also in segmentation of the video in sequences corresponding to different expressions. In our current work we used neural networks, as a dynamic approach, in a fully integrated modular neural network based facial emotion recognition system containing also static approach modules in order to improve the overall performance. The novelty of this paper comes from the fully integrated and modular architecture which offers the possibility to use several types of classifiers in a parallel manner, with ease in integration as well as in switching from different types of learning techniques. Therefore a neural network will be used for switching between different algorithms, in order to get the best result from their outputs. Proposed architecture of a Fully Integrated Modular Neural Network-based Automatic Facial Emotion Recognition system based on Facial Action Coding System

State of the Art The system is based on the Facial Action Coding System (FACS) developed by Eckman and Friesen which is a codification of facial expressions where movements of the face are described through the use of a set of action units (AU). Tian et al. made researches in this area, developing an Automatic Face Analysis (AFA), a system that can recognize automatically 6 upper face AUs and 10 lower face AUs. They also used a multi-state face model, dividing the facial component models for a front face in FrontLips, FrontEyes,FrontCheecks, Nasolabial Furrows and NoseWrinkles, and the profile face including the component models: Side Lips, Right eye, Right brow and Right cheek. Their recognition rate on those specific AUs they performed tests on was of about 90%, proving that this type of approach offers really good results for face emotion recognition. Littlewort et al. developed a study aimed at comparing different machine learning methods in terms of their feasibility over the automatic recognition of facial expressions. They concluded that the best results are obtained by means of AdaBoost, obtaining almost 93% average AU classification rate on MMI database and Cohn-Kanade database. Proposed architecture of a Fully Integrated Modular Neural Network-based Automatic Facial Emotion Recognition system based on Facial Action Coding System

State of the Art Bartlett at al. envisaged a way to apply facial expression recognition systems to spontaneous expressions which are more common in real applications and raise several new issues on facial emotion recognition and AU classification. Their system was able to extract information about facial actions from the MMI database, despite the differences between the spontaneous expression and the posed data contained in MMI. Valstar and Pantic worked on coding images with automatic AUs recognition on sequenced images in videos. They used several detectors and classifiers, dividing the face in four categories: Eyebrows, Eyes, Nostrils and Chin-Mouth. The AUs and their temporal activation models encoding was performed by means of hidden Markov Models. They obtained an emotion recognition precision rate of 72% on a Cohn-Kanade database. Another recent successful study in this domain is that of Koelstra and Pantic [7]. They envisaged the first dynamic texture-based approach to recognize the facial Action Units. Their method was based on non-rigid face registration using Free-Form Deformations (FFDs) and adapted versions of the Viola-Jones face detector algorithms to locate the face. Proposed architecture of a Fully Integrated Modular Neural Network-based Automatic Facial Emotion Recognition system based on Facial Action Coding System

Model 1. Facial Feature Extraction The face detection may be done in 2 ways. In the holistic approach, the face is determined as an entire unit, while in the analytic approach; only some important facial features are detected. Therefore the analytic approach will be used in our case, because we need the detection of specific features corresponding to the action units of FACS in order to be able to recognize emotions. We use featured-based methods in order to track facial features and classify emotions, in a multi-state facial model that will be detailed in the following slides. Proposed architecture of a Fully Integrated Modular Neural Network-based Automatic Facial Emotion Recognition system based on Facial Action Coding System

Model 2. Facial Action Coding System (FACS)
Our proposed architecture will analyze 27 of the 44 Aus in order to detect changes in localized facial features. This is performed by following the next steps: The head orientation and the face position are detected - Subtle changes in facial components are measured; these changes are represented as a collection of feature parameters - The action units are classified and each classifier will fed their outputs to a neural network which will take the decision of the final AU map and the corresponding recognized emotion. Proposed architecture of a Fully Integrated Modular Neural Network-based Automatic Facial Emotion Recognition system based on Facial Action Coding System

Model 2. Multi-state face model and feature detection
A multi-state face model is used in order to track different facial features. One significant problem in face detection is represented by head orientation, and multi-state face model is known to offer the best results in this case. Our architecture proposes the division of our multi-state model in five essential components: Eye, Brow, Cheek, Wrinkle and Lip. Each of these components will be associated to a certain area of the face and will provide specific features that will be used for Action Unit determination. The Eye component is specifically designed to determine the state in which the eyes are found in a picture or a video frame. In our proposed architecture, for detecting the eye corresponding AUs, we used three methods: a Haar-Cascade classifier, a cascade classifier via Boosted learning and a method based on rectangle features and pixel-pattern based texture features. Proposed architecture of a Fully Integrated Modular Neural Network-based Automatic Facial Emotion Recognition system based on Facial Action Coding System

Model The Brow component follows the same procedure as the Eye component, using only two types of classifiers, a Haar-Cascade classifier and a cascade classifier via Boosted learning. The Cheek component made use of a feature point tracking algorithm for extraction and a Haar-cascade classifier and Viola/Jones Face detector classifier adapted for AU classification on cheek detection The Lips component made use of three different methods: the snake and active contour methods, the lip feature point tracking method of Lien and horizontal and vertical mouth classifiers. The Wrinkles Component made use classifiers used in automatic age estimation: quadratic function method and shortest distance classifier. The results of group of classifiers is combined in a decisional Neural Network (NN) which was trained using backpropagation in order to provide the best result combining the partial results offered by the classification methods. Proposed architecture of a Fully Integrated Modular Neural Network-based Automatic Facial Emotion Recognition system based on Facial Action Coding System

Architecture The platform was created from scratch using C++ programming. For each component (Eye, Brow, Cheek, Lip, and Wrinkle) a new thread is launched when a frame is analyzed, after the face feature decomposition. Each thread can trigger new threads for each of the classifier used in the component. When all these classifier threads are finished, they are collected by the corresponding Component Decision Neural Network which will provide the best AU map for its classifiers as well as the number of the frame analyzed to a 4 layer neural network based on the recognized AUs coming from each module, the final AU map will be generated Proposed architecture of a Fully Integrated Modular Neural Network-based Automatic Facial Emotion Recognition system based on Facial Action Coding System

Experimental Results We used the MMI database [4], containing recordings of temporal patterns in facial expression, from Neutral, to onset and offset pattern phases and back to neutral face. The database consists of over 2900 videos and high resolution still images coming from 75 different subjects and focuses on the 7 prototypical facial expressions being consistently FACS annotated. The other database we used was the Coch-Kanade database from 130 subjects with both spontaneous and posed expression. Also, this database is FACS orientated and all the needed AUs for training are annotated. The testbed was run at 24 frames/second in 800x600 resolution frames on a 3.4 GHz i7 processor and 8GB of RAM memory. Proposed architecture of a Fully Integrated Modular Neural Network-based Automatic Facial Emotion Recognition system based on Facial Action Coding System

Experimental Results Proposed architecture of a Fully Integrated Modular Neural Network-based Automatic Facial Emotion Recognition system based on Facial Action Coding System

Conclusions We developed a fully integrated neural-network based facial emotion recognition system based on FACS system. We presented the architecture for easily integrating different types of methods and classifiers in order to improve the overall recognition rate of the system. A specific neural network is used for each component (eyes, lips, brows, lips and wrinkles) in order to improve feature detection and, consequently, the detection of action units and action unit combinations. Also, a four layer neural network is used for combining the results offered by the generated AUs and AUCs and for providing an estimated emotion. Experimental results have shown that this type of approach offers significantly better results than any other facial emotion recognition systems in terms of AU classification as well as final emotion interpretation. Moreover, the proposed system offered a significantly better consistency than other similar works, proving that its modularity and fully integrabillity improve the quality of the final output. The fact that the architecture is modular offers the possibility to add other classifiers to the feature recognition part increases also the possibility of improving the final output, thanks to the modularity of our system. Proposed architecture of a Fully Integrated Modular Neural Network-based Automatic Facial Emotion Recognition system based on Facial Action Coding System

Conclusions Drawbacks:
- One of the main disadvantages of this type of architecture is the overall processing time. For a video sequence the delay was of about frames, forcing the analyzing of only 1 frame every 3 frames. This is a drawback that could make the system considerable inefficient when a large number of classifiers are implemented, and it can only be improved through better parallelization algorithms as well as through improving the testbed’s performance. - Another disadvantage is the fact that there is still a great difference between the results obtained for a front still face compared to a spontaneous face. Research must be conducted in this area in order to envisage better algorithms for detecting AUs in spontaneous faces and integrate them into this architecture. Proposed architecture of a Fully Integrated Modular Neural Network-based Automatic Facial Emotion Recognition system based on Facial Action Coding System

Thank you! Improved automatic speech recognition system using sparse decomposition by basis pursuit with deep rectifier neural networks and compressed sensing recomposition of speech signals, Comms 2014, București

Proposed architecture of a Fully Integrated

Similar presentations

Presentation on theme: "Proposed architecture of a Fully Integrated"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Proposed architecture of a Fully Integrated

Similar presentations

Presentation on theme: "Proposed architecture of a Fully Integrated"— Presentation transcript:

Similar presentations

About project

Feedback