EE 492 ENGINEERING PROJECT

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

Advanced Image Processing Student Seminar: Lipreading Method using color extraction method and eigenspace technique ( Yasuyuki Nakata and Moritoshi Ando.
A Graph based Geometric Approach to Contour Extraction from Noisy Binary Images Amal Dev Parakkat, Jiju Peethambaran, Philumon Joseph and Ramanathan Muthuganapathy.
Road-Sign Detection and Recognition Based on Support Vector Machines Saturnino, Sergio et al. Yunjia Man ECG 782 Dr. Brendan.
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Technische Universität München Face Model Fitting based on Machine Learning from Multi-band Images of Facial Components Institute for Informatics Technische.
PCA + SVD.
Kapitel 13 “Interactive Segmentation” – p. 1 Interactive Segmentation  Live-wire approach  Random walker segmentation TexPoint fonts used in EMF. Read.
Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao.
Computer Vision Spring ,-685 Instructor: S. Narasimhan Wean 5403 T-R 3:00pm – 4:20pm Lecture #20.
Lecture 6 Image Segmentation
EE 7730 Image Segmentation.
Multi video camera calibration and synchronization.
A Study of Approaches for Object Recognition
Rodent Behavior Analysis Tom Henderson Vision Based Behavior Analysis Universitaet Karlsruhe (TH) 12 November /9.
Today: Image Segmentation Image Segmentation Techniques Snakes Scissors Graph Cuts Mean Shift Wednesday (2/28) Texture analysis and synthesis Multiple.
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Multiple Human Objects Tracking in Crowded Scenes Yao-Te Tsai, Huang-Chia Shih, and Chung-Lin Huang Dept. of EE, NTHU International Conference on Pattern.
Visual Querying By Color Perceptive Regions Alberto del Bimbo, M. Mugnaini, P. Pala, and F. Turco University of Florence, Italy Pattern Recognition, 1998.
Video summarization by graph optimization Lu Shi Oct. 7, 2003.
Smart Traveller with Visual Translator for OCR and Face Recognition LYU0203 FYP.
Computer Vision - A Modern Approach Set: Segmentation Slides by D.A. Forsyth Segmentation and Grouping Motivation: not information is evidence Obtain a.
Database Construction for Speech to Lip-readable Animation Conversion Gyorgy Takacs, Attila Tihanyi, Tamas Bardi, Gergo Feldhoffer, Balint Srancsik Peter.
Information Retrieval in Practice
Eigenfaces for Recognition Student: Yikun Jiang Professor: Brendan Morris.
Tal Mor  Create an automatic system that given an image of a room and a color, will color the room walls  Maintaining the original texture.
CSE554AlignmentSlide 1 CSE 554 Lecture 8: Alignment Fall 2014.
06 - Boundary Models Overview Edge Tracking Active Contours Conclusion.
3D Fingertip and Palm Tracking in Depth Image Sequences
M ULTIFRAME P OINT C ORRESPONDENCE By Naseem Mahajna & Muhammad Zoabi.
Multimodal Interaction Dr. Mike Spann
7.1. Mean Shift Segmentation Idea of mean shift:
EE 492 ENGINEERING PROJECT LIP TRACKING Yusuf Ziya Işık & Ashat Turlibayev Yusuf Ziya Işık & Ashat Turlibayev Advisor: Prof. Dr. Bülent Sankur Advisor:
Chapter 14: SEGMENTATION BY CLUSTERING 1. 2 Outline Introduction Human Vision & Gestalt Properties Applications – Background Subtraction – Shot Boundary.
Intelligent Vision Systems ENT 496 Object Shape Identification and Representation Hema C.R. Lecture 7.
Intelligent Scissors for Image Composition Anthony Dotterer 01/17/2006.
Detection of nerves in Ultrasound Images using edge detection techniques NIRANJAN TALLAPALLY.
CSE554AlignmentSlide 1 CSE 554 Lecture 8: Alignment Fall 2013.
Raquel A. Romano 1 Scientific Computing Seminar May 12, 2004 Projective Geometry for Computer Vision Projective Geometry for Computer Vision Raquel A.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Externally growing self-organizing maps and its application to database visualization and exploration.
Voice Activity Detection based on OptimallyWeighted Combination of Multiple Features Yusuke Kida and Tatsuya Kawahara School of Informatics, Kyoto University,
A NOVEL METHOD FOR COLOR FACE RECOGNITION USING KNN CLASSIFIER
Lecture 9 Feature Extraction and Motion Estimation Slides by: Michael Black Clark F. Olson Jean Ponce.
Irfan Ullah Department of Information and Communication Engineering Myongji university, Yongin, South Korea Copyright © solarlits.com.
Machine Vision Edge Detection Techniques ENT 273 Lecture 6 Hema C.R.
Wrapping Snakes For Improved Lip Segmentation Matthew Ramage Dr Euan Lindsay (Supervisor) Department of Mechanical Engineering.
Detection of nerves in Ultrasound Images using edge detection techniques NIRANJAN TALLAPALLY.
Computer Vision Lecture 7 Classifiers. Computer Vision, Lecture 6 Oleh Tretiak © 2005Slide 1 This Lecture Bayesian decision theory (22.1, 22.2) –General.
1 Minimum Bayes-risk Methods in Automatic Speech Recognition Vaibhava Geol And William Byrne IBM ; Johns Hopkins University 2003 by CRC Press LLC 2005/4/26.
Graph-based Segmentation
CSE 554 Lecture 8: Alignment
3.1 Clustering Finding a good clustering of the points is a fundamental issue in computing a representative simplicial complex. Mapper does not place any.
SIFT Scale-Invariant Feature Transform David Lowe
We propose a method which can be used to reduce high dimensional data sets into simplicial complexes with far fewer points which can capture topological.
Improving Chinese handwriting Recognition by Fusing speech recognition
Digital Image Processing Lecture 16: Segmentation: Detection of Discontinuities Prof. Charlene Tsai.
Motion Detection And Analysis
Robust Range Only Beacon Localization
Dynamical Statistical Shape Priors for Level Set Based Tracking
3.1 Clustering Finding a good clustering of the points is a fundamental issue in computing a representative simplicial complex. Mapper does not place any.
Fast and Robust Object Tracking with Adaptive Detection
Presented by :- Vishal Vijayshankar Mishra
PRAKASH CHOCKALINGAM, NALIN PRADEEP, AND STAN BIRCHFIELD
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Outline H. Murase, and S. K. Nayar, “Visual learning and recognition of 3-D objects from appearance,” International Journal of Computer Vision, vol. 14,
第 九 章 影像邊緣偵測 9-.
Active Appearance Models theory, extensions & cases
Announcements Project 1 is out today help session at the end of class.
A Novel Smoke Detection Method Using Support Vector Machine
Presenter: Shih-Hsiang(士翔)
Presentation transcript:

EE 492 ENGINEERING PROJECT LIP TRACKING Yusuf Ziya Işık & Ashat Turlibayev Advisor: Prof. Dr. Bülent Sankur

Outline IDENTIFICATION OF THE PROBLEM LIP CONTOUR EXTRACTION LIP TRACKING RESULTS AND CONCLUSION FUTURE WORK

IDENTIFICATION OF THE PROBLEM Automatic Speech Recognition (ASR) systems 1.Systems Using Only Acoustic Information - Poor performance in noisy environments 2.Bimodal Audio-Visual Systems - Visual signal often contains information that is complementary to audio information - Visual information is not affected by acoustic noise - The overall performance of the combined sistem is better

Recognition ratio of audio, visual and audio-visual approaches

LIP READING Obtaining the visual information is known as lip reading problem Lip tracking is a crucial step of extracting visual features.

LIP TRACKING Lip tracking problem can be solved in 2 steps: Extracting lip boundary in the first frame by the help of the user Tracking the obtained contour through the subsequent frames automatically

Lip Contour Extraction Fully automatic segmentation is a very difficult task Semi-automatic methods are unavoidable and wanted Intelligent Scissors is a robust, accurate, and interactive semi-automatic boundary extraction tool which requires minimal user input.

Intelligent Scissors I Intelligent Scissors tool provides extracting of object’s contour by using several seed points specified interactively by the user. Intelligent Scissors algorithm converts the object boundary extraction to the problem of optimal path search in a weighted graph.

Obtaining Weighted Graph Weighted Graph: The local cost is calculated from every pixel in the image to its neghbouring pixel. Local Cost Functionals: -Laplacian zero crossing -Gradient Magnitude -Gradient Direction Pixels that exibit strong edge features are made to have low local costs.

Optimal Path Selection User Interaction: Seed points are specified on the image after all local costs are calculated. Contour = Minimal Cost Path: The optimal path from every pixel in the image to the seed point is determined by using Dijkstra’s algorithm.

Live-Wire Tool Live-Wire Tool: As the user moves the mouse, the optimal path from the free point to the seed point is displayed. Property of the ‘live-wire’: If the cursor comes in proximity of the edge the ‘live-wire’ snaps to the object boundary. Extracting the Contour: When the new seed point is specified, the live wire from this point to the previous seed point is taken as a segment of contour.

Extracting of a Lip Contour Using Intelligent Scissors At every move of the mouse the previous ‘live-wire’ is deleted and the new one beginning from the current position of the cursor and ending at the seed point is displayed.

Extraction of Outer Boundaries of Lena and a Lip Image Using Intelligent Scissors

LIP TRACKING Non-Rigid Object Tracking Algorithm Active Shape Models Method 1: Non-Rigid Object Tracking Algorithm Method 2: Tracking with “Intelligent Scissors” Method 3: Active Shape Models

Non-Rigid Object Tracking

Results of Non-Rigid Object Tracking Esra-8 Video Sequence Aysel-0 Video Sequence Esra-6 Video Sequence

Evaluation of Algorithm Color Edge Frame 67 Frame 68 Color Segmentation

Remarks The overall performance of the algorithm is satisfactory. Advantage: Ability to track the lips through large number of frames. Drawback: Long computation time of this algorithm in a closed loop mode makes it inappropriate for accurate tracking in real time applications.

Lip Tracking Using Intelligent Scissors Motivations : A desire to obtain a more accurate and faster lip tracking tool. Intelligent Scissors may be extended from lip segmentation to lip tracking easily.

Lip Tracking using Intelligent Scissors Seed points from the first frame are tracked to the following frames and by using Intelligent Scissors the contour of the lip may be extracted automatically. Suitable seed points are located by using priori information about the lip image. Used Features: Gradient Magnitude Hue Value Distance between successive seed points

Gradient Magnitude Feature Lip region has larger gradient magnitude than its surrounding region N points with highest gradient magnitudes (N << M×M, M is the search range) are seed candidates.

Hue Values Hue value is very useful for separating boundary from inner lip regions. Hue tripple: In addition to the seed point that is going to be tracked, hues of neighbours that are p pixels up and down of the current point are calculated. Selected Seed Point: From N points having largest gradients the one whose hue tripple is the most similar to the preious seed’s tripple is selected.

The Distance Between Seed Points The relative poistion of seed points is very important during tracking. The Intelligent Scissor tool gives wrong results if they get too close or too far away from each other. In the figure above the search range of seed point s2 in the following frame is shown.

Result Result of the “Tracking Using Intelligent Scissors” method applied on the 20 frame lip sequence

Active Shape Models Motivations: Lip tracking is a specific case of the general object tracking problem. Therefore, taking into account the knowledge about the shape of the lip will increse the performance of a tracker. Active Shape Models may be used for lip tracking on their own as well as for complementing and correcting the errors of a tracker with Intelligent Scissors.

Lip Training Set The shape of a lip is represented by a set of n 2-D points: x={x1,x2,x3,...,xn,y1,y2,y3,...,yn} If there are s training examples in a set corresponding s vectors are constructed and brought to the same coordinate frame.

Active Shape Models I Shape Model: We look for a parametric model x=M(b), where b is vector of model parameters. Principal Component Analysis: Helps to reduce the dimensionality of the data. Covariance matrix S of shape vectors:

Active Shape Models II Eigenlips: Eigenvectors of S (φi) are computed and corresponding eigenvalues (λi) are determined . The matrix Φ is formed which contains t eigenvectors corresponding to t largest eigenvalues. Hence: New Lip Shapes: By changing components of the vector b in a controlled way we may obtain new plausible lip shapes

Applications of Active Shape Models 1. Determining Visemes of a Language 2. Increasing Robustness of any Tracking Algorithm 3. If the shape model of an object is extracted apriory: i) To locate the object in the image ii)To track that object through image sequence

Visemes of a Language Determining viseme of each letter: Using Acitive Shape Models the parameter vector b of a lip shape corresponding to a letter of a language is obtained. Benefits to Speech Recognition: Parameter vectors obtained from an image sequence may be fused with acoustic information, thus increasing the recognition rate.

Contribution of EigenLips to Lip Tracking Algorithms Lip tracking algorithms may give wrong lip contours for frames far from the first frame. The shape vector of a wrong lip x’ is projected into the shape space: Distribution of the parameter vector b: if p(b’) is larger that a given threshold the contour is accepted as correct. if p(b’) is smaller, then the closest b vector is assigned to to the lip, thus correcting the wrong boundary.

Conclusion I “Intelligent Scissors” is an interactive semi- automatic image segmentation tool. May be used for extracting of initial lip boundary as well as for tracking that boundary through image sequence.

Conclusion II Non-Rigid Object Tracking Algorithm High time complexity Tracking through large number of frames Tracking with Intelligent Scissors More accurate results Low time complexity Tracking through small number of frames

Future Works Active Shape Models The library of lip shapes was obtained Viseme group for Turkish language Correction of wrong contours Extraction & Tracking of contours

Future Works II The method of “Lip Tracking Using Itelligent Scissors” may be made more robust by imposing Shape Constraint factor. Given an image, the region of the lip may be located by using Shape Models. A lip tracking system which is fully based on Active Shape Models may be developed.