Project 1 : Eigen-Faces Applied to Speech Style Classification Brad Keserich, Senior, Computer Engineering College of Engineering and Applied Science;

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Face Recognition. Introduction Why we are interested in face recognition? Why we are interested in face recognition? Passport control at terminals in.
1 CPC group SeminarThursday, June 1, 2006 Classification techniques for Hand-Written Digit Recognition Venkat Raghavan N. S., Saneej B. C., and Karteek.
Machine Learning Lecture 8 Data Processing and Representation
Robust 3D Head Pose Classification using Wavelets by Mukesh C. Motwani Dr. Frederick C. Harris, Jr., Thesis Advisor December 5 th, 2002 A thesis submitted.
AUTOMATIC SPEECH CLASSIFICATION TO FIVE EMOTIONAL STATES BASED ON GENDER INFORMATION ABSTRACT We report on the statistics of global prosodic features of.
São Paulo Advanced School of Computing (SP-ASC’10). São Paulo, Brazil, July 12-17, 2010 Looking at People Using Partial Least Squares William Robson Schwartz.
Classifying Motion Picture Audio Eirik Gustavsen
Pattern Recognition Topic 1: Principle Component Analysis Shapiro chap
Multidimensional Analysis If you are comparing more than two conditions (for example 10 types of cancer) or if you are looking at a time series (cell cycle.
Fig. 2 – Test results Personal Memory Assistant Facial Recognition System The facial identification system is divided into the following two components:
Smart Traveller with Visual Translator for OCR and Face Recognition LYU0203 FYP.
Biomedical Image Analysis and Machine Learning BMI 731 Winter 2005 Kun Huang Department of Biomedical Informatics Ohio State University.
Database Construction for Speech to Lip-readable Animation Conversion Gyorgy Takacs, Attila Tihanyi, Tamas Bardi, Gergo Feldhoffer, Balint Srancsik Peter.
/14 Automated Transcription of Polyphonic Piano Music A Brief Literature Review Catherine Lai MUMT-611 MIR February 17,
Eigenfaces for Recognition Student: Yikun Jiang Professor: Brendan Morris.
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Case Studies Dr Lee Nung Kion Faculty of Cognitive Sciences and Human Development UNIVERSITI MALAYSIA SARAWAK.
Eng. Shady Yehia El-Mashad
PCA & LDA for Face Recognition
Element 2: Discuss basic computational intelligence methods.
BACKGROUND LEARNING AND LETTER DETECTION USING TEXTURE WITH PRINCIPAL COMPONENT ANALYSIS (PCA) CIS 601 PROJECT SUMIT BASU FALL 2004.
Machine Learning in Spoken Language Processing Lecture 21 Spoken Language Processing Prof. Andrew Rosenberg.
Access Control Via Face Recognition Progress Review.
1 Recognition by Appearance Appearance-based recognition is a competing paradigm to features and alignment. No features are extracted! Images are represented.
Jacob Zurasky ECE5526 – Spring 2011
1 Detection and Discrimination of Sniffing and Panting Sounds of Dogs Ophir Azulai(1), Gil Bloch(1), Yizhar Lavner (1,2), Irit Gazit (3) and Joseph Terkel.
Overview of Part I, CMSC5707 Advanced Topics in Artificial Intelligence KH Wong (6 weeks) Audio signal processing – Signals in time & frequency domains.
Using Support Vector Machines to Enhance the Performance of Bayesian Face Recognition IEEE Transaction on Information Forensics and Security Zhifeng Li,
Classification Course web page: vision.cis.udel.edu/~cv May 12, 2003  Lecture 33.
Indian Institute of Information Technology and Management Gwalior24/12/2008 DR. ANUPAM SHUKLA DR. RITU TIWARI HEMANT KUMAR MEENA RAHUL KALA Speaker Identification.
Character Identification in Feature-Length Films Using Global Face-Name Matching IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 11, NO. 7, NOVEMBER 2009 Yi-Fan.
Overview ► Recall ► What are sound features? ► Feature detection and extraction ► Features in Sphinx III.
CSE 185 Introduction to Computer Vision Face Recognition.
Look who’s talking? Project 3.1 Yannick Thimister Han van Venrooij Bob Verlinden Project DKE Maastricht University.
Singer similarity / identification Francois Thibault MUMT 614B McGill University.
Wavelets Anderson G Moura 05/29/2015. Introduction Biomedical signals usually consist of brief high-frequency components closely spaced in time, accompanied.
A NOVEL METHOD FOR COLOR FACE RECOGNITION USING KNN CLASSIFIER
SAR-ATR-MSTAR TARGET RECOGNITION FOR MULTI-ASPECT SAR IMAGES WITH FUSION STRATEGIES ASWIN KUMAR GUTTA.
Project 1 : Qualifying and Quantifying Noise from Different Types of Guns (Vehicles) Brad Keserich, Senior, Computer Engineering College of Engineering.
Performance Comparison of Speaker and Emotion Recognition
Team Members Ming-Chun Chang Lungisa Matshoba Steven Preston Supervisors Dr James Gain Dr Patrick Marais.
2D-LDA: A statistical linear discriminant analysis for image matrix
Chapter 15: Classification of Time- Embedded EEG Using Short-Time Principal Component Analysis by Nguyen Duc Thang 5/2009.
Face detection and recognition Many slides adapted from K. Grauman and D. Lowe.
Painting Classification by Artist and Period Using Neural Network Pattern Classification Techniques Stuart Rowan 12/12/2008.
Project 1 : Qualifying and Quantifying Noise from Different Types of Guns (Vehicles) Brad Keserich, Senior, Computer Engineering College of Engineering.
Research Methodology Proposal Prepared by: Norhasmizawati Ibrahim (813750)
Under Guidance of Mr. A. S. Jalal Associate Professor Dept. of Computer Engineering and Applications GLA University, Mathura Presented by Dev Drume Agrawal.
Results from Mean and Variance Calculations The overall mean of the data for all features was for the REF class and for the LE class. The.
Descriptive Statistics The means for all but the C 3 features exhibit a significant difference between both classes. On the other hand, the variances for.
Course Outline (6 Weeks) for Professor K.H Wong
Principal Component Analysis (PCA)
CSSE463: Image Recognition Day 26
Introduction to Pattern Recognition
Catherine Lai MUMT-611 MIR February 17, 2005
ARTIFICIAL NEURAL NETWORKS
CS 2750: Machine Learning Dimensionality Reduction
Mixture of SVMs for Face Class Modeling
Hybrid Features based Gender Classification
Recognition: Face Recognition
Brian Whitman Paris Smaragdis MIT Media Lab
Principal Component Analysis
Blind Signal Separation using Principal Components Analysis
Principal Component Analysis
Electronic Door Unlock with Face Recognition
CS4670: Intro to Computer Vision
Sponsored ByThe National Science Foundation Grant ID No.: DUE
Using Manifold Structure for Partially Labeled Classification
Presentation transcript:

Project 1 : Eigen-Faces Applied to Speech Style Classification Brad Keserich, Senior, Computer Engineering College of Engineering and Applied Science; University of Cincinnati; Cincinnati, Ohio Suryadip Chakraborty, School of Computing Sciences and Informatics Dr. Dharma Agrawal, Professor, School of Computing Sciences and Informatics 1 Sponsored ByThe National Science Foundation Grant ID No.: DUE

Introduction Speech recognition Voice disorders –Stuttering –Pausing –Other less known forms Research group focus on Parkinson’s Patients 2

Techniques Previous work –Good results using Neural Network classifiers using Fuzzy values –Wavelet Transformations are effective For this project –Eigen-faces method adapted to audio 3

Goals Investigate the usefulness of the eigen-faces method for speech classification 4

Objectives Acquire data Extract salient features Analyze Eigen-faces effectiveness 5

Eigen-faces for audio 6 w1w2 w3w4w5 t wivi = f1 f2 : : : : : fr

Classifiers using Abstract Features Training –Training set of feature vectors –Convert to Zero-mean truth set –Top k principle components (using principle component analysis (PCA)) Classifying –Project new vectors onto eigenbasis –Residuals indicate closeness to a class 7

Data Recorded word: “Ta-Be-Mo-No” –Consonant + vowel sounds –Easy to do segmentation –Use “Ta” portion only Use voice acting for data collection –Same person –Vary the way the word is spoken Variance of speaking style –Stuttering –Pausing –Pace –Pitch inflections 8

9 Pipeline

Segmentation and Labeling Automation –Works well for slow clear cases –Not as well for more realistic cases –Slow cases are close to hand segmentation By Hand –More reliable segmentation at this point –Done with sample counts in Logic 8 –Label the segments with correct sound 10

11

12

Modifications Use additional features in the Eigen- faces method –Stutter detection –Pauses and spacing within the spoken word –Pitch inflections Utilize Mel-Cepstrum to pick up features Substitute Laplacian Eigenmap for PCA 13

Results Features performing well –Blatant stutter detection –Long durations –Spectrum analysis Good class seperability 14

Conclusions Eigen-faces work for spoken audio data More tweaking required Further research –Mel-Cepstrum features –Laplacian Eigenmapping to replace PCA May be useful as a front end to Fuzzy- Neuro classifiers 15

References 1.Wu, H., Siegel, M., & Khosla, P. (1999). Vehicle sound signature recognition by frequency vector principal component analysis. IEEE Transactions on Instrumentation and Measurement, 48(5) doi: Belkin, M. & Niyogi, P. (2002). Laplacian Eigenmaps for Dimensionality Reduction and Data Representation. 1.Prahalld, K. Speech Technology: A Practical Introduction Topics: Spectogram, Cepstrum and Mel-Frequency Analysis /slides/03_mfcc.pdf. 16