Neural Net Algorithms for SC Vowel Recognition Presentation for EE645 Neural Networks and Learning Algorithms Spring 2003 Diana Stojanovic.

Slides:



Advertisements
Similar presentations
A. Hatzis, P.D. Green, S. Howard (1) Optical Logo-Therapy (OLT) : Visual displays in practical auditory phonetics teaching. Introduction What.
Advertisements

Acoustic/Prosodic Features
Tom Lentz (slides Ivana Brasileiro)
Improved ASR in noise using harmonic decomposition Introduction Pitch-Scaled Harmonic Filter Recognition Experiments Results Conclusion aperiodic contribution.
1 Image Classification MSc Image Processing Assignment March 2003.
Vowel Formants in a Spectogram Nural Akbayir, Kim Brodziak, Sabuha Erdogan.
From Resonance to Vowels March 8, 2013 Friday Frivolity Some project reports to hand back… Mystery spectrogram reading exercise: solved! We need to plan.
“Connecting the dots” How do articulatory processes “map” onto acoustic processes?
Basic Spectrogram & Clinical Application Lab 9. Spectrographic Features of Vowels n 1st formant carries much information about manner of articulation.
A two dimensional kinematic mapping between speech acoustics and vocal tract configurations : WISP A.Hatzis, P.D.Green1 History of Vowel.
Coarticulation Analysis of Dysarthric Speech Xiaochuan Niu, advised by Jan van Santen.
Entropy and Dynamism Criteria for Voice Quality Classification Applications Authors: Peter D. Kukharchik, Igor E. Kheidorov, Hanna M. Lukashevich, Denis.
SPEECH RECOGNITION 2 DAY 15 – SEPT 30, 2013 Brain & Language LING NSCI Harry Howard Tulane University.
Speech Science XII Speech Perception (acoustic cues) Version
Vineel Pratap Girish Govind Abhilash Veeragouni. Human listeners are capable of extracting information from the acoustic signal beyond just the linguistic.
Vowel Acoustics, part 2 March 12, 2014 The Master Plan Today: How resonance relates to vowels (= formants) On Friday: In-class transcription exercise.
Basic Spectrogram Lab 8. Spectrograms §Spectrograph: Produces visible patterns of acoustic energy called spectrograms §Spectrographic Analysis: l Acoustic.
The Human Voice. I. Speech production 1. The vocal organs
Introduction to Acoustics Words contain sequences of sounds Each sound (phone) is produced by sending signals from the brain to the vocal articulators.
Speech Sound Production: Recognition Using Recurrent Neural Networks Abstract: In this paper I present a study of speech sound production and methods for.
Automatic Lip- Synchronization Using Linear Prediction of Speech Christopher Kohnert SK Semwal University of Colorado, Colorado Springs.
Vowel Acoustics, part 2 November 14, 2012 The Master Plan Acoustics Homeworks are due! Today: Source/Filter Theory On Friday: Transcription of Quantity/More.
Natural Language Processing - Speech Processing -
LYU0103 Speech Recognition Techniques for Digital Video Library Supervisor : Prof Michael R. Lyu Students: Gao Zheng Hong Lei Mo.
A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING CS 525 : Project Presentation PALDEN LAMA and MOUNIKA NAMBURU.
On Recognizing Music Using HMM Following the path craved by Speech Recognition Pioneers.
Feature vs. Model Based Vocal Tract Length Normalization for a Speech Recognition-based Interactive Toy Jacky CHAU Department of Computer Science and Engineering.
SPEECH PERCEPTION The Speech Stimulus Perceiving Phonemes Top-Down Processing Is Speech Special?
Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Vowels Vowels: Articulatory Description (Ferrand, 2001) Tongue Position.
COMP 4060 Natural Language Processing Speech Processing.
1 Lab Preparation Initial focus on Speaker Verification –Tools –Expertise –Good example “Biometric technologies are automated methods of verifying or recognising.
A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING CS 525 : Project Presentation PALDEN LAMA and MOUNIKA NAMBURU.
Pitch Prediction for Glottal Spectrum Estimation with Applications in Speaker Recognition Nengheng Zheng Supervised under Professor P.C. Ching Nov. 26,
Why is ASR Hard? Natural speech is continuous
A PRESENTATION BY SHAMALEE DESHPANDE
Representing Acoustic Information
Introduction to Automatic Speech Recognition
Source/Filter Theory and Vowels February 4, 2010.
Eng. Shady Yehia El-Mashad
Knowledge Base approach for spoken digit recognition Vijetha Periyavaram.
CSD 5400 REHABILITATION PROCEDURES FOR THE HARD OF HEARING Auditory Perception of Speech and the Consequences of Hearing Loss.
Artificial Intelligence 2004 Speech & Natural Language Processing Natural Language Processing written text as input sentences (well-formed) Speech.
Saichon Jaiyen, Chidchanok Lursinsap, Suphakant Phimoltares IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 21, NO. 3, MARCH Paper study-
1 Speech Perception 3/30/00. 2 Speech Perception How do we perceive speech? –Multifaceted process –Not fully understood –Models & theories attempt to.
Vowel Acoustics November 2, 2012 Some Announcements Mid-terms will be back on Monday… Today: more resonance + the acoustics of vowels Also on Monday:
Transcription of Text by Incremental Support Vector machine Anurag Sahajpal and Terje Kristensen.
Artificial Intelligence 2004 Speech & Natural Language Processing Natural Language Processing written text as input sentences (well-formed) Speech.
Speaker Recognition by Habib ur Rehman Abdul Basit CENTER FOR ADVANCED STUDIES IN ENGINERING Digital Signal Processing ( Term Project )
Recognition of Speech Using Representation in High-Dimensional Spaces University of Washington, Seattle, WA AT&T Labs (Retd), Florham Park, NJ Bishnu Atal.
Speech Science VI Resonances WS Resonances Reading: Borden, Harris & Raphael, p Kentp Pompino-Marschallp Reetzp
Artificial Intelligence 2004 Speech & Natural Language Processing Speech Recognition acoustic signal as input conversion into written words Natural.
Vowel Acoustics March 10, 2014 Some Announcements Today and Wednesday: more resonance + the acoustics of vowels On Friday: identifying vowels from spectrograms.
Probabilistic reasoning over time Ch. 15, 17. Probabilistic reasoning over time So far, we’ve mostly dealt with episodic environments –Exceptions: games.
Introduction to Digital Speech Processing Presented by Dr. Allam Mousa 1 An Najah National University SP_1_intro.
A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING CS 525 : Project Presentation PALDEN LAMA and MOUNIKA NAMBURU.
Performance Comparison of Speaker and Emotion Recognition
Predicting Voice Elicited Emotions
Detection of Vowel Onset Point in Speech S.R. Mahadeva Prasanna & Jinu Mariam Zachariah Department of Computer Science & Engineering Indian Institute.
Introduction to Spectral Analysis Phil Lockett Centre College KAPT meeting March 16, 2013.
Speech Recognition through Neural Networks By Mohammad Usman Afzal Mohammad Waseem.
IIS for Speech Processing Michael J. Watts
Research on Machine Learning and Deep Learning
The Human Voice. 1. The vocal organs
ARTIFICIAL NEURAL NETWORKS
Spoken Digit Recognition
The Human Voice. 1. The vocal organs
Speech Recognition Christian Schulze
Speech Perception (acoustic cues)
Artificial Intelligence 2004 Speech & Natural Language Processing
Presentation transcript:

Neural Net Algorithms for SC Vowel Recognition Presentation for EE645 Neural Networks and Learning Algorithms Spring 2003 Diana Stojanovic

Summary Neural net algorithms applied to recognition of Serbo-Croatian vowels Follows Thubthong & Kijsirkul (2001) paper on Thai phoneme recognition Light background will be provided

Introduction Speech recognition has many applications (PCs, cell phones, home appliance activation a la Dilbert etc.)

Introduction 2 There are various algorithms for recognizing speech, some of which rely on the recognition of individual phonemes or sounds

Block diagram of speech recognition system For this project Signal Processing: segmentation, spectral analysis Speech Recognition: Individual vowel recognition Signal Processing Speech Recognition

Previous work Thubthong & Kijsirkul (2001) tested multi-class Support Vector Machine (SVM) vs. Multilayer Perceptron (MLP) for recognition of Thai Vowels and tones They claim superiority of SVM, while the recognition rate differs by 2-3% for comparably complex systems

About speech sounds Speech sound is an acoustic wave Speaker’s vocal tract shapes the spectrum of each sound Spectrum depends on the speaker and on the property of the particular sound (for instance /u/), thus recognition in spectral domain is possible

Vowel Formants Vowels can be recognized in spectral domain by the characteristic “lines” corresponding to their properties (backness, height, lip rounding etc.) These “lines” –formants- occur at resonant frequencies of the vocal tract

Serbo-Croatian Vowel Chart

Data Used in the Project Data collection and Properties Type of speech: speaker dependent, accented syllables 480 isolated words were recorded and digitized at 11 kHz Vowels in accented position segmented manually Vowel formants measured by PCQuirer

Sound Features Measured Only first two formants were used for training the nets in order to reduce complexity Based on the property of the SC sounds, the performance should not suffer from this low dimensionality

Perceptron,Backprop and Support Vector Machine We learned about this throughout the semester. For details, please refer to the paper

Results Results for Thai (previous work) Results for SC (present work) MLPSVMMLPSVM 92.28% Recognition rate 94.99% Recognition rate (DDAG) 90-95% Recognition rate Work in progres

What is next? First, finish the SVM results Examine fast, connected speech Speaker independent recognition