Supervisor: Dr. Eddie Jones Electronic Engineering Department Final Year Project 2008/09 Development of a Speaker Recognition/Verification System for Security.

Slides:

Advertisements

Similar presentations

Presented by Erin Palmer. Speech processing is widely used today Can you think of some examples? Phone dialog systems (bank, Amtrak) Computers dictation.

Advertisements

Speech Synthesis Markup Language V1.0 (SSML) W3C Recommendation on September 7, 2004 SSML is an XML application designed to control aspects of synthesized.

Speech Synthesis Markup Language SSML. Introduced in September 2004 XML based Assists the generation of synthetic speech Specifies the way speech is outputted.

Speech Recognition Part 3 Back end processing. Speech recognition simplified block diagram Speech Capture Speech Capture Feature Extraction Feature Extraction.

Improvement of Audio Capture in Handheld Devices through Digital Filtering Problem Microphones in handheld devices are of low quality to reduce cost. This.

Implementation of an Audio Reverberation Algorithm

Communicating with Robots using Speech: The Robot Talks (Speech Synthesis) Stephen Cox Chris Watkins Ibrahim Almajai.

Speech in Multimedia Hao Jiang Computer Science Department Boston College Oct. 9, 2007.

Automatic Lip- Synchronization Using Linear Prediction of Speech Christopher Kohnert SK Semwal University of Colorado, Colorado Springs.

1 Frequency Domain Analysis/Synthesis Concerned with the reproduction of the frequency spectrum within the speech waveform Less concern with amplitude.

Sean Powers Florida Institute of Technology ECE 5525 Final: Dr. Veton Kepuska Date: 07 December 2010 Controlling your household appliances through conversation.

PHONEXIA Can I have it in writing?. Discuss and share your answers to the following questions: 1.When you have English lessons listening to spoken English,

AN IMPROVED AUDIO Jenn Tam Computer Science Dept. Carnegie Mellon University SOAPS 2008, Pittsburgh, PA.

Making a Clay Mask 6 Step 1 Step 2 Step 3Decision Point Step 5 Step 4 Reading ComponentsTypical Types of Tasks and Test Formats Phonological/Phonemic.

Voice Recognition Technology Kathleen Kennedy COMP 1631 Winter 2010.

Chapter three Phonology

ENEE408G Capstone Design Project: Multimedia Signal Processing Group 1 By : William “Chris” Paul Louis Lo Jang-Hyun Ko Ronald McLaren Final Project : V-LOCK.

Auditory User Interfaces

Simultaneous Translation ENGT 443 Introduction to Simultaneous Interpreting Dr. Monira I. Al-Mohizea.

Why is ASR Hard? Natural speech is continuous

Advisor: Prof. Tony Jebara

Text-To-Speech System for Marathi Miss. Deepa V. Kadam Indian Institute of Technology, Bombay.

Representing Sound in a computer Analogue  Analogue sound is produced by being picked up by a transducer (microphone) and converted in an electrical current.

Vision-Based Biometric Authentication System by Padraic o hIarnain Final Year Project Presentation.

Sub-band Mixing and Addition of Digital Effects for Consumer Audio ELECTRICAL & ELECTRONIC ENGINEERING FINAL YEAR PROJECTS 2012/2013 Presented by Fionn.

Introduction to Automatic Speech Recognition

Digital Sound and Video Chapter 10, Exploring the Digital Domain.

1 “ Speech ” EMPOWERED COMPUTING Greenfield Business Centre, 20 th September, 2006.

Speech & Language Modeling Cindy Burklow & Jay Hatcher CS521 – March 30, 2006.

Knowledge Base approach for spoken digit recognition Vijetha Periyavaram.

Speaker Recognition By Afshan Hina.

Group 3 Teacher: Kate Chen Student: Nicole Ivy Julie Yuki Sandy Kelly.

CP SC 881 Spoken Language Systems. 2 of 23 Auditory User Interfaces Welcome to SLS Syllabus Introduction.

Group Members: Group Members:.  Introduction  Current Scenario  Proposed Solution  Block Diagram  Technical Implementation  Hardware & Software.

 How to Sound like a Native English Speaker Joey Nevarez CELOP.

- Pronunciation - Intonation.  How many different tones does Mandarin have?  4  What effect do these different tones have on the language?  Each tone.

1 Computational Linguistics Ling 200 Spring 2006.

1 BILC SEMINAR 2009 Speech Recognition: Is It for Real? Tony Mirabito Defense Language Institute English Language Center (DLIELC) DLIELC.

Chapter 3.2 Speech Communication Human Performance Engineering Robert W. Bailey, Ph.D. Third Edition.

Supervisor: Dr. Eddie Jones Co-supervisor: Dr Martin Glavin Electronic Engineering Department Final Year Project 2008/09 Development of a Speaker Recognition/Verification.

Recognition of spoken and spelled proper names Reporter : CHEN, TZAN HWEI Author :Michael Meyer, Hermann Hild.

Machine Translation  Machine translation is of one of the earliest uses of AI  Two approaches:  Traditional approach using grammars, rewrite rules,

Math 5 Professor Barnett Timothy G. McManus Anthony P. Pastoors.

MULTIMEDIA INPUT / OUTPUT TECHNOLOGIES INTRODUCTION 6/1/ A.Aruna, Assistant Professor, Faculty of Information Technology.

1 Speech Synthesis User friendly machine must have complete voice communication abilities Voice communication involves Speech synthesis Speech recognition.

Speech Recognition MIT SMA 5508 Spring 2004 Larry Rudolph (MIT)

Creating User Interfaces Directed Speech. XML. VoiceXML Classwork/Homework: Sign up to be Voxeo developer. Do tutorials.

Automated Reading Assistance System Using Point-of-Gaze Estimation M.A.Sc. Thesis Presentation Automated Reading Assistance System Using Point-of-Gaze.

Speech Recognition with CMU Sphinx Srikar Nadipally Hareesh Lingareddy.

Performance Comparison of Speaker and Emotion Recognition

ARTIFICIAL INTELLIGENCE FOR SPEECH RECOGNITION. Introduction What is Speech Recognition?  also known as automatic speech recognition or computer speech.

Development of an Intelligent Translation Memory MorphoLogic SZAK Publishers Balázs Kis

A Primer on Reading Terminology. AUTOMATICITY Readers construct meaning through recognition of words and passages (strings of words). Proficient readers.

Natural Language and Speech (parts of Chapters 8 & 9)

Natural Language Processing (NLP)

Speaker Verification System Middle Term Presentation Performed by: Barak Benita & Daniel Adler Instructor: Erez Sabag.

Speech Recognition Created By : Kanjariya Hardik G.

PREPARED BY MANOJ TALUKDAR MSC 4 TH SEM ROLL-NO 05 GUKC-2012 IN THE GUIDENCE OF DR. SANJIB KR KALITA.

2014 Development of a Text-to-Speech Synthesis System for Yorùbá Language Olúòkun Adédayọ̀ Tolulope Department of Computer Science.

Speaker Recognition UNIT -6. Introduction  Speaker recognition is the process of automatically recognizing who is speaking on the basis of information.

BIOMETRICS VOICE RECOGNITION. Meaning Bios : LifeMetron : Measure Bios : LifeMetron : Measure Biometrics are used to identify the input sample when compared.

IIS for Speech Processing Michael J. Watts

Reception Literacy Workshop

Speech Recognition

An Information Evening for Parents

ARTIFICIAL NEURAL NETWORKS

Command Me Specification

Analogue and Digital Signals

Phoneme Recognition Using Neural Networks by Albert VanderMeulen

An Information Evening for Parents

Presentation transcript:

Supervisor: Dr. Eddie Jones Electronic Engineering Department Final Year Project 2008/09 Development of a Speaker Recognition/Verification System for Security Applications

Contents What is a Speech Recognition/Verification System How Speech Recognition Works How Speech Verification Works Basics Milestones Timeline Questions

What is a Speech Recognition/Verification System Speech recognition applications include voice dialing (e.g., "Call home”) Speech Recognition converts spoken words to a digital signal, this signal is then converted to very short waveforms and then compares these to a database of known pronunciations, called Phonemes. This is then used to recognise the word that was spoken. Speech verification applications try to verify correctness of pronunciation. Speech verification doesn’t try to decode unknown speech, but instead, knowing what speech is to be said, it attempts to match the known sentences pitch, pronunciation etc. with that of a stored speakers pitch, pronunciation etc.

How Speech Recognition Works Speech at its basic level is broken down to phonemes – A representation of the sound we make and put together to form sentences. When the analogue speech is converted to a digital signal it is divided into small segments as short as a few 100 ths of a second. These segments are then matched to know phonemes (~40 in the English language). A program will then examine phonemes in the context of other phonemes around them.

How Speech Verification Works There are essentially 4 different types of speech verification: 1. Fixed Phrase Verification – One fixed phrase is stored and when verifying, that phrase is used to identify the user. 2. Fixed Vocabulary Verification – Multiple phrases are stored and when verifying, one at random is used to identify the user. 3. Flexible Vocabulary Verification – During system training using phrases, a set of subwords are generated and can be used during verification. 4. Text Independent Verification – The system learns the users voice (pitch, dialect, pronunciation, tone etc.) and in verification the user is free to say anything he/she wishes and the system recognises their voice.

Basics Training – Recording multiple speakers saying many varying sentences. This will then be analysed and stored for future recognition. Certain speakers will have a higher security level and this can be used in security applications. Recognition – Setting the system to recognition mode and reading out the sentence provided. If a match is found, access to a secured location is granted, if not, the user must try again, or does not have access.

Basic Block Diagram of Training Training Sample sentence Analysed Stored Voice 2 Stored Voice 1 Note: There will be multiple stored voices for speech verification

Basic Block Diagram of Verification Recognition Read a Previously recorded sentence Analysed Compared to stored voices Matches with a stored voice Unknown voice fails Passes Security check

Milestones 1. Research into Speaker Recognition/Verification and simulate on Various FEP in Matlab. 2. Simulation of classifier(s), and baseline performance evaluation of system. 3. Investigate speaker recognition/Verification over the Internet. Simulation of channel errors and additive noise. 4. Investigation of real-time implementation, including selecting a suitable development platform. Translation to C with a view to real-time implementation and functional verification against the Matlab reference. 5. Development of a real-time version of the system.

Timeline Milestone 1: Mid-Late November Milestone 2: Mid January Milestone 3: February (Pending research) Milestone 4: Early March Milestone 5: Late March

Questions?