Voice Recognition by a Realistic Model of Biological Neural Networks by Efrat Barak Supervised by Karina Odinaev Igal Raichelgauz.

Slides:



Advertisements
Similar presentations
Bioinspired Computing Lecture 16
Advertisements

Perceptron Lecture 4.
Chrisantha Fernando & Sampsa Sojakka
2806 Neural Computation Self-Organizing Maps Lecture Ari Visa.
Marković Miljan 3139/2011
Masters Presentation at Griffith University Master of Computer and Information Engineering Magnus Nilsson
Speech Sound Production: Recognition Using Recurrent Neural Networks Abstract: In this paper I present a study of speech sound production and methods for.
CMSC Assignment 1 Audio signal processing
Machine Learning Neural Networks
Neural NetworksNN 11 Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22
F 鍾承道 Acoustic Features for Speech Recognition: From Mel-Frequency Cepstrum Coefficients (MFCC) to BottleNeck Features(BNF)
Connectionist models. Connectionist Models Motivated by Brain rather than Mind –A large number of very simple processing elements –A large number of weighted.
November 19, 2009Introduction to Cognitive Science Lecture 20: Artificial Neural Networks I 1 Artificial Neural Network (ANN) Paradigms Overview: The Backpropagation.
An Illustrative Example
Associative Learning in Hierarchical Self Organizing Learning Arrays Janusz A. Starzyk, Zhen Zhu, and Yue Li School of Electrical Engineering and Computer.
Classification of Music According to Genres Using Neural Networks, Genetic Algorithms and Fuzzy Systems.
October 5, 2010Neural Networks Lecture 9: Applying Backpropagation 1 K-Class Classification Problem Let us denote the k-th class by C k, with n k exemplars.
Neural Optimization of Evolutionary Algorithm Strategy Parameters Hiral Patel.
1 COMP305. Part I. Artificial neural networks.. 2 The McCulloch-Pitts Neuron (1943). McCulloch and Pitts demonstrated that “…because of the all-or-none.
Pattern Recognition Applications Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall
Connected Populations: oscillations, competition and spatial continuum (field equations) Lecture 12 Course: Neural Networks and Biological Modeling Wulfram.
SOMTIME: AN ARTIFICIAL NEURAL NETWORK FOR TOPOLOGICAL AND TEMPORAL CORRELATION FOR SPATIOTEMPORAL PATTERN LEARNING.
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
Ranga Rodrigo April 5, 2014 Most of the sides are from the Matlab tutorial. 1.
Representing Acoustic Information
Eng. Shady Yehia El-Mashad
Machine Learning. Learning agent Any other agent.
Artificial Neural Nets and AI Connectionism Sub symbolic reasoning.
Modeling speech signals and recognizing a speaker.
CONTENTS:  Introduction  What is neural network?  Models of neural networks  Applications  Phases in the neural network  Perceptron  Model of fire.
Artificial Intelligence Lecture No. 29 Dr. Asad Ali Safi ​ Assistant Professor, Department of Computer Science, COMSATS Institute of Information Technology.
Jacob Zurasky ECE5526 – Spring 2011
1 Detection and Discrimination of Sniffing and Panting Sounds of Dogs Ophir Azulai(1), Gil Bloch(1), Yizhar Lavner (1,2), Irit Gazit (3) and Joseph Terkel.
Supervisor: Dr. Eddie Jones Co-supervisor: Dr Martin Glavin Electronic Engineering Department Final Year Project 2008/09 Development of a Speaker Recognition/Verification.
National Taiwan A Road Sign Recognition System Based on a Dynamic Visual Model C. Y. Fang Department of Information and.
Overview ► Recall ► What are sound features? ► Feature detection and extraction ► Features in Sphinx III.
The Function of Synchrony Marieke Rohde Reading Group DyStURB (Dynamical Structures to Understand Real Brains)
Digital Image Processing Lecture 25: Object Recognition Prof. Charlene Tsai.
Introduction to Neural Networks. Biological neural activity –Each neuron has a body, an axon, and many dendrites Can be in one of the two states: firing.
Voice Activity Detection based on OptimallyWeighted Combination of Multiple Features Yusuke Kida and Tatsuya Kawahara School of Informatics, Kyoto University,
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
BY KALP SHAH Sentence Recognizer. Sphinx4 Sphinx4 is the best and versatile recognition system. Sphinx4 is a speech recognition system which is written.
Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22
November 21, 2013Computer Vision Lecture 14: Object Recognition II 1 Statistical Pattern Recognition The formal description consists of relevant numerical.
Keeping the neurons cool Homeostatic Plasticity Processes in the Brain.
1 Electrical and Computer Engineering Binghamton University, State University of New York Electrical and Computer Engineering Binghamton University, State.
July 23, BSA, a Fast and Accurate Spike Train Encoding Scheme Benjamin Schrauwen.
“Principles of Soft Computing, 2 nd Edition” by S.N. Sivanandam & SN Deepa Copyright  2011 Wiley India Pvt. Ltd. All rights reserved. CHAPTER 2 ARTIFICIAL.
Kim HS Introduction considering that the amount of MRI data to analyze in present-day clinical trials is often on the order of hundreds or.
Supervised Learning – Network is presented with the input and the desired output. – Uses a set of inputs for which the desired outputs results / classes.
Co-funded by the European Union SlideSP4 Theoretical Neuroscience – HBP 2 nd Periodic Review – June 2016 SP4 Major Achievements: Ramp-Up Phase overall.
Data Mining: Concepts and Techniques1 Prediction Prediction vs. classification Classification predicts categorical class label Prediction predicts continuous-valued.
Speech Recognition through Neural Networks By Mohammad Usman Afzal Mohammad Waseem.
Korean Phoneme Discrimination
Deep Learning Amin Sobhani.
Speaker Classification through Deep Learning
ARTIFICIAL NEURAL NETWORKS
Presentation on Artificial Neural Network Based Pathological Voice Classification Using MFCC Features Presenter: Subash Chandra Pakhrin 072MSI616 MSC in.
FUNDAMENTAL CONCEPT OF ARTIFICIAL NETWORKS
Computer Science Department Brigham Young University
Ala’a Spaih Abeer Abu-Hantash Directed by Dr.Allam Mousa
An Improved Neural Network Algorithm for Classifying the Transmission Line Faults Slavko Vasilic Dr Mladen Kezunovic Texas A&M University.
Department of Electrical Engineering
Hugh Pastoll, Lukas Solanka, Mark C.W. van Rossum, Matthew F. Nolan 
The Naïve Bayes (NB) Classifier
Volume 86, Issue 4, Pages (May 2015)
Thomas Akam, Dimitri M. Kullmann  Neuron 
Motor Learning with Unstable Neural Representations
Machine Learning.
Outline Announcement Neural networks Perceptrons - continued
Presentation transcript:

Voice Recognition by a Realistic Model of Biological Neural Networks by Efrat Barak Supervised by Karina Odinaev Igal Raichelgauz

Structure Project Objective The Model The Classification Process Results & Analysis Conclusion

Project Objective Configure a neural network based system for voice recognition

The Model

The Main Principle The readout function recognizes the basin that the network has converged to, and classifies the input according to the indicator of that basin

Correspondence with the Theory of Attractor Neural Networks The system converges to a basin The basins are periodic attractors

Correspondence with the LSM theory The neural network may be treated as a liquid The readout function receives only the current state of the liquid and transforms it to an output signal The system can perform several tasks simultaneously

Neural Network Structure 22 Input Neurons 135 spiking neurons in a 3x3x15 formation LIF model for neurons behavior 20% of the neurons are inhibitory and 80% of them are excitatory Dynamic synapses

Creating the Stimulus 30 seconds of recorded speech are encoded into 1 second of spike trains, in the following methods: Time Encoding – A straight forward conversion

Creating the Stimulus Mel Frequency Cepstral Coefficients (MFCCs) encoding - In this method the frequency bands are positioned logarithmically, on the mel scale. A periodic spikes train is added to the second of the voice segment.

Performing a Simulation A new network is created A stimulus of one speech segment is fed to the network, followed by a periodic driving force (Repeated for every combination of segment and frequency). The basins are categorized by their activity vector.

The Classification Process

The Indicators Map - The number of segments of the wanted voice that converged to the basin b. - The number of segments of the unwanted voice that converged to the basin b. - The total number of initials that converged to the basin b.

The Indicators Map The indicator of basin b:

The Indicators Map Examples:

The Indicators Map

Indicators’ Average:

The Classification Process

Tuning Step 1. Select frequencies

Tuning Preceding to Step 2. Why do we need a threshold?

Tuning Step 2. Determine the threshold

The Classification Process

Results – Amplitude Encoded Input Input Examples Wanted Voice Unwanted Voice

Results – Amplitude Encoded Input Results of a verification test

Results – Amplitude Encoded Input Results of a Classification Test InputClassified asOur ClassificationTrue Classification Wanted 71%100% WantedUnwanted29%0% UnwantedWanted55.9%0% Unwanted 44.1%100%

Results – Amplitude Encoded Input Results of Classification by Two Different Systems Input Classified as System 1System 2True Classification Wanted 71%94%100% WantedUnwanted29%6%0% UnwantedWanted55.9%61.23%0% Unwanted 44.1%38.77%100%

Results – Amplitude Encoded Input Cross Classification

Results – Amplitude Encoded Input Results of cross classification for systems 1 and 2: 50.2% Answered, 49.8% Unanswered Input Classified as System 1System 2Cross Classification Wanted 71%94%97.1% WantedUnwanted29%6%2.9% UnwantedWanted55.9%61.23%66.5% Unwanted 44.1%38.77%33.5%

Results – MFCC Encoded Input Input Examples Wanted Voice Unwanted Voice

Results – MFCC Encoded Input True Classification Classified as Test I Segments: 100 wanted, 400 unwanted Test II Segments: 30 wanted, 30 unwanted WantedWanted (Hit)87%86.8% WantedUnwanted (Miss-Hit)13%13.2% UnwantedWanted (False Alarm)55.3%45% UnwantedUnwanted (Hit)44.7%55% Results of a classification test Two sets of new data were used

Results – MFCC Encoded Input InputClassificationf=18Hz, th=0.3 f=18Hz, th=0 f=18Hz, th=-0.12 f=18Hz, th=-0.2 Data set 3 Segments: Wanted (Hit)58%87%96%100% 100 wantedUnwanted (Miss-Hit) 42%13%4%0% 400 unwantedWanted (False Alarm) 32.2%55.3%77.5%93.75% Unwanted (Hit)67.8%44.7%22.5%6.25% Data set 4 Segments: Wanted (Hit)47.3%86.8%97%100% 30 wantedUnwanted (Miss-Hit) 52.7%13.2%2.6%0% 30 unwantedWanted (False Alarm) 17.5%45%82.5%92.5% Unwanted (Hit)82.5%55%17%7%

Basins Creation Pattern (a) 324 initials (b) 100 initials (c) 60 initials

Conclusion A system for voice recognition, based on neuro-computations, was designed The system succeeded in recognizing the wanted voice when the input was encoded by its amplitude.

Conclusion The MFCC method yielded very different inputs, therefore the ability of the system to recognize such input was proven partially. The system’s stability was proved

Suggestions for Future Projects Prepare the system for various types of inputs Perform automatic tuning by using statistical tools Prove that the system can perform several tasks simultaneously

THE END