Presentation is loading. Please wait.

Presentation is loading. Please wait.

Audio Workgroup Neuro-inspired Speech Recognition Group Members Ismail UysalYoojin Chung Ramin Pichevar Rich Hammett Tarek Massoud Ross Gaylor David Anderson.

There are copies: 1
Audio Workgroup Neuro-inspired Speech Recognition Group Members Ismail UysalYoojin Chung Ramin Pichevar Rich Hammett Tarek Massoud Ross Gaylor David Anderson.

Similar presentations


Presentation on theme: "Audio Workgroup Neuro-inspired Speech Recognition Group Members Ismail UysalYoojin Chung Ramin Pichevar Rich Hammett Tarek Massoud Ross Gaylor David Anderson."— Presentation transcript:

1 Audio Workgroup Neuro-inspired Speech Recognition Group Members Ismail UysalYoojin Chung Ramin Pichevar Rich Hammett Tarek Massoud Ross Gaylor David Anderson Shihab Shamma Hynek HermanskiShih-Chii Liu Giacomo Indiveri Malcolm Slaney

2 Audio Workgroup Audio Projects Localization Speech Recognition More ASR

3 Audio Workgroup Shihab is Running See Shihab arriving in Telluride in 2004 (should happen around 4PM today)

4 Audio Workgroup Localization Effort Interaural Time Difference (ITD) Estimated from time difference between spikes of two matching channels. Interaural Intensity Difference (IID) Difference of spike counts between two cochleae. Azimuth: Combination of ITD and IID ITD estimation from pure tones Azimuth estimation from music Speaker Microphones

5 Audio Workgroup Localization Effort

6 Audio Workgroup FPAA/Mote – Word Recognition

7 Audio Workgroup FPAA/Mote – Word Recognition Field Programmable Analog Array (FPAA)based analog cochlea (non-spiking) with envelope detection. MOTEbased pattern matching using matched filtering with receptive fields Robosapien listens to the spoken commands….

8 Audio Workgroup FPAA/Mote – Word Recognition Status: FPAA – (we are using a new FPAA) 2 nd -order sections synthesized but a full auditory filter bank is not yet up. MOTE – real-time communication with Matlab and sampling operational.

9 Audio Workgroup Relational Network (Simple) X Y Z M M X M Y M Z m Patches of neurons Each measure one quantity Bidirectional relations for feedback/feedforward Thanks to Rodney Douglas

10 Audio Workgroup Relational Network (example) Input here Relational Feedback Relational specification Relational feedback

11 Audio Workgroup ASR Relational Network Cochlea Delay Phone Recognizer Word Recognizer A patch of neurons (one of N output) Note: We dont know how to represent delays Phone Recognizer Bidirectional links enforce phoneme/word constraints

12 Audio Workgroup Relational Advantages Not an HMM HMMs are great, but… Incorporate other knowledge Bottom-up perception Top-down word hypothesis Hallucinate Based on experience Hear ba.. and know that Bad, bat, bar, bass, band follow >

13 Audio Workgroup Inner hair cells Silicon Cochlea Ganglion cells Basilar membrane high frequency low frequency (van Schaik, Liu, 2004) BASILAR MEMBRANE INNER HAIR CELLS GANGLION CELLS

14 Audio Workgroup Silicon Frequency Response Tone ramps into two cochleas

15 Audio Workgroup Cochlear Rate Profiles Left CochleaRight Cochlea Spikes per utterance

16 Audio Workgroup Learning Algorithms Statistical SAS (Pick best channels for decision) Least squares (for software demo) Liquid State Machine Take input to high dimensions with spiking net Spike Timing Dependent Plasticity (STDP) Giocomo/Srinjoy Chip Brader/Fusi Vowel 1 Vowel 2 LSM Spiking Output

17 Audio Workgroup Phoneme 1Phoneme 2 Learning Chip Architecture Immediate Cochlea Plastic synapses Delayed Cochlea Phoneme 1 Cochlea Chip Learning Chip Neurons Relational Network Nonplastic synapses Excit. Inhib. Binary synaptic weights:,,

18 Audio Workgroup Tone Results Tone recognition Spike input from silicon cochlea Training Two tones Duplicated input Positive and negative examples Testing

19 Audio Workgroup Phoneme recognition Spike input from silicon cochlea Training Two phonemes Duplicated inputs Positive and negative examples Testing Phoneme Results

20 Audio Workgroup Behind the Curtain

21 Audio Workgroup Hardware Overview Cochlea Learning Phoneme Word PCI-AER (for remapping) Cochlea Shih-Chii Liu Giacomo Indiveri Implemented in M ATLAB

22 Audio Workgroup Infrastructure Difficulties Remapper Ensuing the problems surrounding AER mapper boards, remapping the AER data from silicon cochlea to the learning chip had to be done in Matlab. (very slow) Power The unpredictable problem caused by the variation in supply voltage as much as 1V. Sharing chips The learning chip had to be shared with two other workgroups. PC replacement

23 Audio Workgroup Impedance Difficulties Cochlear firing rates Cochlea: 6M spikes/second 30k channels, 200 spikes/second Silicon Cochlea: 30k spikes/second 30 channels, 1k spike/second Learning Chip: 3k spikes/second 30 channels, 100 spikes/second Dynamic range

24 Audio Workgroup Desired Results /A/ Phoneme Patch /I/ Phoneme Patch AI Word Patch IA Word Patch AA AI Phoneme Input Relational Feedback WithoutWith

25 Audio Workgroup Simulation

26 Simulation 2

27 Audio Workgroup Simulation 3

28 Audio Workgroup Great Job! Student Members Ismail UysalYoojin Chung Ramin Pichevar Rich Hammett Tarek Massoud Ross Gaylor

29 Audio Workgroup

30 Silicon Cochlea Raster plot for two different tone inputs Mean firing rates for two different vowel inputs Channel Number Time in microseconds

31 Audio Workgroup Word Recognizer Four example raster plot (silence, A_, A_ with relational, AI)

32 Audio Workgroup Software Simulation

33 Audio Workgroup Software Simulation

34 Audio Workgroup Behind the Curtain


Download ppt "Audio Workgroup Neuro-inspired Speech Recognition Group Members Ismail UysalYoojin Chung Ramin Pichevar Rich Hammett Tarek Massoud Ross Gaylor David Anderson."

Similar presentations


Ads by Google