Presentation is loading. Please wait.

Presentation is loading. Please wait.

Audio Workgroup Neuro-inspired Speech Recognition Group Members Ismail UysalYoojin Chung Ramin Pichevar Rich Hammett Tarek Massoud Ross Gaylor David Anderson.

Similar presentations


Presentation on theme: "Audio Workgroup Neuro-inspired Speech Recognition Group Members Ismail UysalYoojin Chung Ramin Pichevar Rich Hammett Tarek Massoud Ross Gaylor David Anderson."— Presentation transcript:

1 Audio Workgroup Neuro-inspired Speech Recognition Group Members Ismail UysalYoojin Chung Ramin Pichevar Rich Hammett Tarek Massoud Ross Gaylor David Anderson Shihab Shamma Hynek HermanskiShih-Chii Liu Giacomo Indiveri Malcolm Slaney

2 Audio Workgroup Audio Projects Localization Speech Recognition More ASR

3 Audio Workgroup Localization Effort

4 Audio Workgroup FPAA/Mote – Word Recognition

5 Audio Workgroup FPAA/Mote – Word Recognition Field Programmable Analog Array (FPAA)based analog cochlea (non-spiking) with envelope detection. MOTEbased pattern matching using matched filtering with receptive fields Robosapien listens to the spoken commands….

6 Audio Workgroup FPAA/Mote – Word Recognition Status: FPAA – (we are using a new FPAA) 2 nd -order sections synthesized but a full auditory filter bank is not yet up. MOTE – real-time communication with Matlab and sampling operational.

7 Audio Workgroup Relational Network (Simple) X Y Z M M X M Y M Z m Patches of neurons Each measure one quantity Bidirectional relations for feedback/feedforward Thanks to Rodney Douglas

8 Audio Workgroup Relational Network (example) Input here Relational Feedback Relational specification Relational feedback

9 Audio Workgroup ASR Relational Network Cochlea Delay Phone Recognizer Word Recognizer A patch of neurons (one of N output) Note: We dont know how to represent delays Phone Recognizer Bidirectional links enforce phoneme/word constraints

10 Audio Workgroup Relational Advantages Not an HMM HMMs are great, but… Incorporate other knowledge Bottom-up perception Top-down word hypothesis Hallucinate Based on experience Hear ba.. and know that Bad, bat, bar, bass, band follow >

11 Audio Workgroup Inner hair cells Silicon Cochlea Ganglion cells Basilar membrane high frequency low frequency (van Schaik, Liu, 2004) BASILAR MEMBRANE INNER HAIR CELLS GANGLION CELLS

12 Audio Workgroup Silicon Frequency Response Tone ramps into two cochleas

13 Audio Workgroup Cochlear Rate Profiles Left CochleaRight Cochlea Spikes per utterance

14 Audio Workgroup Learning Algorithms Statistical SAS (Pick best channels for decision) Least squares (for software demo) Liquid State Machine Take input to high dimensions with spiking net Spike Timing Dependent Plasticity (STDP) Giocomo/Srinjoy Chip Brader/Fusi Vowel 1 Vowel 2 LSM Spiking Output

15 Audio Workgroup Phoneme 1Phoneme 2 Learning Chip Architecture Immediate Cochlea Plastic synapses Delayed Cochlea Phoneme 1 Cochlea Chip Learning Chip Neurons Relational Network Nonplastic synapses Excit. Inhib. Binary synaptic weights:,,

16 Audio Workgroup Tone Results Tone recognition Spike input from silicon cochlea Training Two tones Duplicated input Positive and negative examples Testing

17 Audio Workgroup Phoneme recognition Spike input from silicon cochlea Training Two phonemes Duplicated inputs Positive and negative examples Testing Phoneme Results

18 Audio Workgroup Behind the Curtain

19 Audio Workgroup Hardware Overview Cochlea Learning Phoneme Word PCI-AER (for remapping) Cochlea Shih-Chii Liu Giacomo Indiveri Implemented in M ATLAB

20 Audio Workgroup Infrastructure Difficulties Remapper Ensuing the problems surrounding AER mapper boards, remapping the AER data from silicon cochlea to the learning chip had to be done in Matlab. (very slow) Power The unpredictable problem caused by the variation in supply voltage as much as 1V. Sharing chips The learning chip had to be shared with two other workgroups. PC replacement

21 Audio Workgroup Impedance Difficulties Cochlear firing rates Cochlea: 6M spikes/second 30k channels, 200 spikes/second Silicon Cochlea: 30k spikes/second 30 channels, 1k spike/second Learning Chip: 3k spikes/second 30 channels, 200 spikes/second Dynamic range

22 Audio Workgroup Simulation

23 Simulation 2

24 Audio Workgroup Simulation 3

25 Audio Workgroup Great Job! Student Members Ismail UysalYoojin Chung Ramin Pichevar Rich Hammett Tarek Massoud Ross Gaylor

26 Audio Workgroup

27 Silicon Cochlea Raster plot for two different tone inputs Mean firing rates for two different vowel inputs Channel Number Time in microseconds

28 Audio Workgroup Word Recognizer Four example raster plot (silence, A_, A_ with relational, AI)

29 Audio Workgroup Software Simulation

30 Audio Workgroup Software Simulation

31 Audio Workgroup Behind the Curtain


Download ppt "Audio Workgroup Neuro-inspired Speech Recognition Group Members Ismail UysalYoojin Chung Ramin Pichevar Rich Hammett Tarek Massoud Ross Gaylor David Anderson."

Similar presentations


Ads by Google