Presentation is loading. Please wait.

Presentation is loading. Please wait.

Since noteworthy events happen only occasionally in any data, it is imperative for smart sensors to learn the norms in data so that authorities can be.

Similar presentations


Presentation on theme: "Since noteworthy events happen only occasionally in any data, it is imperative for smart sensors to learn the norms in data so that authorities can be."— Presentation transcript:

1 Since noteworthy events happen only occasionally in any data, it is imperative for smart sensors to learn the norms in data so that authorities can be alerted and appropriate action can be taken at the occurrence of an abnormal or noteworthy event. The aim of this project is to develop algorithms that can learn the norm in terms of a hierarchy of meaningful features from data in an unsupervised and online manner. The application testbed is the problem of automatically tuning cochlear implants (CIs) of patients with severe-to-profound hearing loss by continuously monitoring their speech output. The working hypothesis is that deficiencies in hearing for people with significant hearing loss are reflected in their speech production. This project will develop and use unsupervised, online, and biologically plausible machine learning algorithms to learn feature hierarchies from the speech output data of severely-to-profoundly hearing-impaired patients. The learned feature hierarchy from the speech of a patient will be compared to those learned from the speech of a comparable normal hearing population. Deficiencies in the patient’s hearing will be ascertained by identifying the missing or distorted features. Algorithms will be developed to map this information into the signal processing strategies used in CIs to enhance the audibility of speech. ABSTRACT We gratefully acknowledge support from the National Science Foundation through NSF CISE Grant No. 1231620. For enquiries, contact: Dr. Bonny Banerjee, BonnyBanerjee@yahoo.com ACKNOWLEDGMENT DATA COLLECTION Subjects N=30; ages 8–80 years Native speakers of English Able to read words and sentences Two groups of subjects 1. Normal hearing: Pure tone average < 20 dB HL, normal speech production 2. Hearing impaired: Severe sensorineural hearing loss, poor speech production Baseline assessments 1. Audiometry (hearing assessment) 2. Speech perception assessment (MSTB) 3. Speech production assessment (CAPES) 4. Baseline articulation and phonological assessment Reading data 1. Nonsense syllables 2. Six passages and short stories Algorithms for Unsupervised and Online Learning of Hierarchy of Features for Tuning Cochlear Implants for the Hearing Impaired Bonny Banerjee (1,2) (PI) and Lisa Lucks Mendel (3) (co-PI) Institute for Intelligent Systems (1), Dept. of Electrical & Computer Engineering (2), School of Communication Sciences and Disorders (3) The University of Memphis, TN MODELS AND METHODS Layer 1 Data Layer 2 Layer L Lateral Connections (encode local context among neighboring nodes) Feedback Connections (encode global context for the lower layers) Feedforward Connections (encode features) Node (consists of a layer of neurons each of which responds to a unique invariant feature) Figure 2. The SELP framework is implemented in a multilayered neural architecture. The circles denote nodes which are the canonical computational units. (Dutta et al., 2012) Figure 3. Inside a node are two layers (L 1, L 2 ) of neurons. L 1 neurons encode features while L 2 neurons encode transformations. Weights W (1,1) Layer L 1 Input Layer L 0 Weights W (1,2) Weights W (0,1) Layer L 2 Layer L 1... Explain Surprise Predict Learn Updated neuronal connections Activations of lower layer neurons Activations of higher layer neurons Expected input Real-world input (varying in space and time) Figure 1. The SELP framework runs a relentless cycle – detect unexpected or Salient event, Explain the salient event, Learn from its explanation, Predict the future events – involving the real external world and its internal model, and hence the name. (Banerjee et al., 2013a) We experimented with learning generative features by treating the audio signal as a time-series. Three layers of features were learned by our model from over ten hours of speech. First and some second layer features resemble the basis functions obtained using the Karhunen-Loeve transform (Hermansky, 2003). Second and third layer features are similar to those obtained by the shift- invariant K-SVD algorithm (Mailhe et al., 2008). Unseen speech could be successfully reconstructed using our learnt features from the different layers. (Above) Audio features learned by our model in three layers. Column (a) shows 5 out of 30 features learned by the first layer. Column (b) shows a shift- invariant feature learnt by the second layer. Column (c) shows 5 out of 100 features learned by the second layer. Column (d) shows 5 out of 150 features learned by the third layer. The features are of duration 10, 100 and 500 points in the three layers (sampling frequency is 8.82 KHz). GENERATIVE FEATURE LEARNING Features of size 10×10, 16×16 and 28×28 are learnt from 60,000 handwritten numerals in MNIST dataset in first, second and third layers of our model. B. Banerjee, J. K. Dutta and J. Gu. (2013a). SELP: A general-purpose framework for learning the norms from saliencies in spatiotemporal data. Neurocomputing: Special Issue on Brain Inspired Models of Cognitive Memory, Elsevier. [Accepted] B. Banerjee, J. Gu and J. K. Dutta. (2013b). Assigning uniqueness to generative features for discrimination. 17th Intl. Conf. Cognitive and Neural Systems, June 4-7, 2013, Boston University, MA. J. K. Dutta and B. Banerjee. (2013). Learning complex cell receptive field properties by explaining simple cell responses over time. Intl. Joint Conf. Neural Networks, August 4-9, 2013, Dallas, TX. Banerjee, B. (2012). Learning lateral connections among neurons from correlations of their surprises. Center for Visual Science's 28th Symposium: Computational Foundations of Perception and Action, June 1-3, University of Rochester, NY. J. K. Dutta, J. Gu, R. P. Kasani and B. Banerjee. (2012). A multilayered neural network model for verifying the common cortical algorithm hypothesis. Center for Visual Science's 28th Symposium: Computational Foundations of Perception and Action, June 1-3, 2012, University of Rochester, NY. H. Hermansky. (2003). TRAP-TANDEM: Data-driven extraction of temporal features from speech. IEEE Workshop on Automatic Speech Recognition and Understanding, number 50, Martigny, Switzerland. B. Mailhe et al. (2008). Shift-invariant dictionary learning for sparse representations: Extending K-SVD. Proc. European Signal Processing Conf., Lausanne, Suisse. J. F. Linden and C. E. Schreiner. (2003). Columnar transformations in auditory cortex? A comparison to visual and somatosensory cortices. Cerebral Cortex, 13(1):83-9. REFERENCES TRANSFORMATION LEARNING Following the seminal work of Hubel and Wiesel, it is widely accepted that the complex cells in the primary visual cortex (or V1) respond to transformation- invariant stimuli. Our model was exposed to videos recorded at different natural locations with a CCD camera mounted on a cat's head exploring its environment (Betsch et al., 2004). These videos provided a continuous stream of stimuli similar to what the cat's visual system is naturally exposed to, preserving its temporal structure. The complex neurons in our model learned the following transformations, their activations were then invariant to these transformations akin to the response of complex cells in V1. (Dutta and Banerjee, 2013) However, it is not very clear as to what are the transformations learned in the auditory cortex (Linden and Schreiner, 2003). One of our goals in this project is to discover those transformations. FEATURE SELECTION Discriminative ability of feature i for the class j is encoded by the feature neuron and feedforward weights. (Banerjee et al., 2013b)


Download ppt "Since noteworthy events happen only occasionally in any data, it is imperative for smart sensors to learn the norms in data so that authorities can be."

Similar presentations


Ads by Google