Perceptual and Neural Modeling Automatic Speech Attribute Transcription (ASAT) Project Sorin Dusan Center for Advanced Information Processing Rutgers University.

Slides:



Advertisements
Similar presentations
Language Comprehension Speech Perception Semantic Processing & Naming Deficits.
Advertisements

Cognition, 8e by Margaret W. MatlinChapter 2 Cognition, 8e Chapter 2 Perceptual Processes I: Visual and Auditory Recognition.
Cognitive Neuroscience of Language 1. Premise 1: Constituent Cognitive Processes Phonological analysis Syntactic analysis Semantic analysis Premise 2:
Models of Language Language and Cognition Colombo 2011.
Language. Using Language What is language for? Using Language What is language for? – Rapid, efficient communication To accomplish this goal, what needs.
Organizational Notes no study guide no review session not sufficient to just read book and glance at lecture material midterm/final is considered hard.
Research Follow the evidence…. How to look at the brain Brain structure –CT –MRI Brain activity –PET –fMRI.
PSY 369: Psycholinguistics
COGNITIVE NEUROSCIENCE
The Future of Behavioural Genetics. Quantitative Genetics More fine-grained cognitive abilities, personality traits, disorders, childhood origins Integrating.
What is Cognitive Science? … is the interdisciplinary study of mind and intelligence, embracing philosophy, psychology, artificial intelligence, neuroscience,
COGN1001 Introduction to Cognitive Science Sept 2006 :: Lecture #1 :: Joe Lau :: Philosophy HKU.
What is Cognitive Science? … is the interdisciplinary study of mind and intelligence, embracing philosophy, psychology, artificial intelligence, neuroscience,
Language Comprehension Speech Perception Naming Deficits.
Cognitive Processes PSY 334 Chapter 2 – Perception.
Language Comprehension Speech Perception Meaning Representation.
Cognitive Psychology Chapter 3. Visual Consciousness Transduction of the visible spectrum (400 nm to 700 nm) of electromagnetic radiation. Crossing.
The History and Methods of Cognitive Psychology. What is Cognitive Psychology? The branch of psychology that studies how we perceive, attend, recognize,
Cognitive level of Analysis
Cognitive Science and Cognitive Neuroscience PSY 421 – Fall 2004.
Information Processing Approach Define cognition and differentiate among the stage, levels-of-processing, parallel distributed processing, and connectionist.
MIND: The Cognitive Side of Mind and Brain  “… the mind is not the brain, but what the brain does…” (Pinker, 1997)
Audio Scene Analysis and Music Cognitive Elements of Music Listening
1 Speech Perception 3/30/00. 2 Speech Perception How do we perceive speech? –Multifaceted process –Not fully understood –Models & theories attempt to.
Cognitive Systems Foresight Language and Speech. Cognitive Systems Foresight Language and Speech How does the human system organise itself, as a neuro-biological.
PERCEPTION AND PATTERN RECOGNITION Making sense of sensation –Local vs. Global scope –Data-driven (sensory, bottom-up) vs. Concept-driven (knowledge, “top-down”)
EE141 1 Language Janusz A. Starzyk
What's in the brain that ink may character?1 What’s in the brain that ink may character? (Net-based models in neurology, cognition, and social processes)
Fundamentals of Sensation and Perception THE WORLD, MIND AND BRAIN ERIK CHEVRIER SEPTEMBER 14 TH, 2015.
Chapter 1 What is Social Psychology?. Defining Social Psychology The scientific study of how individuals think, feel, and behave in social context. –
SPEECH PERCEPTION DAY 16 – OCT 2, 2013 Brain & Language LING NSCI Harry Howard Tulane University.
Unit 3-B (A): Brain Monitoring Tools Mr. McCormick A.P. Psychology.
The Holes in the Brian Help Us Sort Out Sounds..  I. The Brain’s ability to sort out sounds  1. speech sounds are categorized.  2.Misinterpretations.
ACE TESOL Diploma Program – London Language Institute OBJECTIVES You will understand: 1. The scope of the field of phonology; 2. The relevance of phonology.
Cognitive Systems Foresight Language and Speech. Cognitive Systems Foresight Language and Speech How does the human system organise itself, as a neuro-biological.
Automatic Speech Attribute Transcription (ASAT) Project Period: 10/01/04 – 9/30/08 The ASAT Team –Mark Clements –Sorin Dusan.
New Acoustic-Phonetic Correlates Sorin Dusan and Larry Rabiner Center for Advanced Information Processing Rutgers University Piscataway,
1 Branches of Linguistics. 2 Branches of linguistics Linguists are engaged in a multiplicity of studies, some of which bear little direct relationship.
Decoding Dyslexia Parent Support Group October,
Fig61. Fig62 Fig5_14 InRev5a InRev4bInRev2a PRINCIPLES OF PERCEPTUAL ORGANIZATION AND CONSTANCY Certain objects or sounds are automatically identified.
Unit 5 Phonetics and Phonology. Phonetics Sounds produced by the human speech organs are called the “phonic/auditory medium” Phonetics is the study of.
Näätänen et al. (1997) Language-specific phoneme representations revealed by electric and magnetic brain responses. Presented by Viktor Kharlamov September.
Biological Approach Methods Brain scanning techniques (CAT, PET, fMRI) The use of brain scanning to investigate aggression One twin study – Gottesman and.
Audio Scene Analysis and Music Cognitive Elements of Music Listening Kevin D. Donohue Databeam Professor Electrical and Computer Engineering University.
Brain Imaging Techniques
Vocab 3b The Brain. area at the front of the parietal lobes that registers and processes body touch and movement sensations.
Chapter 2 Cognitive Neuroscience. Some Questions to Consider What is cognitive neuroscience, and why is it necessary? How is information transmitted from.
Biology and Behavior Neuroscience  Scientific study of the brain and of the links between brain activity and behavior.
Cognitive Modeling Cogs 4961, Cogs 6967 Psyc 4510 CSCI 4960 Mike Schoelles
Christoph Prinz / Automatic Speech Recognition Research Progress Hits the Road.
Louis Rogers.  Why, when and how?  Listening ◦ Meaning and processing  Reading ◦ Decoding, vocabulary, volume, knowledge.
Language and Brain Summer, 2017.
Chapter 2 E: Brain Monitoring Tools
Speech Perception Models
Overview of Year 1 Progress Angelo Cangelosi & ITALK team
Copyright © American Speech-Language-Hearing Association
Interdisciplinary research on language & speech
Interdisciplinary research on language & speech
fMRI: What Does It Measure?
XinTONG yu ALEX BATES SEPTEMBER 12TH, 2017
Cognition and neurolinguistics
The emergence of cognitive neuroscience
Cognitive neuroscience: the toolbox David Poeppel, NYU
Cognitive Processes PSY 334
Biological and Evolutionary Bases of Behavior
مدلسازی تولید گفتار سمینار درس مدلسازی سیستم های بیولوژیکی
The Brain Tools of Discovery Older Brain Structures The Limbic System
Using Natural Language Processing to Aid Computer Vision
Topic: Language perception
Artificial Intelligence 2004 Speech & Natural Language Processing
Presentation transcript:

Perceptual and Neural Modeling Automatic Speech Attribute Transcription (ASAT) Project Sorin Dusan Center for Advanced Information Processing Rutgers University Piscataway, NJ Project Kickoff Meeting – Rutgers University

Can Automatic Speech Recognition Learn from Human Speech Perception?  Human auditory system as a model (Geisler ’98, Warren ’99, Plomp ’02, Ledoux ‘02)  The neuro-cognitive process of speech perception is still not totally understood  More understanding today about auditory processing and speech perception than years ago due to technology advances: functional magnetic resonance imaging (fMRI), positron emission tomography (PET), magneto- encephalography (MEG)  Better models of speech perception that explain the data (e.g., FLMP Oden&Massaro ‘78, TRACE McClelland&Elman ‘86)  View of speech perception as a process related to other processes of perceptions (e.g., reading – Massaro ‘87)  Take an engineering look at recent findings and understandings about auditory system and speech perception from neuroscience and psychology Sorin Dusan Sept. 13, 2004 NSF ASAT Project

Automatic Speech Recognition: from Sound to Words  What are the possible levels of perceptual representations in speech: words, phonemes, features?  The use of subword units for ASR is extremely appealing due to the increased efficiency of modeling, but …  Any kind of subword “units” of speech recognition could damage the sound-to-words mapping accuracy  Is it possible to replace the phoneme? Is it the right time to dethrone the phoneme in speech processing? Sorin Dusan Sept. 13, 2004 NSF ASAT Project Words Phonemes Features Sound words phonemes features Neural Speech Processing

Automatic Speech Recognition: from Sound to Words  The ASR can be simply seen as a mapping from acoustics to words with no hard-coded intermediate units  Can one build a system to directly map sound or features to lexical representations? (Marslen-Wilson&Warren ’94)  What are the system architectural implications of such a mapping? (levels, complexity, processing time, etc.) Sorin Dusan Sept. 13, 2004 NSF ASAT Project Speech Sound Word 1 Word 2 Word 3 Word N Measurements Phonological Features Hypothesis 1: Complexity: 1 -> 2 -> 3 -> 4

Automatic Speech Recognition: from Sound to Words  Speech recognition could be a heterogeneous process using simultaneously multiple types of phonological representations (features, phonemes, diphones, syllables, words)  Test this hypothesis by building a hybrid system using for example both features and phonemes and compare performance with those of individual systems  Add a top-down structure for context and knowledge integration to the system that uses the same processing principle as the bottom-up structure (Plomp ’02, Massaro ’75) Sorin Dusan Sept. 13, 2004 NSF ASAT Project Hypothesis 2: Speech Sound Feature-Based Recognizer Phoneme-Based Recognizer Word-Based Recognizer Fusion Word 1 Word 2 Word N

References  Geisler, C. D., From Sound to Synapse, Oxford University Press, 1998  Ledoux, J., Synaptic Self: How Our Brains Become Who We Are, New York, 2002  Massaro, D. W., Speech Perception by Ear and Eye: A Paradigm for Psychological Inquiry, LEA Publishers, Hillsdale, London, 1987  Marslen-Wilson, W. and Warren, P., “Levels of Perceptual Representation and Process in Lexical Access: Words, Phonemes, and Features”, Psychological Review, Vol. 101, Issue 4, pp , 1994  Massaro, D. W., Understanding Language – An Information Processing Analysis of Speech Perception, Reading, and Psycholinguistics, Academic Press, New York, 1975  McClelland, J. L. and Elman, J. L., “The TRACE Model of Speech Perception”, Cognitive Psychology, Vol. 18, 1-86, 1986  Oden, G. C. and Massaro, D. W., “Integration of Featural Information in Speech Perception”, Psychological Review, Vol. 85, pp , 1978  Plomp, R., The Intelligent Ear, LEA Publishers, Mahwah, London, 2002  Warren, R. M., Auditory Perception – A New Analysis and Synthesis, Cambridge University Press, 1999 Sorin Dusan Sept. 13, 2004 NSF ASAT Project