Dirk Van CompernolleAtranos Workshop, Leuven 12 April 2002 Automatic Transcription of Natural Speech - A Broader Perspective – Dirk Van Compernolle ESAT.

Slides:



Advertisements
Similar presentations
Introduction to Computational Linguistics
Advertisements

Research-Based Instruction in Reading Dr. Bonnie B. Armbruster University of Illinois at Urbana-Champaign Archived Information.
Accuracy vs Fluency Cesar Klauer 28 Feb., Presentation scheme What is fluency? What is accuracy? Fluency VS Accuracy? Communicative competence Suggestions.
1990s DARPA Programmes WSJ and BN Dapo Durosinmi-Etti Bo Xu Xiaoxiao Zheng.
Chapter 3 Listening for intermediate level learners Helgesen, M. & Brown, S. (2007). Listening [w/CD]. McGraw-Hill: New York.
EFFECTIVE LEARNING MANAGEMENT
INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING NLP-AI IIIT-Hyderabad CIIL, Mysore ICON DECEMBER, 2003.
S1S1 S2S2 S3S3 ATraNoS Workshop 12 April 2002 Patrick Wambacq.
Centro per la Ricerca Scientifica e Tecnologica Spoken language technologies: recent advances and future challenges Gianni Lazzari VIENNA July 26.
1 CS 430: Information Discovery Lecture 22 Non-Textual Materials 2.
Spoken Language Technologies: A review of application areas and research issues Analysis and synthesis of F0 contours Agnieszka Wagner Department of Phonetics,
Meeting Recorder Adam Janin
Tanja Schultz, Alan Black, Bob Frederking Carnegie Mellon University West Palm Beach, March 28, 2003 Towards Dolphin Recognition.
ASR Evaluation Julia Hirschberg CS Outline Intrinsic Methods –Transcription Accuracy Word Error Rate Automatic methods, toolkits Limitations –Concept.
How to evaluate listening skills
Auditory User Interfaces
Why is ASR Hard? Natural speech is continuous
English Mastery A “How To” guide…
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
DIVINES – Speech Rec. and Intrinsic Variation W.S.May 20, 2006 Richard Rose DIVINES SRIV Workshop The Influence of Word Detection Variability on IR Performance.
Natural Language Processing and Speech Enabled Applications by Pavlovic Nenad.
An Introduction to Principles of Language Learning Alena Macurová.
Lecture 8 Assessing Listening Chapter Six Pages: Brown, 2004.
Automatic Transcript Generation Helmer Strik A 2 RT Dept. of Language & Speech University of Nijmegen.
ISSUES IN SPEECH RECOGNITION Shraddha Sharma
Chapter 4 Listening for advanced level learners Helgesen, M. & Brown, S. (2007). Listening [w/CD]. McGraw-Hill: New York.
Saturday, March 15 th and Monday, March 17 th English FL: Reading Comprehension and Composition. Writing: Paragraph Structure; unity; parts, etc. Translation.
Speech Recognition Final Project Resources
CAREERS IN LINGUISTICS OUTSIDE OF ACADEMIA CAREERS IN INDUSTRY.
Data collection and experimentation. Why should we talk about data collection? It is a central part of most, if not all, aspects of current speech technology.
Some Thoughts on HPC in Natural Language Engineering Steven Bird University of Melbourne & University of Pennsylvania.
Real-Time Speech Recognition Subtitling in Education Respeaking 2009 Dr Mike Wald University of Southampton.
Recent Activities of Speech Corpora and Assessment in Korea Yong-Ju Lee Wonkwang University Korea.
Interaction Design Session 12 LBSC 790 / INFM 718B Building the Human-Computer Interface.
Spoken dialog for e-learning supported by domain ontologies Dario Bianchi, Monica Mordonini and Agostino Poggi Dipartimento di Ingegneria dell’Informazione.
Teaching Productive Skills Which ones are they? Writing… and… Speaking They have similarities and Differences.
P  We do exegesis every day.  It is the process of understanding what we hear or read.  Exegesis is about communication and understanding :
Helynn Boughner EDU 674 Prof. Klein.  Is any technology that can help a person do a task. It can be as high- tech, as a computer system that speaks the.
Spoken Dialog Systems and Voice XML Lecturer: Prof. Esther Levin.
Rundkast at LREC 2008, Marrakech LREC 2008 Ingunn Amdal, Ole Morten Strand, Jørn Almberg, and Torbjørn Svendsen RUNDKAST: An Annotated.
ARTIFICIAL INTELLIGENCE FOR SPEECH RECOGNITION. Introduction What is Speech Recognition?  also known as automatic speech recognition or computer speech.
Chapter 7 Speech Recognition Framework  7.1 The main form and application of speech recognition  7.2 The main factors of speech recognition  7.3 The.
Artificial Intelligence: Research and Collaborative Possibilities a presentation by: Dr. Ernest L. McDuffie, Assistant Professor Department of Computer.
Natural Language and Speech (parts of Chapters 8 & 9)
The new GCSE 2018: Specification change as an opportunity to build best practice.
Lesson 4 Grammar - Chapter 13.
Communication and Social Styles Administrative Professional Series Rosalie Owens.
S1S1 S2S2 S3S3 8 October 2002 DARTS ATraNoS Automatic Transcription and Normalisation of Speech Jacques Duchateau, Patrick Wambacq, Johan Depoortere,
Teaching English with Technology. A little bit of history…. Web – 1970: Tape recorders, laboratories – 1970: Tape recorders, laboratories.
Notes for Speech Recognition. Speech Recognition Continuous Speech Recognition (CSR) is the software that allows users to speak normally and input data.
Objectives of session By the end of today’s session you should be able to: Define and explain pragmatics and prosody Draw links between teaching strategies.
PREPARED BY MANOJ TALUKDAR MSC 4 TH SEM ROLL-NO 05 GUKC-2012 IN THE GUIDENCE OF DR. SANJIB KR KALITA.
HIGH SCHOOL TEACHER TRAINING WORKSHOP
How can speech technology be used to help people with disabilities?
Automated detection and correction of errors in real-time Speech To Text Andrew Lambourne – Leeds Beckett University Lindsay Bywood – University of Westminster.
Natural Language Processing and Speech Enabled Applications
Automatic Speech Recognition
Artificial Intelligence for Speech Recognition
Course Projects Speech Recognition Spring 1386
3.0 Map of Subject Areas.
The Role of Teachers and Technology in Assessing the CCSS Speaking and
Why Study Spoken Language?
SPEAKING ASSESSMENT Joko Nurkamto UNS Solo 11/8/2018.
Why Study Spoken Language?
SPEAKING ASSESSMENT Joko Nurkamto UNS Solo 12/3/2018.
Command Me Specification
Interactive Medium-Fi Prototype
汉语连续语音识别 年1月4日访北京工业大学 973 Project 2019/4/17 汉语连续语音识别 年1月4日访北京工业大学 郑 方 清华大学 计算机科学与技术系 语音实验室
VoiceXML An investigation Author: Mya Anderson
Presentation transcript:

Dirk Van CompernolleAtranos Workshop, Leuven 12 April 2002 Automatic Transcription of Natural Speech - A Broader Perspective – Dirk Van Compernolle ESAT - KULeuven

Dirk Van CompernolleAtranos Workshop, Leuven 12 April 2002 The Goal of Speech Recognition may be defined as “ to allow for a natural and non-intrusive user option between either speech input or text input for any application “

Dirk Van CompernolleAtranos Workshop, Leuven 12 April 2002 Applications of Speech Recognition INFORMATION RETRIEVAL –GOAL: To understand a spoken query –ROLE of ASR: “speech recognition” is a tool, full accuracy may not be required as long as the essential elements are understood TRANSCRIPTION –GOAL: To make a written version of a spoken document –ROLE of ASR: “speech recognition” is a goal by itself AUDIO MINING, SPEECH TRANSLATION –Are in theory different from their textual cousins –Use in practice a combination of ‘speech recognition for transcription’ and text mining, text translation,..

Dirk Van CompernolleAtranos Workshop, Leuven 12 April 2002 Specialization in Speech Recognition Is natural because the requirements depend on the application area Is required because speech recognition technology is still far from perfect Is achieved by limiting –Vocabulary size –Task complexity –Speaking style –Acoustic variability –Speaker variability –Language variability

Dirk Van CompernolleAtranos Workshop, Leuven 12 April Speech Recognition Problems Spontaneous Fluent Read Connected Isolated Voice commands Directory assistance Office dictation Natural conversation Name dialing More difficult Dialog systems Vocabulary size Speaking style

Dirk Van CompernolleAtranos Workshop, Leuven 12 April 2002 Gradually increasing task complexity in research projects Small vocabulary ( ’s) –isolated words, spelling –digit strings (TIdigits) –discrete dictation Towards large vocabulary read speech ( ) –medium vocabulary continuous tasks (RM, ATIS) –large vocabulary read speech (WSJ) Towards unconstrained speech (1996-…) –Transcription of Broadcast news (ABN) –Mixed environmental conditions –Mixed speaking styles & speakers –Spontaneous speech (Switchboard, CallHome)

Dirk Van CompernolleAtranos Workshop, Leuven 12 April 2002 Research Benchmarks Humans vs Machines

Dirk Van CompernolleAtranos Workshop, Leuven 12 April 2002 Why are we shifting the evaluation paradigm ? We prefer to do research on what is difficult, but not on what is impossible The older (easier) tasks are more artificial in nature and reflect insufficiently well how speech recognition will perform in real life Progress is hard to measure, when the task performance is getting reasonable. Industry should take over at this point. Implicit over-training on specific test material

Dirk Van CompernolleAtranos Workshop, Leuven 12 April 2002 Transcription vs. Speech Recognition Speech recognition always provides a transcription, but the requirements put onto that transcription may vary: –Conceptual / Keyword Transcription –Textual transcription –Transcription of non-verbal events Hesitations, background noise … Speaker turns … Intonation, hidden intentions … –Edited / Normalized transcription Correction of restarts, misspeaks, … Grammatical corrections Shortened transcription for close-captioning Markup for document layout

Dirk Van CompernolleAtranos Workshop, Leuven 12 April 2002 Applications of ‘Transcription per se’ Transcription of ‘Dictated Speech’ –Document Generation Commercial packages available for public at large and specific professional groups (doctors, lawyers) Cooperative users, controlled acoustic conditions Works well if sufficiently close to ‘read speech’ Transcription of ‘Available Speech’ –Meeting Transcription Examples: parliament, court hearings, … Spontaneous & non-formal, Multi-speaker Attention required to non-verbal information –Broadcast transcription Multi-speaker & multi environment Read speech & spontaneous dialogues Attention required to specific usage

Dirk Van CompernolleAtranos Workshop, Leuven 12 April “Transcription of Natural Speech” Spontaneous Fluent Read Connected Isolated Voice commands Directory assistance Office dictation Natural conversation Name dialing More difficult Dialog systems Vocabulary size Speaking style

Dirk Van CompernolleAtranos Workshop, Leuven 12 April 2002 Research Projects world-wide DARPA sponsored projects on ‘broadcast transcription’ –Started in 1995 –Common test sets –< 10 labs typically participate in the official benchmark tests –English language dominated ‘Other’ projects –Many ‘national’ projects focusing on local languages –Focus on different aspects of the ‘transcription problem’ –Examples today: ALERT DRUID ATRANOS