Wolfgang Wahlster German Research Center for Artificial Intelligence, DFKI GmbH Stuhlsatzenhausweg 3 66123 Saarbruecken, Germany phone: (+49 681) 302-5252/4162.

Slides:

Advertisements

Similar presentations

PDAs Accept Context-Free Languages

Advertisements

ALAK ROY. Assistant Professor Dept. of CSE NIT Agartala

Institut Software- und Systemtechnik Fraunhofer ISS T 1 Current Service Platform and Architecture Design Personalization and Context-Awareness Bernhard.

Special Topics in Computer Science Advanced Topics in Information Retrieval Lecture 10: Natural Language Processing and IR. Syntax and structural disambiguation.

Sequential Logic Design

GMD German National Research Center for Information Technology Darmstadt University of Technology Perspectives and Priorities for Digital Libraries Research.

Multilinguality & Semantic Search Eelco Mossel (University of Hamburg) Review Meeting, January 2008, Zürich.

Agile Modeling Emitzá Guzmán Agile Modeling.

Machine Learning Approaches to the Analysis of Large Corpora : A Survey Xunlei Rose Hu and Eric Atwell University of Leeds.

FLST: Speech Recognition Bernd Möbius

Christian Fortmann & Martin Forst InSTIL/ICALL2004 Symposium, Venice 1 A German LFG for CALL Christian Fortmann, Martin Forst Institut für Maschinelle.

The 5S numbers game..

Knowledge Extraction from Technical Documents Knowledge Extraction from Technical Documents *With first class-support for Feature Modeling Rehan Rauf,

A Fractional Order (Proportional and Derivative) Motion Controller Design for A Class of Second-order Systems Center for Self-Organizing Intelligent.

1 OFDM Synchronization Speaker:. Wireless Access Tech. Lab. CCU Wireless Access Tech. Lab. 2 Outline OFDM System Description Synchronization What is Synchronization?

1 Multimodal Technology Integration for News-on-Demand SRI International News-on-Demand Compare & Contrast DARPA September 30, 1998.

IH&RA Hotel booking platform

Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute.

Lexical Analysis Arial Font Family.

Dynamic Access Control the file server, reimagined Presented by Mark on twitter 1 contents copyright 2013 Mark Minasi.

Atomatic summarization of voic messages using lexical and prosodic features Koumpis and Renals Presented by Daniel Vassilev.

Before Between After.

ST/PRM3-EU | | © Robert Bosch GmbH reserves all rights even in the event of industrial property rights. We reserve all rights of disposal such as copying.

1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)

Chapter 10: The Traditional Approach to Design

Systems Analysis and Design in a Changing World, Fifth Edition

Static Equilibrium; Elasticity and Fracture

Lial/Hungerford/Holcomb/Mullins: Mathematics with Applications 11e Finite Mathematics with Applications 11e Copyright ©2015 Pearson Education, Inc. All.

Discriminative Training in Speech Processing Filipp Korkmazsky LORIA.

1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)

Introduction Embedded Universal Tools and Online Features 2.

Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.

Natural Language and Speech Processing Creation of computational models of the understanding and the generation of natural language. Different fields coming.

Media Coordination in SmartKom Norbert Reithinger Dagstuhl Seminar “Coordination and Fusion in Multimodal Interaction” Deutsches Forschungszentrum für.

Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.

Natural Language Understanding

Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.

Wolfgang Wahlster German Research Center for Artificial Intelligence DFKI GmbH Seventeenth International Joint Conference on Artificial.

ACL, ECCAI and the Verbmobil/SmartKom Consortia German Research Center for Artificial Intelligence Stuhlsatzenhausweg 3, Geb Saarbrücken Tel.:

Prof. Wolfgang Wahlster German Research Center for Artificial Intelligence, DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( )

1 7-Speech Recognition (Cont’d) HMM Calculating Approaches Neural Components Three Basic HMM Problems Viterbi Algorithm State Duration Modeling Training.

Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.

DFKI GmbH, , R. Karger Indo-German Workshop on Language Technologies Reinhard Karger, M.A. Deutsches Forschungszentrum für Künstliche Intelligenz.

Recent Activities of Speech Corpora and Assessment in Korea Yong-Ju Lee Wonkwang University Korea.

Experiments on Building Language Resources for Multi-Modal Dialogue Systems Goals identification of a methodology for adapting linguistic resources for.

Multimodal Information Access Using Speech and Gestures Norbert Reithinger

Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.

Wolfgang Wahlster German Research Center for Artificial Intelligence, DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162.

May 2006CLINT-CS Verbmobil1 CLINT-CS Dialogue II Verbmobil.

Machine Learning in Spoken Language Processing Lecture 21 Spoken Language Processing Prof. Andrew Rosenberg.

Copyright 2007, Toshiba Corporation. How (not) to Select Your Voice Corpus: Random Selection vs. Phonologically Balanced Tanya Lambert, Norbert Braunschweiler,

PETRA – the Personal Embedded Translation and Reading Assistant Werner Winiwarter University of Vienna InSTIL/ICALL Symposium 2004 June 17-19, 2004.

RELATIONAL FAULT TOLERANT INTERFACE TO HETEROGENEOUS DISTRIBUTED DATABASES Prof. Osama Abulnaja Afraa Khalifah

German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: (+49.

Introduction to CL & NLP CMSC April 1, 2003.

Chennai, 17./18. Feb 04Andreas KlüterNLP System Software Engineering Verbmobil from a Software Engineering point of view System Design and Software Integration.

Indirect Supervision Protocols for Learning in Natural Language Processing II. Learning by Inventing Binary Labels This work is supported by DARPA funding.

October 2005CSA3180 NLP1 CSA3180 Natural Language Processing Introduction and Course Overview.

Translingual Information Management Stephan Busemann Language Technology Lab German Research Center for Artificial Intelligence.

Chapter 8. Situated Dialogue Processing for Human-Robot Interaction in Cognitive Systems, Christensen et al. Course: Robots Learning from Humans Sabaleuski.

DFKI GmbH, , R. Karger Perspectives for the Indo German Scientific and Technological Cooperation in the Field of Language Technology Reinhard.

Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.

Chapter 7 Speech Recognition Framework  7.1 The main form and application of speech recognition  7.2 The main factors of speech recognition  7.3 The.

1 7-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches Recognition Theories Bayse Rule Simple Language Model P(A|W) Network Types.

© W. Wahlster, DFKI IST ´98 Workshop „The Language of Business - the Business of Language“ Vienna, 2 December 1998 German Research Center for Artificial.

Natural Language Processing (NLP)

Robust Translation of Spontaneous Speech: A Multi-Engine Approach

Natural Language Processing (NLP)

Natural Language Processing (NLP)

Presentation transcript:

Wolfgang Wahlster German Research Center for Artificial Intelligence, DFKI GmbH Stuhlsatzenhausweg Saarbruecken, Germany phone: ( ) /4162 fax: ( ) WWW: Verbmobil Multilingual Processing of Spontaneous Speech

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Mobile Speech-to-Speech Translation of Spontaneous Dialogs As the name Verbmobil suggests, the system supports verbal communication with foreign dialog partners in mobile situations. 1 2 face-to-face conversations telecommunication

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Mobile Speech-to-Speech Translation of Spontaneous Dialogs Verbmobil Speech Translation Server Solution: Conference Call: The Verbmobil Speech Translation Server is accessed by GSM mobile phones.

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Verbmobil is a Multilingual System German English (American) German Japanese It supports bidirectional translation between: German Chinese (Mandarine) Siemens, Philips, FH Konstanz, 2 Chinese Universities Final industrial demo at the end of 2000

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Input Conditions Naturalness Adaptability Dialog Capabilities Increasing Complexity Close-Speaking Microphone/Headset Push-to-talk Telephone, Pause-based Segmentation Isolated Words Read Continuous Speech Speaker Independent Speaker Dependent Monolog Dictation Information- seeking Dialog Open Microphone, GSM Quality Spontaneous Speech Speaker adaptive Multiparty Negotiation Verbmobil Challenges for Language Engineering

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Wann fährt der nächste Zug nach Hamburg ab? When does the next train to Hamburg depart? Wo befindet sich das nächste Hotel? Where is the nearest hotel? Final Verbmobil Demos: CeBIT-2000 (Hannover) COLING-2000 (Saarbrücken) ECAI-2000 (Berlin) Context-Sensitive Speech-to-Speech Translation Verbmobil Server

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Verbmobil: The First Speech-Only Dialog Translation System Mobile GSM Phone Mobile DECT Phone German Speaker: Verbmobil (Voice Dialing)

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Verbmobil: The First Speech-Only Dialog Translation System Mobile GSM Phone Mobile DECT Phone German Speaker: Verbmobil (Voice Dialing) Connect to the Verbmobil Speech-to-Speech Translation Server

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Verbmobil: The First Speech-Only Dialog Translation System Mobile GSM Phone Mobile DECT Phone German Speaker: Verbmobil (Voice Dialing) Connect to the Verbmobil Speech-to-Speech Translation Server Verbmobil: Willkommen beim Verbmobil-Sprachserver. Bitte sprechen sie nach dem Piepton.

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Verbmobil: The First Speech-Only Dialog Translation System Mobile GSM Phone Mobile DECT Phone German Speaker: Verbmobil (Voice Dialing) Connect to the Verbmobil Speech-to-Speech Translation Server Verbmobil: Willkommen beim Verbmobil-Sprachserver. Bitte sprechen sie nach dem Piepton. German Speaker:Verbmobil neuer Teilnehmer hinzufügen. (Speech command to initiate a conference call)

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Verbmobil: The First Speech-Only Dialog Translation System Mobile GSM Phone Mobile DECT Phone German Speaker: Verbmobil (Voice Dialing) Connect to the Verbmobil Speech-to-Speech Translation Server Verbmobil: Willkommen beim Verbmobil-Sprachserver. Bitte sprechen sie nach dem Piepton. German Speaker:Verbmobil neuer Teilnehmer hinzufügen. (Speech command to initiate a conference call) Verbmobil: Bitte sprechen Sie jetzt die Telephonnummer Ihres Gesprächspartners.

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Verbmobil: The First Speech-Only Dialog Translation System Mobile GSM Phone Mobile DECT Phone German Speaker: Verbmobil (Voice Dialing) Connect to the Verbmobil Speech-to-Speech Translation Server Verbmobil: Willkommen beim Verbmobil-Sprachserver. Bitte sprechen sie nach dem Piepton. German Speaker:Verbmobil neuer Teilnehmer hinzufügen. (Speech command to initiate a conference call) Verbmobil: Bitte sprechen Sie jetzt die Telephonnummer Ihres Gesprächspartners. German Speaker: 0681/

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Verbmobil: The First Speech-Only Dialog Translation System Mobile GSM Phone Mobile DECT Phone German Speaker: Verbmobil (Voice Dialing) Connect to the Verbmobil Speech-to-Speech Translation Server Verbmobil: Willkommen beim Verbmobil-Sprachserver. Bitte sprechen sie nach dem Piepton German Speaker:Verbmobil neuer Teilnehmer hinzufügen (Speech command to initiate a conference call) Verbmobil: Bitte sprechen Sie jetzt die Telephonnummer Ihres Gesprächspartners. German Speaker: 0681/ Foreign Participant is placed into the Conference Call To German Participant Verbmobil: Verbmobil hat eine neue Verbindung aufgebaut. Bitte sprechen Sie jetzt. To American Participant Verbmobil: Welcome to the Verbmobil server. Please start your input after the beep.

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Scenario 1 Appointment Scheduling When? When? Where? How? What? When? Where? How? Focus on temporal expressions Focus on temporal and spatial expressions Integration of special sublanguage lexica Vocabulary Size: 2500/6000 Vocabulary Size: 7000/10000 Vocabulary Size: 15000/30000 Verbmobil II: Three Domains of Discourse Scenario 2 Travel Planning Scenario 3 Remote PC Maintenance

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH The Control Panel of Verbmobil

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH M1 M2M3 M5 M6 M4 BB 2BB 1 BB 3 M1 Verbmobil I Verbmobil II Multi-Agent Architecture Multi-Blackboard Architecture Each module must know, which module produces what data Direct communication between modules Each module has only one instance Heavy data traffic for moving copies around Multiparty and telecooperation applications are impossible Software: ICE and ICE Master Basic Platform: PVM All modules can register for each blackboard dynamically No direct communication between modules Each module can have several instances No copies of representation structures (word lattice, VIT chart) Multiparty and Telecooperation applications are possible Software: PCA and Module Manager Basic Platform: PVM From a Multi-Agent Architecture to a Multi- Blackboard Architecture Blackboards M2 M3 M6 M4 M5

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Audio Data Command Recognizer Spontaneous Speech Recognizer Channel/Speaker Adaptation Prosodic Analysis A Multi-Blackboard Architecture for the Combination of Results from Deep and Shallow Processing Modules

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Audio Data Word Hypotheses Graph with Prosodic Labels Command Recognizer Spontaneous Speech Recognizer Channel/Speaker Adaptation Prosodic Analysis Statistical Parser Dialog Act Recognition Chunk Parser HPSG Parser A Multi-Blackboard Architecture for the Combination of Results from Deep and Shallow Processing Modules

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Audio Data Word Hypotheses Graph with Prosodic Labels VITs Underspecified Discourse Representations Command Recognizer Spontaneous Speech Recognizer Channel/Speaker Adaptation Prosodic Analysis Statistical Parser Dialog Act Recognition Chunk Parser HPSG Parser Semantic Construction Robust Dialog Semantics Semantic Transfer Generation A Multi-Blackboard Architecture for the Combination of Results from Deep and Shallow Processing Modules

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Verbmobil as the First Dialog Translation System that Uses Prosodic Information Systematically at All Processing Stages Speech SignalWord Hypotheses Graph Multilingual Prosody Module Prosodic features: duration pitch energy pause Search Space Restriction Parsing Dialog Act Segmentation and Recognition Dialog Understanding Constraints for Transfer Translation Lexical Choice Generation Speech Synthesis Speaker Adaptation Boundary Information Boundary Information Boundary Information Boundary Information Sentence Mood Sentence Mood Accented Words Accented Words Prosodic Feature Vector

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Augmented Word Hypotheses Graph Augmented Word Hypotheses Graph Chunk Parser Statistical Parser HPSG Parser Integrating Shallow and Deep Analysis Components in a Multi-Blackboard Architecture

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH partial VITs Chart with a combination of partial VITs Chart with a combination of partial VITs Integrating Shallow and Deep Analysis Components in a Multi-Blackboard Architecture Augmented Word Hypotheses Graph Augmented Word Hypotheses Graph Chunk Parser Statistical Parser HPSG Parser

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Robust Dialog Semantics Combination and knowledge- based reconstruction of complete VITs Robust Dialog Semantics Combination and knowledge- based reconstruction of complete VITs Complete and Spanning VITs Complete and Spanning VITs Integrating Shallow and Deep Analysis Components in a Multi-Blackboard Architecture Chunk Parser Statistical Parser HPSG Parser partial VITs Chart with a combination of partial VITs Chart with a combination of partial VITs Augmented Word Hypotheses Graph Augmented Word Hypotheses Graph

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Verbmobils Massive Data Collection Effort Transliteration Variant 1 Transliteration Variant 2 Lexical Orthography Canonical Pronounciation Manual Phonological Segmentation Automatic Phonological Segmentation Word Segmentation Prosodic Segmentation Dialog Acts Noises Superimposed Speech Syntactic Category Word Category Syntactic Function Prosodic Boundaries The so-called Partitur (German word for musical score) orchestrates fifteen strata of annotations 3,200 dialogs (182 hours) with 1,658 speakers 79,562 turns distributed on 56 CDs, 21.5 GB

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Machine Learning for the Integration of Statistical Properties into Symbolic Models for Speech Recognition, Parsing, Dialog Processing, Translation Transcribed Speech Data Segmented Speech with Prosodic Labels Annotated Dialogs with Dialog Acts Treebanks & Predicate- Argument Structures Aligned Bilingual Corpora Hidden Markov Models Neural Nets, Multilayered Perceptrons Probabilistic Automata Probabilistic Grammars Probabilistic Transfer Rules Extracting Statistical Properties from Large Corpora

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Incremental chart construction and anytime processing Chart Parser using cascaded finite-state transducers (Abney, Hinrichs) Semantic Construction VHG: A Packed Chart Representation of Partial Semantic Representations

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Incremental chart construction and anytime processing Chart Parser using cascaded finite-state transducers (Abney, Hinrichs) Statistical LR parser trained on treebank (Block, Ruland) Semantic Construction VHG: A Packed Chart Representation of Partial Semantic Representations

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Incremental chart construction and anytime processing Chart Parser using cascaded finite-state transducers (Abney, Hinrichs) Statistical LR parser trained on treebank (Block, Ruland) Very fast HPSG parser (see two papers at ACL99, Kiefer, Krieger et al.) Semantic Construction VHG: A Packed Chart Representation of Partial Semantic Representations

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Incremental chart construction and anytime processing Chart Parser using cascaded finite-state transducers (Abney, Hinrichs) Statistical LR parser trained on treebank (Block, Ruland) Very fast HPSG parser (see two papers at ACL99, Kiefer, Krieger et al.) Semantic Construction VHG: A Packed Chart Representation of Partial Semantic Representations

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Incremental chart construction and anytime processing Rule-based combination and transformation of partial UDRS coded as VITs Chart Parser using cascaded finite-state transducers (Abney, Hinrichs) Statistical LR parser trained on treebank (Block, Ruland) Very fast HPSG parser (see two papers at ACL99, Kiefer, Krieger et al.) Semantic Construction VHG: A Packed Chart Representation of Partial Semantic Representations

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Incremental chart construction and anytime processing Rule-based combination and transformation of partial UDRS coded as VITs Selection of a spanning analysis using a bigram model for VITs (trained on a tree bank of 24 k VITs) Chart Parser using cascaded finite-state transducers (Abney, Hinrichs) Statistical LR parser trained on treebank (Block, Ruland) Very fast HPSG parser (see two papers at ACL99, Kiefer, Krieger et al.) Semantic Construction VHG: A Packed Chart Representation of Partial Semantic Representations

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Goals of robust semantic processing (Pinkal, Worm, Rupp) Combination of unrelated analysis fragments Completion of incomplete analysis results Skipping of irrelevant fragments Method:Transformation rules on VIT Hypothesis Graph: Conditions on VIT structures Operations on VIT structures The rules are based on various knowledge sources: lattice of semantic types domain ontology sortal restrictions semantic constraints Results: 20% analysis is improved, 0.6% analysis gets worse Robust Dialog Semantics: Deep Processing of Shallow Structures

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH The preposition in is missing in all paths through the word hypothesis graph. A temporal NP is transformed into a temporal modifier using a underspecified temporal relation: [temporal_np(V1)] [typeraise_to_mod (V1, V2)] & V2 The modifier is applied to a proposition: [type (V1, prop), type (V2, mod)] [apply (V2, V1, V3)] & V3 Let us meet the late afternoon to catch the train to Frankfurt Let us meet (in) the late afternoon to catch the train to Frankfurt Robust Dialog Semantics: Combining and Completing Partial Representations

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH I need a car next Tuesdayoops Monday Original Utterance Editing PhaseRepair Phase Reparandum Hesitation Reparans Recognition of Substitutions Transformation of the Word Hypothesis Graph I need a car next Monday Verbmobil Technology:Understands Speech Repairs and extracts the intended meaning Dictation Systems like: ViaVoice, VoiceXpress, FreeSpeech, Naturally Speaking cannot deal with spontaneous speech and transcribe the corrupted utterances. The Understanding of Spontaneous Speech Repairs

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Integrating a Deep HPSG-based Analysis with Probabilistic Dialog Act Recognition for Semantic Transfer Probabilistic Analysis of Dialog Acts (HMM) Probabilistic Analysis of Dialog Acts (HMM) Recognition of Dialog Plans (Plan Operators) Recognition of Dialog Plans (Plan Operators) Dialog Act Type HPSG Analysis Robust Dialog Semantics Robust Dialog Semantics VIT Semantic Transfer Semantic Transfer Dialog Act Type

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Probabilistic Analysis of Dialog Acts (HMM) Probabilistic Analysis of Dialog Acts (HMM) Recognition of Dialog Plans (Plan Operators) Recognition of Dialog Plans (Plan Operators) Dialog Act Type Dialog Phase Dialog Act Type Integrating a Deep HPSG-based Analysis with Probabilistic Dialog Act Recognition for Semantic Transfer HPSG Analysis Robust Dialog Semantics Robust Dialog Semantics VIT Semantic Transfer Semantic Transfer

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Statistical Prediction Statistical Prediction Context Evaluation Dialog Module Dialog-Act based Translation Plan Recognition Plan Recognition Dialog Memory Dialog Memory Main Proprositional Content Dialog Act Context Evaluation Dialog-Act based Translation Transfer by Rules Generation of Minutes Dialog Act Predictions Dialog Act Dialog Phase Focus Combining Statistical and Symbolic Processing for Dialog Processing

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Using Context and World Knowledge for Semantic Transfer All other dialog translation systems translate word-by-word or sentence-by-sentence. 1 Nehmen wir dieses Hotel, ja. Let us take this hotel. Ich reserviere einen Platz. I will reserve a room. 2 Machen wir das Abendessen dort. Let us have dinner there. Ich reserviere einen Platz. I will reserve a table. 3 Gehen wir ins Theater. Let us go to the theater. Ich möchte Plätze reservieren. I would like to reserve seats. Example: Platz room / table / seat

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Dialog Translation by Verbmobil Multilingual Generation of Protocols HTML-Document in English Transferred by Internet or Fax HTML-Document in German Transferred by Internet or Fax German Dialog Partner American Dialog Partner Automatic Generation of Multilingual Protocols of Telephone Conversations

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Segment 1 If you prefer another hotel, Segment 1 If you prefer another hotel, Segment 2 please let me know. Segment 2 please let me know. Integrating Deep and Shallow Processing: Combining Results from Concurrent Translation Threads

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Alternative Translations with Confidence Values Statistical Translation Statistical Translation Dialog-Act Based Translation Dialog-Act Based Translation Semantic Transfer Semantic Transfer Case-Based Translation Case-Based Translation Integrating Deep and Shallow Processing: Combining Results from Concurrent Translation Threads Segment 1 If you prefer another hotel, Segment 1 If you prefer another hotel, Segment 2 please let me know. Segment 2 please let me know.

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Integrating Deep and Shallow Processing: Combining Results from Concurrent Translation Threads Segment 1 Translated by Semantic Transfer Segment 1 Translated by Semantic Transfer Segment 2 Translated by Case-Based Translation Segment 2 Translated by Case-Based Translation Alternative Translations with Confidence Values Statistical Translation Statistical Translation Dialog-Act Based Translation Dialog-Act Based Translation Semantic Transfer Semantic Transfer Case-Based Translation Case-Based Translation Segment 1 If you prefer another hotel, Segment 1 If you prefer another hotel, Segment 2 please let me know. Segment 2 please let me know. Selection Module

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Funding by the German Ministry for Education and Research BMBF Phase I ( )$ 33 M Phase II ( )$ 28 M 60% Industrial funding according to shared cost model$ 17 M Additional R&D investments of industrial partners$ 11 M Total$ 89 M > 800 Publications (>600 refereed) >Many Patents > 17 Commercial Spin-off Products >6 Spin-off Companies > 900 trained Researchers for >Product Announcement German Language Industry for GSM version in 2001 Philips, DaimlerChrysler and Siemens are leaders in Spoken Dialog Applications Verbmobil: Long-Term, Large-Scale Funding and Its Impact

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH More than 80% of Verbmobils Translations are Approximately Correct - Large-Scale Web-based Evaluation: Translations, 65 Evaluators - Sentence Length Words Translation Thread Case-based Translation Statistical Translation Dialog-Act based Translation Semantic Transfer Substring-based Translation Automatic Selection Manual Selection 37% 69% 40% 65% 57% / 78% * 88% 44% 79% 45% 47% 75% 66% / 83% * 95% 46% 81% 46% 49% 79% 68% / 85% * 97% Word Accuracy 50% 5069 Turns Word Accuracy 75% 3267 Turns Word Accuracy 80% 2723 Turns * After Training with Instance-based Learning Algorithm Percentage of Approximately Correct Translation

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Three Domains: Appointment Scheduling, Travel Planning, PC Hotline Bi-directional and speaker-independent translation in the domains: appointment scheduling and travel planning Translation pairs: German English, German Japanese Vocabulary Size: for German, Equivalent English Lexicon, 2500 for Japanese Operational Success Criteria: Word recognition rate (16 kHz): German: spontaneous: 75% (cooperative: 85%) English: spontaneous: 72% (cooperative: 82%) Japanese: spontaneous: 75% (cooperative: 85%) (8kHz) spontaneous: 70% (cooperative: 80%) 80% of the translations are approximately correct and the dialog task success rate should be around 90%. The average end-to-end processing time should be four times real time (length of the input signal) Checklist for Final Verbmobil System I

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH The system can work in the open microphone mode and cope with speech over GSM mobile phones. Verbmobil can be controlled by speech commands. A spelling mode is integrated into the speech recognizer. The speech recognizers can cope with simple non-speech input (like coughing). Spontaneous speech phenomena like repairs, hesitations and agreement failures can be handled. The language identification and speech recognition components are implemented as separate components. A three-party conference call with Verbmobil and a foreign partner can be initiated by one speaker. A high-quality speech synthesis for German and American English is realized. Checklist for Final Verbmobil System II

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Prosodic information is used for input segmentation. Unknown words can be identified and processed. Robust semantic processing integrates partial analysis results of the competing parsing approaches. The selection of the translation result is based on a dynamic choice function based on confidence values computed by competing translation threads. Some translation ambiguities can be resolved by the exploitation of world and context knowledge, so that the translation quality is improved. Verbmobil can generate various forms of dialog protocols in German and English. Checklist for Final Verbmobil System III

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Results of the Verbmobil Project have been used in 17 Spin-Off Products by the Industrial Partners DaimlerChrysler, Philips and Siemens Verbmobil Dictation Systems 3 Spoken Dialog Systems 4 Dialog Engines 2 Command & Control Systems 5 Text Classification Systems 1 Translation Systems 2

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Verbmobil CLT Sprachtechnologie GmbH Language Technology for Text Processing Saarbrücken RETIVOX GbR Speech Synthesis Systems Bonn XTRAMIND Technologies Language Technology for Customer Interaction Services Saarbrücken SYMPALOG GmbH Spoken Dialog Systems Nürnberg GSDC GmbH Multilingual Documentation Nürnberg SCHEMA GmbH Document Engineering Nürnberg Successful Technology Transfer: 6 High-Tec Spin-Off Companies in the Area of Language Technology have been founded by Verbmobil Researchers

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Verbmobil Internships 18 Master Students 238 PhD Students 164 Student Research Assistants 483 Habilitations 16 Total 919 Verbmobil was the Key Resource for the Education and Training of Researchers and Engineers Needed to Build Up Language Industry in Germany

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Main Contractor Project Management Testbed Software Integration DFKI Saarbrücken The SmartKom Consortium: Project Budget: $ 34 M Project Duration: 4 years SmartKom: Intuitive Multimodal Interaction MediaInterface European Media Lab IMS Institut für Maschinelle Sprachverarbeitung, Universität Stuttgart Ludwig-Maximilians- Universität München

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Smartcard/ Credit Card for authentication and billing Docking station for PDA/Notebook/ Camcorder high speed and broad bandwidth Internet connectivity Loudspeaker Room microphone Face-tracking camera Virtual touchscreen protected against vandalism Multipoint video conferencing High-resolution scanner SmartKom-Public: A Multimodal Communication Booth

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Camera GPS Microphone Loudspeaker Stylus-Activated Sketch Pad Wearable Compute Server Docking Station for Car PC Biosensor for Authentication & Emotional Feedback GSM for Telephone, Fax, Internet Connectivity SmartKom-Mobile: A Handheld Communication Assistant

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Verbmobil is a Very Large Dialog System 69 modules communicate via 224 blackboards HPSG for German uses a hierarchy of 2,400 types 15,385 entries in the semantic database 22,783 transfer rules and 13,640 microplanning rules 30,000 templates for case-based translation 691,583 alignment templates 334 finite state-transducers

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Additional Information about Verbmobil during COLING 2 Tutorials: Klüter/Reithinger: Verbmobil Development and Integration Müller: HPSG 11 Presentations at main conference (regular papers and project notes) - Probabilistic Parsing - Tense Translation - Selection of Translation Results - Statistical Translation (4) - HPSG Parsing - Semantic Construction - Self Corrections Verbmobil Demos at the COLING exhibition 1 2 3

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Real-world problems in language technology like the understanding of spoken dialogs, speech-to-speech translation and multimodal dialog systems can only be cracked by the combined muscle of deep and shallow processing approaches. In a multi-blackboard architecture based on packed representations on all processing levels (speech recognition, parsing, semantic processing, translation, generation) using charts with underspecified representations (eg. UDRS) the results of concurrent processing threads can be combined in an incremental fashion. Conclusion I

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH All results of concurrent processing modules should come with a confidence value, so that a selection module can choose the most promising result at a each processing stage. Packed representations together with formalisms for underspecification capture the uncertainties in a each processing phase, so that the uncertainties can be reduced by linguistic, discourse and domain constraints as soon as they become applicable. Conclusion II

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH Deep Processing can be used for merging, completing and repairing the results of shallow processing strategies. Shallow methods can be used to guide the search in deep processing. Statistical methods must be augmented by symbolic models (eg. Class-based language modelling, word order normalization as part of statistical translation). Statistical methods can be used to learn operators or selection strategies for symbolic processes. It is much more than a balancing act... (see Klavans and Resnik 1996) Conclusion III

Verbmobil Final Symposium, 30 July 2000 © Wolfgang Wahlster, DFKI GmbH English speech recognition for telephone input (DaimlerChrysler) Two additional translation engines: case-based (ALI, DFKI) and substring-based translation (LTrans, Siemens) An additional protocol mode (baseline protocol, DFKI) Open Problems: Integrating top-down knowledge into basic speech recognition processes Exploiting more knowledge about human interpretation strategies More robust translation of turns with very low word accuracy rates More systematic use of expert knowledge about the domain of discourse Additional Results (not promised in the project proposal)

URL of this Presentation: