German Research Center for Artificial Intelligence, DFKI GmbH Stuhlsatzenhausweg 3 66123 Saarbruecken, Germany phone: (+49 681) 302-5252/4162 fax: (+49.

German Research Center for Artificial Intelligence, DFKI GmbH Stuhlsatzenhausweg 3 66123 Saarbruecken, Germany phone: (+49 681) 302-5252/4162 fax: (+49 681) 302-5341 e-mail: wahlster@dfki.de WWW:http://www.dfki.de/~wahlster Wolfgang Wahlster Ninth Conference of the European Chapter of the Association for Computational Linguistics EACL'99 Bergen, June 10, 1999 Deep Processing of Shallow Structures The Robust Integration of Speech, Language and Translation Technology for Intelligent Interface Agents

1. Speech-to-Speech Translation: Challenges for Language Technology 2. A Multi-Blackboard Architecture for the Integration of Deep and Shallow Processing 3. Integrating the Results of Multiple Deep and Shallow Parsers 4. Packed Chart Structures for Partial Semantic Representations 5. Robust Semantic Processing: Merging and Completing Discourse Representations 6. Combining the Results of Deep and Shallow Translation Threads 7. The Impact of Verbmobil on German Language Industry 8. SmartKom: Integrating Verbmobil Technology Into an Intelligent Interface Agent 9. Conclusion Outline

 W. Wahlster, DFKI Input Conditions Naturalness Adaptability Dialog Capabilities Increasing Complexity Close-Speaking Microphone/Headset Push-to-talk Telephone, Pause-based Segmentation Isolated Words Read Continuous Speech Speaker Independent Speaker Dependent Monolog Dictation Information- seeking Dialog Open Microphone, GSM Quality Spontaneous Speech Speaker adaptive Multiparty Negotiation Verbmobil Challenges for Language Engineering

 W. Wahlster, DFKI Verbmobil Server Wann fährt der nächste Zug nach Hamburg ab? When does the next train to Hamburg depart? Wo befindet sich das nächste Hotel? Where is the nearest hotel? Final Verbmobil Demos: World Expo-2000 (Hannover) CeBIT-2000 (Hannover) COLING-2000 (Saarbrücken) Context-Sensitive Speech-to-Speech Translation

 W. Wahlster, DFKI Wenn ich den Zug um 14 Uhr bekomme, bin ich um 4 in Frankfurt. If I get the train at 2 o‘clock I am in Frankfurt at 4 o‘clock. Am Flughafen könnten wir uns treffen. We could meet at the airport. Dialog Translation 1

 W. Wahlster, DFKI Abends könnten wir Essen gehen. We could go out for dinner in the evening. What time in the evening? Wann denn am Abend? Dialog Translation 2

 W. Wahlster, DFKI Ich könnte für 8 Uhr einen Tisch reservieren. I could reserve a table for 8 o‘clock. Dialog Translation 3

 W. Wahlster, DFKI Scenario 1 Appointment Scheduling Scenario 2 Travel Planning & Hotel Reservation Scenario 3 PC-Maintenance Hotline When? When? Where? How? What? When? Where? How? Focus on temporal expressions Focus on temporal and spatial expressions Integration of special sublanguage lexica Vocabulary Size: 2500/6000 Vocabulary Size: 7000/10000 Vocabulary Size: 15000/30000 Verbmobil II: Three Domains of Discourse

UNIVERSITÄT DES SAARLANDES RUHR-UNIVERSITÄT BOCHUM Phase 2 UNIVERSITÄT HAMBURG UNIVERSITÄT KARLSRUHE UNIVERSITÄT BIELEFELD TECHNISCHE UNIVERSITÄT MÜNCHEN FRIEDRICH- ALEXANDER- UNIVERSITÄT ERLANGEN-NÜRNBERG UNIVERSITÄT STUTTGART RHEINISCHE FRIEDRICH WILHELMS-UNIVERSITÄT BONN LUDWIG MAXIMILIANS UNIVERSITÄT MÜNCHEN TU-BRAUNSCHWEIG EBERHARDT-KARLS UNIVERSITÄT TÜBINGEN  W. Wahlster, DFKI D AIMLER C HRYSLER Verbmobil Partner

 W. Wahlster, DFKI The Control Panel of Verbmobil

 W. Wahlster, DFKI M1 M2M3 M5 M6M4 BB 2BB 1 BB 3 M1 M2 M3 M4 M5 M6 Verbmobil I Verbmobil II Multi-Agent Architecture Multi-Blackboard Architecture Each module must know, which module produces what data Direct communication between modules Each module has only one instance Heavy data traffic for moving copies around Multiparty and telecooperation applications are impossible Software: ICE and ICE Master Basic Platform: PVM All modules can register for each blackboard dynamically No direct communication between modules Each module can have several instances No copies of representation structures (word lattice, VIT chart) Multiparty and Telecooperation applications are possible Software: PCA and Module Manager Basic Platform: PVM From a Multi-Agent Architecture to a Multi-Blackboard Architecture Blackboards

 W. Wahlster, DFKI Audio Data Word Hypothesis Graph with Prosodic Labels VITs Underspecified Discourse Representations Command Recognizer Spontaneous Speech Recognizer Channel/Speaker Adaptation Prosodic Analysis Statistical Parser Dialog Act Recognition Chunk Parser HPSG Parser Semantic Construction Robust Dialog Semantics Semantic Transfer Generation A Multi-Blackboard Architecture for the Combination of Results from Deep and Shallow Processing Modules

 W. Wahlster, DFKI Augmented Word Lattice Augmented Word Lattice Chunk Parser Statistical Parser HPSG Parser partial VITs Chart with a combination of partial VITs Chart with a combination of partial VITs Robust Dialog Semantics Combination and knowledge- based reconstruction of complete VITs Robust Dialog Semantics Combination and knowledge- based reconstruction of complete VITs partial VITs Complete and Spanning VITs Complete and Spanning VITs Integrating Shallow and Deep Analysis Components in a Multi-Blackboard Architecture

 W. Wahlster, DFKI Machine Learning for the Integration of Statistical Properties into Symbolic Models for Speech Recognition, Parsing, Dialog Processing, Translation Transcribed Speech Data Segmented Speech with Prosodic Labels Annotated Dialogs with Dialog Acts Treebanks & Predicate- Argument Structures Aligned Bilingual Corpora Hidden Markov Models Neural Nets, Multilayered Perceptrons Probabilistic Automata Probabilistic Grammars Probabilistic Transfer Rules Extracting Statistical Properties from Large Corpora

 W. Wahlster, DFKI Incremental chart construction and anytime processing Rule-based combination and transformation of partial UDRS coded as VITs Selection of a spanning analysis using a bigram model for VITs (trained on a tree bank of 24 k VITs) Chart Parser using cascaded finite-state transducers (Abney, Hinrichs) Statistical LR parser trained on treebank (Block, Ruland) Very fast HPSG parser (see two papers at ACL99, Kiefer, Krieger et al.) Semantic Construction VHG: A Packed Chart Representation of Partial Semantic Representations

 W. Wahlster, DFKI Goals of robust semantic processing (Pinkal, Worm, Rupp) Combination of unrelated analysis fragments Completion of incomplete analysis results Skipping of irrelevant fragments Method:Transformation rules on VIT Hypothesis Graph: Conditions on VIT structures  Operations on VIT structures The rules are based on various knowledge sources: lattice of semantic types domain ontology sortal restrictions semantic constraints Results: 20% analysis is improved, 0.6% analysis gets worse Robust Dialog Semantics: Deep Processing of Shallow Structures

 W. Wahlster, DFKI We are meeting in Kaiserslautern. Wir treffen uns Kaiserslautern. (We are meeting Kaiserslautern.) English German Semantic Correction of Recognition Errors

 W. Wahlster, DFKI The preposition ‚in‘ is missing in all paths through the word hypothesis graph. A temporal NP is transformed into a temporal modifier using a underspecified temporal relation: [temporal_np(V1)]  [typeraise_to_mod (V1, V2)] & V2 The modifier is applied to a proposition: [type (V1, prop), type (V2, mod)]  [apply (V2, V1, V3)] & V3 Let us meet the late afternoon to catch the train to Frankfurt Let us meet (in) the late afternoon to catch the train to Frankfurt Robust Dialog Semantics: Combining and Completing Partial Representations

 W. Wahlster, DFKI I need a car next Tuesdayoops Monday Original Utterance Editing PhaseRepair Phase Reparandum Hesitation Reparans Recognition of Substitutions Transformation of the Word Hypothesis Graph I need a car next Monday Verbmobil Technology:Understands Speech Repairs and extracts the intended meaning Dictation Systems like: ViaVoice, VoiceXpress, FreeSpeech, Naturally Speaking cannot deal with spontaneous speech and transcribe the corrupted utterances. The Understanding of Spontaneous Speech Repairs

 W. Wahlster, DFKI Wir treffen uns in Mannheim, äh, in Saarbrücken. (We are meeting in Mannheim, oops, in Saarbruecken.) We are meeting in Saarbruecken. English German Automatic Understanding and Correction of Speech Repairs in Spontaneous Telephone Dialogs

 W. Wahlster, DFKI Probabilistic Analysis of Dialog Acts (HMM) Probabilistic Analysis of Dialog Acts (HMM) Recognition of Dialog Plans (Plan Operators) Recognition of Dialog Plans (Plan Operators) Dialog Act Type Dialog Phase HPSG Analysis Robust Dialog Semantics Robust Dialog Semantics VIT Semantic Transfer Semantic Transfer Dialog Act Type Integrating a Deep HPSG-based Analysis with Probabilistic Dialog Act Recognition for Semantic Transfer

 W. Wahlster, DFKI Dialog Act CONTROL_DIALOG MANAGE_TASK PROMOTE_TASK GREETING INTRODUCE POLITENESS_FORMULA THANK DELIBERATE BACKCHANNEL INIT DEFER CLOSE REQUEST SUGGEST INFORM FEEDBACK COMMIT REQUEST_SUGGEST REQUEST_CLARIFY REQUEST_COMMENT REQUEST_COMMIT GREETING_BEGIN GREETING_END DIGRESS EXCLUDE CLARIFY GIVE_REASON DEVIATE_SCENARIO REFER_TO_SETTING CLARIFY_ANSWER FEEDBACK_NEGATIVE REJECT EXPLAINED_REJECT FEEDBACK_POSITIVE ACCEPT CONFIRM The Dialog Act Hierarchy used for Planning, Prediction, Translation and Generation

 W. Wahlster, DFKI Statistical Prediction Statistical Prediction Context Evaluation Dialog Module Dialog-Act based Translation Plan Recognition Plan Recognition Dialog Memory Dialog Memory Main Proprositional Content Dialog Act Context Evaluation Dialog-Act based Translation Transfer by Rules Generation of Minutes Dialog Act Predictions Dialog Act Dialog Phase Focus Combining Statistical and Symbolic Processing for Dialog Processing

 W. Wahlster, DFKI ( OPERATOR-s-10523-6 goal [IN-TURN confirm-s-10523 ?SLASH-3314 ?SLASH-3316] subgoals(sequence[IN-TURN confirm-s-10521 ?SLASH-3314 ?SLASH-3315] [IN-TURN confirm-s-10522 ?SLASH-3315 ?SLASH-3316]) PROB 0.72) ( OPERATOR-s-10521-8 goal [IN-TURN confirm-s-10521 ?SLASH-3321 ?SLASH-3322] subgoals (sequence[DOMAIN-DEPENDENT accept ?SLASH-3321 ?SLASH-3322]) PROB 0.95) ( OPERATOR-s10522-10 goal [IN-TURN confirm-s-10522 ?SLASH-3325 ?SLASH-3326] subgoals (sequence[DOMAIN-DEPENDENT confirm ?SLASH-3325 ?SLASH-3326]) PROB 0.83) Learning of Probabilistic Plan Operators from Annotated Corpora

 W. Wahlster, DFKI Dialog Translation by Verbmobil Multilingual Generation of Protocols HTML-Document In English Transfered by Internet or Fax HTML-Document In English Transfered by Internet or Fax German Dialog Partner American Dialog Partner Automatic Generation of Multilingual Protocols of Telephone Conversations

 W. Wahlster, DFKI A and B greet each other. A: (INIT_DATE, SUGGEST_SUPPORT_DATE, REQUEST_COMMENT_DATE) I would like to make a date. How about the seventeenth? Is that ok with you? B: (REJECT_DATE, ACCEPT_DATE) The seventeenth does not suit me. I’m free for one hour at three o’clock. A: (SUGGEST_SUPPORT_DATE) How about the sixteenth in the afternoon? B: (CLARIFY_QUERY, ACCEPT_DATE, CONFIRM) The sixteenth at two o’clock? That suits me. Ok. A and B say goodbye. Minutes generated automatically on 23 May 1999 08:35:18 h Automatic Generation of Minutes

 W. Wahlster, DFKI The Control Panel of Verbmobil

 W. Wahlster, DFKI Integrating Deep and Shallow Processing: Combining Results from Concurrent Translation Threads Segment 1 Wenn wir den Termin vorziehen, Segment 1 Wenn wir den Termin vorziehen, Segment 2 das würde mir gut passen. Segment 2 das würde mir gut passen. Selection Module Segment 1 Translated by Semantic Transfer Segment 1 Translated by Semantic Transfer Segment 2 Translated by Case-Based Translation Segment 2 Translated by Case-Based Translation Segment 1 If you prefer another hotel, Segment 1 If you prefer another hotel, Segment 2 please let me know. Segment 2 please let me know. Alternative Translations with Confidence Values Statistical Translation Statistical Translation Dialog-Act Based Translation Dialog-Act Based Translation Semantic Transfer Semantic Transfer Case-Based Translation Case-Based Translation

 W. Wahlster, DFKI SEQ:=Set of all translation sequences for a turn Seq  SEQ:=Sequence of translation segments s 1, s 2,...s n Input: Each translation thread provides for every segment an online confidence value confidence (thread.segment) Task: Compute normalized confidence values for translated Seq CONF (Seq) =  Length(segment) * (alpha(thread) + beta(thread) * confidence(thread.segment)) Output: Best (SEQ) = {Seq  SEQ | Seq is maximal element in (SEQ  CONF ) segment  Seq A Context-Free Approach to the Selection of the Best Translation Result

 W. Wahlster, DFKI Turn := segment 1, segment 2...segment n For each turn in a training corpus all segments translated by one of the four translation threads are manually annotated with a score for translation quality. For the sequence of n segments resulting in the best overall translation score at most 4 n linear inequations are generated, so that the selected sequence is better than all alternative translation sequences. From the set of inequations for spanning analyses (  4 n ) the values of alpha and beta can be determind offline by solving the constraint system. Learning the Normalizing Factors Alpha and Beta from an Annotated Corpus

 W. Wahlster, DFKI Turn := Segment_1 Segment_2 Segment_3 Statistical Translation = STAT Case-based Translation = CASE Dialog-Act Based Translation = DIAL Semantic Transfer = SEMT quality (CASE, Segment_1), quality (SEMT, Segment_2), quality (STAT, Sement_3) is optimal > Length (Segment_1) * (alpha (DIAL) + beta (DIAL) * confidence (DIAL, Segment_1)) Length (Segment_2) * (alpha (DIAL) + beta (DIAL) * confidence (DIAL, Segment_2)) Length (Segment_3) * (alpha (DIAL) + beta (DIAL) * confidence (DIAL, Segment_3)) Example of a Linear Inequation Used for Offline Learning Length (Segment_1) * (alpha (CASE ) + beta (CASE) * confidence (CASE, Segment_1)) Length (Segment_2) * (alpha (SEMT) + beta (SEMT) * confidence (SEMT, Segment_2)) Length (Segment_3) * (alpha (STAT) + beta (STAT) * confidence (STAT, Segment_3))

 W. Wahlster, DFKI Using probabilities of dialog acts in the normalization process CONF (Seq) =  Length (segment) * (alpha (thread) + dialog-act (thread, segment) + beta (thread) * confidence (thread, segmnet)) e.g. Greet (Statistical_Translation, Segment > Greet (Semantic_Transfer, Segment) Suggest (Semantic_Transfer, Segment) > Suggest (Case_based Translation, Segment) Exploiting meta-knowledge If the semantic transfer generates  x disambiguation tasks then increase the alpha and beta values for semantic transfer. e.g. einen Termin vorziehen  prefer/give priority to/bring forward Observation: Even on the meta-control level (selection module) a hybrid approach is advantageous. segment  Seq The Context-Sensitive Selection of the Best Translation

 W. Wahlster, DFKI Funding by the German Ministry for Education and Research BMBF Phase I (1993-1996)$ 33 M Phase II (1997-2000)$ 28 M 60% Industrial funding according to shared cost model$ 17 M Additional R&D investments of industrial partners$ 11 M Total$ 89 M > 400 Publications (>250 refereed) >Many Patents > 10 Commercial Spin-off Products >Many new Spin-off Companies > 100 New jobs in German Language >50 Academics transferred to Industry Philips, DaimlerChrysler and Siemens are leaders in Spoken Dialog Applications Verbmobil: Long-Term, Large-Scale Funding and Its Impact

 W. Wahlster, DFKI Fielded applications Train schedules (German Railway System, DB) TABA (Philips) +49 241 60 40 20 OSCAR (DaimlerChrysler) +49 1805 99 66 22 Flight Schedules (Lufthansa) ALF (Philips) +49 1803 00 00 74 Technical Challenges: phone -based dialogs, many proper names, clarification subdialogs Spoken Dialogs about Schedules

 W. Wahlster, DFKI Microphone Push-to-talk Switch Please call Doris Wahlster. Open the left window in the back. I want to hear the weather channel. When will I reach the next gas station? Where is the next parking lot? Speech control of: cellular phone, radio, windows / AC, route guidance system Option for S-, C-, and E-Class of Mercedes and BMW Speaker-independent, Garbage models for non-speech (blinker, AC, wheels) Linguatronic : Spoken Dialogs with Mercedes-Benz

User(s) Media Analysis Design Media Fusion Output Rendering Representation and Inference User Model Discourse Model Domain Model Task Model Media Models Interaction Management Media Analysis Input Processing Information Applications People Intention Recognition Media Design Application Interface Discourse Modeling User Modeling Presentation Design Language Graphics Gesture Biometrics Language Graphics Gesture Animated Presentation Agent The Architecture of the SmartKom Agent (cf. Maybury/Wahlster 1998)  W. Wahlster, DFKI

SmartKom-Home/Office: A Versatile Agent-based Interface SmartKom-Public: A Multimodal Communication Booth SmartKom-Mobile: A Handheld Communication Assistant Media Analysis Kernel of SmartKom Interface Agent Interaction Management Application Manage- ment Media Design SmartKom: A Transportable and Transmutable Interface Agent  W. Wahlster, DFKI

Smartcard/ Credit Card for authentication and billing Docking station for PDA/Notebook/ Camcorder high speed and broad bandwidth Internet connectivity High-resolution scanner Loudspeaker Room microphone Face-tracking camera Virtual touchscreen protected against vandalism Multipoint video conferencing SmartKom-Public: A Multimodal Communication Booth  W. Wahlster, DFKI

MOBILE Camera GPS Microphone Loudspeaker Stylus-Activated Sketch Pad Wearable Compute Server Docking Station for Car PC Biosensor for Authentication & Emotional Feedback GSM for Telephone, Fax, Internet Connectivity SmartKom-Mobile: A Handheld Communication Assistant  W. Wahlster, DFKI

SpeechMike Virtual Touchscreen Natural Gesture Recognition SmartKom-Home/Office: A Versatile Agent-based Interface  W. Wahlster, DFKI

MediaInterface European Media Lab Uinv. Of Munich Univ. of Stuttgart Saarbrücken Aachen Dresden Berkeley Stuttgart MunichUniv. of Erlangen Heidelberg Main Contractor Project Management Testbed Software Integration DFKI Saarbrücken The SmartKom Consortium: Project Budget: $ 34 M Project Duration: 4 years D AIMLER C HRYSLER Ulm SmartKom: Intuitive Multimodal Interaction  W. Wahlster, DFKI

hatsuka no gogo wa ii desu Am Zwanzigsten, am Morgen wäre in Ordnung. Speaker independent, robust speech recognition, over analog phone, ISDN, and GSM mobile phone Japanese German Verbmobil: Translation of Spontaneous Speech

 W. Wahlster, DFKI Real-world problems in language technology like the understanding of spoken dialogs, speech-to-speech translation and multimodal dialog systems can only be cracked by the combined muscle of deep and shallow processing approaches. In a multi-blackboard architecture based on packed representations on all processing levels (speech recognition, parsing, semantic processing, translation, generation) using charts with underspecified representations (eg. UDRS) the results of concurrent processing threads can be combined in an incremental fashion.   Conclusion

 W. Wahlster, DFKI All results of concurrent processing modules should come with a confidence value, so that a selection module can choose the most promising result at a each processing stage. Packed representations together with formalisms for underspecification capture the uncertainties in a each processing phase, so that the uncertainties can be reduced by linguistic, discourse and domain constraints as soon as they become applicable.   Conclusion

 W. Wahlster, DFKI Deep Processing can be used for merging, completing and repairing the results of shallow processing strategies. Shallow methods can be used to guide the search in deep processing. Statistical methods must be augmented by symbolic models (eg. Class-based language modelling, word order normalization as part of statistical translation). Statistical methods can be used to learn operators or selection strategies for symbolic processes.     It is much more than a balancing act... (see Klavans and Resnik 1996) Conclusion

 W. Wahlster, DFKI Harry Bunt Ron Kay Stephan Euler Martin Kay Susan Armstrong Dieter Huber Herbert Reininger Verbmobil‘s Scientific Advisory Board 1993-2000

URL of this Presentation: www.dfki.de/~wahlster/EACL

German Research Center for Artificial Intelligence, DFKI GmbH Stuhlsatzenhausweg 3 66123 Saarbruecken, Germany phone: (+49 681) 302-5252/4162 fax: (+49.

Similar presentations

Presentation on theme: "German Research Center for Artificial Intelligence, DFKI GmbH Stuhlsatzenhausweg 3 66123 Saarbruecken, Germany phone: (+49 681) 302-5252/4162 fax: (+49."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

German Research Center for Artificial Intelligence, DFKI GmbH Stuhlsatzenhausweg 3 66123 Saarbruecken, Germany phone: (+49 681) 302-5252/4162 fax: (+49.

Similar presentations

Presentation on theme: "German Research Center for Artificial Intelligence, DFKI GmbH Stuhlsatzenhausweg 3 66123 Saarbruecken, Germany phone: (+49 681) 302-5252/4162 fax: (+49."— Presentation transcript:

Similar presentations

About project

Feedback