Presentation is loading. Please wait.

Presentation is loading. Please wait.

Scanalu2002 23.5 Jan Alexandersson Experiences from large NLP Projects Jan Alexandersson German research center for Artificial Intelligence GmbH Stuhlsatzenhausweg.

Similar presentations

Presentation on theme: "Scanalu2002 23.5 Jan Alexandersson Experiences from large NLP Projects Jan Alexandersson German research center for Artificial Intelligence GmbH Stuhlsatzenhausweg."— Presentation transcript:

1 Scanalu Jan Alexandersson Experiences from large NLP Projects Jan Alexandersson German research center for Artificial Intelligence GmbH Stuhlsatzenhausweg 3, Geb Saarbrücken Tel.: (0681)

2 Scanalu Jan Alexandersson Overview Introduction What was VerbMobil What is SmartKom Scaling Experiences from VerbMobil Conclusion

3 Scanalu Jan Alexandersson What was... ?

4 Scanalu Jan Alexandersson VerbMobil - What was it? Speech-to-speech translation system Robust processing of spontaneous dialogs Speaker independent (adaptive) Languages: English, German, Japanese Domains: Appointment scheduling, travel planning and hotel reservation, remote PC maintenance Summary of the dialogue automatically generated by the system The system mediates between two humans, it does not play an active role There is no control of the ongoing dialog by the system

5 Scanalu Jan Alexandersson The Verbmobil Partners

6 Scanalu Jan Alexandersson The Verbmobil Partners

7 Scanalu Jan Alexandersson 23 participating institutions (in Verbmobil II), from Germany and the USA Over 900 full-time employees and students involved over the whole duration Funded by the German Ministry for Education and Science and the participating companies: Facts About the Project BMBF-Funding Phase I, – Mio. DM BMBF-Funding Phase II, Mio. DM Industrial investment I+II32.6 Mio. DM Related industrial R & D activitiesca. 20 Mio. DM Total168.6 Mio. DM 31.6 Mio 27 Mio 16.5 Mio ca. 10 Mio 85.1 Mio

8 Scanalu Jan Alexandersson Project Organization Verbmobil Consortium Group of Module Managers Head of System Integration Group A. Klüter Module Coordinator N. Reithinger Manager Module 1 Manager Module n... Verbmobil Advisory Board Scientific Management Scientific Head W. Wahlster Deputy Scientific Head A. Waibel Head of Project Management Group R. Karger DLR G. Klein Steering Committee German Federal Ministry for Research and Education

9 Scanalu Jan Alexandersson Input Conditions Naturalness Adaptability Dialog Capabilities Increasing Complexity Close-Speaking Microphone/Headset Push-to-talk Telephone, Pause-based Segmentation Isolated Words Read Continuous Speech Speaker Independent Speaker Dependent Monolog Dictation Information- seeking Dialog Open Microphone, GSM Quality Spontaneous Speech Speaker Adaptive Multiparty Negotiation Verbmobil Challenges for Language Engineering

10 Scanalu Jan Alexandersson Classification of Machine TranslationMethods Syntactic Analysis Word Structure Word Structure Direct Translation Syntactic Transfer Semantic Transfer Interlingua Semantic Structure Semantic Analysis Semantic Generation Syntactic Generation Syntactic Structure Morphologic Analysis Morphologic Generation Source Language Target Language

11 Scanalu Jan Alexandersson The VerbMobil Case Syntactic Analysis Word Structure Word Structure Direct Translation Syntactic Transfer Semantic Transfer Interlingua Semantic Structure Semantic Analysis Semantic Generation Syntactic Generation Syntactic Structure Morphologic Analysis Morphologic Generation Source Language Target Language Speec h Signal Prosodic Analysis Prosodic Annotation

12 Scanalu Jan Alexandersson The Graphical User Interface

13 Scanalu Jan Alexandersson Focuses of Speech Recognition in Verbmobil Robustness Multilinguality Large Vocabulary Daimler Chrysler RWTH Aachen University of Karlsruhe

14 Scanalu Jan Alexandersson General Speech Recognition Task German English Japanese Audio SignalRecognizersWord Hypotheses Graph interface between acoustic and linguistic processing

15 Scanalu Jan Alexandersson Microphone 1 Microphone open Speech input Microphone 1 Microphone closed Microphone 1 Speech output Pause detection Synchronization Translation Open Microphone Approach

16 Scanalu Jan Alexandersson What Linguistic Analysis Really Needs Syntactic Boundaries He saw ? the man ? with the telescope Prosody cannot help Dialog Act Boundaries No, I have no time at all on Thursday. D But how about on Friday? Dialog acts are pragmatic units that chunk the input into units which can be processed alone. Prosodic Syntactic Boundaries Of course ? not ? on Saturday Syntactic boundaries that correlate to the acoustic-phonetic reality; help during analysis within one chunk/dialog act. Important in spontaneous speech with elliptical utterances.

17 Scanalu Jan Alexandersson Speech SignalWord Hypotheses Graph Multilingual Prosody Module Prosodic features: F0 duration energy.... Search Space Restriction Parsing Dialog Act Segmentation and Recognition Dialog Understand. Constraints for Transfer Translation Lexical Choice Generation Speech Synthesis Speaker Adaptation Boundary Information Boundary Information Boundary Information Boundary Information Sentence Mood Sentence Mood Accented Words Accented Words Prosodic Feature Vector Prosody in Verbmobil

18 Scanalu Jan Alexandersson Facts about Repairs in the Verbmobil Corpus 21% of all turns in the Verbmobil corpus ( turns ) contain at least one self correction The syntactic category is preserved in most cases (For example: Out of a sample of 266 verb replacements, 224 are again mapped to verbs) Repairs take place in a restricted context (in 98% the reparandum consists of less than 5 words) Repair sequences underlie certain regularities

19 Scanalu Jan Alexandersson The Understanding of Spontaneous Speech Repairs I need a car next Tuesdayoops Monday Original Utterance Editing PhaseRepair Phase Reparandum Editing Term Reparans Recognition of Substitutions Transformation of the Word Hypotheses Graph I need a car next Monday

20 Scanalu Jan Alexandersson Architecture of Repair Processing On Thursday I cannot no I can meet äh after one

21 Scanalu Jan Alexandersson Multiple Approaches Mono-cultural approaches are dangerous –humans vs. viruses diversity –Microsoft vs. ILOVEYOU and copycats alternative software solutions Some sources of errors in a speech translation system –external spontaneous speech: not well formed, hesitations, repairs bad acoustic conditions human dialog behavior –internal knowledge gaps in modules software errors probabilistic processing Use multiple engines, varying approaches on various stages of processing

22 Scanalu Jan Alexandersson Exclusive alternatives: three different 16 kHz German speech recognizers with various capabilities Competing approaches: –three parsers: HPSG, Chunk, Statistical –five translation tracks: case-based, dialog-act based, statistical, substring- based, linguistic (deep) semantic translation Needed: selection and combination of results from competing tracks –parsers: combination of partial analyses in the semantic processing modules –translation: pre-selection module Multiple Approaches in Verbmobil

23 Scanalu Jan Alexandersson Multiple Translation Tracks - Approaches and Advantages Case-based: –Approach: uses examples from the aligned bilingual Verbmobil corpus –Advantage: good translation if input matches example in corpus Dialog-act based: –Approach: extract core intention (dialog act) and content –Advantage: robust wrt. recognition errors Statistical –Approach: use statistical language and translation models –Advantage: guaranteed translation with high approximate correctness Substring- based –Approach: combines statistical word alignment with precomputation of translation chunks and contextual clustering –Advantage: guaranteed translation with high approximate correctness Linguistic (deep) semantic translation –Approach: classic approach using semantic transfer –Advantage: high quality translation in case of success

24 Scanalu Jan Alexandersson Example Based Translation Task: Providing a translation based on translation templates and partial linguistic analysis Input: WHGs or best Hypothesis Method: Definite Clause Grammar (DCG), graph matching algorithms Result: Translation and a confidence value Benefit: Improving Verbmobils translation capabilities through an additional translation path Responsible: DFKI, Kaiserslautern

25 Scanalu Jan Alexandersson Dialog-Act Based Translation Task: Robustly provide a translation of core intentions and contents of the domain Input: Prosodically annotated best hypothesis (flat WHG) Method: Statistical dialog-act classifier and Finite State Transducers Result: Translation and a confidence value, additionally content descriptions for the dialog module Benefit: Robust translation and content extraction even when the recognition is erroneous Responsible: DFKI, Saarbrücken

26 Scanalu Jan Alexandersson Statistical Translation Task: Provide approximative correct translations Input: Prosodically annotated best hypothesis (flat WHG) Method: Use statistical language and translation models Result: Translation and a confidence value Benefit: Approximative correct translation for spontaneous speech Responsible: RWTH Aachen

27 Scanalu Jan Alexandersson Deep Translation Task: Provide high quality translations Input: Prosodically annotated WHG and contextual information Method: Use syntactic and semantic approaches to analysis, transfer, and generation Result: Translation containing content information, suited for high quality speech synthesis Benefit: Delivers the highest quality, but is sensitive to recognition errors and spontaneous speech phenomena Responsible: Siemens AG, DFKI Saarbrücken, Universität Tübingen, Universität des Saarlandes, Universität Stuttgart, TU Berlin, CSLI Stanford

28 Scanalu Jan Alexandersson Modules Involved Integrated processing comprises – search through the WHG – statistic parser – chunk parser Semantic Construction provides VITs from statistic and chunk parser output Deep Analysis: HPSG Parser Dialog Semantics:combination of parsing results, and semantic resolution Transfer: VIT to VIT transfer Generation: TAG generation from VITs Dialog+Context: provides contextual information

29 Scanalu Jan Alexandersson The Multi-Parser Approach Verbmobil uses three different syntactic parsers: an HPSG parser, a chunk parser, and a probabilistic LR parser. Every parser implements another level of parsing accuracy, depth of syntactic analysis, and robustness of the analyzing process. –Chunk parser: Most robust but least accurate analysis –HPSG parser: Most accurate by least robust analysis –Probabilistic parser: Level of accuracy and robustness between HPSG and chunk parser

30 Scanalu Jan Alexandersson Integrated Processing Gets WHGs for the English, German, or Japanese speech input and dispatches WHG information to the three parsers Provides an A* search algorithm that allows any connected parser to find the best scored path using –acoustic score of the speech recognizer –Verbmobil trigram language model Parsers analyze the same utterance simultaneously

31 Scanalu Jan Alexandersson HPSG Processing Task: Thorough syntactic analysis Input: Word chains from integrated processing Method: Apply HPSG analysis Result: Source language VITs Benefit: Delivers the highest quality, but is sensitive to recognition errors and spontaneous speech phenomena Responsible: DFKI Saarbrücken, CSLI Stanford

32 Scanalu Jan Alexandersson The Result is a Syntactic Tree Alright, and that should get us there about nine in the evening.

33 Scanalu Jan Alexandersson... but analysis is not always spanning The train arise at seven thirty. We could take a cab it to the hotel problem train station.

34 Scanalu Jan Alexandersson Semantic Construction Task: Convert and extend syntax trees to VITs Input: Syntax tree from statistical and chunk parsers Method: Compositional construction using semantic lexicon Result: VITs Benefit: Providing results of shallow parser to the deep analysis track Responsible: Universität Stuttgart (IMS)

35 Scanalu Jan Alexandersson Schematic Processing Lexcion access and interpretation of the grammatical roles Intermediate representation: Application Tree Compositional semantic construction Intermediate representation: VIT Non compositional semantic construction using transfer rule engine Intermediate representation: Resulting VIT Input: Syntactic tree

36 Scanalu Jan Alexandersson Dialog Semantics Task: Combining results from various parsers, reinterpret and correct VITs, and resolve non- local ambiguities Input: VITs from different parsers Method: VIT models and rule based approaches Result: VIT ready for transfer Benefit: Enhances robustness of deep analysis and provides vital information for transfer Responsible: Universität des Saarlandes, Saarbrücken

37 Scanalu Jan Alexandersson Combining Analyses from Various Parsers Parsers deliver VITs for segments of a turn May be spanning analyses or just partial fragments Combination necessary, both analyses of one parsers, but also analyses from various parsers Combination criteria –HPSG is better than statistical parsers is better than chunk parser –Integrated results are better than fragments –Longer results are better than short ones

38 Scanalu Jan Alexandersson Semantic Based Transfer Task: Transfer VITs from the source to the target language Input: VITs Method: Rule based transfer Result: VITs for generation Benefit: Translate VITs inside the deep translation path Responsible: Universität Stuttgart (IMS)

39 Scanalu Jan Alexandersson Context Evaluation Task: Resolving ambiguities in the dialog context during semantic transfer Input: Requests from transfer Method: Using world knowledge and rules Result: disambiguated transfer requests Benefit: Higher quality of transfer results Responsible: Technical University (TU) Berlin

40 Scanalu Jan Alexandersson Dialog Processing Task: Provides dialog context for all tracks and computes main information for dialog summaries Input: Data from a lot of modules Method: Frame-like topic structuring and rules Result: context information and dialog summaries and minutes Benefit: Verbmobil knows what happens throughout the dialog and can present it Responsible: DFKI, Saarbrücken

41 Scanalu Jan Alexandersson Probabilistic Analysis of Dialog Acts (HMM) Probabilistic Analysis of Dialog Acts (HMM) Recognition of Dialog Plans (Plan Operators) Recognition of Dialog Plans (Plan Operators) Dialog Act Dialog Phase Syntactic Analysis Robust Dialog Semantics Robust Dialog Semantics VIT Semantic Transfer Semantic Transfer Dialog Act Dialog Information in Semantic Transfer

42 Scanalu Jan Alexandersson The Intentional Structure DA Level Move Level Game Level Phase Level Dialogue Level VM_Dialogue PH_Greet G_Greet M_Greet PH_Nego G_Nego Greet Feedback Pol_Form Introduce G_Nego RequestSuggest AABB Reject Speaker M_Tr_InitM_InitM_Resp

43 Scanalu Jan Alexandersson Collaboration for a New Functionality: Summaries Provide the users with a summary of the topics that were agreed Two benefits –have a piece of information to use in calendars etc. –control the translation Approach: exploit already existing modules for –content extraction –dialog interpretation –planning the summary –generation –transfer

44 Scanalu Jan Alexandersson Summaries Dialog module keeps track of the dialog: dialog model, context extraction, translations: dialog history Three types of documents: Minutes: relevant exchanges Summary: dialog results Scripts: complete dialog script

45 Scanalu Jan Alexandersson Multilingual Summaries Multilinguality: Integration of transfer module: German Summary (HTML) Context Syndialog Dialog VM-PROTO GENGER Transfer (G E) VM-PROTO GENENG English Summary (HTML) Document structure VITs

46 Scanalu Jan Alexandersson Result Summary

47 Scanalu Jan Alexandersson Generation Task: Robustly generate the output of the semantic transfer in German, English, or Japanese Input: VITs from transfer Method: Constraint system for micro- planning, TAG grammar (reusing HPSG grammars) for syntactic realization Result: Strings, enriched with content-to-speech (CTS) information to support synthesis Benefit: Output from the semantic transfer track Responsible: DFKI, Saarbrücken

48 Scanalu Jan Alexandersson Multiple Translation Tracks – Approx. correct translation case based statistical DA based Sem. based Substring Selection (Man) Selection (Learning) Selection (Manual) case based statistical DA based Sem. based Substring Selection (Automatic) Selection (Learning) Selection (Manual) WA > 50%WA > 75%WA > 80%

49 Scanalu Jan Alexandersson Verbmobil – The Book There are over 600 refereed papers on the various aspects of and achievements in Verbmobil. Wolfgang Wahlster (ed.): " Verbmobil: Foundations of Speech-to-Speech Translation " Springer-Verlag Berlin Heidelberg New York. 679 Pages ISBN

50 Scanalu Jan Alexandersson What is... ?

51 Information, Applications, People User(s) User Modeling Discourse Management Intention Recognition Interaction Management Mode Analysis Language Graphics Gesture Sound Media Input Processing Media Output Rendering Reference Architecture for Multimodal Systems Context Management Expectation Management User ID Biometrics Application Interface Integrate Respond Request Terminate Initiate T A V G G Mode Coordination Presentation Design Multimodal Reference Resolution Multimodal Fusion A A V G G Mode Design Language Graphics Gesture Sound Animated Presentation Agent Select Content Design Allocate Coordinate Layout User Model Discourse Model Domain Model Media Models Task Model Representation and Inference, States and Histories Application Models Context Model Reference Resolution Action Planning 2 Nov Dagstuhl Seminar Fusion and Coordination in Multimodal Interaction edited by: M. Maybury

52 Scanalu Jan Alexandersson User specifies goal delegates task cooperate on problems asks questions presents results Service 1 Service 2 Service 3 IT Services Personalized Interaction Agent Situated Delegation-oriented Dialog Paradigm: Collaborative Problem Solving

53 Scanalu Jan Alexandersson The Main Modules on the Control GUI

54 Scanalu Jan Alexandersson More About the System Modules realized as independent processes Not all must be there (critical path: speech or graphic input to speech or graphic output) (Mostly) independent from display size Pool Communication Architecture (PCA) based on PVM for Linux and NT –Modules know only about their I/O pools –Literature: Andreas Klüter, Alassane Ndiaye, Heinz Kirchmann: Verbmobil From a Software Engineering Point of View: System Design and Software Integration. In Wolfgang Wahlster: Verbmobil - Foundation of Speech-To- Speech Translation. Springer, Data exchanged using M3L documents All modules and pools are visualized here...

55 Scanalu Jan Alexandersson The Real Story

56 Scanalu Jan Alexandersson Frame Languages Object-oriented Modeling Primitives Frame Languages Object-oriented Modeling Primitives NL/MM-Semantics More formal Semantics Subsumption, Inferences NL/MM-Semantics More formal Semantics Subsumption, Inferences W3C Standards XML Schema/DTDs W3C Standards XML Schema/DTDs M3L The Glue - M3L: XML based Multimodal Markup Language Domain Knowledge NL/MM Representation Pool..... XML schema

57 Scanalu Jan Alexandersson Validation of Dialogue Systems Analysis Generator Database DM ASR Synthesis Dialogue model Project ValDia (DFKI – DaimlerChrysler ULM) Tool for validation of Dialogue Models/Managers (DM) Automatic Manual

58 Scanalu Jan Alexandersson Validation of DM Even slight changes can make test suites for DM invalid (but not for parser, recognizer, …) Put persons in front of the complete system + We will eventually find errors -It is time consuming -For some scenarios impossible to exhaustively validate a DM -What module failed to perform its task? -Combination of errors? the whole system has to be put together

59 Scanalu Jan Alexandersson Validation of DM ValDia approach: Replace test person and I/O modules with ValDia Database DM Analysis Generator ASR Synthesis Dialogue model

60 Scanalu Jan Alexandersson Experiences ValDia detects errors Logical: –Combination of greet und request leads to goal conflict in DM – DM hang! Technical: –After about 500 Dialogues DM crashed due to erroneous memory handling

61 Scanalu Jan Alexandersson What is Scalability ?

62 Scanalu Jan Alexandersson What is Scale (-able)? WordNet (1.6): –Noun scaling has 3 senses (grading) the act of arranging in a graduated series act of measuring, arranging or adjusting according to a scale ascent by or as if by a ladder –Verb scale has 8 senses measure by or as if by a scale; "This bike scales only 25 pounds pattern, make,... or estimate according to some rate or standard take by attacking with scaling ladders (surmount) -- reach the highest point of climb up by means of a ladder scale, descale -- remove the scales from; "scale fish" measure with or as if with scales; "scale the gold" size or measure according to a scale

63 Scanalu Jan Alexandersson Scaling what/how? Bigger Better Faster Robuster Precision Coverage Multilinguality Cheaper Depth

64 Scanalu Jan Alexandersson Coverage Linguistic constructions Domain, Task, Application Sub-Languages, Type of Lang. Multilingual, Cultur Interaction style SIZE Robustness Depth Speed

65 Scanalu Jan Alexandersson Who are we scaling for? EU NSF BMBF Industri... Basic research Research Prototypes Applied research / Product development ``Real´´ Systems

66 Scanalu Jan Alexandersson Experiences VerbMobil ``Many´´ people has said: –With persons on one spot I would make a VerbMobil of my own. But muuuuuch better/cheaper/... This is not true! –Software enginering –Ex: Speech recognition -93: –Single word recognition –Push-to-talk -00: –Open microphone –Spontaneous Speech

67 Scanalu Jan Alexandersson The VerbMobil Corpus 3,200 dialogs (G: 1,454, E: 726, J: 1,020) 1,658 speakers (G: 1,022, E: 202, J: 434) 79,562 turns (G: 41,512, E: 16,104, J: 21,946) 1,520,000 running words (G: 670,000, E: 270,000, J: 580,000) 181,6 hours were recorded (G: 96.1, E: 37.9, J: 47.7) were recorded using –a close microphone, –a room microphone and –a telephone

68 Scanalu Jan Alexandersson The VerbMobil Corpus transcribed and distributed on –56 CDs (21.5 GB) Analyzing the corpus: – 206,000 instances of articulatory background noise, –85,000 instances of breathing and –35,000 hesitations voiced: 19,000, nasal: 2,500, vocalic-nasalized. 13,500 The Verbmobil data are distributed to research or commercial users via the Bavarian Archive of Speech Signals (BAS)

69 Scanalu Jan Alexandersson Experiences from WOZ GER142: danach könnten wir gemeinsam Abendessen gehen SIM143: Bitte wiederholen Sie Ihre Äußerung. Es ist ein Fehler in der semantischen Verarbeitung aufgetreten GER144: oh,danach könnten wir gemeinsam abendessen SIM145: Bitte wiederholen Sie Ihre Äußerung mit anderen Wörtern. Die semantische Verarbeitung war nicht erfolgreich GER146: äh, okay ENG147: maybe a bit louder ? GER148: yes, I invite you for the dinner.

70 Scanalu Jan Alexandersson Development HPSG Starting point: HPGS for written G/E Goal: Lexical Entries for spont. spoken G/E Schema: (V1.0)

71 Scanalu Jan Alexandersson Development HPSG What factors contributed to progress? –Getting to know the challenge Spontaneous/Spoken vs Written Language –Finding a Suitable Formalism –Tools –Interface Verbmobil Interface Term (VIT) –Compilation Techniques –Test Suites –Corpora

72 Scanalu Jan Alexandersson Well Defined Interfaces Speech Recognotion – Linguistic Modules: –Word Hypothesis Graph (WHG) Between (deep) Linguistic Modules –VerbMobil Interface Term (VIT) Linguistic Modules – Synthesizer –Annotated String (Concept-to-Speech)

73 Scanalu Jan Alexandersson Verbmobil From a Software Engineering Point of View System Design and Software Integration

74 Scanalu Jan Alexandersson Software Technology Challenges The goal Build an integrated system The situation Researchers do research Using different programming languages Researchers dont want to be bothered with technical details The solution Introducing: the System Group Maximal technical support for the researchers/developers

75 Scanalu Jan Alexandersson The System Architecture M1 M2M3 M5 M6M4 BB 2BB 1 BB 3 M1 M2 M3 M4 M5 M6 Verbmobil I Verbmobil II Multi-Agent ArchitectureMulti-Blackboard Architecture Modules know all communication partners Direct communication between modules Reconfiguration difficult Software: ICE and ICE Master Basic Platform: PVM Modules know their I/O data pools No direct communication between modules 198 blackboards vs direct comm. paths Reconfiguration easy Several instances of one module/functionality Software: PCA and Module Manager Basic Platform: PVM Blackboards

76 Scanalu Jan Alexandersson Audio Data Word Hypotheses Graph with Prosodic Labels VITs Underspecified Discourse Representations Command Recognizer Spontaneous Speech Recognizer Channel/Speaker Adaptation Prosodic Analysis Statistical Parser Dialog Act Recognition Chunk Parser HPSG Parser Semantic Construction Robust Dialog Semantics Semantic Transfer Generation Sample Pool Structure

77 Scanalu Jan Alexandersson Distributed Execution Supports Distributed Development server 2 server 1 controlling terminal User 2 User 1 Pool Communication Architecture

78 Scanalu Jan Alexandersson Support from the System Group: (1) Integration framework (Testbed) with common communication mechanism for all used programming languages (C, C++, Lisp, Prolog, Java, Fortran, Tcl/Tk) Narrow interface for all used programming languages Overall system control infrastructure Standards on various levels –Installation –Compilation –Communication formats between modules –... Toolbox for recording, replaying, testing, inspecting data exchanged between modules,...

79 Scanalu Jan Alexandersson The Testbed is the Integration Framework for the Verbmobil System PCA Visualization Manager Automatic Test Module Synchronization Module User Command Mapper Arbitration of Concurrent Modules GUI Testbed Manager

80 Scanalu Jan Alexandersson The GUI Visualization and Debug Tool.... and much more

81 Scanalu Jan Alexandersson Assure high system stability and robustness in connection with large-scale testing audio modules, testbed acoustic modules 2 Weeks parsers and shallow translation modules 2 Weeks linguistic modules and synthesis 2 Weeks system delivery Weeks integration and stabilization phase Support from the System Group (2): Regular Integration Cycles

82 Scanalu Jan Alexandersson Development at different cites Communication via and FTP Server: –UPLOAD Software for integration –EXCHANGE Exchanging software between developers –ALPHA Service New integrated complete system Support from the System Group (3): The FTP Server

83 Scanalu Jan Alexandersson What contributed to the success of VerbMobil?

84 Scanalu Jan Alexandersson Important Contributions Multiple approaches Management Meetings –Project meetings, Work Shops,... Corpus collection - Massive amounts of data for –Testing, Linguistic Phenomena, Annotation System Group –Test bed, Integration Cycles,... Time The Internet...

85 Scanalu Jan Alexandersson Conclusion We still need: –lot of man power : Researchers Software engineers Management –lot of data: annotate learn from All this costs a lot of $/ The Holy Grale of NLP (too?): Self learning systems

86 Scanalu Jan Alexandersson Thank you very much for your attention!

Download ppt "Scanalu2002 23.5 Jan Alexandersson Experiences from large NLP Projects Jan Alexandersson German research center for Artificial Intelligence GmbH Stuhlsatzenhausweg."

Similar presentations

Ads by Google