Presentation is loading. Please wait.

Presentation is loading. Please wait.

Analysis for Spoken Language Translation Using Phrase-Level Parsing and Domain Action Classification Chad Langley Language Technologies Institute Carnegie.

Similar presentations


Presentation on theme: "Analysis for Spoken Language Translation Using Phrase-Level Parsing and Domain Action Classification Chad Langley Language Technologies Institute Carnegie."— Presentation transcript:

1 Analysis for Spoken Language Translation Using Phrase-Level Parsing and Domain Action Classification Chad Langley Language Technologies Institute Carnegie Mellon University June 9, 2003

2 Chad Langley2 Outline Interlingua-Based Machine Translation NESPOLE! MT System Overview Interchange Format Interlingua Hybrid Analysis Approach Evaluation –Domain Action Classification –End-to-End Translation Summary

3 June 9, 2003Chad Langley3 Interlingua-Based Machine Translation Interlingua Japanese Arabic Chinese English French German Italian Arabic Chinese English FrenchGerman Italian Japanese Korean Analyzers Generators Spanish

4 June 9, 2003Chad Langley4 Interlingua-Based MT at Carnegie Mellon Long line of research in interlingua-based machine translation of spontaneous conversational speech –C-STAR I (appointment scheduling) –Enthusiast (passive Spanish  English) –C-STAR II (travel planning) –LingWear (wearable tourist assistance) –Babylon (handheld medical assistance) –NESPOLE! (travel & tourism and medical assistance)

5 June 9, 2003Chad Langley5 NESPOLE! Overview Human-to-human speech-to-speech machine translation over the Internet Domains: –Travel & Tourism –Medical Assistance Languages: –English – Carnegie Mellon University –German – Universität Karlsruhe –Italian – ITC-irst –French – Université Joseph Fourier Additional Partners –AETHRA Telecommunications –APT Trentino Tourism Board

6 June 9, 2003Chad Langley6 NESPOLE! Architecture Mediator connects users to translation server Language specific servers for each language exchange Interchange Format to perform translation

7 June 9, 2003Chad Langley7 NESPOLE! Language Servers Analysis Chain: Speech  Text  IF Generation Chain: IF  Text  Speech Connect source language analysis chain to target language generation chain to translate

8 June 9, 2003Chad Langley8 NESPOLE! User Interface

9 June 9, 2003Chad Langley9 Interchange Format Overview Interchange Format (IF) is a shallow semantic interlingua for task-oriented domains Captures speaker intention rather than literal meaning Abstracts away from language-specific syntax and predicate-argument structure Represents utterances as sequences of Semantic Dialogue Units (SDUs)

10 June 9, 2003Chad Langley10 Interchange Format Representation IF representation consists of four parts 1.Speaker 2.Speech Act 3.Concepts 4.Arguments speaker : speech_act +concept* (arguments*) Domain Action combines domain-independent speech act and domain-dependent concepts } Domain Action

11 June 9, 2003Chad Langley11 Interchange Format Specification Defines the sets of speech acts, concepts, and arguments –72 speech acts + 3 “prefix” speech acts –144 concepts –227 top-level arguments Defines constraints on how components can be combined –Domain actions are formed compositionally based on the constraints for combining speech acts and concepts –Arguments must be licensed by at least one element of the domain action

12 June 9, 2003Chad Langley12 Example “Hello. I would like to take a vacation in Val di Fiemme.” hello i would like to take a vacation in val di fiemme c:greeting (greeting=hello) c:give-information+disposition+trip (disposition=(who=i, desire), visit-spec=(identifiability=no, vacation), location=name-val_di_fiemme_area)) ENG: Hello! I want to travel for a vacation at Val di Fiemme. ITA: Salve. Io vorrei una vacanza in Val di Fiemme.

13 June 9, 2003Chad Langley13 Why Hybrid Analysis? Goal: A portable and robust analyzer for task- oriented IF-based speech-to-speech MT Previous IF-based MT systems used full semantic grammars to parse complete DAs –Useful for parsing spoken language in restricted domains –Difficult to port to new domains Continue to use semantic grammars to parse small domain-independent DAs and phrase-level arguments Train classifiers to identify DAs

14 June 9, 2003Chad Langley14 Hybrid Analysis Approach Use a combination of grammar-based phrase- level parsing and machine learning to produce interlingua (IF) representations

15 June 9, 2003Chad Langley15 Hybrid Analysis Approach hello i would like to take a vacation in val di fiemme c:greeting (greeting=hello) c:give-information+disposition+trip (disposition=(who=i, desire), visit-spec=(identifiability=no, vacation), location=name-val_di_fiemme_area))

16 June 9, 2003Chad Langley16 greeting=disposition=visit-spec=location= helloi would like totake a vacationin val di fiemme

17 June 9, 2003Chad Langley17 SDU1SDU2

18 June 9, 2003Chad Langley18 greetinggive-information+disposition+trip

19 June 9, 2003Chad Langley19 Argument Parsing Parse utterances using phrase-level grammars SOUP Parser: Stochastic, chart-based, top-down robust parser designed for real-time analysis of spoken language Separate grammars based on the type of phrases that the grammar is intended to cover

20 June 9, 2003Chad Langley20 Grammars Argument grammar –Identifies arguments defined in the IF s[arg:activity-spec=]  (*[object-ref=any] *[modifier=good] [biking]) –Covers "any good biking", "any biking", "good biking", "biking", plus synonyms for all 3 words Pseudo-argument grammar –Groups common phrases with similar meanings into classes s[=arrival=]  (*is *usually arriving) –Covers "arriving", "is arriving", "usually arriving", "is usually arriving", plus synonyms

21 June 9, 2003Chad Langley21 Grammars Cross-domain grammar –Identifies simple domain-independent DAs s[greeting] ([greeting=first_meeting] *[greet:to-whom=]) –Covers "nice to meet you", "nice to meet you donna", "nice to meet you sir", plus synonyms Shared grammar –Contains low-level rules accessible by all other grammars

22 June 9, 2003Chad Langley22 Segmentation Goal: Split utterances into Semantic Dialogue Units so Domain Actions can be assigned Potential SDU boundaries occur between argument parse trees and/or unparsed words An SDU boundary is present if there is a parse tree from the cross-domain grammar on either side of a potential boundary position Otherwise, use a memory-based classifier to determine if an SDU boundary is present

23 June 9, 2003Chad Langley23 Segmentation Classifier The segmentation classifier is a memory- based classifier implemented using TiMBL Input: 10 features based on word and parse information surrounding a potential boundary Output: Binary decision about presence of SDU boundary Training Data: Potential SDU boundaries extracted from utterances manually annotated with SDU boundaries and parsed with the phrase-level grammars

24 June 9, 2003Chad Langley24 Segmentation Features Preceding parse label (A -1 ) Probability a boundary follows A -1 (P(A -1 )) Preceding word (w -1 ) Probability a boundary follows w -1 (P(w -1 )) Number of words since last boundary Number of argument parse trees since last boundary Following parse label (A 1 ) Probability a boundary precedes A 1 (P(A 1 )) Following word (w 1 ) Probability a boundary precedes w 1 (P(w 1 ))

25 June 9, 2003Chad Langley25 Segmentation Features Probability features are estimated using counts from the training data –P(A -1 ) = C(A -1 ) / C(A -1 ) –P(w -1 ) = C(w -1 ) / C(w -1 ) –P(A 1 ) = C(A 1 ) / C(A 1 ) –P(w 1 ) = C(w 1 ) / C(w 1 ) 3 segmentation training examples in “hello i would like to take a vacation in val di fiemme” –1 positive (between “hello” and “i”) –2 negative (between “to” and “take”; between “vacation” and “in”)

26 June 9, 2003Chad Langley26 Evaluation: Segmentation Data: English and German in Travel & Tourism and Medical Assistance domains TiMBL parameters: IB1 (k-NN) algorithm with Gain Ratio feature weighting, k=5 and unweighted voting Evaluated using 20-fold cross validation with “in-turn” examples English Travel German Travel English Medical German Medical Accuracy92.64%93.04%96.23%93.26% Training Examples3569046170421877792 In-Turn Examples2384431234278735522 Turn Boundary Examples1184614936143142270

27 June 9, 2003Chad Langley27 Domain Action Classification Goal: Identify the DA for each SDU using TiMBL memory-based classifiers Split DA classification into two subtasks (Speech Act and Concept Sequence) –Reduces the number of classes for each classifier –Allows for different approaches and/or feature sets for each task –Allows for DAs that did not occur in the data Also classify the complete DA directly

28 June 9, 2003Chad Langley28 DA Classification Data English Travel German Travel English Medical German Medical SDUs 8289 8719 3664 2294 Domain Actions 972 1001 462 286 Speech Acts 70 50 43 Concept Sequences 615 638 305 179 Vocabulary Size 1946 2815 1694 1112 Corpus Information English Travel German Travel English Medical German Medical DA 19.2% acknowledge 19.7% acknowledge 25.1% give-information+exp+h-s 27.2% acknowledge SA 41.4% give-information 40.7% give-information 59.7% give-information 35.3% give-information CS 38.9% No concepts 40.3% No concepts 35.0% +experience+health-status 47.3% No concepts Most Frequent DAs, SAs, and Concept Sequences

29 June 9, 2003Chad Langley29 DA Classifiers SA, CS, and DA classifiers implemented using TiMBL memory-based learner Input: Binary features indicate presence or absence of argument and pseudo-argument labels in the phrase-level parse (200-300 features) –CS classifier also uses the corresponding SA Output: Best class (SA, CS, or DA) Training Data: SDUs manually annotated with IF representations and parsed with the argument parser

30 June 9, 2003Chad Langley30 DA Classification Data: English and German in Travel & Tourism and Medical Assistance domains TiMBL parameters: IB1 (k-NN) algorithm with Gain Ratio feature weighting, k=1 20-fold cross validation

31 June 9, 2003Chad Langley31 Comparison of Learning Approaches Learning Approaches –Memory-Based Learning (TiMBL) –Decision Trees (C4.5) –Neural Networks (SNNS) –Naïve Bayes (Rainbow) Important Considerations –Accuracy –Speed of training and classification –Accommodation of discrete and continuous features from multiple sources –Production of ranked list of classes –Online server mode

32 June 9, 2003Chad Langley32 Comparison of Learning Approaches 20-fold cross validation setup All classifiers used same feature set (grammar labels) SNNS may perform slightly better but prefer TiMBL when all factors are taken into account

33 June 9, 2003Chad Langley33 Adding Word Information Grammar label unigrams do not exploit the strengths of naïve Bayes classification Test naïve Bayes classifiers (Rainbow) trained on word bigrams Words provide useful information for the task, especially for Speech Act classification

34 June 9, 2003Chad Langley34 Adding Word Information Add word-based features to the TiMBL classifiers 1.Binary features for the top 250 words sorted by mutual information 2.Probabilities computed by Rainbow

35 June 9, 2003Chad Langley35 Using the IF Specification Use knowledge from the IF specification during DA classification –Ensure that only legal DAs are produced –Guarantee that the DA and arguments combine to form a valid IF representation Strategy: Find the best DA that licenses the most arguments –Trust parser to reliably label arguments –Retaining detailed argument information is important for translation

36 June 9, 2003Chad Langley36 Using the IF Specification Check if the best speech act and concept sequence form a legal DA If not, test alternative combinations of speech acts and concept sequences from ranked set of possibilities Select the best combination that licenses the most arguments Drop arguments not licensed by the best DA

37 June 9, 2003Chad Langley37 Evaluation: IF Specification Fallback Test set contained 292 SDUs from 151 utterances 182 SDUs required classification 4% had illegal DAs 29% had illegal IFs Mean arguments per SDU: 1.47 Changed Speech Act5% Concept Sequence26% Domain Action29% Arguments dropped per SDU Without fallback0.38 With fallback0.07

38 June 9, 2003Chad Langley38 End-to-End Translation Speech input through text output –Reflects combined performance of speech recognition, analysis, and generation Travel & Tourism domain English-to-English and English-to-Italian –Test Set: 232 SDUs (110 utterances) from 2 unseen dialogues German-to-German and German-to-Italian –Test Set: 356 SDUs (246 utterances) from 2 unseen dialogues Analyzer used Segmentation, Speech Act, and Concept Sequence classifiers with IF specification fallback strategy

39 June 9, 2003Chad Langley39 End-to-End Translation Each SDU graded by 3 human graders as very good, good, bad, or very bad Acceptable = very good + good Unacceptable = bad + very bad Majority vote among 3 graders (i.e. A translation was considered acceptable if it received at least 2 Acceptable grades) Speech recognition hypotheses were also graded as if they were paraphrases produced by the translation system

40 June 9, 2003Chad Langley40 End-to-End Translation (Travel & Tourism) English WAR German WAR 56.4% 51.0% Speech Recognition Word Accuracy Rates Acceptable end-to-end translation for English travel input Acceptable end-to-end translation for German travel input

41 June 9, 2003Chad Langley41 Work in Progress Evaluation of end-to-end translation for medical assistance domain Evaluation of portability from the Travel & Tourism domain to the Medical Assistance domain Data ablation studies

42 June 9, 2003Chad Langley42 Summary I described an effective method for identifying domain actions that combines phrase-level parsing and machine learning. The hybrid analysis approach is fully integrated into the NESPOLE! English and German MT systems. Automatic classification of domain actions is feasible despite the large number of classes and relatively sparse unevenly distributed data –<10000 training examples –Most frequent classes have >1000 examples –Many classes have only 1-2 examples

43 June 9, 2003Chad Langley43 Summary Word and argument information can be effectively combined to improve domain action classification performance. Preliminary indications are that the approach is quite portable. –English and German NESPOLE! systems were ported from Travel & Tourism to Medical Assistance. Annotation: ~125 person hours Grammar Development: ~140 person hours

44 June 9, 2003Chad Langley44

45 June 9, 2003Chad Langley45 Hybrid Analysis Approach hello i would like to take a vacation in val di fiemme c:greeting (greeting=hello) c:give-information+disposition+trip (disposition=(who=i, desire), visit-spec=(identifiability=no, vacation), location=name-val_di_fiemme_area)) SDU1SDU2 greeting=disposition=visit-spec=location= helloi would like totake a vacationin val di fiemme greetinggive-information+disposition+trip

46 June 9, 2003Chad Langley46 DA Classification Data


Download ppt "Analysis for Spoken Language Translation Using Phrase-Level Parsing and Domain Action Classification Chad Langley Language Technologies Institute Carnegie."

Similar presentations


Ads by Google