Presentation is loading. Please wait.

Presentation is loading. Please wait.

Non-Monotonic Parsing of Fluent Umm I mean Disfluent Sentences Mohammad Sadegh Rasooli[Columbia University] Joel Tetreault[Yahoo Labs] This work conducted.

Similar presentations


Presentation on theme: "Non-Monotonic Parsing of Fluent Umm I mean Disfluent Sentences Mohammad Sadegh Rasooli[Columbia University] Joel Tetreault[Yahoo Labs] This work conducted."— Presentation transcript:

1 Non-Monotonic Parsing of Fluent Umm I mean Disfluent Sentences Mohammad Sadegh Rasooli[Columbia University] Joel Tetreault[Yahoo Labs] This work conducted while both authors were at Nuance’s NLU Research Lab in Sunnyvale, CA

2 Mohammad Sadegh Rasooli and Joel Tetreault “Joint Parsing and Disfluency Detection in Linear Time.” EMNLP 2013 Mohammad Sadegh Rasooli and Joel Tetreault “Non-Monotonic Parsing of Fluent Umm I mean Disfluent Sentences.” EACL 2014

3 Motivation Two issues for spoken language processing:  ASR errors  Speech Disfluencies ~10% of the words in conversational speech are disfluent An extreme case of “noisy” input: http://www.youtube.com/watch?v=lj3iNxZ8Dww http://www.youtube.com/watch?v=lj3iNxZ8Dww Error propagation from these two errors can wreak havoc on downstream modules such as parsing and semantics

4 Disfluencies Three types  Filled pauses: e.g. uh, um  Discourse markers and parentheticals: e.g., I mean, you know  Reparandum (edited phrase) ReparandumRepair FPDM Interregnum I want a flight to Boston uh I mean to Denver

5 Processing Disfluent Sentences Most approaches deal solely with disfluency detection as a pre-processing step before parsing Serialized method of disfluency detection and then parsing can be slow… Why not parse disfluent sentences at the same time as detecting disfluencies?  Advantage: speed-up processing, especially for dialogue systems

6 Our Approach Dependency parsing and disfluency detection with high accuracy and processing speed Source: I want a flight to Boston uh I mean to Denver Output: I want a flight to Denver I want a flight to Boston uh I mean to Denver [ Root ] prep pobj root dobj subj det Real output of our system!

7 Our Work: EMNLP13 Our approach is based on arc-eager transition-based parsing [Nivre, 2004] Parsing is the process of choosing the best action at a particular state and buffer configuration Extend 4 actions {shift, reduce, left-arc, right-arc} with three additional actions:  IJ[w i..w j ]: interjections  DM[w i..w j ]: discourse markers  RP[w i..w j ]: reparandum

8 I 1 want 2 a 3 flight 4 to 5 Boston 6 uh 7 I 8 mean 9 to 10 Denver 11 [ Root 0 ] prep root dobj subj det Example StackBuffer IJ[7] [ Root 0 ] uh 7 I 8 mean 9 to 10 Denver 11 want 2 flight 4 to 5 Boston 6 pobj

9 I 1 want 2 a 3 flight 4 to 5 Boston 6 uh 7 I 8 mean 9 to 10 Denver 11 [ Root 0 ] prep root dobj subj det Example StackBuffer DM[8:9] [ Root 0 ] I 8 mean 9 to 10 Denver 11 want 2 flight 4 to 5 Boston 6 pobj

10 I 1 want 2 a 3 flight 4 to 5 Boston 6 uh 7 I 8 mean 9 to 10 Denver 11 [ Root 0 ] prep root dobj subj det Example StackBuffer RP[5:6] [ Root 0 ] to 10 Denver 11 want 2 flight 4 to 5 Boston 6 pobj

11 I 1 want 2 a 3 flight 4 to 5 Boston 6 uh 7 I 8 mean 9 to 10 Denver 11 [ Root 0 ] prep root dobj subj det Example StackBuffer Deleting words and dependencies [ Root 0 ] to 10 Denver 11 want 2 flight 4 to 5 Boston 6 pobj

12 I 1 want 2 a 3 flight 4 to 5 Boston 6 uh 7 I 8 mean 9 to 10 Denver 11 [ Root 0 ] root dobj subj det Example StackBuffer Right-arc: prep [ Root 0 ] to 10 Denver 11 want 2 flight 4

13 I 1 want 2 a 3 flight 4 to 5 Boston 6 uh 7 I 8 mean 9 to 10 Denver 11 [ Root 0 ] root dobj subj det Example StackBuffer Right-arc: pobj [ Root 0 ] Denver 11 want 2 flight 4 to 10 prep

14 I 1 want 2 a 3 flight 4 to 5 Boston 6 uh 7 I 8 mean 9 to 10 Denver 11 [ Root 0 ] root dobj subj det Example StackBuffer Reduce….. [ Root 0 ] prep pobj

15 The Cliffhanger Method parsed at a high accuracy but on task of disfluency detection was 1.1% off of Qian et al. ’13: [82.5 to 81.4] How can we improve disfluency detection performance? How can we make model faster and more compact to work in real-time SLU applications?

16 EACL 2014 Two extensions to prior work to achieve state-of-the-art disfluency detection performance Novel disfluency-focused features  EMNLP ’13 work used standard parse features for all classifiers Cascaded classifiers  Use series of nested classifiers for each action to improve speed and performance

17 Nested classifiers: two designs

18 New Features New disfluency-specific features Some of the prominent ones:  N-gram overlap  N-grams after a RP is done  Number of common words and POS tag sequences between reparandum candidate and repair  Distance features Different classifiers use different combinations of features

19 Joint Model: First Pass Augment arc-eager classifier with three additional actions: inaccurate and slow  Each new type should be modeled with its own set of features  More candidates  more memory fetch Instead of having a one level model, use a sequence of increasingly complex models that progressively filter space of possible outputs Each classifier has its own unique feature space

20 Evaluation: Disfluency Detection ModelDescriptionF-score [Miller and Schuler, 2008]Joint + PCFG Parsing30.6 [Lease and Johnson, 2006]Joint + PCFG Parsing62.4 [Kahn et al, 2005]TAG + LM rerank78.2 [Qian and Lui, 2013] – optIOB tagging82.5* (previous best) Flat ModelArc-Eager Parsing41.5 EMNLP ’13 – Two Classifiers (M2)Arc-Eager Parsing81.4 EACL ’14 – Two Classifiers (M2)Arc-Eager Parsing82.2 EACL ‘14 – Six Classifiers (M6)Arc-Eager Parsing82.6 * Also performed 10-fold x-val tests on SWB, M2 outperforms Qian et al. by 0.6 Corpus: parsed section of Switchboard (mrg) Conversion: T-surgeon and Penn2Malt Metrics: F-score of detecting reperandum Classifier: Average Structured Perceptron

21 Other Evaluations Speed: M6 is 4 times faster than M2 # Features: M6 has 50% fewer features than M2 Parse score: M6 is slightly better than M2 and within 2.5 points of “gold standard trees”

22 Conclusions State-of-the-art disfluency detection algorithm which also produces accurate dependency parse  New features + engineering  improved performance  Runs in linear time  very fast!  Incremental, so could be coupled with incremental speech and dialogue processing  Future work: acoustic features, beam search, etc. Special note: current approach surpassed by Honnibal et al. TACL to appear (84%)

23 Thanks! Mohammad S. Rasooli: rasooli@cs.columbia.com Joel Tetreault: tetreaul@yahoo-inc.com

24 Evaluation: Parsing ModelMS/Sentence# FeaturesUnlabeled Acc. F-score Flat Model1849M60.264.6 2 Classifiers1282M88.187.6 6 Classifiers361M88.487.7 Lower bound F-score: 70.2 Upper bound F-score: 90.2

25 Action: IJ IJ[i]: remove the first i words from the buffer and tag them as interjection (IJ) StackBuffer want flight to Bostonuh I mean to Denver IJ[1] StackBuffer want flight to BostonI mean to Denver

26 Action: DM DM[i]: remove the first i words from the buffer and tag them as discourse marker (DM) StackBuffer want flight to BostonI mean to Denver DM[2] StackBuffer want flight to Bostonto Denver

27 Action: RP RP[i:j]: from the words outside the buffer, remove the un-removed words i to j and tag them as reparandum (RP) StackBuffer want flight to Bostonto Denver RP[5:6] StackBuffer want flightto Denver

28 I 1 want 2 a 3 flight 4 to 5 Boston 6 uh 7 I 8 mean 9 to 10 Denver 11 [ Root 0 ] prep pobj root dobj subj det Example StackBuffer Shift [ Root 0 ] I 1 want 2 a 3 flight 4 to 5 Boston 6 uh 7 I 8 mean 9 to 10 Denver 11

29 [ Root 0 ] Example StackBuffer Shift [ Root 0 ] I 1 want 2 a 3 flight 4 to 5 Boston 6 uh 7 I 8 mean 9 to 10 Denver 11

30 [ Root 0 ] Example Stack Buffer Left-arc: subj [ Root 0 ] I 1 want 2 a 3 flight 4 to 5 Boston 6 uh 7 I 8 mean 9 to 10 Denver 11

31 I 1 want 2 a 3 flight 4 to 5 Boston 6 uh 7 I 8 mean 9 to 10 Denver 11 [ Root 0 ] subj Example StackBuffer Right-arc: root [ Root 0 ] want 2 a 3 flight 4 to 5 Boston 6 uh 7 I 8 mean 9 to 10 Denver 11

32 I 1 want 2 a 3 flight 4 to 5 Boston 6 uh 7 I 8 mean 9 to 10 Denver 11 [ Root 0 ] root subj Example StackBuffer Shift [ Root 0 ] a 3 flight 4 to 5 Boston 6 uh 7 I 8 mean 9 to 10 Denver 11 want 2

33 I 1 want 2 a 3 flight 4 to 5 Boston 6 uh 7 I 8 mean 9 to 10 Denver 11 [ Root 0 ] root subj Example StackBuffer left-arc: det [ Root 0 ] flight 4 to 5 Boston 6 uh 7 I 8 mean 9 to 10 Denver 11 want 2 a 3

34 I 1 want 2 a 3 flight 4 to 5 Boston 6 uh 7 I 8 mean 9 to 10 Denver 11 [ Root 0 ] root subj det Example StackBuffer Right-arc: dobj [ Root 0 ] flight 4 to 5 Boston 6 uh 7 I 8 mean 9 to 10 Denver 11 want 2

35 I 1 want 2 a 3 flight 4 to 5 Boston 6 uh 7 I 8 mean 9 to 10 Denver 11 [ Root 0 ] root dobj subj det Example StackBuffer Right-arc: prep [ Root 0 ] to 5 Boston 6 uh 7 I 8 mean 9 to 10 Denver 11 want 2 flight 4

36 I 1 want 2 a 3 flight 4 to 5 Boston 6 uh 7 I 8 mean 9 to 10 Denver 11 [ Root 0 ] prep root dobj subj det Example StackBuffer Right-arc: pobj [ Root 0 ] Boston 6 uh 7 I 8 mean 9 to 10 Denver 11 want 2 flight 4 to 5


Download ppt "Non-Monotonic Parsing of Fluent Umm I mean Disfluent Sentences Mohammad Sadegh Rasooli[Columbia University] Joel Tetreault[Yahoo Labs] This work conducted."

Similar presentations


Ads by Google