Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Lightweight and High Performance Monolingual Word Aligner Xuchen Yao, Benjamin Van Durme, (Johns Hopkins) Chris Callison-Burch and Peter Clark (UPenn)

Similar presentations


Presentation on theme: "A Lightweight and High Performance Monolingual Word Aligner Xuchen Yao, Benjamin Van Durme, (Johns Hopkins) Chris Callison-Burch and Peter Clark (UPenn)"— Presentation transcript:

1 A Lightweight and High Performance Monolingual Word Aligner Xuchen Yao, Benjamin Van Durme, (Johns Hopkins) Chris Callison-Burch and Peter Clark (UPenn) (Vulcan)

2 2013-8-6ACL 2013, Sofia2 monolingual word alignment Aligning one sentence pair from RTE2 Premise: Linda Johnson, who lives with her husband, Charles, and two cats in..., said Katrina has... Hypothesis: Linda Johnson is married to Charles alignment contributed by Brockett (2007)

3 2013-8-6ACL 2013, Sofia3 monolingual vs. bilingual aligment less training data (labeled or unlabeled), but more lexical resources semantic relatedness: cued by distributional word similaries the same grammar shared by source/target sentences

4 2013-8-6ACL 2013, Sofia4 monolingual vs. bilingual aligment less training data (labeled or unlabeled), but more lexical resources semantic relatedness: cued by distributional word similaries the same grammar shared by source/target sentences

5 2013-8-6ACL 2013, Sofia5 monolingual vs. bilingual aligment less training data (labeled or unlabeled), but more lexical resources semantic relatedness: cued by distributional word similaries the same grammar shared by source/target sentences

6 2013-8-6ACL 2013, Sofia6 a discriminative model first proposed by Blunsom and Cohn (2006): s, t: source (observation), target sentence a: target word indices (0 to target length), state 0 is NULL state for deletion. f(): feature functions

7 2013-8-6ACL 2013, Sofia7 a discriminative model first proposed by Blunsom and Cohn (2006): s, t: source (observation), target sentence a: target word indices (0 to target length), state 0 is NULL state for deletion. f(): feature functions

8 2013-8-6ACL 2013, Sofia8 a discriminative model first proposed by Blunsom and Cohn (2006): s, t: source (observation), target sentence a: target word indices (0 to target length), state 0 is NULL state for deletion. f(): feature functions

9 2013-8-6ACL 2013, Sofia9

10 2013-8-6ACL 2013, Sofia10 desired Viterbi decoding path

11 2013-8-6ACL 2013, Sofia11 a discriminative model first proposed by Blunsom and Cohn (2006): s, t: source (observation), target sentence a: target word indices (0 to target length), state 0 is NULL state for deletion. f(): feature functions

12 2013-8-6ACL 2013, Sofia12 features string similarity –Jaro Winkler, Dice Sorensen, Hamming, Jaccard, Levenshtein, NGram overlapping and common prefix matching POS tags matching WordNet –hypernym, hyponym, synonym, derived form, entailing, causing, members of, have member, substances of, have substances, parts of, have part

13 2013-8-6ACL 2013, Sofia13 features string similarity –Jaro Winkler, Dice Sorensen, Hamming, Jaccard, Levenshtein, NGram overlapping and common prefix matching POS tags matching WordNet –hypernym, hyponym, synonym, derived form, entailing, causing, members of, have member, substances of, have substances, parts of, have part

14 2013-8-6ACL 2013, Sofia14 features string similarity –Jaro Winkler, Dice Sorensen, Hamming, Jaccard, Levenshtein, NGram overlapping and common prefix matching POS tags matching WordNet –hypernym, hyponym, synonym, derived form, entailing, causing, members of, have member, substances of, have substances, parts of, have part

15 2013-8-6ACL 2013, Sofia15 features positional –offset difference between src/tgt word context –whether neighboring words are similar –helps to align functional words distortion (Markov feature) –how far apart are two aligned target words

16 2013-8-6ACL 2013, Sofia16 features positional –offset difference between src/tgt word context –whether neighboring words are similar –helps to align functional words distortion (Markov feature) –how far apart are two aligned target words

17 2013-8-6ACL 2013, Sofia17 features positional –offset difference between src/tgt word context –whether neighboring words are similar –helps to align functional words distortion (Markov feature) –how far apart are two aligned target words

18 2013-8-6ACL 2013, Sofia18 Implementation: jacana-align source code at http://code.google.com/p/jacana lightweight: only used a POS tagger and WordNet written in Scala, optimize with LBFGS platform independent, compiles to a.jar file, fully interoperable with Java high performance? -> evaluation

19 2013-8-6ACL 2013, Sofia19 Baselines GIZA++ Tree Edit Distance (with stem/wordnet matching) MANLI –MacCartney, B.; Galley, M. & Manning, C. D., A Phrase-Based Alignment Model for Natural Language Inference, EMNLP 2008 MANLI-constraint (decoding with ILP) –Thadani, K. & McKeown, K. Optimal and syntactically-informed decoding for monolingual phrase-based alignment. ACL 2011

20 2013-8-6ACL 2013, Sofia20 Baselines GIZA++ Tree Edit Distance (with stem/wordnet matching) MANLI –MacCartney, B.; Galley, M. & Manning, C. D., A Phrase-Based Alignment Model for Natural Language Inference, EMNLP 2008 MANLI-constraint (decoding with ILP) –Thadani, K. & McKeown, K. Optimal and syntactically-informed decoding for monolingual phrase-based alignment. ACL 2011

21 2013-8-6ACL 2013, Sofia21 Baselines GIZA++ Tree Edit Distance (with stem/wordnet matching) MANLI –MacCartney, B.; Galley, M. & Manning, C. D., A Phrase-Based Alignment Model for Natural Language Inference, EMNLP 2008 MANLI-constraint (decoding with ILP) –Thadani, K. & McKeown, K. Optimal and syntactically-informed decoding for monolingual phrase-based alignment. ACL 2011

22 2013-8-6ACL 2013, Sofia22 Baselines GIZA++ Tree Edit Distance (with stem/wordnet matching) MANLI –MacCartney, B.; Galley, M. & Manning, C. D., A Phrase-Based Alignment Model for Natural Language Inference, EMNLP 2008 MANLI-constraint (decoding with ILP) –Thadani, K. & McKeown, K. Optimal and syntactically-informed decoding for monolingual phrase-based alignment. ACL 2011

23 2013-8-6ACL 2013, Sofia23 performance in F1 10.3%

24 2013-8-6ACL 2013, Sofia24 performance in F1 0.8% 3.3%

25 2013-8-6ACL 2013, Sofia25 performance in speed (seconds per sentecne) when sentences are more balanced, jacana- align is about 20x faster corpussentence pair length MANLI-approx.MANLI-exactjacana-align RTE229/111.67s0.08s0.025s FUSION27/2761.96s2.45s0.096s 20x

26 2013-8-6ACL 2013, Sofia26 performance in speed (seconds per sentecne) the speed of jacana-align is not as sensitive to sentence length increase corpussentence pair length MANLI-approx.MANLI-exactjacana-align RTE229/111.67s0.08s0.025s FUSION27/2761.96s2.45s0.096s 30x 4x

27 2013-8-6ACL 2013, Sofia27 Conclusion state-of-the-art monolingual word aligner –in accuracy –in speed open source, use it and hack it!

28 thank you with a demo


Download ppt "A Lightweight and High Performance Monolingual Word Aligner Xuchen Yao, Benjamin Van Durme, (Johns Hopkins) Chris Callison-Burch and Peter Clark (UPenn)"

Similar presentations


Ads by Google