Presentation is loading. Please wait.

Presentation is loading. Please wait.

Automatic sense prediction for implicit discourse relations in text Emily Pitler, Annie Louis, Ani Nenkova University of Pennsylvania ACL 2009.

Similar presentations


Presentation on theme: "Automatic sense prediction for implicit discourse relations in text Emily Pitler, Annie Louis, Ani Nenkova University of Pennsylvania ACL 2009."— Presentation transcript:

1 Automatic sense prediction for implicit discourse relations in text Emily Pitler, Annie Louis, Ani Nenkova University of Pennsylvania ACL 2009

2 Implicit discourse relations Explicit comparison – I am in Singapore, but I live in the United States. Implicit comparison – The main conference is over Wednesday. I am staying for EMNLP. Explicit contingency – I am here because I have a presentation to give at ACL. Implicit contingency – I am a little tired; there is a 13 hour time difference.

3 Related work Soricut and Marcu (2003) – Sentence level Wellner et al. (2006) – used GraphBank annotations that do not differentiate between implicit and explicit. – difficult to verify success for implicit relations. Marcu and Echihabi (2001) – Artificial implicit – Delete the connective to generate dataset. – [Arg1, but Arg2] => [Arg1, Arg2]

4 Word pairs investigation The most easily accessible features are the words in the two text spans of the relation. There is some relationship that hold between the words in the two arguments. – The recent explosion of country finds mirrors the “closed-end fund mania” of the 1920s. Mr. Foot says, when narrowly focused funds grew wildly popular. They fell into oblivion after the 1929 crash. – Popular ( 受歡迎 ) and oblivion ( 被遺忘 ) are almost antonyms. – Triggers the contrast relation between the sentences.

5 Word pairs selection Marcu and Echihabi (2001) – Only nouns, verbs, and others cue phrases. – Using all words were superior to those based on only non- functions words. Lapata and Lascarides (2004) – Only verbs, nouns, and adjectives. – Verb pairs are one of best features. – No useful information was obtained using nouns and adjectives. Blair-Goldensohn et al. (2007) – Stemming. – Small vocabulary. – Cutoff on the minimum frequency of a feature. – Filtering stop-words has a negative impact on the results.

6 Analysis of word pair features Finding the word pairs with highest information gain on the synthetic data. – The government says it has reached most isolated townships by now, but because roads are blocked, getting anything but basic food supplies to people remains difficult. – Remove but => comparison example – Remove because => contingency example

7 Features for sense prediction Polarity tags Inquirer tags Money/Percent/Number Verbs First-last/first 3 words Context

8 Similar to word pairs, but words replaced with polarity tags. Each word’s polarity was assigned according to its entry in the Multi-perspective Question Answering Opinion Corpus (Wilson et al., 2005) Each sentiment word is tagged as positive, negative, both, or neutral. Using the number of negated and non-negated positive, negative, and neutral sentiment word in the two spans as features. Executives at Time Inc. Magazine Co., a subsidiary of Time Warner, have said the joint venture with Mr. Lang wasn’t a good one. [Negated Positive] The venture, formed in 1986, was supposed to be Time’s low-cost, safe entry into women’s magazines. [Positive] Polarity Tags pairs

9 Look up what semantic categories each word falls into according to the General Inquirer lexicon (Stone et al., 1966). See more observation for each semantic class than for any particular word, reducing the data sparsity problem. Complementary classes – “Understatement” vs. “Overstatement” – “Rise” vs. “Fall” – “Pleasure” vs. “Pain” Only verbs. Inquirer Tags

10 If two adjacent sentences both contain numbers, dollar amounts, or percentages, it is likely that a comparison relation might hold between the sentences. Count of numbers, percentages, and dollar amounts in the two arguments. Number of times each combination of number/percent/dollar occurs in the two arguments. Newsweek's circulation for the first six months of 1989 was 3,288,453, flat from the same period last year U.S. News' circulation in the same time was 2,303,328, down 2.6% Money/Percent/Num

11 Number of pairs of verbs in Arg1 and Arg2 from the same verb class. – Two verbs are from the same verb class if each of their highest Levin verb class levels are the same. – The more related the verbs, the more likely the relation is an Expansion. Average length of verb phrases in each argument – They [are allowed to proceed] => Contingency – They [proceed] => Expansion, Temporal POS tags of the main verb – Same tense => Expansion – Different tense => Contingency, Temporal Verbs

12 Prior work found first and last words very helpful in predicting sense – Wellner et al., 2006 – Often explicit connectives First-Last, First3

13 Some implicit relations appear immediately before or immediately after certain explicit relations. Indicating if the immediately preceding/following relation was an explicit. – Connective – Sense of the connective Indicating if an argument begins a paragraph. Context

14 Penn Discourse Treebank Largest available annotated corpus of discourse relations – Penn Treebank WSJ articles – 16,224 implicit relations between adjacent sentences I am a little tired; [because] there is a 13 hour time difference. – Contingency.cause.reason Use only the top level of the sense annotations. Dataset

15 Top level discourse relations Comparison ( 轉折 ) – 但是、可是、卻、即使、竟然、然而 …… Contingency ( 因果 ) – 因為、由於、因此、於是 …… Expansion ( 並列 ) – 又、並且、而且 …… Temporal ( 時序 ) – 在此之前、之後 ……

16 Relation SenseProportion of implicits Expansion53% Contingency26% Comparison15% Temporal 6% Discourse relations

17 Developed features on sections 0-1 Trained on sections 2-20 Tested on sections 21-22 Binary classification task for each sense Trained on equal numbers of positive and negative examples Tested on natural distribution Naïve Bayes classifier Experiment setting

18 Results: comparison Featuresf-score First-Last, First321.01 Context19.32 Money/Percent/Num19.04 Random9.91 Polarity is actually the worst feature 16.63

19 Distribution of opposite polarity pairs ComparisonNot Comparison Positive-Negative or Negative-Positive Pairs 30%31%

20 Results: contingency Featuresf-score First-Last, First336.75 Verbs36.59 Context29.55 Random19.11

21 Results: expansion Featuresf-score Polarity Tags71.29 Inquirer Tags70.21 Context67.77 Random64.74 Expansion is majority class precision more problematic than recall These features all help other senses

22 Results: temporal Featuresf-score First-Last, First315.93 Verbs12.61 Context12.34 Random5.38 Temporals often end with words like “Monday” or “yesterday”

23 Best feature sets Comparison – Selected word pairs. Contingency – Polarity, verb, first/last, modality, context, selected word pairs. Expansion – Polarity, inquirer tags, context. Temporal – First/last, selected word pairs.

24 Best results RelationF-scorebaseline Comparison21.9617.13 Contingency47.1331.10 Expansion76.4163.84 Temporal16.7616.21

25 Sequence model for discourse relations Tried Conditional random field classifier. ModelAccuracy Naïve Bayes Model43.27% Conditional Random Fields44.58%

26 Conclusion First study that predicts implicit discourse relations in a realistic setting. Better understanding of word pairs. – The feature in fact do not capture opposite semantic relation but rather give information about function word co-occurrences. Empirical validation of new and old features. – Polarity, verb classes, context, and some lexical features indicate discourse relations.


Download ppt "Automatic sense prediction for implicit discourse relations in text Emily Pitler, Annie Louis, Ani Nenkova University of Pennsylvania ACL 2009."

Similar presentations


Ads by Google