Emily Pitler, Annie Louis, Ani Nenkova University of Pennsylvania.

Slides:



Advertisements
Similar presentations
Numbers Treasure Hunt Following each question, click on the answer. If correct, the next page will load with a graphic first – these can be used to check.
Advertisements

Jack Jedwab Association for Canadian Studies September 27 th, 2008 Canadian Post Olympic Survey.
EcoTherm Plus WGB-K 20 E 4,5 – 20 kW.
1 A B C
Trend for Precision Soil Testing % Zone or Grid Samples Tested compared to Total Samples.
AGVISE Laboratories %Zone or Grid Samples – Northwood laboratory
Trend for Precision Soil Testing % Zone or Grid Samples Tested compared to Total Samples.
AP STUDY SESSION 2.
1
& dding ubtracting ractions.
© 2008 Pearson Addison Wesley. All rights reserved Chapter Seven Costs.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 6 Author: Julia Richards and R. Scott Hawley.
Author: Julia Richards and R. Scott Hawley
1 Copyright © 2013 Elsevier Inc. All rights reserved. Appendix 01.
Properties Use, share, or modify this drill on mathematic properties. There is too much material for a single class, so you’ll have to select for your.
UNITED NATIONS Shipment Details Report – January 2006.
David Burdett May 11, 2004 Package Binding for WS CDL.
1 RA I Sub-Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Casablanca, Morocco, 20 – 22 December 2005 Status of observing programmes in RA I.
CALENDAR.
CHAPTER 18 The Ankle and Lower Leg
1 Click here to End Presentation Software: Installation and Updates Internet Download CD release NACIS Updates.
The 5S numbers game..
Solve Multi-step Equations
A Fractional Order (Proportional and Derivative) Motion Controller Design for A Class of Second-order Systems Center for Self-Organizing Intelligent.
REVIEW: Arthropod ID. 1. Name the subphylum. 2. Name the subphylum. 3. Name the order.
Break Time Remaining 10:00.
The basics for simulations
Factoring Quadratics — ax² + bx + c Topic
Table 12.1: Cash Flows to a Cash and Carry Trading Strategy.
PP Test Review Sections 6-1 to 6-6
MM4A6c: Apply the law of sines and the law of cosines.
EU Market Situation for Eggs and Poultry Management Committee 21 June 2012.
Bellwork Do the following problem on a ½ sheet of paper and turn in.
Exarte Bezoek aan de Mediacampus Bachelor in de grafische en digitale media April 2014.
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
1 RA III - Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Buenos Aires, Argentina, 25 – 27 October 2006 Status of observing programmes in RA.
Progressive Aerobic Cardiovascular Endurance Run
Basel-ICU-Journal Challenge18/20/ Basel-ICU-Journal Challenge8/20/2014.
Chapter 1: Expressions, Equations, & Inequalities
1..
© 2012 National Heart Foundation of Australia. Slide 2.
Adding Up In Chunks.
MaK_Full ahead loaded 1 Alarm Page Directory (F11)
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt Synthetic.
TCCI Barometer September “Establishing a reliable tool for monitoring the financial, business and social activity in the Prefecture of Thessaloniki”
Artificial Intelligence
2011 WINNISQUAM COMMUNITY SURVEY YOUTH RISK BEHAVIOR GRADES 9-12 STUDENTS=1021.
Before Between After.
2011 FRANKLIN COMMUNITY SURVEY YOUTH RISK BEHAVIOR GRADES 9-12 STUDENTS=332.
Model and Relationships 6 M 1 M M M M M M M M M M M M M M M M
Subtraction: Adding UP
: 3 00.
5 minutes.
Static Equilibrium; Elasticity and Fracture
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
Essential Cell Biology
Converting a Fraction to %
Clock will move after 1 minute
PSSA Preparation.
Essential Cell Biology
Immunobiology: The Immune System in Health & Disease Sixth Edition
Energy Generation in Mitochondria and Chlorplasts
Select a time to count down from the clock above
Murach’s OS/390 and z/OS JCLChapter 16, Slide 1 © 2002, Mike Murach & Associates, Inc.
1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)
Schutzvermerk nach DIN 34 beachten 05/04/15 Seite 1 Training EPAM and CANopen Basic Solution: Password * * Level 1 Level 2 * Level 3 Password2 IP-Adr.
Automatic sense prediction for implicit discourse relations in text Emily Pitler, Annie Louis, Ani Nenkova University of Pennsylvania ACL 2009.
Presentation transcript:

Emily Pitler, Annie Louis, Ani Nenkova University of Pennsylvania

2

 I am in Singapore, but I live in the United States. ◦ Explicit Comparison  The main conference is over Wednesday. I am staying for EMNLP. ◦ Implicit Comparison 3

 I am here because I have a presentation to give at ACL. ◦ Explicit Contingency  I am a little tired; there is a 13 hour time difference. ◦ Implicit Contingency 4

 Focus on implicit discourse relations ◦ in a realistic distribution  Better understanding of lexical features ◦ Showed do not capture semantic oppositions  Empirical validation of new and old features ◦ Polarity, verb classes, context, and some lexical features indicate discourse relations 5

 Classify both implicits and explicits ◦ Same sentence [Soricut and Marcu, 2003] ◦ Graphbank corpus: doesn’t distinguish implicit and explicit [ Wellner et al., 2006]  Create artificial implicits by deleting connective ◦ I am in Singapore, but I live in the United States. ◦ [Marcu and Echihabi, 2001; Blair-Goldensohn et al., 2007; Sporleder and Lascarides, 2008] 6

7

 Most basic feature for implicits  I_there, I_is, …, tired_time, tired_difference 8 IamaIittletired thereisa13hourtimedifference Marcu and Echihabi, 2001

 The recent explosion of country funds mirrors the “closed-end fund mania of the 1920s, Mr. Foot says, when narrowly focused funds grew wildly popular.  They fell into oblivion after the 1929 crash. 9

 Using just content words reduces performance (but has steeper learning curve) ◦ Marcu and Echihabi, 2001  Nouns and adjectives don’t help at all ◦ Lapata and Lascarides, 2004  Filtering out stopwords lowers results ◦ Blair-Goldensohn et al.,

 Synthetic implicits: Cause/Contrast/None sentences ◦ Explicit instances from Gigaword with connective deleted ◦ Because  Cause, But  Contrast ◦ At least 3 sentences apart  None ◦ Blair-Goldensohn et al., 2007  Random selection ◦ 5,000 Cause ◦ 5,000 Other  Computed information gain of word pairs 11

 The government says it has reached most isolated townships by now, but because roads are blocked, getting anything but basic food supplies to people remains difficult.  but because  Comparison  but because  Contingency 12

 Maybe even with lots and lots of data, we won’t see “popular…but…oblivion” that often  What are we trying to get at?  PopularDesirableMollify  Oblivion AbhorrentEnrage 13

14

 Multi-perspective Question Answering Opinion Corpus ◦ Wilson et. al, 2005  Sentiment words annotated as ◦ Positive ◦ Negative ◦ Both ◦ Neutral 15

 Similar to word pairs, but words replaced with polarity tags  Arg1: Executives at Time Inc. Magazine Co., a subsidiary of Time Warner, have said the joint venture with Mr. Lang wasn’t a good one.  Arg2: The venture, formed in 1986, was supposed to be Time’s low- cost, safe entry into women’s magazines. Arg1NegatePositiveArg2Positive 16

 General Inquirer lexicon ◦ Stone et al., 1966 ◦ Semantic categories of words  Complementary classes ◦ “Understatement” vs. “Overstatement” ◦ “Rise” vs. “Fall” ◦ “Pleasure” vs. “Pain”  Features ~ Tag pairs, only verbs 17

 Newsweek's circulation for the first six months of 1989 was 3,288,453, flat from the same period last year  U.S. News' circulation in the same time was 2,303,328, down 2.6%  Probably WSJ-specific 18

 Levin verb class level in LCS database ◦ Levin, 1993; Dorr, 2001 ◦ More related verbs ~ Expansion  Average length of verb chunk ◦ They [are allowed to proceed] ~ Contingency ◦ They [proceed] ~ Expansion, Temporal  POS tags of the main verb ◦ Same tense ~ Expansion ◦ Different tense ~ Contingency, Temporal 19

 Prior work found first and last words very helpful in predicting sense ◦ Wellner et al., 2006 ◦ Often explicit connectives 20

 Was preceding/following relation explicit? ◦ If so, which sense? ◦ If so, which connective?  Does Arg1 begin a paragraph? 21

 Largest available annotated corpus of discourse relations ◦ Penn Treebank WSJ articles ◦ 16,224 implicit relations between adjacent sentences  I am a little tired; [because] there is a 13 hour time difference. ◦ Contingency.cause.reason 22

Relation Sense Proportion of implicits Expansion53% Contingency26% Comparison15% Temporal 6% 23

 Developed features on sections 0-1  Trained on sections 2-20  Tested on sections  Binary classification task for each sense  Trained on equal numbers of positive and negative examples  Tested on natural distribution  Naïve Bayes classifier 24

25

 Motivation in prior work ◦ Train on synthetic implicits  What works better ◦ Train on actual implicits Synthetic examples can still help! Comp. Cont. ◦ With only best features selected from synthetic implicits

Featuresf-score First-Last, First Context19.32 Money/Percent/Num19.04 Random Polarity is actually the worst feature 16.63

Comparison Not Comparison Positive-Negative or Negative-Positive Pairs 30%31% 28

Featuresf-score First-Last, First Verbs36.59 Context29.55 Random

Featuresf-score Polarity Tags71.29 Inquirer Tags70.21 Context67.77 Random Expansion is majority class precision more problematic than recall These features all help other senses

Featuresf-score First-Last, First Verbs12.61 Context12.34 Random Temporals often end with words like “Monday” or “yesterday”

 Comparison ◦ Selected word pairs  Contingency ◦ Polarity, Verb, First/Last, Modality, Context, Selected word pairs 32

 Expansion ◦ Polarity, Inquirer Tags, Context  Temporal ◦ First/Last+word pairs 33

Comparison (17.13) Contingency (31.10) Expansion (63.84) Temporal (16.21) 34 Comparison/Contingency baseline: synthetic implicits word pairs Expansion/Temporal baseline: real implicits word pairs

 Results from classifying each relation independently ◦ Naïve Bayes, MaxEnt, AdaBoost  Since context features were helpful, tried CRF  6-way classification, word pairs as features ◦ Naïve Bayes accuracy: 43.27% ◦ CRF accuracy: 44.58% 35

 Focus on implicit discourse relations ◦ in a realistic distribution  Better understanding of word pairs ◦ Showed do not capture semantic oppositions  Empirical validation of new and old features ◦ Polarity, verb classes, context, and some lexical features indicate discourse relations 36