Presentation is loading. Please wait.

Presentation is loading. Please wait.

Playing the Telephone Game: Determining the Hierarchical Structure of Perspective and Speech Expressions Eric Breck and Claire Cardie Department of Computer.

Similar presentations


Presentation on theme: "Playing the Telephone Game: Determining the Hierarchical Structure of Perspective and Speech Expressions Eric Breck and Claire Cardie Department of Computer."— Presentation transcript:

1 Playing the Telephone Game: Determining the Hierarchical Structure of Perspective and Speech Expressions Eric Breck and Claire Cardie Department of Computer Science Cornell University COLING 04

2 Abstract News articles report on facts, events, and opinions that are only second- or third-hand. Any child who has played “telephone” knows, this relying of facts often garbles the original message. Properly understanding the information filtering structures is critical to analyzing them. We present a learning approach that correctly determines the hierarchical structure.

3 1 Introduction This paper introduces two kinds of expression that can filter information. A perspective expression is the word that denotes the presence of an explicit opinion, emotion, speculation, belief, sentiment, etc.  The source of a perspective expression is the person or entity whose opinion or emotion is being conveyed in the context. A speech expression is the word that convey the words of another individual.

4 Example: (Perspective expression, speech expression, source) 1. Charlie was angry at Alice’s claim that Bob was unhappy. 2. Philip Clapp, president of the National Environment Trust, sums up well the general thrust of the reaction of environmental movements: “There is no reason at all to believe that the polluters are suddenly going to become reasonable.”

5 Given sentences 1 and 2 and their pse’s (perspective and speech expressions), we will present methods that produce the structure shown in Figure 1.

6 2 Related Work Gerard (2000) propose a computational model of the reader of a news article. Bethard et al. (2004) seek to extract propositional opinions and their holders. Gildea and Jurafsky’s (2002) work on semantic role identification. Wiebe et al. (2003) present preliminary results for the automatic identification of pse’s using corpus-based techniques.

7 3 The Approach We use the NRRC corpus which is annotated with pse’s and their hierarchical pse structure.  The corpus consists of 535 newswire documents.  We use only those sentences that contain at least two non-writer pse’s.

8 Training instances for the binary classifier are pairs of pse’s from the same sentence,.  We assign a class value of 1 to a training instance if pse parent is the immediate parent of pse target in the manually annotated hierarchical structure for the sentence and 0 otherwise.

9  For sentence 1, there are nine training instance generated:,, (class 1),,,,,, (class 0)

10 During testing, we construct the hierarchical pse structure of an entire sentence as follows.  For each pse in the sentence, ask the binary classifier to judge each other pse as a potential parent, and choose the pse with the highest confidence.  Finally, join these immediate-parent links to form a tree.

11 3.1 Features Parse-based features  pse parent -dominates-pse target feature  A variant of previous one that is 1 if the parent of pse parent dominates pse target.  Variants of previous two features that if the first dependency relation is an object relation.  pse parent -dominates-pse target feature based on a partial parse.  A feature is 1 when the parser failed.

12 Positional features  A feature that is 1 if pse parent is the root of the parse (and similarly for pse target ).  A feature giving the ordinal position of pse parent among the pse’s in the sentence, relative to pse target (-1 means pse parent is the pse that immediately precedes pse target, 1 means immediately following).  A feature giving the total number of pse’s in the sentence.

13 Special parents and lexical features  Features for three type of parents: the writer pse, and the lexical items “said” (the most common non-writer pse) and “according to”.  The part of speech of pse parent and pse target (reduced to noun, verb, adjective, adverb, or other).

14 Genre-specific features  A few special forms that are not always parsed accurately. Examples are: 4. “Alice disagrees with me,” Bob argued. 5. Charlie, she noted, dislikes Chinese food. : features pse parent -pattern-1 and pse target -pattern-1 would be 1; : feature pse parent -pattern-2 would be 1.  We also add features that denote whether the pse in question falls between matching quote marks.  Finally, a simple feature indicates whether pse parent is the last word in the sentence.

15 3.2 Resources The corpus is distributed with annotations automatically generated by the GATE toolkit. For parsing we use the Collins parser. For partial parses, we employ CASS. We use a simple finite-state recognizer to identify (possibly nested) quoted phrases. For classifier construction, we use the IND package to train decision trees (we use the mml tree style).

16 4 Evaluation Three metrics  Lin  perf The fraction of sentences whose structure is determined entirely correctly.  Bin The accuracy of the binary classifier from the test corpus.

17 Comparing to two heuristic-based approach  heurOne All pse’s are attached to the writer’s implicit pse.  heurTwo It attaches a pse to the pse most immediately dominating it in the dependency tree. We use 10-fold cross-validation (the heuristics do not require training).

18 Table 3: “Size” is the number of sentences or pse pairs

19 5 Discussion Our system performs poorly on sentences with more pse’s. This reflects a weakness in our decision to combine binary decisions, because the model has learned that, a “said” or writer’s pse is likely to be the parent. Speech expressions behave differently than perspective expressions with respect to how closely syntax reflects their hierarchical structure.

20 6 Conclusions and Future Work We have shown that identifying the hierarchical structure of pse’s is amenable to automated analysis via a machine learning approach, although there is room for improvement in the result. In the future, we plan to address the related tasks discussed in section 2. We are also interested in ways of improving the machine learning formulation of the current task.


Download ppt "Playing the Telephone Game: Determining the Hierarchical Structure of Perspective and Speech Expressions Eric Breck and Claire Cardie Department of Computer."

Similar presentations


Ads by Google