Presentation is loading. Please wait.

Presentation is loading. Please wait.

July 2003LSA1 Computational Approaches to Reference Massimo Poesio (University of Essex) Lecture 4: Centering Theory.

Similar presentations


Presentation on theme: "July 2003LSA1 Computational Approaches to Reference Massimo Poesio (University of Essex) Lecture 4: Centering Theory."— Presentation transcript:

1 July 2003LSA1 Computational Approaches to Reference Massimo Poesio (University of Essex) Lecture 4: Centering Theory

2 July 2003LSA2 SALIENCE: Theoretical Models and Empirical Evidence Massimo Poesio (University of Essex) Rosemary Stevenson (University of Durham) Lecture 4: Centering Theory

3 July 2003 LSA 3 Today’s lecture A formalization of the notion of ‘in focus’: Centering Evidence for centering: Behavioral Corpora Centering-based anaphora resolution

4 July 2003 LSA 4 Theories of salience & focusing Fixed number of foci: Sidner’s theory Centering Unbounded: Unbounded, but no activation Strube’s S-List 1998, Henschel Cheng & Poesio Activation-based: Kantor, Alshawi / Lappin & Leass, Haijcova

5 July 2003 LSA 5 The Grosz and Sidner theory of discourse Central idea: COHERENCE and SALIENCE go hand in hand Reintroduce the idea of a separation between `local’ and `global’ bal’ aspects of coherence and salience (the LOCAL FOCUS and GLOBAL FOCUS) from Grosz, 1977 Two separate theories for each component: Global focus: Grosz and Sidner, 1986 Local focus: Centering theory (Grosz, Joshi and Weinstein, 1983, 1995) Massimo Poesio: Gordon survey makes it quite clear that the `local focus’ plays here a role similar to that of Short Term Memory (STM) in the Kintsch and van Dijk model – but interestingly, Gordon himself seems to assume that it’s the stack that corresponds to the STM! Perverse ??? Maybe we should discuss Guindon’s model in chapter centering, as an alternative to the idea of the local focus as a CF list? (Just as Walker’s cache model would be an alternative to the stack model?) Massimo Poesio: Gordon survey makes it quite clear that the `local focus’ plays here a role similar to that of Short Term Memory (STM) in the Kintsch and van Dijk model – but interestingly, Gordon himself seems to assume that it’s the stack that corresponds to the STM! Perverse ??? Maybe we should discuss Guindon’s model in chapter centering, as an alternative to the idea of the local focus as a CF list? (Just as Walker’s cache model would be an alternative to the stack model?)

6 July 2003 LSA 6 The Global Focus At this level of discourse organization, Coherence has to do with INTENTIONAL STRUCTURE, I.e., a discourse is perceived as GLOBALLY COHERENT if the intentions expressed by its constituents are related ATTENTIONAL STRUCTURE is about situations rather than events: GLOBAL ATTENTION is on FOCUS SPACES, subsets of the global knowledge base Three levels of discourse structure: LINGUISTIC STRUCTURE (cfr. Van Dijk and Kintsch’s `linguistic structure’ INTENTIONAL STRUCTURE: intentions associated with segments together with their relations, DOMINANCE and SATISFACTION- PRECEDES ATTENTIONAL STRUCTURE: a stack of FOCUS SPACES, each associated with an intention, and whose position reflects the relations among intentions Massimo Poesio: Focus stack not necessarily situation-based Massimo Poesio: Focus stack not necessarily situation-based

7 July 2003 LSA 7 Example: the Grosz 1977 `tent’ story (1)P1: I’m going camping next week-end. Do you have a two-person tent I could borrow? (2)P2: Sure. I have a two-person backpacking tent. (3) P1: The last trip I was on there was a huge storm. (4) It poured for two hours. (5) I had a tent, but got soaked anyway. (6)P2: What kind of tent was it? (7)P1: A tube tent. (8)P2: Tube tents don’t stand well in a real storm. (9)P1: True.

8 July 2003 LSA 8 Example: the Grosz 1977 `tent’ story (1)P1: I’m going camping next week-end. Do you have a two-person tent I could borrow? (2)P2: Sure. I have a two-person backpacking tent. (10)P2: Where are you going on this trip? (11)P1: Up in the Minarets. (12)P2: Do you need any other equipment? (13)P1: No. (14)P2: OK. I’ll bring the tent in tomorrow.

9 July 2003 LSA 9 Intentional structure DSP1: P1 intend to get tent from P2 DSP2: P1 explains why P1 needs tent DOMINATE Massimo Poesio: Note how the idea that discourse structure is determined by intentions is different from ideas like kintsch and van Dijk and more in general of ‘situational models’ or ‘event structure’ (cfr. Gordon survey) Massimo Poesio: Note how the idea that discourse structure is determined by intentions is different from ideas like kintsch and van Dijk and more in general of ‘situational models’ or ‘event structure’ (cfr. Gordon survey)

10 July 2003 LSA 10 Intentional structure DSP1: P1 intend to get tent from P2

11 July 2003 LSA 11 The Focus Space Stack X1 S1 tent(X1) S1:of(P2,X1) DSP1: P1 intend to get tent from P2 X2 E1 S2 tube-tent(X2) S2:of(P1,X2) E1:washed-up(X2) DSP2: P1 explains why P1 needs tent DOMINATE

12 July 2003 LSA 12 The Focus Space Stack X1 S1 E3 X3 X4 E4 tent(X1) S1:of(P2,X1) Minarets(X3) E3:go(P1,X3) tent(X4) X4=? E4:bring(P2,X4) DSP1: P1 intend to get tent from P2

13 July 2003 LSA 13 Other formalizations of the global focus Reichman’s ‘context space model’ (1981, 1985) Context spaces very similar to focus spaces, but with levels of activation Richer repertoire of relations Walker’s cache model (1996, 1998) Replace stack with cache

14 July 2003 LSA 14 Some evidence Clearest evidence for distinction between global focus and local focus: the Clark and Sengul’s experiments discussed in Lecture 1 Evidence that discourses have a `global organization’ and that discourse segments (and associated episodes) become unaccessible: Experiments by Anderson et al 1983 suggesting that `temporally closed’ situations become unaccessible Lesgold, Roth, and Curtis, 1979 Vonk, Hustin and Simmons 1992 Corpus work: Grosz’ own work Chafe 1979’s analysis of the `pear stories’ Evidence relevant to the claim that attentional state is a stack: O’Brien, 1987 But there is also evidence that antecedents which are ‘too far’ are not accessible any longer (Walker, 1998; O’Brien et al, 1997) Massimo Poesio: Check Rosemary’s notes: situation is a bit nuanced Good discussion in Garnham, p. 88-91 (although some of the experiments he mentions do not seem terribly relevant) Even better discussion in Gordon survey. But is there any evidence supporting idea of intentional structure as opposed to event structure? (All the evidence mentioned here is from even structure, as is old work by Garrod and Sanford). Gordon mentions correlation with prosody and cue phrases – perhaps Vonk et al? Should also mention that there is real question whether this global structure can be reliably identified Perhaps even mention work with Barbara? And what about ideas from Ali etc that global discourse is entity-structured in certain genres? Massimo Poesio: Check Rosemary’s notes: situation is a bit nuanced Good discussion in Garnham, p. 88-91 (although some of the experiments he mentions do not seem terribly relevant) Even better discussion in Gordon survey. But is there any evidence supporting idea of intentional structure as opposed to event structure? (All the evidence mentioned here is from even structure, as is old work by Garrod and Sanford). Gordon mentions correlation with prosody and cue phrases – perhaps Vonk et al? Should also mention that there is real question whether this global structure can be reliably identified Perhaps even mention work with Barbara? And what about ideas from Ali etc that global discourse is entity-structured in certain genres?

15 July 2003 LSA 15 Lesgold, Roth, and Curtis, 1979 Massimo Poesio: (From Garnham, p. 90) Hold on. If stack, material in between in 6.54b and c shouldn’t be on stack anymore, so there SHOULDN’T be a difference in time! Massimo Poesio: (From Garnham, p. 90) Hold on. If stack, material in between in 6.54b and c shouldn’t be on stack anymore, so there SHOULDN’T be a difference in time!

16 July 2003 LSA 16 The local discourse level Whereas the global focus theory from Grosz and Sidner 1986 is meant to characterize INTERSEGMENTAL coherence and salience, Centering is meant to characterize INTRASEGMENTAL coherence and salience The first claim is that what matters most at this level is ENTITY COHERENCE: discourse segments in which successives utterances keep mentioning the same utterances are perceived to be more coherent than discourse segments in which different entities are mentioned each time A second important claim is that each utterance has a main CENTER, or CB, and that utterances whose CB is the same as the previous one are easier to process A third claim is that the entities mentioned by an utterance (`realized’) are RANKED (cfr. Sidner’s ordering of DFLs). This ranking determines the CB of subsequent utterances, and changes in ranking also make utterances more difficult to process. Massimo Poesio: Cfr. Knott, Oberlander and Mellish entity coherence as a global organizing principle? Massimo Poesio: Cfr. Knott, Oberlander and Mellish entity coherence as a global organizing principle?

17 July 2003 LSA 17 The local focus: Centering Centering is often presented as a development of Sidner, but in fact it is radically different in outlook and fairly different in its details as well Unlike Sidner’s theory, Centering (Joshi and Weinstein, 1979; Grosz, Joshi and Weinstein, 1983; Grosz, Joshi and Weinstein, 1995) is more of a `linguistic’ theory than a computational one: its primary aim is to develop a vocabulary for talking about local salience and coherence, rather than specific algorithms The precise specification of many of the central concepts (‘ranking’, ‘utterance’, ‘realization’) is left for further research – indeed, it has been claimed that these concepts may be instantiated in different ways in different languages (Walker et al, 1994)

18 July 2003 LSA 18 Ranking and local coherence Grosz et al (1983, 1995): texts that do not have a clear ‘central entity’ feel less coherent (1) a.John went to his favorite music store to buy a piano. b. He had frequented the store for many years. c. He was excited that he could finally buy a piano. d. He arrived just as the store was closing for the day. (2) a.John went to his favorite music store to buy a piano. b.It was a store John had frequented for many years. c.He was excited that he could finally buy a piano. d. It was closing just as John arrived.

19 July 2003 LSA 19 Local salience and pronominalization Grosz et al (1995): the CB is also the most salient entity. Texts in which other entities are pronominalized are less felicitous (1) a.Something must be wrong with John. b. He has been acting quite odd. c. He called up Mike yesterday. d. John wanted to meet him quite urgently. (2) a. Something must be wrong with John. b.He has been acting quite odd. c.He called up Mike yesterday. d. He wanted to meet him quite urgently.

20 July 2003 LSA 20 Uniqueness of the center Grosz et al (1995) argue against Sidner that utterances have a single CB. (1) a.Susan gave Betsy a pet hamster. b. She reminded her that such hamsters were quite shy. c.She asked Betsy whether she liked the gift. d.Betsy told her that she really liked the gift. f.She told Susan that she really liked the gift. e.Susan asked her whether she liked the gift. Massimo Poesio: NB: the one bit that Sidner does not predict is a contrast between c. and e. In the cases d. and f., we have a pronoun in AGENT position referring to an entity in non-AGENT position, and viceversa, which could be claimed to result in processing difficulties. Sidner would also claim that all the pronouns in AGENT position are ambiguous (although not clear what she does with ambiguity) Note also that according to Strube, both entities would be equally ranked. Massimo Poesio: NB: the one bit that Sidner does not predict is a contrast between c. and e. In the cases d. and f., we have a pronoun in AGENT position referring to an entity in non-AGENT position, and viceversa, which could be claimed to result in processing difficulties. Sidner would also claim that all the pronouns in AGENT position are ambiguous (although not clear what she does with ambiguity) Note also that according to Strube, both entities would be equally ranked.

21 July 2003 LSA 21 Concepts and definitions Every UTTERANCE U in a discourse (segment) DS updates the local attentional state, or local focus, which consists of a PARTIALLY RANKED set of discourse entities, or FORWARD- LOOKING CENTERS (CFs) An utterance U in discourse segment DS updates the existing set of forward-looking centers by replacing it with the set of CFs REALIZED in U, CF(U,DS) (usually simplified to CF(U)) The most highly ranked CF realized in utterance U is CP(U) (1) u1. Susan gave James a pet hamster. CF(u1) = [Susan,James,pet hamster]. CP(u1) = Susan (2) u2. She gave Peter a nice scarf. CF(u2) = [Susan,Peter,nice scarf]. CP(u2) = Susan Massimo Poesio: Add examples of utterances and CFs! Massimo Poesio: Add examples of utterances and CFs!

22 July 2003 LSA 22 The CB: Examples (1) u1. Susan gave James a pet hamster. CF(u1) = [Susan,James,pet hamster]. CB = undefined CP=Susan (2) u2. She gave Peter a nice scarf. CF(u2) = [Susan,Peter,nice scarf]. CB=Susan. CP=Susan NB: The CB is not always the most ranked entity of the PREVIOUS utterance (2’) u2. He loves hamsters. CF(u2) = [James]. CB=James. CP=James … or the most highly ranked entity of the CURRENT one (2’’) u2. Peter gave her a nice scarf. CF(u2) = [Peter,Susan, nice scarf]. CB=Susan. CP=Peter

23 July 2003 LSA 23 Transitions Grosz et al proposed that the load involved in processing an utterance depends on whether that utterance preserves the CB of the previous utterance or not, and on whether CB(U) is also CP(U). They introduce the following classification: CENTER CONTINUATION: U i is a continuation if CB(U i ) = CB(U i-1 ), and CB(U i ) = CP(U i ) CENTER RETAIN: U i is a retain if CB(U i ) = CB(U i-1 ), but CB(U i ) is different from CP(U i ) CENTER SHIFT: U i is a shift if CB(U i ) ≠ CB(U i-1

24 July 2003 LSA 24 Utterance classification (0) u0. Susan is a generous person. CF(u0) = [Susan] CB = undefined CP = Susan. (1) u1. She gave James a pet hamster. CF(u1) = [Susan,James,pet hamster]. CB = Susan CP=Susan (2) u2. She gave Peter a nice scarf. CF(u2) = [Susan,Peter,nice scarf]. CB=Susan. CP=Susan CONTINUE SHIFT: (2’) u2. He loves hamsters. CF(u2) = [James]. CB=James. CP=James SHIFT RETAIN: (2’’) u2. Peter gave her a nice scarf. CF(u2) = [Peter,Susan, nice scarf]. CB=Susan. CP=Peter RETAIN CONTINUE: Massimo Poesio: Note that you need to establish the CB first – see Walker et al 1994, Kameyama 1998, etc. Massimo Poesio: Note that you need to establish the CB first – see Walker et al 1994, Kameyama 1998, etc.

25 July 2003 LSA 25 Main claims CONSTRAINT 1: All utterances of a segment except for the first have exactly one CB RULE 1: if any CF is pronominalized, the CB is. RULE 2: (Sequences of) continuations are preferred over (sequences of) retains, which are preferred over (sequences of) shifts.

26 July 2003 LSA 26 Violations of the claims A violation of Rule 1 Violations of Constraint 1 (1) a.Something must be wrong with John. b. He has been acting quite odd. CB=John c. He called up Mike yesterday. CB=John d. John wanted to meet him quite urgently. CB=John (1) a.Something must be wrong with John. b. He has been acting quite odd. c. He called up Mike yesterday. d. It must have been four o’clock in the morning. CB=undef (1) a.Something must be wrong with John. b. He has been acting quite odd. c. He and Susan had a fight yesterday. d. He didn’t want her to go to the party. CB=John, CB=Susan Massimo Poesio: Emphasize what the claims say: these are preferences that make a text easier or harder to read! Massimo Poesio: Emphasize what the claims say: these are preferences that make a text easier or harder to read!

27 July 2003 LSA 27 The parameters of the theory Grosz et al do not provide algorithms for computing any of the notions used in the basic definitions: UTTERANCE PREVIOUS UTTERANCE REALIZATION RANKING What counts as a ‘PRONOUN’ for the purposes of Rule 1? (Only personal pronouns? Or demonstrative pronouns as well? What about second person pronouns?) One of the reasons for the success of the theory is that it provides plenty of scope for theorizing …

28 July 2003 LSA 28 The CB A second CF is singled out as BACKWARD-LOOKING CENTER, CB – Centering’s implementation of the notion of ‘topic’ or, better, ‘main character’ in the sense of Garrod and Sanford (1988) Originally, the CB was only characterized in intuitive terms. Ever since Grosz, Joshi and Weinstein (1986, 1995), the CB has been DEFINED as follows: Note however that other characterizations of CB have been proposed – e.g., by Gordon et al (1993) and Passonneau (1993). CONSTRAINT 3: CB(U i ) is the highest-ranked element of CF(U i-1 ) that is realized in U i

29 July 2003 LSA 29 Utterance and Previous Utterance Originally, utterances implicitly identified with sentences. Later, however, Kameyama (1998) and others suggested to identify utterances with finite clauses. If utterances are identified with sentences, the previous utterance is generally easy to identify (except for texts with titles, etc.) But if utterances are identified with finite clauses, there are various ways of dealing with cases like: (u1) John wanted to leave home (u2) before Bill came home. (u3) He would be drunk as usual. KAMEYAMA: PREV(u3) = u2. SURI and MCCOY: PREV(u3) = u1

30 July 2003 LSA 30 Realization A basic question is whether entities can be ‘indirectly’ realized in utterances by an associate (as in Sidner’s algorithm) (u1) John walked towards the house. (u2) THE DOOR was open. A second question is whether first and second person entities are realized: (u1) Before you buy this medicine, (u2) you should contact your doctor. Realization greatly affects Constraint 1.

31 July 2003 LSA 31 Ranking The most studied parameter GRAMMATICAL FUNCTION (Kameyama 1986, Grosz Joshi and Weinstein 1986, Brennan et al 1987, Hudson et al 1986, Gordon et al 1993): SUBJ < OBJ < OTHERS A student was here to see John today: A STUDENT < JOHN INFORMATION STATUS (Strube and Hahn, 1999): HEARER-OLD < MEDIATED < HEARER-NEW A student was here to see John today: JOHN < A STUDENT THEMATIC ROLES (Cote, 1998) FIRST MENTION / LINEAR ORDER (Rambow, 1993; Gordon et al, 1993) In Lisa’s opinion, John shouldn’t have done that Also, the one parameter that is supposed to vary across languages E.g., ranking in Japanese and Turkish have been claimed to have additional functions (Walker et al 1994; Turan, 1995) Massimo Poesio: A problem for Strube: how do you account for all that evidence about subject assignment in English? Massimo Poesio: A problem for Strube: how do you account for all that evidence about subject assignment in English?

32 July 2003 LSA 32 Variants of the claims Different definitions of CB: Grosz et al 1983, Gordon et al 1993, Passonneau 1993 Different versions of Rule 1: Greene McKoon and Ratcliff 1992, Gordon et al 1993 Different definitions of Rule 2: Brennan et al, 1987 Strube and Hahn, 1999 Kibble, 2001 Massimo Poesio: Skip for the moment Eventually should also add discusson of Greene McKoon and Ratcliff, which seem to have proposed a strong version of Rule 1 themselves Massimo Poesio: Skip for the moment Eventually should also add discusson of Greene McKoon and Ratcliff, which seem to have proposed a strong version of Rule 1 themselves

33 July 2003 LSA 33 Empirical evaluations of Centering theory Some of the evidence about the general architecture of focusing discussed in connection with Sidner’s theory relevant for Centering as well In particular, evidence concerning interaction with commonsense knowledge, and concerning serial vs. parallel Constraint 1: supported by work such as Ehrlich &Johnson Laird (1982) Rule 1: Quite a lot of psychological results, whose connection with Centering is, however, not always so direct: Hudson, Tanenhaus, and Dell, 1986 Gordon et al, 1993 and subsequent papers (Gordon and Chan, 1995; Gordon and Scearce, 1995; Gordon et al, 1999) Brennan, 1995 Rule 2: no evidence for these preferences (e.g., Gordon et al 1993, Gordon and Scearce 1995) Several algorithms based on centering theory have been proposed (Brennan et al, 1987; Strube and Hahn, 1999; Tetreault, 2001) and evaluated using annotated corpora Poesio et al, 2000, 2002: evaluate the claims of the theory by trying various possible parameter settings and finding the one which minimizes the violations of the claims Massimo Poesio: On constraint 1: reference to Ehrlich and Johnson-Laird in Gordon et al 1993. Gordon survey also mentions Kintsch, Kozminsky, Streby, McKoon and Keenan 1975, and Manelis and Yekovich 1976 (.p 34) Gordon also sees the 1993 experiments as relevant for Constraint 1 For Strube / Hahn ranking: Sanford Moar and Garrod and our naming paper Massimo Poesio: On constraint 1: reference to Ehrlich and Johnson-Laird in Gordon et al 1993. Gordon survey also mentions Kintsch, Kozminsky, Streby, McKoon and Keenan 1975, and Manelis and Yekovich 1976 (.p 34) Gordon also sees the 1993 experiments as relevant for Constraint 1 For Strube / Hahn ranking: Sanford Moar and Garrod and our naming paper

34 July 2003 LSA 34 Ranking: Hudson, Tanenhaus, and Dell Hudson, Tanenhaus, and Dell, 1986 ran some reading time experiments using materials as follows, in which `Jack’ is made more highly ranked in a. by being the subject and by choosing an NP1 verb Results: RT(b1) << RT(b2) = RT(b4) << RT(b3) which violates Rule 1 a. Jack apologised profusely to Josh. b1. He had been rude to Josh yesterday. b3. He had been offended by Jack’s comment. b2. Jack had been rude to Josh yesterday. b4. Josh had been offended by Jack’s comment. Massimo Poesio: Note that the first entity is most highly ranked both by subject and by implicit causality verbs Massimo Poesio: Note that the first entity is most highly ranked both by subject and by implicit causality verbs

35 July 2003 LSA 35 Some comments Notice that while Centering predicts problems with b3, it does not predict the faster reading times for b1 (see also discussion of Gordon et al’s experiments, next) In fact, in order to claim that the slow reading time for b3 is consistent with Rule 1, we have to assume that when the subject encounters the pronouns she already knows what the CB is going to be Furthermore, the materials do not really allow us to tell whether the effect here is due to a true focusing effect, or it’s only a subject assignment preference Finally, no difference in this case between the predictions according to a theory of ranking based on grammatical function and that proposed by Strube and Hahn (which uses linear order to break ties) Massimo Poesio: Remark on the need for an incremental version of centering to evaluate these claims (at least two exists; see also Kehler, 1997) Massimo Poesio: Remark on the need for an incremental version of centering to evaluate these claims (at least two exists; see also Kehler, 1997)

36 July 2003 LSA 36 Ranking and pronominalization: Gordon, Grosz and Gilliom, 1993 A series of reading time studies that revealed a REPEATED NAME PENALTY (RNP): an increased reading time when a proper name is used instead of a pronoun PRO-PRO: (1) a.Bruno was the bully of the neighborhood. b. He chased Tommy all the way home one day. c. He watched him hide behind a big tree and start to cry. d. He yelled at him so loudly that all the neighbors came outside. PRO-NAME: (1) a.Bruno was the bully of the neighborhood. b. He chased Tommy all the way home one day. c. He watched Tommy hide behind a big tree and start to cry. d. He yelled at Tommy so loudly that all the neighbors came outside.

37 July 2003 LSA 37 Ranking and pronominalization: Gordon, Grosz and Gilliom, 1993 NAME-PRO: (1) a.Bruno was the bully of the neighborhood. b. Bruno chased Tommy all the way home one day. c. Bruno watched him hide behind a big tree and start to cry. d. Bruno yelled at him so loudly that all the neighbors came outside.

38 July 2003 LSA 38 Repeated Name Penalty

39 July 2003 LSA 39 Ranking and pronominalization: entities subject to RNP Gordon et al only observed a RNP for entities in SUBJECT position referring to either the FIRST MENTIONED or SUBJECT of the previous utterance (Exp. 2 and 3) Gordon et al: these results support both Costraint 1 and Rule 1 They suggest to replace the definition of CB with one based on the RNP A difference: (2b) above (2)a. Lisa gave Fred a pet hamster. b. In her / Lisa’s opinion, an hamster was the best present for him/Fred. c. In his / Fred’s opinion, She/Lisa shouldn’t have done that. Massimo Poesio: Exp 2 for b, exp 3 for c Add data supporting claim that first mention and subject same ranking? Massimo Poesio: Exp 2 for b, exp 3 for c Add data supporting claim that first mention and subject same ranking? (1)a. Susan gave Fred a pet hamster. b. In his / Fred’s opinion, she/Susan shouldn’t have done that. c.She/Susan just assumed that anyone would love a hamster. c’.He/Fred doesn’t have anywhere to put a cage.

40 July 2003 LSA 40 A few comments The RNP is a very interesting result, but a lot of people wonder whether it really makes sense to take it as a verification of Centering, at least in its `classical’ version Gordon et al didn’t find RNP for entities that would be CBs according to the original definition Even if we accept Gordon et al’s suggestion that we should modify the theory by dropping the definition in Constraint 3 and adopting the RNP as an operational test for the CB, we would still need to modify Rule 1 – which in its classical form does not REQUIRE the CB to be pronominalized.

41 July 2003 LSA 41 Brennan, 1995 Method: have subjects work in pairs – one, acting as `announcer’, narrates a basketball game seen on a videotape to the other, as `audience’. The narration is taped and transcribed, `reference events’ marked and identified as ‘high-prominence’ or `low-prominence’. NPs marked as pros, NP-subj, NP-obj, and a few other classes Main results: NPs mostly introduced in subject position NPs introduced in object position not pronominalized until moved to subject position. Massimo Poesio: Skip for the moment Massimo Poesio: Skip for the moment

42 July 2003 LSA 42 Other Gordon experiments Gordon and Scearce, 1995: focusing generates hypotheses independently from commonsense knowledge Gordon and Chan, 1995: ranking depends on subjecthood rather than agenthood Gordon et al, 1999: ranking in complex NPs (coordinated NPs, possessive NPs) depends on structural factors rather than linear order Massimo Poesio: Skip this for now Check Garnham and survey paper by Gordon Massimo Poesio: Skip this for now Check Garnham and survey paper by Gordon

43 July 2003 LSA 43 Corpus-based evaluation Notions from Centering used in a number of studies, especially of the connection between status in the local focus an NP form Passonneau (1993): comparison of uses of IT and THAT IT primarily used to refer to LOCAL CENTERs THAT to entities which are not local centers Di Eugenio (1992, 1998): `weak’ vs. `strong’ pronouns in Italian `weak’ pronouns used to maintain CB `strong’ pronouns for shifting

44 July 2003 LSA 44 Poesio et al 2000, submitted: A corpus- based evaluation of Centering Using the GNOME corpus to compare ‘parameter configurations’ using the number of violations of Constraint 1, Rule 1, and Rule 2 as metrics Can be used on-line: http://cswww.essex.ac.uk/staff/poesio/cbc

45 July 2003 LSA 45 The trade-off between Constraint 1 and Rule 1: Utterance parameters

46 July 2003 LSA 46 Algorithms based on centering theory Anaphora resolution: Brennan, Friedman and Pollard, 1987 (BFP) ‘Basic algorithm’, Strube and Hahn 1999 Incremental algorithms: Strube, 1998; Tetreault, 1999, 2001 Generation: Text planning: Kibble and Power, 2000; Karamanis, 2001, 2002 NP realization: Henschel, Cheng, and Poesio, 2000 A number of evaluation studies: Walker, 1989 (BFP vs. Hobbs) Strube and Hahn, 1999 (SH vs. BFP) Strube, 1998 (S-LIST vs BFP) Tetreault, 2001 (History List vs. Hobbs vs. BFP vs S-LIST vs. LRC) Massimo Poesio: Have to add here discussion of Tetreault Massimo Poesio: Have to add here discussion of Tetreault

47 July 2003 LSA 47 Brennan et al (1987) The first and, arguably, still best-known algorithm for pronoun resolution based on Centering was proposed by Brennan, Friedman and Pollard (1987) Parameter configuration: Utterances: sentences Ranking: grammatical function Realization: direct?? Along the way, Brennan et al also developed the formalization of Centering which is best-know E.g., the terminology of ‘Constraints’ and ‘Rules’, or the division of ‘Shifts’ into ‘Smooth shift’ and ‘Rough Shifts’

48 July 2003 LSA 48 The algorithm 1.GENERATE possible Cb-Cf combinations (or anchors) 2.FILTER these anchors by constraints: a.Binding theory, b.sortal predicates c.Centering rules and constraints 3.RANK the remaining anchors according to transition preferences: CONTINUE < RETAIN < SMOOTH-SHIFT < ROUGH-SHIFT

49 July 2003 LSA 49 An example (1) u1. Terry really goofs sometimes. (2) u2. Yesterday was a beautiful day and he was excited about trying out his new sailboat. Tony (3) u3. He wanted Tony to join him on a sailing expedition. him (4) u4. He called him at 6AM. (5) u5. He was sick and furious at being woken up so early.

50 July 2003 LSA 50 Analysis of the example (1) u1. Terry really goofs sometimes. CB = NIL CF = [Terry] (2) u2. Yesterday was a beautiful day and he was excited about trying out his new sailboat. Referring expressions = [yesterday, A1, A2, the sailboat] Possible CF lists: [yesterday, Terry, Terry, the sailboat] Anchors = a1. a2. Filter out a2. CB = Terry CF = [yesterday, Terry, Terry, the sailboat] transition = ESTABLISH / CONTINUE

51 July 2003 LSA 51 Example, cont’d Tony (3) u3. He wanted Tony to join him on a sailing expedition. CB = Terry CF = [Terry, Tony, Terry, a sailing expedition] him (4) u4. He called him at 6AM. Referring expressions = [A3, A4] Possible CF lists: [Terry, Terry] [Terry, Tony] [Tony, Terry] [Tony, Tony] Anchors = a1. a2. a3. ….. Filter out all anchors except for (CONTINUE) and (RETAIN) CB = Terry CF = [Terry, Tony] transition = CONTINUE

52 July 2003 LSA 52 Example, end (5) u5. He was sick and furious at being woken up so early. Referring expressions: [A1] Possible CF lists: [Terry], [Tony] Possible anchors: (CONTINUE) (SMOOTH-SHIFT) CB = Terry CF = [Terry]

53 July 2003 LSA 53 A simple example (1) u1. Carl works at HP on the Natural Language Project. CB: nil CF = [Carl, HP, Natural Language Project] (2) u2. He manages Lyn. CB: Carl CF = [Carl, Lyn] Transition type: ESTABLISH? Referring expressions = [A1, Lyn] Possible CF lists: [Carl, Lyn] Anchors = a1. a2. a3. Filter out a2 and a3.

54 July 2003 LSA 54 A more complex example (1) u1. Susan gave Betsy a pet hamster. CB = NIL CF = [Susan, Betsy] (2) u2. She reminded her that such hamsters were quite shy. Referring expressions = [A1, A2, hamsters] Possible CF lists: [Susan, Betsy, hamsters], [Betsy, Susan, hamsters] Anchors = a1. a2. a3. a4. a5. a6. Filter out a2, a3, a5, a6.

55 July 2003 LSA 55 Kehler, 1997 (1) u1. Terry gets really angry sometimes. (2) u2. Yesterday was a beautiful day and he was excited about trying out his new sailboat. Tony himhis (3) u3. He wanted Tony to join him on a sailing expedition, and left him a message on his answering machine. Tony (4) u4. Tony called him at 6AM the next morning. (RETAIN) (5) u5. He was furious for being woken up so early. He = Terry: CONTINUE. He = Tony: SMOOTH-SHIFT. (5’) u5. He was furious with him for being woken up so early. (CB = Tony) He = Tony, him = Terry: SMOOTH-SHIFT He = Terry, him = Tony: ROUGH-SHIFT (5’’) u5. He was furious with Tony for being woken up so early. (CB = Tony) He = Terry: violation of Rule 1

56 July 2003 LSA 56 Tetreault 2001- LRC 1.Process left-to-right all references to discourse entities in utterance U n. When a pronoun is encountered, a.Search for an antecedent intrasententially in the list of all processed CFs in U n that meet feature and binding constraints. b.If none is found, search for an antecedent intersententially in CF(U n-1 ) that satisfies agreement and binding constraints. 2.Create CF(U n ) by ranking its discourse entities according to grammatical function. (In the implementation, this ranking is approximated by a left-to-right, breadth-first walk of the parse tree.) 3.Compute CB(U n ) 4.Compute the transition.

57 July 2003 LSA 57 Tetreault’s Evaluation of Pronoun Resolution Algorithms AlgorithmRight% right% intra% inter BFP100459.475.148.0 S-list121171.774.167.5 LRC order126674.772.081.6 LRC GF126874.972.082.0 Hobbs129876.874.282.0 LRC toback 136280.477.787.3 Massimo Poesio: These are the figures for the NYT Massimo Poesio: These are the figures for the NYT

58 July 2003 LSA 58 Readings & Reference Brennan, S., Friedman, M. and Pollard, C. 1987. A Centering approach to pronouns. Proc. of the 25 th ACL, p. 155-162. Gordon, P. C., Grosz, B. J., and Gilliom, L. A. 1993. Pronouns, names, and the centering of attention in discourse. Cognitive Science, 17(3), 311- 347. Grosz, B., Joshi, A., and Weinstein, S. 1995. Centering: a framework for modeling the local coherence of discourse. Computational Linguistics, 21(2). [required reading] Grosz, B. and Sidner, C. 1986. Attention, Intention, and the Structure of Discourse. Computational Linguistics. Kehler, A. 1997. Current theories of Centering for pronoun interpretation. Computational Linguistics, 23(3). Tetreault, J. 2001. A corpus-based evaluation of Centering and anaphora resolution. Computational Linguistics, 27(4). [required reading]


Download ppt "July 2003LSA1 Computational Approaches to Reference Massimo Poesio (University of Essex) Lecture 4: Centering Theory."

Similar presentations


Ads by Google