Presentation is loading. Please wait.

Presentation is loading. Please wait.

Anaphora resolution in connectionist networks Florian Niefind University of Saarbrücken, Institute for Computational Linguistics Helmut Weldle University.

Similar presentations


Presentation on theme: "Anaphora resolution in connectionist networks Florian Niefind University of Saarbrücken, Institute for Computational Linguistics Helmut Weldle University."— Presentation transcript:

1 Anaphora resolution in connectionist networks Florian Niefind University of Saarbrücken, Institute for Computational Linguistics Helmut Weldle University of Freiburg, Centre for Cognitive Science Workshop: „Representation and Processing of Language“ University of Freiburg, 20.11.2009

2 Connectionism and Language Connectionist approaches to language processing applied to various levels and processes (Christiansen & Chater, 1999, 2001) Sequence processing: Simple Recurrent Networks (SRNs: Elman, 1990, 1991, 1993) Linguistic representation in SRNs Associationist, probabilistic, distribution sensitive Constraint satisfaction …Grammar? …Syntactic structures? –Categorization by collocation –Transitions in phase states

3 Sentence processing in SRNs Feed-forward network performing word prediction Context layer provides memory for syntactic context Probability derivation: context dependent word (transition) probabilities Internal Representations: syntactic word classes, context specific features boythesaw boysaw the

4 SRNs as models for language processing? SRNs are merely semantics free POS-taggers (Steedman, 1999, 2002) Limited systematicity, but (Frank, 2006; Brakel & Frank 2009; Frank, Haselager & vanRooij, 2009) Sensitive to irrelevant structural relations (Frank, Mathis & Badecker, 2005) No extrapolation and variable binding (Marcus, 1998; concerning eliminative view: Holyoak & Hummel, 2000) Only structural relations –Language grounding: Acquisition in a situated fashion (Harnad, 1990; Glenberg, 1997; Barsalou, 1999) –Connectionist approaches to grounded acquisition (e.g., Cangelosi, 2005; Plunkett, et al., 1992; Coventry, et al., 2004)

5 Anaphora Resolution Anaphora resolution factors (constraints vs. preferences) Gender/number agreement Semantic consistency Salience Semantic/syntactic parallelism Global structural constraints: c-command Structurally determined complementary binding domains for pronouns and reflexives (G&B theory) a)Reflexives need a c-commanding NP as antecedent b)Pronouns must not have a c-commanding NP as antecedent (inside the boundaries of one sentence) (a) Ken i who likes John j saw himself i/*j. (b) Ken i who likes John j saw him j/*i.

6 Anaphora Resolution Is online anaphora resolution globally structure driven? –Pro sensitivity for structural binding constraints (Asudeh & Keller, 2001; Badecker & Straub, 2002 …but with influences of gender marking by inaccessible antecedents) –Contra I: dominance of exclusively structural principles Logophors (Kaiser et al., 2009; Runner, Sussman & Tanenhaus, 2003, 2006) Referential commitment (MacWhinney, 2008) –Contra II: sensitivity for structural constraints, but not within a global but rather a local frame

7 Anaphora Resolution in SRNs Origins of our studies: Investigation of the performance capacity of SRNs (Frank, Mathis & Badecker, 2005) –How abstract are the grammatical generalizations derived by SRNs? –Anaphora resolution (subsequently: variable binding) Acquisition of binding constraints for pronouns reflexives Lexically complex (variable reference) Structurally complex (bridging irrelevant structures) Architecture: Stepwise cascading SRNs

8 Word prediction Reference assignment Anaphora resolution in SRNs

9 Anaphora Resolution in SRNs Results (Frank, Mathis & Badecker, 2005) –Word prediction good performance –Reference assignment good performance for simple sentences bad performance for complex sentences that impose long- distance constraints –Internal representations reveal the problem Assignment is based on irrelevant structural generalizations E.g., pronoun/reflexive position after SRCs vs. ORCs

10 New Approach SRNs are capable of integrating multiple cues (e.g., Christiansen, Allen & Seidenberg, 1998) SRNs are capable of processing anaphors (Weldle, Konieczny, Müller, Wolfer & Baumann, 2009) Despite restrictions concerning variable binding –Interesting behaviour and predictions of SRNs –Behaviour and predictions for anaphora resolution!? Error-correspondence of performance Locality-effects, false alarms, local syntactic coherences (Konieczny, Müller & Ruh, 2009) Improved replication of Frank, Mathis & Badecker (2005) : mature grammatical representations by means of complex stimuli, task-driven representations forced by integrated cascading SRNs

11 Architecture: cascading SRNs SPC:70 hidden/context units 27 input/output units localistic lexical encoding RAC:35 hidden/context units 9 output units (referents) Learning rate:0.2 – 0.02 (grad. decr.) Momentum:0.6 Init. weight range:0.5 Training10 epochs backpropagation through time Integrative training allows the SRN to keep sensitive for structural information required to solve the reference assignment task word prediction(t +1 ) reference assignment (t 0 ) R EFERENCE A SSIGNMENT C OMPONENT S ENTENCE P ROCESSING C OMPONENT Input word-by-word (t 0 )

12 Training corpus Artificial training corpus, generated with a PCFG –20.000 sentences, presented word-by-word

13 Test corpora SRC Während der Germanist, der den Biologen sieht, sich/ihn kratzte, … „While the philologist, who saw the biologist, scratched him/himself…“ ORC Während der Germanist, den der Biologe sieht, sich/ihn kratzt, … „While the philologist, who the biologist saw, scratched him/himself…“ Test sets –Common test set –Complex test set: anaphora resolution and N/V- agreement in complex syntactic embeddings

14 Results Examination of –Output performance for word prediction –Output performance for reference assignment –Internal representations at anaphoric expression Grammatical Prediction Error (Christiansen & Chater, 1999)

15 Word prediction While the philologist, who saw the biologist, scratched him/himself …

16 Word prediction While the philologist, who saw the biologist, scratched him/himself …

17 Reference: pronouns While the philologist, who saw the biologist, scratched him …

18 Reference: reflexives While the philologist, who saw the biologist, scratched himself …

19 Reference: reflexives While the philologist, who saw the biologist, scratched himself …

20 Local syntactic coherences Analysis of probability vectors at anaphor position Activations are influenced by the antecedent directly preceeding the anaphoric expression Locally coherent sub-sequence crossing the RC-boundary (cf. converging previous simulation findings: Konieczny, Ruh & Müller, 2009) a.Enables access to normally inaccessible antecedents b.Inhibits access to normally accessible antecedents Internal representations (multivariate statistics) Do not reflect dependence on preceding phrase structure Categorization highlights gender- and agreement-marking of MC subject Network develops trans-structural generalizations the biologist, scratched himself i/*j „While the philologist i, who saw the biologist j, scratched himself i/*j …“ the biologist, scratched him „While the philologist i, who saw the biologist j, scratched him j/*i …“

21 Conclusions SRNs with proper prerequisites are in principle capable of anaphora resolution – within limits of interpolation Previous results (Frank, Mathis & Badecker, 2005) are most likely simulation artefacts of the architecture, training procedure and limited grammar Interferences by local coherent subsequences should be seen in terms of error correspondence: prediction of local coherence effects in anaphora resolution Local syntactic coherence effects (Konieczny, 2005; Konieczny et al., 2007, 2009) Effects also affect reference assignment (Weldle et al., 2009; Wolfer, previous talk)


Download ppt "Anaphora resolution in connectionist networks Florian Niefind University of Saarbrücken, Institute for Computational Linguistics Helmut Weldle University."

Similar presentations


Ads by Google