Readers maintain and act on uncertainty about past linguistic input: Evidence from eye movements Roger Levy Klinton Bicknell Tim Slattery Keith Rayner.

Slides:



Advertisements
Similar presentations
Collaborating By: Mandi Schumacher.
Advertisements

Statistical NLP Winter 2009 Lecture 12: Computational Psycholinguistics Roger Levy.
Marslen-Wilson Big Question: “What processes take place during the period that the sensory information is accumulating for the listener” during spoken.
TOWARDS A MODULAR APPROACH TO ANAPHORIC PROCESSING: semantic operations precede discourse operations Arnout Koornneef.
Eye Movements and Spoken Language Comprehension: effects of visual context on syntactic ambiguity resolution Spivey et al. (2002) Psych 526 Eun-Kyung Lee.
Sentence Processing III Language Use and Understanding Class 12.
PROBABILITY. Uncertainty  Let action A t = leave for airport t minutes before flight from Logan Airport  Will A t get me there on time ? Problems :
Sentence Processing 1: Encapsulation 4/7/04 BCS 261.
Uncertain input and noisy-channel sentence comprehension Roger Levy UC San Diego LSA Institute July 2011.
Spoken Word Recognition 1 Language Use and Understanding.
Using prosody to avoid ambiguity: Effects of speaker awareness and referential context Snedeker and Trueswell (2003) Psych 526 Eun-Kyung Lee.
9/22/10Psyc / Ling / Comm 525 Fall 10 Semantic Priming (Phenomenon & Tool)...armkitchentree Related prime >doctoractor < Unrelated prime nurse floor...
Hemispheric asymmetries and joke comprehension Coulson, S., & Williams, R. F. (2005) Neuropsychologia, 43,
Auditory Word Recognition
Sentence Memory: A Constructive Versus Interpretive Approach Bransford, J.D., Barclay, J.R., & Franks, J.J.
Validity, Sampling & Experimental Control Psych 231: Research Methods in Psychology.
Elementary hypothesis testing
Evaluating Hypotheses Chapter 9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics.
Elementary hypothesis testing Purpose of hypothesis testing Type of hypotheses Type of errors Critical regions Significant levels Hypothesis vs intervals.
Evaluating Hypotheses Chapter 9 Homework: 1-9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics ~
Day 2: Pruning continued; begin competition models
On hallucinated garden paths Roger Levy UC San Diego Department of Linguistics 2010 LSA Annual Meeting, January 8.
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Overview of Lecture Independent and Dependent Variables Between and Within Designs.
How Psychologists Ask and Answer Questions
 The misinformation effect refers to incorrect recall or source attribution of an item presented after a to-be-remembered event as having been presented.
Intro to Psycholinguistics What its experiments are teaching us about language processing and production.
Skim Reading: An Adaptive Strategy for Reading on the Web Gemma Fitzsimmons, Mark J Weal and Denis Drieghe.
EXPERIMENT 2 [4] CW- inconsistent If cats were vegetarians they would be cheaper for owners to look after. Families could feed their cat a bowl of |fish.
CHAPTER 3: DEVELOPING LITERATURE REVIEW SKILLS
Inferences for Regression
The Argument for Using Statistics Weighing the Evidence Statistical Inference: An Overview Applying Statistical Inference: An Example Going Beyond Testing.
Chapter 8 Introduction to Hypothesis Testing
Anaphoric dependencies : A window into the architecture of the language system Eye tracking experiments Eric Reuland Frank Wijnen Arnout Koornneef.
The effects of relevance of on-screen information on gaze behaviour and communication in 3-party groups Emma L Clayes University of Glasgow Supervisor:
10.2 Tests of Significance Use confidence intervals when the goal is to estimate the population parameter If the goal is to.
Scientific Methodology. a way of knowing about the world. a process using observation and data to investigate and understand our universe. SCIENCE IS…
1 Reasoning Under Uncertainty Artificial Intelligence Chapter 9.
Background: Speakers use prosody to distinguish between the meanings of ambiguous syntactic structures (Snedeker & Trueswell, 2004). Discourse also has.
1 Lesson 4 Attitudes. 2 Lesson Outline   Last class, the self and its presentation  What are attitudes?  Where do attitudes come from  How are they.
METHOD RW- inconsistent / consistent If cats are hungry they usually pester their owners until they get fed. Families could feed their cat a bowl of carrots/
Results of Eyetracking & Self-Paced Moving Window Studies DO-Bias Verbs: The referees warned the spectators would probably get too rowdy. The referees.
The effects of working memory load on negative priming in an N-back task Ewald Neumann Brain-Inspired Cognitive Systems (BICS) July, 2010.
A Fully Rational Model of Local-coherence Effects Modeling Uncertainty about the Linguistic Input in Sentence Comprehension Roger Levy UC San Diego CUNY.
A Strategy for Looking For Effects of Discourse on Sentence Comprehension Look for effects of discourse context by making sentence require something from.
Chapter 8: Simple Linear Regression Yang Zhenlin.
NIH and IRB Purpose and Method M.Ed Session 2.
Inferential Statistics Inferential statistics allow us to infer the characteristic(s) of a population from sample data Slightly different terms and symbols.
Week 6. Statistics etc. GRS LX 865 Topics in Linguistics.
An Eyetracking Analysis of the Effect of Prior Comparison on Analogical Mapping Catherine A. Clement, Eastern Kentucky University Carrie Harris, Tara Weatherholt,
Why is there a current lack of methodological diversity? One cause (in my impression) is a perception that more diversity may: … increase the uncertainty.
Parafoveal Preview in Reading Burgess (1991) - Self-paced moving window reading time study - Varied window size from single to several words - Found an.
48 Item Sets (Only the results for the relative clause versions are reported here.) The professor (who was) confronted by the student was not ready for.
Detecting Violations In Real- And Counterfactual- World Contexts: Eye-movements And ERP Analysis Heather J Ferguson, Anthony J Sanford & Hartmut Leuthold.
Chapter 26 Inferences for Regression. An Example: Body Fat and Waist Size Our chapter example revolves around the relationship between % body fat and.
Chapter 11 Language. Some Questions to Consider How do we understand individual words, and how are words combined to create sentences? How can we understand.
48 Item Sets (Only the results for the relative clause versions are reported here.) The professor (who was) confronted by the student was not ready for.
Models of Production and Comprehension [1] Ling4-437.
Does the brain compute confidence estimates about decisions?
The Scientific Method. Scientifically Solving a Problem Observe Define a Problem Review the Literature Observe some More Develop a Theoretical Framework.
Academic Writing Fatima AlShaikh. A duty that you are assigned to perform or a task that is assigned or undertaken. For example: Research papers (most.
Blink Is Not A Random Event In Reading Yu-Chi Tai, James Sheedy, & John Hayes Pacific University, College of Optometry.
If: expressing different scenarios through language
Hannah Rohde, Andy Kehler, & Jeff Elman UC San Diego
Predictability affects pronoun production only for some verb types
Modify—use bio. IB book  IB Biology Topic 1: Statistical Analysis
The Components of the Phenomenon of Repetition Suppression
Computer-Mediated Communication
Null Hypothesis Testing
Thinking critically with psychological science
Presentation transcript:

Readers maintain and act on uncertainty about past linguistic input: Evidence from eye movements Roger Levy Klinton Bicknell Tim Slattery Keith Rayner UC San Diego CUNY Sentence Processing Conference 26 March 2009 ?

Incrementality and Rationality Online sentence comprehension is hard But lots of information sources can be usefully brought to bear to help with the task Therefore, it would be rational for people to use all the information available, whenever possible This is what incrementality is We have lots of evidence that people do this often “Put the apple on the towel in the box.” (Tanenhaus et al., 1995)

Grammar  Word modularity? Probabilistic models are good at capturing this use of evidential information (Jurafsky, 1996; Crocker & Brants, 2000; Hale, 2001; Levy, 2008) the horse raced good agent the criminal arrested bad agent (McRae et al., 1998)

Grammar  Word modularity? (II) But implicit modularity assumption: decisions made during word recognition aren’t subject to revision Caveat: It’s clear that prior context can affect the recognition process for newly perceived words… He swept the flour… The baker needed more flour… …but in most models, the outcome of word recognition is “just a word” Misperceived more often as floor (Slattery, submitted) the horse raced

Grammar  Word modularity? (III) This partition between word-level and sentence level analysis could be real (Fodor, 1983) Or, it could be a practical modeling simplification In this talk, we use eye-tracking data to suggest that it is indeed a practical simplification We argue: Readers retain residual uncertainty in their beliefs about the identities of words already read Readers change these beliefs on the basis of these words’ consistency with subsequent input Readers rapidly act on these changes via regressive eye movements We use local-coherence sentences (Tabor et al., 2004; Konieczny & Mueller, 2007) to show this

The puzzle of local coherences Try to understand this sentence: (a) The coach smiled at the player tossed the frisbee. …and contrast this with: (b) The coach smiled at the player thrown the frisbee. (c) The coach smiled at the player who was thrown the frisbee. (d) The coach smiled at the player who was tossed the frisbee. Readers boggle at “tossed” in (a), but not in (b-d) (Tabor et al., 2004) RT spike in (a)

Why is tossed/thrown interesting? As with classic garden-paths, part-of-speech ambiguity leads to misinterpretation The horse raced past the barn…fell But now context “should” rule out the garden path: The coach smiled at the player tossed… A challenge for fully incremental probabilistic models: failure to condition on relevant context verb? participle? verb? participle?

Local coherences from uncertain input? Levy (2008) suggested that local coherences could arise from presence of “near-neighbor” sentences… Hypothesis: the boggle at “tossed” involves what the comprehender wonders whether she might have seen Levy (2008) quantified this boggle as the size of change in the probability distribution over earlier contexts (Error Identification Signal, or EIS) Any of these changes makes tossed a main verb!!! The coach smiled at the player tossed the frisbee (as?) (and?) (who?) (that?) (who?) (that?) (and?) new distributionold distribution

The core of the intuition Grammar & input come together to determine two possible “paths” through the partial sentence: tossed is more likely to happen along the bottom path This creates a large shift in belief in the tossed condition thrown is very unlikely to happen along the bottom path As a result, there is no corresponding shift in belief the coach smiled… at (likely) …the player… as/and (unlikely) …the player… tossed thrown (line thickness ≈ probability)

Results for classic local-coherence sentence EIS greater for the variant humans boggle more on The coach smiled at the player tossed The coach smiled at the player thrown

Near-neighbor manipulation Since the EIS depends on near-neighbor sentences, vary neighborhood of the context? Substituting toward for at should reduce the EIS In free reading, we should see less tendency to regress from tossed when the EIS is small The coach smiled at the player tossed the frisbee (as?) (and?) (who?) (that?) (who?) (that?) (and?) The coach smiled toward the player tossed the frisbee at…thrown at…tossed

Model predictions Indeed we do see this behavior predicted in the model: at…tossed at…thrown toward…thrown toward…tossed

Experimental design In a free-reading eye-tracking study, we crossed at/toward with tossed/thrown: Prediction: interaction between preposition & ambiguity in some subset of: Early-measure RTs at critical region tossed/thrown First-pass regressions out of critical region Go-past time for critical region Regressions into at/toward The coach smiled at the player tossed the frisbee The coach smiled at the player thrown the frisbee The coach smiled toward the player tossed the frisbee The coach smiled toward the player thrown the frisbee

Procedure 24 items+36 fillers, 40 participants Isolated-sentence reading Each sentence followed by a comprehension question Eye movements monitored with Eyelink 2000 tower setup Regions of analysis: (critical) 6 The coach| smiled| {at,toward}| the player| {tossed,thrown}| the frisbee|…

Experimental results ConditionFirstPass RTGopast RTRegressions outRegressions in at/tossed %36% at/thrown %31% toward/tossed %31% toward/thrown %31% at…tossed at…thrown toward…thrown toward…tossed Main effect of category ambiguity (p<0.05) Interaction with at/tossed slowest (p<0.05) Interaction with most regressions in at/tossed (p<0.01) n.s., but right trend Critical regionat/toward

Where did readers regress to? First regressive saccade from critical region rarely goes straight back to at: 1 (Subj) 2 (MV) 3 (Prep) 4 (Obj) 5 (critical) The coach | smiled| {at,toward}| the player| {tossed,thrown}|…

Where did readers regress to? Not much action going on from critical region:

Where did readers regress to? (II) But readers jump farther in first regressive saccade from the spillover region: Main effect of ambiguity; maybe an interaction too (needs further analysis before CUNY – random intercepts or random interactions?) Significant main effect of ambiguity; significant interaction between ambiguity & at/toward Interaction t=1.934 in random-intercepts multi-level model (no such pattern in regressions from critical region)

How far did readers continue to regress? More information contained in regression sequences First-pass rate of ultimately regressing to Prep after fixating on Critical and before passing Critical: 2 3 (Prep) 4 5 (critical) | smiled| {at,toward}| the player| {tossed,thrown}|… Interaction p<0.01 in several types of multi- level analyses This bar is 1/3 of all first-pass regressions from the critical region!

Question-answering accuracy 16 of 24 questions were about the RRC, equally divided: The coach smiled at the player tossed a frisbee by the opposing team Did the player toss/throw a frisbee? [NO] Did someone toss/throw the player a frisbee? [YES] Did the player toss the opposing team a frisbee? [NO] Did the opposing team toss the player a frisbee? [YES] Significant interaction of question type w/ ambiguity (p z < 0.05) Significant main effect of question type (p z < 0.01)

What this result tells us Readers must have residual uncertainty about word identity Word misidentification alone won’t get this result in a fully incremental model: The coach smiled toward the player…thrown The coach smiled at the player…thrown The coach smiled as the player…thrown The coach smiled toward the player…tossed The coach smiled at the player…tossed The coach smiled as the player…tossed Also, readers respond to changes in uncertainty in a sensible way Should be easier, if anything Should be about equally hard

What this result does not tell us Doesn’t show that Levy (2008)’s error-identification model is the only model that can get this results But any model without uncertainty about prior-word identities of prior words has its work cut out for it Doesn’t show that local-coherence effects are exclusively a consequence of rational inference But provides more supporting evidence that rational inference may be at least part of the story

Conclusion First experimental results indicating full bidirectional interactivity between word identification & sentence comprehension Readers… …maintain uncertain beliefs about already-read word identities …question these beliefs as a function of coherence with subsequent input …act rapidly on changes in these beliefs through saccadic behavior Strengthens the case for sentence comprehension as rational probabilistic inference Methodological point: pay attention to the near-neighbors of your stimuli! Your participants will…

Acknowledgments Computational Psycholinguistics Lab UCSD Academic Senate NIH HD26765 Jennifer Arnold for kindly lending me an AC adapter

Thank you for listening!

Acknowledgements Thank you for all you help! Collaborators: Klinton Bicknell Tom Griffiths Keith Rayner Florencia Reali Tim Slattery UCSD: Computational Psycholinguistics Lab & CRL Liang Huang & David Chiang Cyril Allauzen and the rest of the developers of the OpenFst ( and GRM libraries ( the people that r

Behavioral correlates (Tabor et al., 2004) Also, Konieczny (2006, 2007) found compatible results in stops-making- sense and visual-world paradigms These results are problematic for theories requiring global contextual consistency (Frazier, 1987; Gibson, 1991, 1998; Jurafsky, 1996; Hale, 2001, 2006) harder than thrown tossed

Eye movements & question answering Hard to find much of anything But there was a significant relationship with first-pass regressions out of the critical region, only for RRC questions Significant main effect of ambiguity (p z < 0.001) Small but significant interaction of ambiguity w/ regression (p z < 0.05) The predictions could have gone either way, though

How often did readers skip the prep? They definitely skipped “at” a lot more than skipping “toward”! (main effect of at/toward only) tossedthrown at40.4%33.8% toward2.9%2.5%

Bad comprehenders? Well, filler QA accuracy was 89.6%