Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Cohesion and Learning in a Tutorial Spoken Dialog System Art Ward Diane Litman.

Similar presentations


Presentation on theme: "1 Cohesion and Learning in a Tutorial Spoken Dialog System Art Ward Diane Litman."— Presentation transcript:

1 1 Cohesion and Learning in a Tutorial Spoken Dialog System Art Ward Diane Litman

2 2 Outline Tutoring Goals 4 issues in measuring cohesion Why they’re interesting How we test them Results

3 3 Natural Language Dialog Tutoring Human tutors are better than classroom instruction (Bloom 84) Intelligent Tutoring Systems (ITSs) hope to replicate this advantage Is Dialog important to learning? Dialog acts: question answering, explanatory reasoning, deep student answers (Graesser et al. 95, Forbes-Riley et al. 05) Difficult to automatically tag dialog input, so: Automatically detectable dialog features Average turn length, etc. (Litman et al. 04) We look at Cohesion Lexical Co-occurrence between turns

4 4 Goals and Results Goals Want to find if cohesion is correlated with learning in our tutoring dialogs. If it is, may inform ITS design Want to find a computationally tractable measure of cohesion So can be used in a real-time tutor Results Do find strong correlations with learning For low pre-testers For interactive (tutor to student) measures of cohesion Robust to multiple measures of lexical cohesion

5 5 4 Issues Why/How identify cohesion in dialogs? Do students of different skill levels respond to cohesion in the same way? (Is there an aptitude/treatment interaction?) Is Interactivity Important? What other processing steps help?

6 6 Issue 1: How identify cohesion in dialogs? Why might cohesion be important in tutoring? McNamara & Kintsch (96) Students read high & low coherence text High coherence text was low coherence version altered to: Use consistent referring expressions Identify anaphora Supply background information Interaction between pre-test score & response to textual coherence Low pre-testers learned more from more coherent text High pre-testers learned LESS from more coherent text

7 7 Measuring Cohesion Measurements from Computational Linguistics Hearst(94) topic segmentation, text Word-count similarity of spans of text Olney & Cai (05) topic segmentation, tutorial dialog Several measures, including Hearst’s Morris & Hirst (91) Lexical Chains Thesaurus entries Barzilay & Eldihad (97) Automatic Lexical Chains WordNet senses We develop measures similar to Hearst’s But novel in that: Applied to dialog rather than text, used to find correlations with learning

8 8 Issue 1: How identify cohesion in dialogs? Defining Cohesion Halliday and Hassan (76) Grammatical vs Lexical Cohesion Lexical Cohesion Reiteration Exact word repetition Synonym repetition Near Synonym repetition Super-ordinate class General referring noun Cohesion measured by counting “cohesive ties” Two words joined by a cohesive device (i.e. reiteration)

9 9 Issue 1: How identify cohesion in dialogs? Defining Cohesion Halliday and Hassan (76) Grammatical vs Lexical Cohesion Lexical Cohesion Reiteration Exact word repetition Synonym repetition Near Synonym repetition Super-ordinate class General referring noun Cohesion measured by counting “cohesive ties” Two words joined by a cohesive device (i.e. reiteration)

10 10 Issue 1: How identify cohesion in dialogs? How we measure Lexical Cohesion We count cohesive ties between turns Tokens (with stop words) (token = “word”) Tokens (stop words removed) (Stops = high frequency, low information words) Stems (stop words removed)

11 11 Stems Stem = non-inflected core of a word Porter Stemmer Allows us to find ties between various inflected forms of the same word in adjacent turns. “Turns” are tutor and student contributions to Tutoring Dialogs collected by the ITSPOKE group.

12 12 Applying Cohesion measures to our Corpora: example TurnContribution Student Essay No. The airplane and the packet have the same horizontal velocity. When the packet is dropped, the only force acting on it is g, and the net force is zero. The packet accelerates vertically down, but does not accelerate horizontally. The packet keeps moving at the same velocity while it is falling as it had when it was on the airplane. There will be displacement because the packet still moves horizontally after it is dropped. The packet will keep moving past the center of the swimming pool because of its horizontal velocity. ITSPOKE Uh huh. There is more still that your essay should cover. Maybe this will help you remember some of the details need in the explanation. After the packet is released, the only force acting on it is gravitational force, which acts in the vertical direction. What is the magnitude of the acceleration of the packet in the horizontal direction? Cohesive Ties Matches Count Token w/stoppacket, horizontal, the, it, is, of, only, force, acting, on, there, will, still, after 14 Token, no stoppacket, horizontal, only, force, acting, there, will, still, after 9 Stem, no stoppacket, horizont, onli, forc, act, acceler, vertic, there, will, still, after 11

13 13 Applying Cohesion measures to our Corpora: example TurnContribution Student Essay No. The airplane and the packet have the same horizontal velocity. When the packet is dropped, the only force acting on it is g, and the net force is zero. The packet accelerates vertically down, but does not accelerate horizontally. The packet keeps moving at the same velocity while it is falling as it had when it was on the airplane. There will be displacement because the packet still moves horizontally after it is dropped. The packet will keep moving past the center of the swimming pool because of its horizontal velocity. ITSPOKE Uh huh. There is more still that your essay should cover. Maybe this will help you remember some of the details need in the explanation. After the packet is released, the only force acting on it is gravitational force, which acts in the vertical direction. What is the magnitude of the acceleration of the packet in the horizontal direction? Cohesive Ties Matches Count Token w/stoppacket, horizontal, the, it, is, of, only, force, acting, on, there, will, still, after 14 Token, no stoppacket, horizontal, only, force, acting, there, will, still, after 9 Stem, no stoppacket, horizont, onli, forc, act, acceler, vertic, there, will, still, after 11

14 14 Applying Cohesion measures to our Corpora: example TurnContribution Student Essay No. The airplane and the packet have the same horizontal velocity. When the packet is dropped, the only force acting on it is g, and the net force is zero. The packet accelerates vertically down, but does not accelerate horizontally. The packet keeps moving at the same velocity while it is falling as it had when it was on the airplane. There will be displacement because the packet still moves horizontally after it is dropped. The packet will keep moving past the center of the swimming pool because of its horizontal velocity. ITSPOKE Uh huh. There is more still that your essay should cover. Maybe this will help you remember some of the details need in the explanation. After the packet is released, the only force acting on it is gravitational force, which acts in the vertical direction. What is the magnitude of the acceleration of the packet in the horizontal direction? Cohesive Ties Matches Count Token w/stoppacket, horizontal, the, it, is, of, only, force, acting, on, there, will, still, after 14 Token, no stoppacket, horizontal, only, force, acting, there, will, still, after 9 Stem, no stoppacket, horizont, onli, forc, act, acceler, vertic, there, will, still, after 11

15 15 Applying Cohesion measures to our Corpora: example TurnContribution Student Essay No. The airplane and the packet have the same horizontal velocity. When the packet is dropped, the only force acting on it is g, and the net force is zero. The packet accelerates vertically down, but does not accelerate horizontally. The packet keeps moving at the same velocity while it is falling as it had when it was on the airplane. There will be displacement because the packet still moves horizontally after it is dropped. The packet will keep moving past the center of the swimming pool because of its horizontal velocity. ITSPOKE Uh huh. There is more still that your essay should cover. Maybe this will help you remember some of the details need in the explanation. After the packet is released, the only force acting on it is gravitational force, which acts in the vertical direction. What is the magnitude of the acceleration of the packet in the horizontal direction? Cohesive Ties Matches Count Token w/stoppacket, horizontal, the, it, is, of, only, force, acting, on, there, will, still, after 14 Token, no stoppacket, horizontal, only, force, acting, there, will, still, after 9 Stem, no stoppacket, horizont, onli, forc, act, acceler, vertic, there, will, still, after 11

16 16 Issue 2: Is there an aptitude/treatment interaction? Why there might be: McNamara & Kintsch How we test it: Mean pre-test split All students Above-mean pretest students (“high” pre-testers) Below-mean pretest students (“low” pre-testers)

17 17 Issue 3: Is interactivity Important? Why it might be: Chi et al. (01) Tutor centered, Student centered, Interactive Deep learning through self construction Not tutor actions alone Litman & Forbes-Riley (05) Learning correlated with both: student utterances that display reasoning tutor questions that require reasoning How we test it: Interactive corpus – compare tutor to student turns Tutor–only corpus Student–only corpus

18 18 Issue 4: What other processing steps help? Tried several on training corpus: Removing stop words N-turn spans Selecting “substantive” turns TF-IDF normalization Turn-normalized counts (Raw tie count / # of turns in dialog) Found final options on training corpus: One turn spans, turn normalization, no TF-IDF, no substantive turn selection All reported results use these options Tested options on new corpus

19 19 Where did the corpora come from? ITSPOKE is a speech-enabled version of Why-2 Atlas (VanLehn et al. 02) Qualitative physics Tutoring Cycle Student reads instructional materials Takes a pre-test Starts Interactive tutoring cycle Problem Essay Tutor evaluates essay, engages in dialog Revise essay Repeat Takes a post-test

20 20 Tutoring Corpora Transcripts of tutoring sessions Training corpus (fall 2003): 20 students, 5 problems each 95 dialogs (5 had no dialog) 13 low pre-testers, 7 high pre-testers Testing corpus (spring 2005): 34 students, 5 problems each 163 dialogs (7 had no dialog) 18 low pre-testers, 16 high pre-testers

21 21 Results: Aptitude/Treatment Test: partial correlation of post-test & cohesion count, controlling for pre- test Cohesion correlated with learning for low pre-test students Not for high pre-test students Little difference between types of measurement Less significant on testing data, “token with stops” level reduced to a trend Tests Train: 2003 DataTest: 2005 Data StudentsRP-ValueR Grouped by Token (with stop words) All Students Low Pretest High Pretest Grouped by Token (Stop words removed) All Students Low Pretest High Pretest Grouped by Stem (Stop words removed) All Students Low Pretest High Pretest

22 22 Results: Aptitude/Treatment Test: partial correlation of post-test & cohesion count, controlling for pre- test Cohesion correlated with learning for low pre-test students Not for high pre-test students Little difference between types of measurement Slightly less significant on testing data Tests Train: 2003 DataTest: 2005 Data StudentsRP-ValueR Grouped by Token (with stop words) All Students Low Pretest High Pretest Grouped by Token (Stop words removed) All Students Low Pretest High Pretest Grouped by Stem (Stop words removed) All Students Low Pretest High Pretest

23 23 Results: Aptitude/Treatment Test: partial correlation of post-test & cohesion count, controlling for pre- test Cohesion correlated with learning for low pre-test students Not for high pre-test students Little difference between types of measurement Less significant on testing data, “token with stops” level reduced to a trend Tests Train: 2003 DataTest: 2005 Data StudentsRP-ValueR Grouped by Token (with stop words) All Students Low Pretest High Pretest Grouped by Token (Stop words removed) All Students Low Pretest High Pretest Grouped by Stem (Stop words removed) All Students Low Pretest High Pretest

24 24 Results: Aptitude/Treatment Test: partial correlation of post-test & cohesion count, controlling for pre- test Cohesion correlated with learning for low pre-test students Not for high pre-test students Little difference between types of measurement Less significant on testing data, “token with stops” level reduced to a trend Tests Train: 2003 DataTest: 2005 Data StudentsRP-ValueR Grouped by Token (with stop words) All Students Low Pretest High Pretest Grouped by Token (Stop words removed) All Students Low Pretest High Pretest Grouped by Stem (Stop words removed) All Students Low Pretest High Pretest

25 25 Results: Aptitude/Treatment Test: partial correlation of post-test & cohesion count, controlling for pre- test Cohesion correlated with learning for low pre-test students Not for high pre-test students Little difference between types of measurement Less significant on testing data, “token with stops” level reduced to a trend Tests Train: 2003 DataTest: 2005 Data StudentsRP-ValueR Grouped by Token (with stop words) All Students Low Pretest High Pretest Grouped by Token (Stop words removed) All Students Low Pretest High Pretest Grouped by Stem (Stop words removed) All Students Low Pretest High Pretest

26 26 Results: Aptitude/Treatment Test: partial correlation of post-test & cohesion count, controlling for pre- test Cohesion correlated with learning for low pre-test students Not for high pre-test students Little difference between types of measurement Less significant on testing data, “token with stops” level reduced to a trend Tests Train: 2003 DataTest: 2005 Data StudentsRP-ValueR Grouped by Token (with stop words) All Students Low Pretest High Pretest Grouped by Token (Stop words removed) All Students Low Pretest High Pretest Grouped by Stem (Stop words removed) All Students Low Pretest High Pretest

27 27 Results: Aptitude/Treatment Test: partial correlation of post-test & cohesion count, controlling for pre- test Cohesion correlated with learning for low pre-test students Not for high pre-test students Little difference between types of measurement Less significant on testing data, “token with stops” level reduced to a trend Tests Train: 2003 DataTest: 2005 Data StudentsRP-ValueR Grouped by Token (with stop words) All Students Low Pretest High Pretest Grouped by Token (Stop words removed) All Students Low Pretest High Pretest Grouped by Stem (Stop words removed) All Students Low Pretest High Pretest

28 28 Results: Aptitude/Treatment (2003 data) No significant difference between amounts of (turn normalized) cohesion in high and low pre-test groups. Difference in correlation between high and low pre-testers not due to different amounts of cohesion.

29 29 Results: Interactivity (2003) Cohesion between tutor utterances is not correlated with learning

30 30 Results: Interactivity (2003) No evidence that cohesion between student productions is correlated with learning (but student utterances are very short with computer tutor)

31 31 Discussion Both high and low pre-testers successfully learned from these dialogs Our measure of lexical cohesion seems to reflect only what the low pre-testers do to learn, not correlated with what high pre- testers do. McNamara & Kintsch also found a positive correlation for low pre-testers, but a negative correlation for high pre-testers.

32 32 Discussion Our measures are slightly different: McNamara & Kintsch: Manipulated coherence in text Reader does not contribute to coherence Coherence is the extent to which semantic relations are spelled out in the text, rather than inferred by the reader. Low pre-testers probably learned because high coherence text allowed them to make inferences they couldn’t from the low cohesion text. Low pre-testers & low coherence: didn’t know the terms High coherence may allow a greater number of successful inferences for their low pre-testers Our work: Dialog Student does contribute to cohesion Higher cohesion means using more of same terms Speculation: High cohesion may indicate the number of successful inferences our low pre-testers already made. High pre-testers already know the terms, so new inferences are not involved in using them.

33 33 Summary We have taken automatically computable measures of cohesion from computational linguistics Applied them to tutorial dialog Found correlations with student learning

34 34 Conclusions Simple, automatically computable measures of lexical cohesion correlate with learning But only for students with low pre-test scores, even though low and high pre-testers showed similar amounts of cohesion. Correlation is robust to differences in type of measurement It’s the cohesion between student and tutor that’s important

35 35 Future Work Short term: Cohesion may also be related with learning in high pre- testers, but we’re measuring the wrong kind of cohesion Work underway to try “sense” level measures Halliday & Hassan’s “synonym” levels of reiteration “Acceleration” & “speeding up” New issues: Word sense disambiguation (one sense per discourse?) Or measuring it in the wrong places Try finding cohesion at impasses (VanLehn 03) Try finding change in cohesion over time (Pickering & Garrod 04) Is it the dialog, or the essay? Long term: Test by manipulating cohesion in ITSPOKE

36 36 Thanks Diane Litman ITSPOKE group

37 37 Questions?

38 38

39 39

40 40

41 41

42 42 Cohesion vs Coherence Cohesive Devices Things that “tie” different parts of a discourse together: Anaphora, repetition, etc… But still may not make sense: John hid Bill’s car keys. He likes spinach. (Jurafsky & Martin 00) Coherence relations Semantic relations between utterances. Result, Explanation, elaboration, etc. (Hobbs 79)

43 43 Britton & Gulgoz 91 Original text: Air war in the North, 1965 By the fall of 1964, Americans in both Saigon and Washington had begun to focus on Hanoi as the source of the continuing problem in the south. Modified text: Air war in North Vietnam, 1965 By the beginning of 1965, Americans in both Saigon and Washington had begun to focus on Hanoi, capital of North Vietnam, as the source of the continuing problems in the south.


Download ppt "1 Cohesion and Learning in a Tutorial Spoken Dialog System Art Ward Diane Litman."

Similar presentations


Ads by Google