Presentation on theme: "CS 4705 Discourse Structure and Text Coherence What makes a text/dialogue coherent? Incoherent? “Consider, for example, the difference between passages."— Presentation transcript:
CS 4705 Discourse Structure and Text Coherence
What makes a text/dialogue coherent? Incoherent? “Consider, for example, the difference between passages (18.71) and (18.72). Almost certainly not. The reason is that these utterances, when juxtaposed, will not exhibit coherence. Do you have a discourse? Assume that you have collected an arbitrary set of well-formed and independently interpretable utterances, for instance, by randomly selecting one sentence from each of the previous chapters of this book.” vs….
“Assume that you have collected an arbitrary set of well-formed and independently interpretable utterances, for instance, by randomly selecting one sentence from each of the previous chapters of this book. Do you have a discourse? Almost certainly not. The reason is that these utterances, when juxtaposed, will not exhibit coherence. Consider, for example, the difference between passages (18.71) and (18.72). (J&M:695)
What makes a text coherent? Appropriate use of coherence relations between subparts of the discourse -- rhetorical structure Appropriate sequencing of subparts of the discourse -- discourse/topic structure Appropriate use of referring expressions
Rhetorical Structure Theory (Mann, Matthiessen, and Thompson ‘89) One theory of discourse structure, based on identifying relations between parts of the text –How many rhetorical relations are there? –MMT say 23 but… Nucleus/satellite notion encodes asymmetry Some rhetorical relations: –Elaboration (set/member, class/instance/whole/part…) –Contrast: multinuclear –Condition: Sat presents precondition for N –Purpose: Sat presents goal of the activity in N –Sequence: multinuclear
–Result: N results from something presented in Sat –Evidence: Sat provides evidence for something claimed in N A sample definition: –Relation: evidence –Constraints on N: H might not believe N as much as S think s/he should –Constraints on Sat: H already believes or will believe Sat An example: George Bush supports Big Business. He is sure to veto House Bill 1711.
1) Title: Bouquets in a basket – with living flowers 2) There is a gardening revolution going on 3) People are planting flower baskets with living plants 4) Mixing many types in one container for a summer of floral beauty 5) To create your own “Victorian” bouquet of flowers 6) Choose varying shapes, sizes and forms, besides a variety of complementary colors 7) Plants that grow tall should be surrounded by smaller ones and filled with others that tumble over the side of a hanging basket 8) Leaf textures and colors will also be important 9) There is the silver-white foliage of dusty miller, the feathery threads of lotus vine floating down from above, the deep greens, or chartreuse, even the widely varied foliage colors of the coleus.
1) Title: Bouquets in a basket – with living flowers 2) There is a gardening revolution going on 3) People are planting flower baskets with living plants (S:Evidence,N2) 4) Mixing many types in one container for a summer of floral beauty (S:Elaboration,N3) 5) To create your own “Victorian” bouquet of flowers 6) Choose varying shapes, sizes and forms, besides a variety of complementary colors (S:Condition,5?) 7) Plants that grow tall should be surrounded by smaller ones and filled with others that tumble over the side of a hanging basket 8) Leaf textures and colors will also be important 9) There is the silver-white foliage of dusty miller, the feathery threads of lotus vine floating down from above, the deep greens, or chartreuse, even the widely varied foliage colors of the coleus.
Some Problems with RST (cf. Moore & Pollack ‘92)Moore & Pollack ‘92 How many Rhetorical Relations are there? How can we use RST in dialogue as well as monologue? How do we incorporate speaker intentions into RST? RST does not allow for multiple relations holding between parts of a discourse RST does not model overall structure of the discourse
What’s the Rhetorical Structure? System: Hello. How may I help you? User: I would like to find out why I was charged for a call? System: What call would you like to inquire about? User: My bill says I made a call to Syncamaloo, Texas, but I’ve never even heard of this town. System: May I have the date of the call that appears on your bill?
Identifying RS Automatically (Marcu ’99)Marcu ’99 Train a parser on a discourse treebank –90 RS trees, hand-annotated for rhetorical relations –Elementary discourse units (edu’s) linked by RR –Parser learns to identify N and S and their RR –Features: Wordnet-based similarity, lexical, structural Uses discourse segmenter to id edu’s –Trained to segment on hand-labeled corpus (C4.5) –Features: 5-word POS window, presence of discourse markers, punctuation, seen a verb?,… –Eval: 96-8% accuracy
Eval of parser: –Id edu’s: Recall 75%, Precision 97% –Id hierarchical structure (2 edu’s related): Recall 71%, Precision 84% –Id nucleus/satellite labels: Recall 58%, Precision 69% –Id RR: Recall 38%, Precision 45% Later errors due mostly to edu mis-identification –Id of hierarchical structure and n/s status comparable to human when hand-labeled edu’s used Hierarchical structure is easier to id than RR
What Can Hierarchical Structure Tell Us? Welcome to word processing. That’s using a computer to type letters and reports. Make a typo? No problem. Just back up, type over the mistake, and it’s gone. And, it eliminates retyping. And, it eliminates retyping.
Structures of Discourse Structure (Grosz & Sidner ‘86)Grosz & Sidner ‘86) Leading alternative theory of discourse structure –Provides for multiple levels of analysis: S’s purpose as well as content of utterances and S and H’s attentional state –Identifies only a few, general relations that hold among intentions Three components: –Linguistic structure –Intentional structure –Attentional structure
Linguistic Structure What is actually said/written How is this represented? –Assume discourse is segmented into Discourse Segments (DS) -- how? what is basic unit of analysis? segmentation agreement automatic segmentation –Embedding relations: topic structure –Cue phrases
Intentional Structure Discourse purpose (DP): basic purpose of the discourse Discourse segment purposes (DSPs): how this segment contributes to the overall DP Segment relations: –Satisfaction-precedence: DSP1 must be satisfied before DSP2 (e.g. ds1 satp ds2) –Dominance: DSP1 dominates DSP2 if fulfilling DSP2 constitutes part of fulfilling DSP1 (e.g. ds3 dom ds4)
Attentional State Focus stack: –Stack of focus spaces, each containing objects, properties and relations salient during each DS, plus the DSP (content plus purpose) –State changes modeled by transition rules controlling the addition/deletion of focus spaces Information at lower levels may or may not be available at higher levels Focus spaces are pushed onto the stack when –new DS or embedded DS (e.g. DS that are dominated by other DS) are begun –popped when they are completed
Limits of G&S ‘86 Assumes that discourses are task-oriented Assumes there is a single, hierarchical structure shared by S and H How do we identify entities that are salient (on the focus stack)? Do people really build such structures when they converse? Use them in interpreting what others say?
How are these structures recognized from a discourse? Linguistic markers: –tense and aspect –cue phrases –intonational variation Inference of S intentions Inference from task structure Intonational Information
Acoustic and Prosodic Cues to Discourse Structure Intuition: –Speakers vary acoustic and prosodic cues to convey variation in discourse structure –Systematic? In read or spontaneous speech? Evidence: –Observations from recorded corpora –Laboratory experiments –Machine learning of discourse structure from acoustic/prosodic features
Boston Directions Corpus (Hirschberg & Nakatani ’96)Hirschberg & Nakatani ’96 Experimental Design 12 speakers: 4 used Spontaneous and read versions of 9 direction-giving tasks Corpus: 50m read; 67m spon Labeling –Prosodic: ToBI intonational labeling –Discourse: Grosz & SidnerDiscourse Features used in analysis
–F0 max and mean –Energy (rms) max and mean –Speaking rate (syllables per sec) –Duration of preceding and subsequent pause Correlations with SBEG, SCONT and SF phrases Results (significant differences) –SBEG higher in f0 and RMS max and mean, with longer preceding and shorter succeeding pauses –SF lower in f0 and RMS max and mean, with shorter preceding and longer succeeding pauses, and are spoken more rapidly Discourse structure is signaled by acoustic/prosodic variation: do people use it?
ds1: step 1, enter and get token first enter the Harvard Square T stop and buy a token ds2: inbound on red line then proceed to get on the inbound um Red Line uh subway Boston Directions Corpus: Describe how to get to MIT from Harvard
ds3: take subway from hs, to cs to ks and take the subway from Harvard Square to Central Square and then to Kendall Square ds4: describe ks station you’ll see a music sculpture there which will tell you it’s Kendall Square it’s very nice ds5: get off T. then get off the T
Next Class Dialogue Systems (J&M 22, new version) HW3 due