Presentation is loading. Please wait.

Presentation is loading. Please wait.

April 26th, 2007 Workshop on Treebanking, HLT/NAACL, Rochester 1 Layering of Annotations in the Penn Discourse TreeBank (PDTB) Rashmi Prasad Institute.

Similar presentations

Presentation on theme: "April 26th, 2007 Workshop on Treebanking, HLT/NAACL, Rochester 1 Layering of Annotations in the Penn Discourse TreeBank (PDTB) Rashmi Prasad Institute."— Presentation transcript:

1 April 26th, 2007 Workshop on Treebanking, HLT/NAACL, Rochester 1 Layering of Annotations in the Penn Discourse TreeBank (PDTB) Rashmi Prasad Institute for Research in Cognitive Science University of Pennsylvania

2 April 26th, 2007 Workshop on Treebanking, HLT/NAACL, Rochester 2 Discourse Relations in the PDTB  Argument Structure of Explicit/Implicit Conns (spans):  She hasn’t played any music since the earthquake hit.  “We asked police to investigate why they are allowed to distribute the flag in this way. Implicit=because It should be considered against the law,” said Danny Leish, a spokesman for the association.  Semantics (labels) of connectives: Temporal Causal  Attribution (spans and (4) features (labels)): Source= Writer (implicit), span= unmarked Source= Other agent, span= marked 3 other attribution features: Type: Assertion, Belief, Factive, Intention Scopal Polarity: I don’t think X > I think NOT X Determinacy: I might think X !> I think X

3 April 26th, 2007 Workshop on Treebanking, HLT/NAACL, Rochester 3 Layering with the PTB  Stand-off annotations of connective, argument and attribution spans: Character offsets in the WSJ raw texts: generated during the annotation Tree node addresses of constituents in PTB trees (constituent sets for spans not dominated by a single node and for discontinuous text spans): generated in post-annotation phase

4 April 26th, 2007 Workshop on Treebanking, HLT/NAACL, Rochester 4 PTB Affecting PDTB Choices  Distinct POS marking of connectives in the PTB could have allowed for automatic identification of connectives: For example, Discourse connective: (PP (IN For (NP (NN example )))) For John, Not a discourse connective: (PP (IN For (NP (NN John)))) Subordinating conjunctions marked as adverbs: When: (WHADP (WRB When ))  Effect of PS vs. dependency annotation: none

5 April 26th, 2007 Workshop on Treebanking, HLT/NAACL, Rochester 5 PTB Affecting PDTB Choices Discourse relations occurring intra-sententially could have been marked in the underlying annotation if not constrained by certain syntactic choices: S says VBZ When WRB WHADVP-1 SBAR-TMP S he PRP NP-SBJ S Sue had already left VP John was hired Syntax incorrectly forces attribution to be the temporally modified element Syntax assumption: All words/phrases must be connected in a tree!

6 April 26th, 2007 Workshop on Treebanking, HLT/NAACL, Rochester 6 What else could be annotated  Attribution phrases: since they often lead to a mismatch with discourse arguments of connectives  When Max was hired, he says Sue had already left. Representative list obtainable from PDTB. Directly observable during syntactic annotation.  Alternative Lexicalizations (AltLex): lexical realizations of discourse relations with non-connective expressions  Mary has been depressed lately. The reason: she failed Representative list obtainable from PDTB. May involve some multi-sentence processing.

7 April 26th, 2007 Workshop on Treebanking, HLT/NAACL, Rochester 7 Methodology and Quality Control  Choices made at more basic levels should make the task easier for discourse-level annotations. Do some annotations at more basic levels if it prevents a reassessment of annotator choices/judgements. Quality control can be done by checking existing annotations (or representative samples thereof)  Stand-off annotation: prevents incompatibilities in representation where unavoidable  Alignments with other layers to check for incompatibilities e.g., attribution in PDTB and PTB

Download ppt "April 26th, 2007 Workshop on Treebanking, HLT/NAACL, Rochester 1 Layering of Annotations in the Penn Discourse TreeBank (PDTB) Rashmi Prasad Institute."

Similar presentations

Ads by Google