Presentation is loading. Please wait.

Presentation is loading. Please wait.

Corpus-based Induction of an LFG Syntax-Semantics Interface for Frame Semantic Processing Anette Frank, Jiří Semecký

Similar presentations


Presentation on theme: "Corpus-based Induction of an LFG Syntax-Semantics Interface for Frame Semantic Processing Anette Frank, Jiří Semecký"— Presentation transcript:

1 Corpus-based Induction of an LFG Syntax-Semantics Interface for Frame Semantic Processing Anette Frank, Jiří Semecký frank@coli.uni-sb.de semecky@ufal.ms.mff.cuni.cz

2 LFG 2004, Christchurch2July 10, 2004 Overview  State of the art  Frame Semantics and FrameNet project  Salsa frame annotation project  LFG syntax-semantics interface for Frame Semantics  Our work  Porting SALSA frame annotations to LFG  Special phenomena  Extraction of frame assignment rules  Conclusion  Current data and results  Summary  Next steps [and Application] Overview State of the art Our work Conclusion

3 LFG 2004, Christchurch3July 10, 2004 Frame Semantics  Frame Semantics (Fillmore 1976, 1977,..)  Frame: a conceptual structure or prototypical situation, e.g. SPD requests that coalition talk about reform.  Evokes a frame REQUEST, with frame elements (frame semantic roles) that identify participants  SPEAKER, SPD  ADDRESSEE, Coalition  MESSAGE, talk about reform  Frame evoking elements: verbs, nouns, adjectives,... introduce frames  FrameNet  Berkeley FrameNet II Project  Database of frames for a lexicon of English  Definition of frames and frame semantic roles  Inheritance relations among frames  Selected and manually annotated example sentence Overview State of the art Our work Conclusion Frame Semantics Salsa From SALSA to LFG

4 LFG 2004, Christchurch4July 10, 2004 SALSA Saarbrücken Lexical Semantics Annotation and Analysis Project  German FrameNet “light”  Creating a large semantically annotated corpus of German  Building on FrameNet DB definitions of frames and roles  Strongly corpus-based oriented  Methods and Aims  Manual annotation on top of syntactically annotated TIGER corpus  (Semi-)automatic semantic annotation of larger corpora  Automatic acquisition of a lexical semantic resource  Semantics-based information access in NLP applications  Focus of our work  Induction of an LFG syntax-semantics interface for frame semantics from manually annotated corpus Overview State of the art Our work Conclusion Frame Semantics Salsa From SALSA to LFG

5 LFG 2004, Christchurch5July 10, 2004 SALSA Example SPD fordert Koalition zu Gespräch über Reform auf. SPD requests that coalition talk about reform.  TIGER  Newspaper corpus  1.5 Million words  TIGER annotation scheme  Syntactic constituents  Functional role labels (SB, HD,..)  Crossing edges (word order) Overview State of the art Our work Conclusion Frame Semantics Salsa From SALSA to LFG

6 LFG 2004, Christchurch6July 10, 2004 SALSA Example  TIGER  Newspaper corpus  1.5 Million words  TIGER annotation scheme  Syntactic constituents  Functional role labels (SB, HD,..)  Crossing edges (word order)  SALSA frame annotation  Frame evoking element, FEE, (fordert auf) projects frame Overview State of the art Our work Conclusion Frame Semantics Salsa From SALSA to LFG SPD fordert Koalition zu Gespräch über Reform auf. SPD requests that coalition talk about reform.

7 LFG 2004, Christchurch7July 10, 2004 SALSA Example  TIGER  Newspaper corpus  1.5 Million words  TIGER annotation scheme  Syntactic constituents  Functional role labels (SB, HD,..)  Crossing edges (word order)  SALSA frame annotation  Frame evoking element, FEE, (fordert auf) projects frame  Frame elements (FEs) of the frame are connected to syntactic constituents Overview State of the art Our work Conclusion Frame Semantics Salsa From SALSA to LFG SPD fordert Koalition zu Gespräch über Reform auf. SPD requests that coalition talk about reform.

8 LFG 2004, Christchurch8July 10, 2004 SALSA Example  TIGER  Newspaper corpus  1.5 Million words  TIGER annotation scheme  Syntactic constituents  Functional role labels (SB, HD,..)  Crossing edges (word order)  SALSA frame annotation  Frame evoking element, FEE, (fordert auf) projects frame  Frame elements (FEs) of the frame are connected to syntactic constituents Overview State of the art Our work Conclusion Frame Semantics Salsa From SALSA to LFG SPD fordert Koalition zu Gespräch über Reform auf. SPD requests that coalition talk about reform.

9 LFG 2004, Christchurch9July 10, 2004 From SALSA to LFG  Automatic semantic frame assignment  Broad-coverage grammar  High accuracy  Portability of manual SALSA/TIGER frame annotations  German LFG grammar (IMS, Univ. Stuttgart)  Used for TIGER annotation: 50% coverage, 70% precision  Further extension of coverage  OT-based and statistical disambiguation  A general syntax-semantics interface  LFG f-structures provide a good level of abstraction  PARGRAM: Common f-structure design principles for different languages allow study of generalizations across languages Overview State of the art Our work Conclusion Frame Semantics Salsa From SALSA to LFG

10 LFG 2004, Christchurch10July 10, 2004 Overview State of the art Our work Conclusion An LFG Frame Semantic Projection Porting SALSA Annotations to LFG Special phenomena Extraction of Frame Assignment Rules An LFG Frame Semantics Projection  Projection from f-structure SPD fordert Koalition zu Gespräch über Reform auf. SPD requests that coalition talk about reform.

11 LFG 2004, Christchurch11July 10, 2004 An LFG Frame Semantics Projection  Projection from f-structure SPD fordert Koalition zu Gespräch über Reform auf. SPD requests that coalition talk about reform. Overview State of the art Our work Conclusion An LFG Frame Semantic Projection Porting SALSA Annotations to LFG Special phenomena Extraction of Frame Assignment Rules

12 LFG 2004, Christchurch12July 10, 2004 An LFG Frame Semantics Projection  Projection from f-structure SPD fordert Koalition zu Gespräch über Reform auf. SPD requests that coalition talk about reform. Overview State of the art Our work Conclusion An LFG Frame Semantic Projection Porting SALSA Annotations to LFG Special phenomena Extraction of Frame Assignment Rules

13 LFG 2004, Christchurch13July 10, 2004 An LFG Frame Semantics Projection  Description by Analysis: transfer rule for frame projection  Co-description: lexicon entry for frame projection auffordern V, (  PRED) = ‘AUFFORDERN ’... (  (  ) FRAME) = REQUEST (  (  ) FEE) = (  PRED FN) (  (  ) SPEAKER) =  (  SUBJ) (  (  ) ADDRESSEE) =  (  OBJ) (  (  ) MESSAGE) =  (  OBL OBJ) pred (X, auffordern), subj (X, A), obj (X, B), obl (X, C), obj (C, D) ==> +  (X, SemX), +frame (SemX, request), +fee (SemX, auffordern), +  (A, SemA), +speaker (SemX, SemA), +  (B, SemB), +addressee (SemX, SemB), +  (D, SemD), +message (SemX, SemD), Overview State of the art Our work Conclusion An LFG Frame Semantic Projection Porting SALSA Annotations to LFG Special phenomena Extraction of Frame Assignment Rules

14 LFG 2004, Christchurch14July 10, 2004 Corpus-based induction of frame assignment rules  Step 1: Porting SALSA annotations to LFG  Using “parallel” LFG corpus of TIGER  To obtain an LFG-frame corpus  Step 2: Induction of general frame assignment rules from the LFG-frame corpus  Can be applied to f-structure output of LFG parsing of new sentences Overview State of the art Our work Conclusion An LFG Frame Semantic Projection Porting SALSA Annotations to LFG Special phenomena Extraction of Frame Assignment Rules

15 LFG 2004, Christchurch15July 10, 2004 Porting SALSA Annotations to LFG FRAME:Request FEE ID:{2, 8} SPEAKER: ADDRESSEE: MESSAGE: 1 3 501  Frame evoking elements (FEE) and frame elements (FE) connected to syntactic constituents identified by IDs  Extracting frame constituting information from SALSA/TIGER annotations  FRAME, TIGER constituent IDentifiers of FEE and FEs Overview State of the art Our work Conclusion An LFG Frame Semantic Projection Porting SALSA Annotations to LFG Special phenomena Extraction of Frame Assignment Rules 123 8 501

16 LFG 2004, Christchurch16July 10, 2004 Porting SALSA Annotations to LFG  „Parallel“ TIGER corpus consisting of automatically derived LFG f-structures (Forst 2003)  Using treebank conversion methods  Preserves TIGER constituent information (ID) Overview State of the art Our work Conclusion An LFG Frame Semantic Projection Porting SALSA Annotations to LFG Special phenomena Extraction of Frame Assignment Rules 123 8 501

17 LFG 2004, Christchurch17July 10, 2004 Porting SALSA Annotations to LFG An LFG Corpus with frame Semantic Projection  Identify f-structure nodes of FEE and FEs, using IDs as anchor  Define semantic projection for frame and all the frame elements  Using rewrite rules of XLE transfer system Overview State of the art Our work Consequences An LFG Frame Semantic Projection Porting SALSA Annotations to LFG Special phenomena Extraction of Frame Assignment Rules

18 LFG 2004, Christchurch18July 10, 2004 Special Phenomena  Multiple constituents  Asymmetric embedding  Coordination  Multiword expressions  Underspecification Overview State of the art Our work Conclusion An LFG Frame Semantic Projection Porting SALSA Annotations to LFG Special phenomena Extraction of Frame Assignment Rules

19 LFG 2004, Christchurch19July 10, 2004 Special Phenomena Multiword Expressions  Idiomatic expression evokes frame for non-literal meaning  „über die Ladentheke gehen“ -- „sell“  Project individual components to set-valued FEE-MWE Vier Artikel gingen über die Ladentheke. Four items went over the counter “Four items were sold.” Overview State of the art Our work Conclusion An LFG Frame Semantic Projection Porting SALSA Annotations to LFG Special phenomena Extraction of Frame Assignment Rules

20 LFG 2004, Christchurch20July 10, 2004 Special Phenomena Underspecification  Underspecification in SALSA annotation scheme  FEE: verlangen (demand) may evoke COMMERCIAL TRANSACTION or REQUEST frame  FE: Antrag (Request) in REQUEST frame may be SPEAKER or MEDIUM  FRAMES and FEs may be marked “underspecified” or optional  LFG: Model underspecification as disjunction (via optional transfer rules) Overview State of the art Our work Conclusion An LFG Frame Semantic Projection Porting SALSA Annotations to LFG Special phenomena Extraction of Frame Assignment Rules

21 LFG 2004, Christchurch21July 10, 2004 Corpus-based induction of frame rules  Step 1: Porting SALSA annotations to LFG  Using “parallel” LFG corpus of TIGER  To obtain an LFG-frame corpus  Rules anchored to node IDs  Step 2: Induction of general frame assignment rules from the LFG-frame corpus  Can be applied to f-structure output of LFG parsing of new sentences  Rules anchored to functional descriptions FE assignment (auffordern) (  SUBJ) – SPEAKER (  OBJ) – ADDRESSEE (  OBL OBJ) – MESSAGE Overview State of the art Our work Conclusion An LFG Frame Semantic Projection Porting SALSA Annotations to LFG Special phenomena Extraction of Frame Assignment Rules

22 LFG 2004, Christchurch22July 10, 2004 Extraction of Functional Paths  FE assignment paths  Paths relative to FEE  Local and non-local  Non-local = with inside out relative path  Prefer local to non-local Overview State of the art Our work Conclusion An LFG Frame Semantic Projection Porting SALSA Annotations to LFG Special phenomena Extraction of Frame Assignment Rules SPD verspricht Wählern, Beschüsse mitzuteilen. SPD promises voters to report decisions. Relative f-path FEE  MESSAGE  OBJ local SPEAKER (XCOMP  ) SUBJ non-local  SUBJ local

23 LFG 2004, Christchurch23July 10, 2004 Extraction of Functional Paths  Prefer local to non-local  SPEAKER => choose  SUBJ  In ambiguous non-local paths choose „shortest non-local sub-path“  Prefer (XCOMP  ) SUBJ to (XCOMP XCOMP  ) SUBJ  Non-local paths of equal length considered equally good  Choose both (XCOMP  ) OBJ and (ADJ  ) OBJ Overview State of the art Our work Conclusion An LFG Frame Semantic Projection Porting SALSA Annotations to LFG Special phenomena Extraction of Frame Assignment Rules Relative f-path FEE  MESSAGE  OBJ local SPEAKER (XCOMP  ) SUBJ non-local  SUBJ local

24 LFG 2004, Christchurch24July 10, 2004 Applying rules to new sentences  mitteilen: COMMUNICATION; SUBJ  SPEAKER, OBJ  MESSAGE  Complete frames with all frame elements  As instantiated in the corpus Overview State of the art Our work Conclusion An LFG Frame Semantic Projection Porting SALSA Annotations to LFG Special phenomena Extraction of Frame Assignment Rules  Problem: unseen configurations (sparse data problem)  Partial annotation  Individual rules for the FEE  Individual rules for each FE of the frame (conditioned on FEE) pred (X, mitteilen), ‘SUBJ‘ (X, A), ‘OBJ‘ (X, B)  +  (X, SemX), +frame (SemX, communication), +fee (SemX, mitteilen), +  (A, SemA), +speaker (SemX, SemA), +  (B, SemB), +message (SemX, SemB). pred (X, mitteilen)  +  (X, SemX), +frame (SemX, communicaition), +fee (SemX, mitteilen). pred (X, mitteilen),  (X, SemX), frame (SemX, communicaition), ‘SUBJ‘(X, A)  +  (A, SemA), +speaker (SemX, SemA).

25 LFG 2004, Christchurch25July 10, 2004 Current Data and Results  Data used:  12127 frame assignment rules  10009 sentences  Successfully ported frames: 11612  Compiled transfer rules after path extraction: 9334  Local vs. non-local FE assignments: 87.18% vs. 12.82%  Ambiguity rate:  Average 8.83 rules per FEE  Average 41.27 rules per frame Overview State of the art Our work Conclusion Current Data and Results Summary Next steps and Application

26 LFG 2004, Christchurch26July 10, 2004 Overview State of the art Our work Conclusion Current Data and Results Summary Next steps and Application  Re-applying syntax-semantics mapping rules to TIGER-LFG corpus Current Data and Results RecallPrecisionAmbiguity rate / sentence Full frame93.98 %25.94 %8.46 Partial frame94.98 %45.52 %7.83  Applying syntax-semantics mapping rules to free LFG parsing (without statistical disambiguation) RecallPrecisionAmbiguity rate / sentence Full frame52.21 %6.93 %13.35 Partial frame76.41 %18.32 %15.79

27 LFG 2004, Christchurch27July 10, 2004 Summary  Modeling frame semantics in LFG framework  Porting frame annotations from TIGER/SALSA to an LFG corpus  Extracting general frame assignment rules for LFG parsing  Applying frame assignment rules in an LFG parsing architecture Overview State of the art Our work Conclusion Current Data and Results Summary Next steps and Application

28 LFG 2004, Christchurch28July 10, 2004 Next steps  Semantically driven syntactic disambiguation  Reduce ambiguity of syntactic parses  Prefer parses with corresponding semantic annotation  Stochastic modeling for semantic role assignment  Training stochastic models on the basis of corpus annotations  For disambiguation of disjunctive frame assignments  XLE: statistical ME package for training and online disambiguation Overview State of the art Our work Conclusion Current Data and Results Summary Next steps and Application

29 LFG 2004, Christchurch29July 10, 2004 LFG for Frame Annotation  Abstraction from surface properties  The woman who had come in [to sell flowers to the customers] overheard their conversation.  We decided to sink some of our capital, buy a car, and [sell it again before leaving].  Don’t sell the factory to another company  They persuaded him [to sell newmont shares quickly].  Localisation of arguments in long-distance constructions and coordination  F-structure representation of non-overt material (imperative, control/raising)  SELLER corresponds to local SUBJ in f-structure

30 LFG 2004, Christchurch30July 10, 2004 Porting SALSA Annotations to LFG  Using rewrite rules of XLE transfer system  Parameters are the node Ids for FEE and FEs from SALSA annotations  First projecting frame from FEE ti_id(X, FeeID ), pred(X,Pred) ==> +‘s::’(X,SemX), +frame(SemX, FRAME ), +fee(SemX,Pred).  Then define semantic projections for all FEs of the FEE ti_id(X, FeeID ), ‘s::’(X,SemX), frame(SemX, FRAME ), ti_id(Y, RoleID ), pred(Y,Pred) ==> +‘s::’(Y,SemY), +‘ Role’ (SemX,SemY), +rel(SemY,Pred).

31 LFG 2004, Christchurch31July 10, 2004 Special Phenomena Asymmetric Embedding  Multiple constituents are marked with single semantic role (MESSAGE)  Frequent with two constituents: asymmetric embedding Der... Geschäftsführer... gab [als Grund für die Absage] [als Grund für die Absage Terminnöte Schmidts] an. The director... mentioned [time conflicts of Schmidt] [as a reason for cancelling the appointment].

32 LFG 2004, Christchurch32July 10, 2004 Special Phenomena Asymmetric Embedding  Multiple constituents are marked with single semantic role (MESSAGE)  Frequent with two constituents: asymmetric embedding Der... Geschäftsführer... gab [als Grund für die Absage] [als Grund für die Absage Terminnöte Schmidts] an. The director... mentioned [time conflicts of Schmidt] [as a reason for cancelling the appointment].

33 LFG 2004, Christchurch33July 10, 2004 Special Phenomena Asymmetric Embedding  Transfer rules create an embedded frame structure  One constituent within the other  In both possible ways  For embedding, we introduce a new underspecified frame, similar to functional uncertainty Der... Geschäftsführer... gab [als Grund für die Absage] [als Grund für die Absage Terminnöte Schmidts] an. The director... mentioned [time conflicts of Schmidt] [as a reason for cancelling the appointment].

34 LFG 2004, Christchurch34July 10, 2004 Syntactic disambiguation Semantically driven  Reduce ambiguity of syntactic parses  consider only parses with corresponding semantic annotation  Prefere parses with richer semantic annotation (probabilistic parsing)

35 LFG 2004, Christchurch35July 10, 2004 Current Data and Results  Data used: 7041 frame assignments for 6335 sents  Special phenomena  Successfully ported frames:  Compiled transfer rules after path extraction:  Local vs. nonlocal FE assignments: xx.yy% vs. xx.yy%  Ambiguity rate: coorduspMWEAE>dblall abs38239512874219712436 in %5,43 %3,82 %10,35 %3,39 %0,78 %100 % Rules per FEERules per frame avg.minmaxavgminmax 11


Download ppt "Corpus-based Induction of an LFG Syntax-Semantics Interface for Frame Semantic Processing Anette Frank, Jiří Semecký"

Similar presentations


Ads by Google