Presentation is loading. Please wait.

Presentation is loading. Please wait.

Learning to Transform Natural to Formal Languages

Similar presentations


Presentation on theme: "Learning to Transform Natural to Formal Languages"— Presentation transcript:

1 Learning to Transform Natural to Formal Languages
Rohit J. Kate Yuk Wah Wong Raymond J. Mooney July 13, 2005

2 Introduction Semantic Parsing: Transforming natural language sentences into executable complete formal representations Different from Semantic Role Labeling which involves only shallow semantic analysis Two application domains: CLang: RoboCup Coach Language GeoQuery: A Database Query Application

3 CLang: RoboCup Coach Language
In RoboCup Coach competition teams compete to coach simulated players The coaching instructions are given in a formal language called CLang If the ball is in our penalty area, then all our players except player 4 should stay in our half. Simulated soccer field Coach Semantic Parsing ((bpos (penalty-area our)) (do (player-except our{4}) (pos (half our))) CLang

4 GeoQuery: A Database Query Application
Query application for U.S. geography database containing about 800 facts [Zelle & Mooney, 1996] How many cities are there in the US? User Semantic Parsing answer(A, count(B, (city(B), loc(B, C), const(C, countryid(USA))),A)) Query

5 Outline Semantic Parsing using Transformation Rules
Learning Transformation Rules Experiments Conclusions

6 Semantic Parsing using Transformation Rules
SILT (Semantic Interpretation by Learning Transformations) Uses pattern-based transformation rules which map natural language phrases to formal language constructs Transformation rules are repeatedly applied to the sentence to construct its formal language expression

7 Formal Language Grammar
NL: If our player 4 has the ball, our player 4 should shoot. CLang: ((bowner our {4}) (do our {4} shoot)) CLang Parse: Non-terminals: RULE, CONDITION, ACTION… Terminals: bowner, our, 4… Productions: RULE  CONDITION DIRECTIVE DIRECTIVE  do TEAM UNUM ACTION ACTION  shoot RULE CONDITION DIRECTIVE do TEAM UNUM ACTION bowner our 4 shoot

8 Transformation Rule Representation
Rule has two components: a natural language pattern and an associated formal language template Two versions of SILT: String-based rules: used to convert natural language sentence directly to formal language Tree-based rules: used to convert syntactic tree to formal language word gap String-pattern TEAM UNUM has [1] ball Template CONDITION  (bowner TEAM {UNUM}) Tree-pattern Template CONDITION  (bowner TEAM {UNUM}) NP VP VBZ DT NN the ball has TEAM UNUM S

9 Example of Semantic Parsing
If our player 4 has the ball, our player 4 should shoot. our TEAM  our player 4 UNUM  4 shoot ACTIONshoot TEAM UNUM has [1] ball CONDITION  (bowner TEAM {UNUM}) If CONDITION, DIRECTIVE. RULE  (CONDITION DIRECTIVE) TEAM UNUM should ACTION DIRECTIVE  (do TEAM {UNUM} ACTION)

10 Example of Semantic Parsing
If player 4 has the ball, player 4 should shoot . TEAM our our TEAM our our our TEAM  our player 4 UNUM  4 shoot ACTIONshoot TEAM UNUM has [1] ball CONDITION  (bowner TEAM {UNUM}) If CONDITION, DIRECTIVE. RULE  (CONDITION DIRECTIVE) TEAM UNUM should ACTION DIRECTIVE  (do TEAM {UNUM} ACTION)

11 Example of Semantic Parsing
If player 4 has the ball, player 4 should shoot . TEAM our TEAM our our TEAM  our player 4 UNUM  4 shoot ACTIONshoot TEAM UNUM has [1] ball CONDITION  (bowner TEAM {UNUM}) If CONDITION, DIRECTIVE. RULE  (CONDITION DIRECTIVE) TEAM UNUM should ACTION DIRECTIVE  (do TEAM {UNUM} ACTION)

12 Example of Semantic Parsing
If has the ball, should shoot . TEAM our player 4 UNUM 4 TEAM our player 4 UNUM 4 our TEAM  our player 4 UNUM  4 shoot ACTIONshoot TEAM UNUM has [1] ball CONDITION  (bowner TEAM {UNUM}) If CONDITION, DIRECTIVE. RULE  (CONDITION DIRECTIVE) TEAM UNUM should ACTION DIRECTIVE  (do TEAM {UNUM} ACTION)

13 Example of Semantic Parsing
If has the ball, should shoot . TEAM our UNUM 4 TEAM our UNUM 4 our TEAM  our player 4 UNUM  4 shoot ACTIONshoot TEAM UNUM has [1] ball CONDITION  (bowner TEAM {UNUM}) If CONDITION, DIRECTIVE. RULE  (CONDITION DIRECTIVE) TEAM UNUM should ACTION DIRECTIVE  (do TEAM {UNUM} ACTION)

14 Example of Semantic Parsing
If has the ball, should TEAM our UNUM 4 TEAM our UNUM 4 ACTION shoot shoot our TEAM  our player 4 UNUM  4 shoot ACTIONshoot TEAM UNUM has [1] ball CONDITION  (bowner TEAM {UNUM}) If CONDITION, DIRECTIVE. RULE  (CONDITION DIRECTIVE) TEAM UNUM should ACTION DIRECTIVE  (do TEAM {UNUM} ACTION)

15 Example of Semantic Parsing
If has the ball, should TEAM our UNUM 4 TEAM our UNUM 4 ACTION shoot our TEAM  our player 4 UNUM  4 shoot ACTIONshoot TEAM UNUM has [1] ball CONDITION  (bowner TEAM {UNUM}) If CONDITION, DIRECTIVE. RULE  (CONDITION DIRECTIVE) TEAM UNUM should ACTION DIRECTIVE  (do TEAM {UNUM} ACTION)

16 Example of Semantic Parsing
If , should TEAM our CONDITION (bowner our {4}) UNUM 4 has the ball TEAM our UNUM 4 ACTION shoot our TEAM  our player 4 UNUM  4 shoot ACTIONshoot TEAM UNUM has [1] ball CONDITION  (bowner TEAM {UNUM}) If CONDITION, DIRECTIVE. RULE  (CONDITION DIRECTIVE) TEAM UNUM should ACTION DIRECTIVE  (do TEAM {UNUM} ACTION)

17 Example of Semantic Parsing
If , should CONDITION (bowner our {4}) TEAM our UNUM 4 ACTION shoot our TEAM  our player 4 UNUM  4 shoot ACTIONshoot TEAM UNUM has [1] ball CONDITION  (bowner TEAM {UNUM}) If CONDITION, DIRECTIVE. RULE  (CONDITION DIRECTIVE) TEAM UNUM should ACTION DIRECTIVE  (do TEAM {UNUM} ACTION)

18 Example of Semantic Parsing
If , CONDITION (bowner our {4}) TEAM our DIRECTIVE (do our {4} shoot) UNUM 4 should ACTION shoot our TEAM  our player 4 UNUM  4 shoot ACTIONshoot TEAM UNUM has [1] ball CONDITION  (bowner TEAM {UNUM}) If CONDITION, DIRECTIVE. RULE  (CONDITION DIRECTIVE) TEAM UNUM should ACTION DIRECTIVE  (do TEAM {UNUM} ACTION)

19 Example of Semantic Parsing
If , CONDITION (bowner our {4}) DIRECTIVE (do our {4} shoot) our TEAM  our player 4 UNUM  4 shoot ACTIONshoot TEAM UNUM has [1] ball CONDITION  (bowner TEAM {UNUM}) If CONDITION, DIRECTIVE. RULE  (CONDITION DIRECTIVE) TEAM UNUM should ACTION DIRECTIVE  (do TEAM {UNUM} ACTION)

20 Example of Semantic Parsing
If , CONDITION (bowner our {4}) RULE ((bowner our {4}) (do our {4} shoot)) DIRECTIVE (do our {4} shoot) our TEAM  our player 4 UNUM  4 shoot ACTIONshoot TEAM UNUM has [1] ball CONDITION  (bowner TEAM {UNUM}) If CONDITION, DIRECTIVE. RULE  (CONDITION DIRECTIVE) TEAM UNUM should ACTION DIRECTIVE  (do TEAM {UNUM} ACTION)

21 Learning Transformation Rules
SILT induces rules from a corpora of NL sentences paired with their formal representations Patterns are learned for each production by bottom-up rule learning For every production: Call those sentences positives whose formal representations’ parses use that production Call the remaining sentences negatives

22 Rule Learning for a Production
SILT applies greedy-covering, bottom-up rule induction method that repeatedly generalizes positives until they start covering negatives CONDITION (bpos REGION) positives negatives The ball is in REGION , our player 7 is in REGION and no opponent is around our player 7 within 1.5 distance. If the ball is in REGION and not in REGION then player 3 should intercept the ball. During normal play if the ball is in the REGION then player 7 , 9 and 11 should dribble the ball to the REGION . When the play mode is normal and the ball is in the REGION then our player 2 should pass the ball to the REGION . All players except the goalie should pass the ball to REGION if it is in RP18. If the ball is inside rectangle ( -54 , -36 , 0 , 36 ) then player 10 should position itself at REGION with a ball attraction of REGION . Player 2 should pass the ball to REGION if it is in REGION . If our player 6 has the ball then he should take a shot on goal. If player 4 has the ball , it should pass the ball to player 2 or 10. If the condition DR5C3 is true , then player 2 , 3 , 7 and 8 should pass the ball to player 3. During play on , if players 6 , 7 or 8 is in REGION , they should pass the ball to players 9 , 10 or 11. If "Clear_Condition" , players 2 , 3 , 7 or 5 should clear the ball REGION . If it is before the kick off , after our goal or after the opponent's goal , position player 3 at REGION . If the condition MDR4C9 is met , then players 4-6 should pass the ball to player 9. If Pass_11 then player 11 should pass to player 9 and no one else.

23 Generalization of String Patterns
ACTION  (pos REGION) Pattern 1: Always position player UNUM at REGION . Pattern 2: Whenever the ball is in REGION, position player UNUM near the REGION . Find the highest scoring common subsequence:

24 Generalization of String Patterns
ACTION  (pos REGION) Pattern 1: Always position player UNUM at REGION . Pattern 2: Whenever the ball is in REGION, position player UNUM near the REGION . Find the highest scoring common subsequence: Generalization: position player UNUM [2] REGION .

25 Generalization of Tree Patterns
REGION  (penalty-area TEAM) Pattern 1: Pattern 2 Find common subgraphs. NP NP PRP$ NN NN NP NN NN TEAM penalty area TEAM POS penalty box ’s

26 Generalization of Tree Patterns
REGION  (penalty-area TEAM) Pattern 1: Pattern 2 Find common subgraphs. NP NP PRP$ NN NN NP NN NN TEAM penalty area TEAM POS penalty box ’s NP * NN NN Generalization: TEAM penalty

27 Rule Learning for a Production
CONDITION  (bpos REGION) positives negatives The ball is in REGION , our player 7 is in REGION and no opponent is around our player 7 within 1.5 distance. If the ball is in REGION and not in REGION then player 3 should intercept the ball. During normal play if the ball is in the REGION then player 7 , 9 and 11 should dribble the ball to the REGION . When the play mode is normal and the ball is in the REGION then our player 2 should pass the ball to the REGION . All players except the goalie should pass the ball to REGION if it is in REGION. If the ball is inside REGION then player 10 should position itself at REGION with a ball attraction of REGION . Player 2 should pass the ball to REGION if it is in REGION . If our player 6 has the ball then he should take a shot on goal. If player 4 has the ball , it should pass the ball to player 2 or 10. If the condition DR5C3 is true , then player 2 , 3 , 7 and 8 should pass the ball to player 3. During play on , if players 6 , 7 or 8 is in REGION , they should pass the ball to players 9 , 10 or 11. If "Clear_Condition" , players 2 , 3 , 7 or 5 should clear the ball REGION . If it is before the kick off , after our goal or after the opponent's goal , position player 3 at REGION . If the condition MDR4C9 is met , then players 4-6 should pass the ball to player 9. If Pass_11 then player 11 should pass to player 9 and no one else. Bottom-up Rule Learner ball is [2] REGION CONDITION  (bpos REGION) it is in REGION CONDITION  (bpos REGION)

28 Rule Learning for a Production
CONDITION  (bpos REGION) positives negatives The CONDITION , our player 7 is in REGION and no opponent is around our player 7 within 1.5 distance. If the CONDITION and not in REGION then player 3 should intercept the ball. During normal play if the CONDITION then player 7 , 9 and 11 should dribble the ball to the REGION . When the play mode is normal and the CONDITION then our player 2 should pass the ball to the REGION . All players except the goalie should pass the ball to REGION if CONDITION. If the CONDITION then player 10 should position itself at REGION with a ball attraction of REGION . Player 2 should pass the ball to REGION if CONDITION . If our player 6 has the ball then he should take a shot on goal. If player 4 has the ball , it should pass the ball to player 2 or 10. If the condition DR5C3 is true , then player 2 , 3 , 7 and 8 should pass the ball to player 3. During play on , if players 6 , 7 or 8 is in REGION , they should pass the ball to players 9 , 10 or 11. If "Clear_Condition" , players 2 , 3 , 7 or 5 should clear the ball REGION . If it is before the kick off , after our goal or after the opponent's goal , position player 3 at REGION . If the condition MDR4C9 is met , then players 4-6 should pass the ball to player 9. If Pass_11 then player 11 should pass to player 9 and no one else. Bottom-up Rule Learner ball is [2] REGION CONDITION  (bpos REGION) it is in REGION CONDITION  (bpos REGION)

29 Rule Learning for All Productions
Transformation rules for productions should cooperate globally to generate complete semantic parses Redundantly cover every positive example by β = 5 best rules Find the subset of these rules which best cooperate to generate complete semantic parses on the training data coverage accuracy

30 Experimental Corpora CLang GeoQuery [Zelle & Mooney, 1996]
300 randomly selected pieces of coaching advice from the log files of the 2003 RoboCup Coach Competition 22.52 words on average in NL sentences 14.24 tokens on average in formal expressions GeoQuery [Zelle & Mooney, 1996] 250 queries for the given U.S. geography database 6.87 words on average in NL sentences 5.32 tokens on average in formal expressions It’s test on Clang corpora. It’s also tested on an benchmark corpora of semantic parsing. - Geoquery example: how many cities are there in the US?

31 Experimental Methodology
Evaluated using standard 10-fold cross validation Syntactic parses needed by tree-based version were obtained by training Collins’ parser [Bikel, 2004] on WSJ treebank and gold-standard parses of training sentences Correctness CLang: output exactly matches the correct representation Geoquery: the resulting query retrieves the same answer as the correct representation Metrics Geoquery – using the same evaluation method in previous papers Why not using partial match: doesn’t make sense, player 2 instead of player 3, left instead of right Animation, first show correct, then wrong Explain what’s precision, what’s recall

32 Compared Systems CHILL GEOBASE CHILLIN [Zelle & Mooney, 1996]
Learns control rules for shift-reduce parsing using Inductive Logic Programming (ILP) CHILLIN [Zelle & Mooney, 1996] COCKTAIL [Tang & Mooney, 2001] GEOBASE Hand-built parser for GeoQuery [Borland International, 1988] SILT-string - Map string to MR SILT-tree Map tree to MR One take syntax into account , but one doesn’t.

33 Precision Learning Curves for CLang

34 Recall Learning Curves for CLang

35 Precision Learning Curves for GeoQuery

36 Recall Learning Curves for GeoQuery

37 Related Work SCISSOR [Ge & Mooney, 2005]
Integrates semantic and syntactic statistical parsing Requires extensive annotations but gives better results PRECISE [Popescu et al., 2003] Designed to work specially on NL database interfaces Speech Recognition Community [Zue & Glass, 2000] Simpler queries in ATIS corpus

38 Conclusions New approach for semantic parsing, SILT, which uses transformation rules SILT learns transformation rules by doing bottom-up rule induction exploiting the target language grammar Tested on two very different domains, performs better than previous ILP-based approaches

39 Thank You! Our corpora can be downloaded from:
Questions??

40 F-measure Learning Curves for CLang

41 F-measure Learning Curves for GeoQuery

42 Extra Slide: Average Training Time in Minutes
CLang GeoQuery SILT-string 3.2 0.35 CHILLIN 10.4 6.3 SILT-tree 81.4 21.5 COCKTAIL _ 39.6

43 Extra Slide: Variations of Rule Representation
Context in the patterns: in REGION CONDITION  (bpos REGION)

44 Extra Slide: Variations of Rule Representation
Context in the patterns: the ball in REGION CONDITION  (bpos REGION) TEAM UNUM has the ball CONDITION (bpos REGION) in REGION TEAM UNUM has [1] ball CONDITION  (bowner TEAM {UNUM})

45 Extra Slide: Variations of Rule Representation
Context in the patterns: Templates with multiple productions: TEAM UNUM has the ball in REGION CONDITION  (and (bwoner TEAM UNUM) (bpos REGION))

46 Extra Slide: Experimental Methodology
Correctness CLang: output exactly matches the correct representation Geoquery: the resulting query retrieves the same answer as the correct representation If the ball is in our penalty area, all our players except player 4 should stay in our half. Geoquery – using the same evaluation method in previous papers Why not using partial match: doesn’t make sense, player 2 instead of player 3, left instead of right Animation, first show correct, then wrong Explain what’s precision, what’s recall Correct: ((bpos (penalty-area our)) (do (player-except our{4}) (pos (half our))) ((bpos (penalty-area opp)) (do (player-except our{4}) (pos (half our))) Output:

47 Extra Slide: Future Work
Hard-matching symbolic patterns are sometimes too brittle, exploit string and tree kernels as classifiers [Lodhi et al., 2002] Unified implementation of string and tree-based versions for direct comparisons


Download ppt "Learning to Transform Natural to Formal Languages"

Similar presentations


Ads by Google