Presentation is loading. Please wait.

Presentation is loading. Please wait.

David Chen Advisor: Raymond Mooney Research Preparation Exam August 21, 2008 Learning to Sportscast: A Test of Grounded Language Acquisition.

Similar presentations


Presentation on theme: "David Chen Advisor: Raymond Mooney Research Preparation Exam August 21, 2008 Learning to Sportscast: A Test of Grounded Language Acquisition."— Presentation transcript:

1 David Chen Advisor: Raymond Mooney Research Preparation Exam August 21, 2008 Learning to Sportscast: A Test of Grounded Language Acquisition

2 Semantics of Language The meaning of words, phrases, etc Crucial in communications 2

3 Semantics of Language The meaning of words, phrases, etc Crucial in communications Example: “Spanish goalkeeper Iker Casillas blocks the ball” –Merriam-Webster: (transitive verb) to interfere usually legitimately with (as an opponent) in various games or sports –WordNet: (v) parry, deflect 2

4 Language Grounding Problem: We are circularly defining the meanings of words in terms of other words. The meanings of many words are grounded in our perception of the physical world: red, ball, cup, run, hit, fall, etc. –Symbol Grounding: Harnad (1990) Even many abstract words and meanings are metaphorical abstractions of terms grounded in the physical world: up, down, over, in, etc. –Lakoff and Johnson’s Metaphors We Live By It’s difficult to put my ideas into words. Interest in competitions is up. 3

5 Grounding Language Casillas blocks the ball 4

6 Grounding Language Casillas blocks the ball Block(Casillas) 5

7 Grounding Language Casillas blocks the ball Block(Casillas) 5

8 Natural Language and Meaning Representation Casillas blocks the ball Block(Casillas) 6

9 Natural Language and Meaning Representation Natural Language (NL) NL: A language that has evolved naturally, such as English, German, French, Chinese, etc Block(Casillas) 6 Casillas blocks the ball

10 Natural Language and Meaning Representation NL: A language that has evolved naturally, such as English, German, French, Chinese, etc MRL: Formal languages such as logic or any computer-executable code Meaning Representation Language (MRL) Block(Casillas) 6 Casillas blocks the ball Natural Language (NL)

11 Semantic Parsing and Tactical Generation NL Semantic Parsing: maps a natural-language sentence to a complete, detailed semantic representation MRL Semantic Parsing (NL  MRL) Block(Casillas) 7 Casillas blocks the ball

12 Semantic Parsing and Tactical Generation NL Semantic Parsing: maps a natural-language sentence to a complete, detailed semantic representation Tactical Generation: Generates a natural-language sentence from a meaning representation. MRL Semantic Parsing (NL  MRL) Tactical Generation (NL  MRL) Block(Casillas) 7 Casillas blocks the ball

13 Learning Approach Manually Annotated Training Corpora (NL/MRL pairs) Semantic Parser MRLNL Semantic Parser Learner 8

14 Learning Approach Manually Annotated Training Corpora (NL/MRL pairs) Tactical Generator MRLNL Tactical Generator Learner 9

15 Example of Annotated Training Corpus Alice passes the ball to Bob Bob turns the ball over to John John passes to Fred Fred shoots for the goal Paul blocks the ball Paul kicks off to Nancy … Pass(Alice, Bob) Turnover(Bob, John) Pass(John, Fred) Kick(Fred) Block(Paul) Pass(Paul, Nancy) … Natural Language (NL) Meaning Representation Language (MRL) 10

16 Example of Annotated Training Corpus Alice passes the ball to Bob Bob turns the ball over to John John passes to Fred Fred shoots for the goal Paul blocks the ball Paul kicks off to Nancy … P1(C1, C2) P2(C2, C3) P1(C3, C4) P3(C4) P4(C5) P5(C5, C6) … 11 Natural Language (NL) Meaning Representation Language (MRL)

17 Learning Language from Perceptual Context Constructing annotated corpora for language learning is difficult Children acquire language through exposure to linguistic input in the context of a rich, relevant, perceptual environment Ideally, a computer system can learn language in the same manner 12

18 Goals Learn to ground the semantics of language Learn language through correlated linguistic and visual inputs 13 Casillas blocks the ball

19 Challenge 14

20 Challenge 14

21 Challenge “ 西班牙守門員 擋下了球 ” 14

22 Challenge A linguistic input may correspond to many possible events ? ? ? “ 西班牙守門員 擋下了球 ” 15

23 Challenge A linguistic input may correspond to many possible events ? ? ? Pass(GermanyPlayer1, GermanyPlayer2) Kick(GermanyPlayer2) Block(SpanishGoalie) “ 西班牙守門員 擋下了球 ” 16

24 Overview Sportscasting task Related works Tactical generation Strategic generation Human evaluation 17

25 Learning to Sportscast Robocup Simulation League games No speech recognition –Record commentaries in text form No computer vision –Ruled-based system to automatically extract game events in symbolic form Concentrate on linguistic issues 18

26 Robocup Simulation League 19

27 Robocup Simulation League 19 Purple goalie blocked the ball

28 Learning to Sportscast Learn to sportscast by observing sample human sportscasts Build a function that maps between natural language (NL) and meaning representation (MR) –NL: Textual commentaries about the game –MR: Predicate logic formulas that represent events in the game 20

29 Robocup Sportscaster Trace Natural Language CommentaryMeaning Representation Purple goalie turns the ball over to Pink8 badPass ( Purple1, Pink8 ) Pink11 looks around for a teammate Pink8 passes the ball to Pink11 Purple team is very sloppy today Pink11 makes a long pass to Pink8 Pink8 passes back to Pink11 turnover ( Purple1, Pink8 ) pass ( Pink11, Pink8 ) pass ( Pink8, Pink11 ) ballstopped pass ( Pink8, Pink11 ) kick ( Pink11 ) kick ( Pink8) kick ( Pink11 ) kick ( Pink8 ) 21

30 Robocup Sportscaster Trace Natural Language CommentaryMeaning Representation Purple goalie turns the ball over to Pink8 badPass ( Purple1, Pink8 ) Pink11 looks around for a teammate Pink8 passes the ball to Pink11 Purple team is very sloppy today Pink11 makes a long pass to Pink8 Pink8 passes back to Pink11 turnover ( Purple1, Pink8 ) pass ( Pink11, Pink8 ) pass ( Pink8, Pink11 ) ballstopped pass ( Pink8, Pink11 ) kick ( Pink11 ) kick ( Pink8) kick ( Pink11 ) kick ( Pink8 ) 21

31 Robocup Sportscaster Trace Natural Language CommentaryMeaning Representation Purple goalie turns the ball over to Pink8 badPass ( Purple1, Pink8 ) Pink11 looks around for a teammate Pink8 passes the ball to Pink11 Purple team is very sloppy today Pink11 makes a long pass to Pink8 Pink8 passes back to Pink11 turnover ( Purple1, Pink8 ) pass ( Pink11, Pink8 ) pass ( Pink8, Pink11 ) ballstopped pass ( Pink8, Pink11 ) kick ( Pink11 ) kick ( Pink8) kick ( Pink11 ) kick ( Pink8 ) 21

32 Robocup Sportscaster Trace Natural Language CommentaryMeaning Representation Purple goalie turns the ball over to Pink8 P6 ( C1, C19 ) Pink11 looks around for a teammate Pink8 passes the ball to Pink11 Purple team is very sloppy today Pink11 makes a long pass to Pink8 Pink8 passes back to Pink11 P5 ( C1, C19 ) P2 ( C22, C19 ) P2 ( C19, C22 ) P0 P2 ( C19, C22 ) P1 ( C22 ) P1( C19 ) P1 ( C22 ) P1 ( C19 ) 22

33 Robocup Data Collected human textual commentary for the 4 Robocup championship games from 2001-2004. –Avg # events/game = 2,613 –Avg # sentences/game = 509 Each sentence matched to all events within previous 5 seconds. –Avg # MRs/sentence = 2.5 (min 1, max 12) Manually annotated with correct matchings of sentences to MRs (for evaluation purposes only). 23

34 Overview Sportscasting task Related works Tactical generation Strategic generation Human evaluation 24

35 Semantic Parser Learners Learn a function from NL to MR NL: “Purple3 passes the ball to Purple5” MR: Pass ( Purple3, Purple5 ) Semantic Parsing (NL  MR) Tactical Generation (MR  NL) We experiment with two semantic parser learners –WASP (Wong & Mooney, 2006; 2007) –KRISP (Kate & Mooney, 2006) 25

36 Uses statistical machine translation techniques –Synchronous context-free grammars (SCFG) [Wu, 1997; Melamed, 2004; Chiang, 2005] –Word alignments [Brown et al., 1993; Och & Ney, 2003] Capable of both semantic parsing and tactical generation WASP: Word Alignment-based Semantic Parsing 26

37 KRISP: Kernel-based Robust Interpretation by Semantic Parsing Productions of MR language are treated like semantic concepts SVM classifier is trained for each production with string subsequence kernel These classifiers are used to compositionally build MRs of the sentences More resistant to noisy supervision but incapable of tactical generation 27

38 KRISPER: KRISP with EM-like Retraining Extension of K RISP that learns from ambiguous supervision [Kate & Mooney, 2007] Uses an iterative EM-like method to gradually converge on a correct meaning for each sentence. 28

39 KRISPER Purple goalie turns the ball over to Pink8 badPass ( Purple1, Pink8 ) Pink11 looks around for a teammate Pink8 passes the ball to Pink11 Purple team is very sloppy today Pink11 makes a long pass to Pink8 Pink8 passes back to Pink11 turnover ( Purple1, Pink8 ) pass ( Pink11, Pink8 ) pass ( Pink8, Pink11 ) ballstopped pass ( Pink8, Pink11 ) kick ( Pink11 ) kick ( Pink8) kick ( Pink11 ) kick ( Pink8 ) 1. Assume every possible meaning for a sentence is correct 29

40 KRISPER Purple goalie turns the ball over to Pink8 badPass ( Purple1, Pink8 ) Pink11 looks around for a teammate Pink8 passes the ball to Pink11 Purple team is very sloppy today Pink11 makes a long pass to Pink8 Pink8 passes back to Pink11 turnover ( Purple1, Pink8 ) pass ( Pink11, Pink8 ) pass ( Pink8, Pink11 ) ballstopped pass ( Pink8, Pink11 ) kick ( Pink11 ) kick ( Pink8) kick ( Pink11 ) kick ( Pink8 ) 1. Assume every possible meaning for a sentence is correct 29

41 Purple goalie turns the ball over to Pink8 badPass ( Purple1, Pink8 ) Pink11 looks around for a teammate Pink8 passes the ball to Pink11 Purple team is very sloppy today Pink11 makes a long pass to Pink8 Pink8 passes back to Pink11 turnover ( Purple1, Pink8 ) pass ( Pink11, Pink8 ) pass ( Pink8, Pink11 ) ballstopped pass ( Pink8, Pink11 ) kick ( Pink11 ) kick ( Pink8) kick ( Pink11 ) kick ( Pink8 ) 1/2 1/3 1/2 KRISPER 2. Resulting NL-MR pairs are weighted and given to semantic parser learner 30

42 KRISPER Purple goalie turns the ball over to Pink8 badPass ( Purple1, Pink8 ) Pink11 looks around for a teammate Pink8 passes the ball to Pink11 Purple team is very sloppy today Pink11 makes a long pass to Pink8 Pink8 passes back to Pink11 turnover ( Purple1, Pink8 ) pass ( Pink11, Pink8 ) pass ( Pink8, Pink11 ) ballstopped pass ( Pink8, Pink11 ) kick ( Pink11 ) kick ( Pink8) kick ( Pink11 ) kick ( Pink8 ) 3. Estimate the confidence of each NL-MR pair using the resulting trained semantic parser 0.65 0.87 0.22 0.35 0.13 0.85 0.81 0.37 0.76 0.49 0.76 0.67 0.86 31

43 KRISPER Purple goalie turns the ball over to Pink8 badPass ( Purple1, Pink8 ) Pink11 looks around for a teammate Pink8 passes the ball to Pink11 Purple team is very sloppy today Pink11 makes a long pass to Pink8 Pink8 passes back to Pink11 turnover ( Purple1, Pink8 ) pass ( Pink11, Pink8 ) pass ( Pink8, Pink11 ) ballstopped pass ( Pink8, Pink11 ) kick ( Pink11 ) kick ( Pink8) kick ( Pink11 ) kick ( Pink8 ) 4. Use maximum weighted matching on a bipartite graph to find the best NL-MR pairs [Munkres, 1957] 0.65 0.87 0.22 0.35 0.13 0.85 0.81 0.37 0.76 0.49 0.76 0.67 0.86 32

44 Purple team is very sloppy today KRISPER Purple goalie turns the ball over to Pink8 badPass ( Purple1, Pink8 ) Pink11 looks around for a teammate Pink8 passes the ball to Pink11 Pink11 makes a long pass to Pink8 Pink8 passes back to Pink11 turnover ( Purple1, Pink8 ) pass ( Pink11, Pink8 ) pass ( Pink8, Pink11 ) ballstopped pass ( Pink8, Pink11 ) kick ( Pink11 ) kick ( Pink8) kick ( Pink11 ) kick ( Pink8 ) 4. Use maximum weighted matching on a bipartite graph to find the best NL-MR pairs [Munkres, 1957] 0.65 0.87 0.22 0.35 0.13 0.85 0.81 0.37 0.76 0.49 0.76 0.67 0.86 32

45 Purple team is very sloppy today KRISPER Purple goalie turns the ball over to Pink8 badPass ( Purple1, Pink8 ) Pink11 looks around for a teammate Pink8 passes the ball to Pink11 Pink11 makes a long pass to Pink8 Pink8 passes back to Pink11 turnover ( Purple1, Pink8 ) pass ( Pink11, Pink8 ) pass ( Pink8, Pink11 ) ballstopped pass ( Pink8, Pink11 ) kick ( Pink11 ) kick ( Pink8) kick ( Pink11 ) kick ( Pink8 ) 5. Give the best pairs to the semantic parser learner in the next iteration, and repeat until convergence 33

46 Overview Sportscasting task Related works Tactical generation Strategic generation Human evaluation 34

47 Tactical Generation Learn how to generate NL from MR Example: Two steps 1.Disambiguate the training data 2.Learn a language generator Pass(Pink2, Pink3)  “Pink2 kicks the ball to Pink3” 35

48 WASPER WASP with EM-like retraining to handle ambiguous training data. Same augmentation as added to KRISP to create KRISPER. 36

49 First train KRISPER to disambiguate the data Then train WASP on the resulting unambiguously supervised data. KRISPER-WASP 37

50 WASPER-GEN Determines the best matching based on generation (MR→NL). Score each potential NL/MR pair by using the currently trained WASP -1 generator. Compute NIST MT score [NIST report, 2002] between the generated sentence and the potential matching sentence. 38

51 NIST scores Target: Purple2 quickly passes to Purple3 Candidate: Purple2 passes to Purple3 1-grams: Purple2, passes, to, Purple3 2-grams: Purple2 passes, passes to, to Purple3 3-grams: Purple2 passes to, passes to Purple3 4-gram: Purple2 passes to Purple3 39

52 NIST scores Target: Purple2 quickly passes to Purple3 Candidate: Purple2 passes to Purple3 1-grams: Purple2, passes, to, Purple3 2-grams: Purple2 passes, passes to, to Purple3 3-grams: Purple2 passes to, passes to Purple3 4-gram: Purple2 passes to Purple3 4/4 39

53 NIST scores Target: Purple2 quickly passes to Purple3 Candidate: Purple2 passes to Purple3 1-grams: Purple2, passes, to, Purple3 2-grams: Purple2 passes, passes to, to Purple3 3-grams: Purple2 passes to, passes to Purple3 4-gram: Purple2 passes to Purple3 4/4 2/3 39

54 NIST scores Target: Purple2 quickly passes to Purple3 Candidate: Purple2 passes to Purple3 1-grams: Purple2, passes, to, Purple3 2-grams: Purple2 passes, passes to, to Purple3 3-grams: Purple2 passes to, passes to Purple3 4-gram: Purple2 passes to Purple3 39 4/4 2/3 1/2

55 NIST scores Target: Purple2 quickly passes to Purple3 Candidate: Purple2 passes to Purple3 1-grams: Purple2, passes, to, Purple3 2-grams: Purple2 passes, passes to, to Purple3 3-grams: Purple2 passes to, passes to Purple3 4-gram: Purple2 passes to Purple3 39 4/4 2/3 1/2 0/1

56 System Overview Purple7 loses the ball to Pink2 SportscasterRobocup Simulator Ambiguous Training Data Pink2 kicks the ball to Pink5 Pink5 makes a long pass to Pink8 Pink8 shoots the ball Turnover ( purple7, pink2 ) Pass ( pink5, pink8) Pass ( Purple5, Purple7 ) Kick ( pink2 ) Pass ( pink2, pink5 ) Kick ( pink5 ) Ballstopped Kick ( pink8 ) 40

57 System Overview Purple7 loses the ball to Pink2 SportscasterRobocup Simulator Pink2 kicks the ball to Pink5 Pink5 makes a long pass to Pink8 Pink8 shoots the ball Turnover ( purple7, pink2 ) Pass ( pink5, pink8) Pass ( Purple5, Purple7 ) Kick ( pink2 ) Pass ( pink2, pink5 ) Kick ( pink5 ) Ballstopped Kick ( pink8 ) Semantic Parser Learner Initial Semantic Parser 40 Ambiguous Training Data

58 System Overview Purple7 loses the ball to Pink2 SportscasterRobocup Simulator Pink2 kicks the ball to Pink5 Pink5 makes a long pass to Pink8 Pink8 shoots the ball Turnover ( purple7, pink2 ) Pass ( pink5, pink8) Pass ( purple5, purple7 ) Kick ( pink2 ) Pass ( pink2, pink5 ) Kick ( pink5 ) Ballstopped Kick ( pink8 ) Initial Semantic Parser Purple7 loses the ball to Pink2 Pink2 kicks the ball to Pink5 Pink5 makes a long pass to Pink8 Pink8 shoots the ball Kick ( pink8 ) Pass ( pink2, pink5 ) Kick ( pink2 ) Kick ( pink5 ) 40 Ambiguous Training Data Unambiguous Training Data

59 System Overview Purple7 loses the ball to Pink2 SportscasterRobocup Simulator Pink2 kicks the ball to Pink5 Pink5 makes a long pass to Pink8 Pink8 shoots the ball Turnover ( purple7, pink2 ) Pass ( pink5, pink8) Pass ( purple5, purple7 ) Kick ( pink2 ) Pass ( pink2, pink5 ) Kick ( pink5 ) Ballstopped Kick ( pink8 ) Semantic Parser Purple7 loses the ball to Pink2 Pink2 kicks the ball to Pink5 Pink5 makes a long pass to Pink8 Pink8 shoots the ball Kick ( pink8 ) Pass ( pink2, pink5 ) Kick ( pink2 ) Kick ( pink5 ) Semantic Parser Learner 40 Ambiguous Training Data Unambiguous Training Data

60 System Overview Purple7 loses the ball to Pink2 SportscasterRobocup Simulator Pink2 kicks the ball to Pink5 Pink5 makes a long pass to Pink8 Pink8 shoots the ball Turnover ( purple7, pink2 ) Pass ( pink5, pink8) Pass ( purple5, purple7 ) Kick ( pink2 ) Pass ( pink2, pink5 ) Kick ( pink5 ) Ballstopped Kick ( pink8 ) Semantic Parser Purple7 loses the ball to Pink2 Pink2 kicks the ball to Pink5 Pink5 makes a long pass to Pink8 Pink8 shoots the ball Kick ( pink8 ) Pass ( pink2, pink5 ) Kick ( pink5 ) Semantic Parser Learner Turnover ( purple7, pink2 ) 40 Ambiguous Training Data Unambiguous Training Data

61 System Overview Purple7 loses the ball to Pink2 SportscasterRobocup Simulator Pink2 kicks the ball to Pink5 Pink5 makes a long pass to Pink8 Pink8 shoots the ball Turnover ( purple7, pink2 ) Pass ( pink5, pink8) Pass ( purple5, purple7 ) Kick ( pink2 ) Pass ( pink2, pink5 ) Kick ( pink5 ) Ballstopped Kick ( pink8 ) Semantic Parser Purple7 loses the ball to Pink2 Pink2 kicks the ball to Pink5 Pink5 makes a long pass to Pink8 Pink8 shoots the ball Kick ( pink8 ) Pass ( pink2, pink5 ) Kick ( pink5 ) Semantic Parser Learner Turnover ( purple7, pink2 ) 40 Ambiguous Training Data Unambiguous Training Data

62 System Overview Purple7 loses the ball to Pink2 SportscasterRobocup Simulator Pink2 kicks the ball to Pink5 Pink5 makes a long pass to Pink8 Pink8 shoots the ball Turnover ( purple7, pink2 ) Pass ( pink5, pink8) Pass ( purple5, purple7 ) Kick ( pink2 ) Pass ( pink2, pink5 ) Kick ( pink5 ) Ballstopped Kick ( pink8 ) Semantic Parser Purple7 loses the ball to Pink2 Pink2 kicks the ball to Pink5 Pink5 makes a long pass to Pink8 Pink8 shoots the ball Kick ( pink8 ) Pass ( pink2, pink5 ) Semantic Parser Learner Turnover ( purple7, pink2 ) Pass ( pink5, pink8) 40 Ambiguous Training Data Unambiguous Training Data

63 KRISPER and WASPER Purple7 loses the ball to Pink2 SportscasterRobocup Simulator Pink2 kicks the ball to Pink5 Pink5 makes a long pass to Pink8 Pink8 shoots the ball Turnover ( purple7, pink2 ) Pass ( pink5, pink8) Pass ( purple5, purple7 ) Kick ( pink2 ) Pass ( pink2, pink5 ) Kick ( pink5 ) Ballstopped Kick ( pink8 ) Semantic Parser Purple7 loses the ball to Pink2 Pink2 kicks the ball to Pink5 Pink5 makes a long pass to Pink8 Pink8 shoots the ball Kick ( pink8 ) Pass ( pink2, pink5 ) Kick ( pink5 ) Semantic Parser Learner (KRISP/WASP) Turnover ( purple7, pink2 ) 41 Ambiguous Training Data Unambiguous Training Data

64 WASPER-GEN Purple7 loses the ball to Pink2 SportscasterRobocup Simulator Pink2 kicks the ball to Pink5 Pink5 makes a long pass to Pink8 Pink8 shoots the ball Turnover ( purple7, pink2 ) Pass ( pink5, pink8) Pass ( purple5, purple7 ) Kick ( pink2 ) Pass ( pink2, pink5 ) Kick ( pink5 ) Ballstopped Kick ( pink8 ) Tactical Generator Purple7 loses the ball to Pink2 Pink2 kicks the ball to Pink5 Pink5 makes a long pass to Pink8 Pink8 shoots the ball Kick ( pink8 ) Pass ( pink2, pink5 ) Kick ( pink5 ) Tactical Generator Learner (WASP) Turnover ( purple7, pink2 ) 42 Ambiguous Training Data Unambiguous Training Data

65 DisambiguationLearning language generator WASPRandomWASP KRISPER (Kate & Mooney, 2007) KRISPN/A WASPERWASP KRISPER-WASPKRISPWASP WASPER-GENWASP’s language generator WASP WASP with gold matching N/AWASP Systems 43

66 DisambiguationLearning language generator WASPRandomWASP KRISPER (Kate & Mooney, 2007) KRISPN/A WASPERWASP KRISPER-WASPKRISPWASP WASPER-GENWASP’s language generator WASP WASP with gold matching N/AWASP Lower baseline Upper baseline Systems 43

67 DisambiguationLearning language generator WASPRandomWASP KRISPER (Kate & Mooney, 2007) KRISPN/A WASPERWASP KRISPER-WASPKRISPWASP WASPER-GENWASP’s language generator WASP WASP with gold matching N/AWASP Lower baseline Upper baseline Systems Matching 44

68 Matching 4 Robocup championship games from 2001-2004. –Avg # events/game = 2,613 –Avg # sentences/game = 509 Leave-one-game-out cross-validation Metric: –Precision: % of system’s annotations that are correct –Recall: % of gold-standard annotations produced –F-measure: Harmonic mean of precision and recall 45

69 Matching Results 46

70 DisambiguationLearning language generator WASPRandomWASP KRISPER (Kate & Mooney, 2007) KRISPN/A WASPERWASP KRISPER-WASPKRISPWASP WASPER-GENWASP’s language generator WASP WASP with gold matching N/AWASP Lower baseline Upper baseline Systems 47

71 DisambiguationLearning language generator WASPRandomWASP KRISPER (Kate & Mooney, 2007) KRISPN/A WASPERWASP KRISPER-WASPKRISPWASP WASPER-GENWASP’s language generator WASP WASP with gold matching N/AWASP Lower baseline Upper baseline Systems Tactical Generation 47

72 Tactical Generation 4 Robocup championship games from 2001- 2004. –Avg # events/game = 2,613 –Avg # sentences/game = 509 Leave-one-game-out cross-validation NIST score [NIST report, 2002] –Evaluate the quality of machine translations based on matching n-grams 48

73 Tactical Generation Results 49

74 Overview Sportscasting task Related works Tactical generation Strategic generation Human evaluation 50

75 Strategic Generation Generation requires not only knowing how to say something (tactical generation) but also what to say (strategic generation). For automated sportscasting, one must be able to effectively choose which events to describe. 51

76 Example of Strategic Generation pass ( purple7, purple6 ) ballstopped kick ( purple6 ) pass ( purple6, purple2 ) ballstopped kick ( purple2 ) pass ( purple2, purple3 ) kick ( purple3 ) badPass ( purple3, pink9 ) turnover ( purple3, pink9 ) 52

77 Example of Strategic Generation pass ( purple7, purple6 ) ballstopped kick ( purple6 ) pass ( purple6, purple2 ) ballstopped kick ( purple2 ) pass ( purple2, purple3 ) kick ( purple3 ) badPass ( purple3, pink9 ) turnover ( purple3, pink9 ) 52

78 Strategic Generation For each event type (e.g. pass, kick) estimate the probability that it is described by the sportscaster. Requires correct NL/MR matching –Use estimated matching from tactical generation –Iterative Generation Strategy Learning 53

79 Iterative Generation Strategy Learning (IGSL) Directly estimates the likelihood of an event being commented on Self-training iterations to improve estimates Uses events not associated with any NL as negative evidence 54

80 Strategic Generation Performance Evaluate how well the system can predict which events a human comments on Metric: –Precision: % of system’s annotations that are correct –Recall: % of gold-standard annotations correctly produced –F-measure: Harmonic mean of precision and recall 55

81 Strategic Generation Results 56

82 Overview Sportscasting task Related works Tactical generation Strategic generation Human evaluation 57

83 4 fluent English speakers as judges 8 commented game clips –2 minute clips randomly selected from each of the 4 games –Each clip commented once by a human, and once by the machine Presented in random counter-balanced order Judges were not told which ones were human or machine generated Human Evaluation (Quasi Turing Test) 58

84 Demo Clip Game clip commentated using WASPER- GEN with IGSL, since this gave the best results for generation. FreeTTS was used to synthesize speech from textual output. 59

85 Human Evaluation Score English Fluency Semantic Correctness Sportscasting Ability 5FlawlessAlwaysExcellent 4GoodUsuallyGood 3Non-nativeSometimesAverage 2DisfluentRarelyBad 1GibberishNeverTerrible 60

86 Human Evaluation Commentator English Fluency Semantic Correctness Sportscasting Ability Human3.944.253.63 Machine3.443.562.94 Difference0.50.69 Score English Fluency Semantic Correctness Sportscasting Ability 5FlawlessAlwaysExcellent 4GoodUsuallyGood 3Non-nativeSometimesAverage 2DisfluentRarelyBad 1GibberishNeverTerrible 60

87 Future Work Expand MRs to beyond simple logic formulas Apply approach to learning situated language in a computer video-game environment (Gorniak & Roy, 2005) Apply approach to captioned images or video using computer vision to extract objects, relations, and events from real perceptual data (Fleischman & Roy, 2007) 61

88 Conclusion Current language learning work uses expensive, unrealistic training data. We have developed a language learning system that can learn from language paired with an ambiguous perceptual environment. We have evaluated it on the task of learning to sportscast simulated Robocup games. The system learns to sportscast almost as well as humans. 62

89 Backup Slides

90 Robocup Sportscaster Trace purple6 passes to purple2 purple3 loses the ball to pink9 purple2 makes a short pass to purple3 ballstopped kick ( purple6 ) pass ( purple6, purple2 ) turnover ( purple3, pink9 ) kick ( purple2 ) pass ( purple2, purple3 ) kick ( purple3 ) Natural Language CommentaryMeaning Representation

91 Robocup Sportscaster Trace purple6 passes to purple2 purple3 loses the ball to pink9 purple2 makes a short pass to purple3 pass ( purple6, purple2 ) pass ( purple2, purple3 ) Natural Language CommentaryMeaning Representation

92 Robocup Sportscaster Trace purple6 passes to purple2 purple3 loses the ball to pink9 purple2 makes a short pass to purple3 kick ( purple6 ) kick ( purple2 ) kick ( purple3 ) Natural Language CommentaryMeaning Representation

93 Robocup Sportscaster Trace purple6 passes to purple2 purple3 loses the ball to pink9 purple2 makes a short pass to purple3 kick ( purple 3 ) ballstopped kick ( purple6 ) pass ( purple6, purple2 ) kick ( purple2 ) turnover ( purple3, pink9 ) kick ( purple2 ) pass ( purple2, purple3 ) kick ( purple3 ) kick (purple 3 Natural Language CommentaryMeaning Representation

94 Robocup Sportscaster Trace purple6 passes to purple2 purple3 loses the ball to pink9 purple2 makes a short pass to purple3 kick ( purple 3 ) kick ( purple6 ) kick ( purple2 ) kick ( purple3 ) kick (purple 3 Natural Language CommentaryMeaning Representation

95 Robocup Sportscaster Trace purple6 passes to purple2 purple3 loses the ball to pink9 purple2 makes a short pass to purple3 kick ( purple 3 ) kick ( purple6 ) kick ( purple2 ) kick ( purple3 ) Natural Language CommentaryMeaning Representation Negative Evidence


Download ppt "David Chen Advisor: Raymond Mooney Research Preparation Exam August 21, 2008 Learning to Sportscast: A Test of Grounded Language Acquisition."

Similar presentations


Ads by Google