CS 182 Sections 103 - 104 slide credit to Eva Mok and Joe Makin Updated by Leon Barrett April 25, 2007.

Slides:



Advertisements
Similar presentations
Statistical Machine Translation Part II: Word Alignments and EM Alexander Fraser ICL, U. Heidelberg CIS, LMU München Statistical Machine Translation.
Advertisements

GRAMMAR & PARSING (Syntactic Analysis) NLP- WEEK 4.
Models of Grammar Learning CS 182 Lecture April 24, 2008.
Topics in Cognition and Language: Theory, Data and Models *Perceptual scene analysis: extraction of meaning events, causality, intentionality, Theory of.
The Neural Basis of Thought and Language Final Review Session.
CS 182 Sections slides created by: Eva Mok modified by jgm April 26, 2006.
CS 182 Sections Eva Mok April 21, 2004.
Models of Grammar Learning CS 182 Lecture April 26, 2007.
Constructing Grammar: a computational model of the acquisition of early constructions CS 182 Lecture April 25, 2006.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
CS 182 Sections slides created by Eva Mok modified by JGM March
1 Compiler Design (40-414)  Main Text Book: Compilers: Principles, Techniques & Tools, 2 nd ed., Aho, Lam, Sethi, and Ullman, 2007  Evaluation:  Midterm.
Artificial Intelligence: Natural Language
Supertagging CMSC Natural Language Processing January 31, 2006.
CS 4705 Lecture 17 Semantic Analysis: Robust Semantics.
Event-Based Extractive Summarization E. Filatova and V. Hatzivassiloglou Department of Computer Science Columbia University (ACL 2004)
Chapter 3 Language Acquisition: A Linguistic Treatment Jang, HaYoung Biointelligence Laborotary Seoul National University.
Parsing with Context Free Grammars. Slide 1 Outline Why should you care? Parsing Top-Down Parsing Bottom-Up Parsing Bottom-Up Space (an example) Top -
Poverty of Stimulus Poverty of Stimulus Reading Group.
The Neural Basis of Thought and Language Final Review Session.
The Neural Basis of Thought and Language Final Review Session.
slides derived from those of Eva Mok and Joe Makin March 21, 2007
The Neural Basis of Thought and Language Week 14.
CS 182 Sections Leon Barrett with slides inspired by Eva Mok and Joe Makin April 18, 2007.
The Neural Basis of Thought and Language Week 14.
CS182 Leon Barrett with slides inspired by Eva Mok and Joe Makin April 25, 2008.
An –Najah National University Submitted to : Dr. Suzan Arafat
Statistical Machine Translation Part II: Word Alignments and EM
Review course concepts
PSYC 206 Lifespan Development Bilge Yagmurlu.
Compiler Design (40-414) Main Text Book:
CS 182 Sections 101 & 102 Leon Barrett Jan 17, 2007
CSC 594 Topics in AI – Natural Language Processing
CSC 421: Algorithm Design & Analysis
CSC 421: Algorithm Design & Analysis
Psych156A/Ling150: Psychology of Language Learning
LEARNER MISTAKES Гайнуллин Гусман Салихжанович,
Basic Parsing with Context Free Grammars Chapter 13
Reading and Frequency Lists
Semantic Parsing for Question Answering
Kenneth Baclawski et. al. PSB /11/7 Sa-Im Shin
The University of Adelaide, School of Computer Science
CSC 421: Algorithm Design & Analysis
slides derived from those of Eva Mok and Joe Makin March 21, 2007
We have defined what the student should be able to do.
CMSC201 Computer Science I for Majors Lecture 24 – Sorting
CS 240 – Lecture 11 Pseudocode.
Data Mining Lecture 11.
THE NATURE of LEARNER LANGUAGE
Basic Program Analysis: AST
CSCI 5832 Natural Language Processing
Lecture 12: Data Wrangling
CSc4730/6730 Scientific Visualization
Lecture 7: Introduction to Parsing (Syntax Analysis)
Lecture 8: Top-Down Parsing
SECOND LANGUAGE LISTENING Comprehension: Process and Pedagogy
The Nature of Learner Language (Chapter 2 Rod Ellis, 1997) Page 15
CS4705 Natural Language Processing
CSCI 5832 Natural Language Processing
Authors: Wai Lam and Kon Fan Low Announcer: Kyu-Baek Hwang
CS 416 Artificial Intelligence
CSC 421: Algorithm Design & Analysis
A User study on Conversational Software
Lecture 5 Scanning.
Artificial Intelligence 2004 Speech & Natural Language Processing
Version Space Machine Learning Fall 2018.
Put the Lesson Title Here
Theoretical approaches to helping children to learn to read:
Data Structures and Algorithms
Presentation transcript:

CS 182 Sections slide credit to Eva Mok and Joe Makin Updated by Leon Barrett April 25, 2007

Announcements a9 due Tuesday, May 1 st, in class final exam Tuesday May 8 th in class final paper due Friday May 11 th, 11:59pm final review sometime next week

Schedule Last Week –Constructions, ECG This Week –Models of language learning –Embodied Construction Grammar learning Next Week –Open lecture –Wrap-Up

Questions 1.Why is learning language difficult? 2.What are the “paucity of the stimulus” and the “opulence of the substrate”? 3.What is Gold's Theorem? 4.How does the analyzer use the constructions to parse a sentence? 5.How can we learn new ECG constructions? 6.What are ways to re-organize and consolidate the current grammar? 7.What metric is used to determine when to form a new construction?

Difficulty of learning language What makes learning language difficult? –How many sentences do children hear? –How often are those sentences even correct? –Even when they're correct, how often are they complete? –How often are they corrected when saying something wrong? –How many possible languages are there?

Larger context War! –Is language innate? –Covered in book

Questions 1.Why is learning language difficult? 2.What are the “paucity of the stimulus” and the “opulence of the substrate”? 3.What is Gold's Theorem? 4.How does the analyzer use the constructions to parse a sentence? 5.How can we learn new ECG constructions? 6.What are ways to re-organize and consolidate the current grammar? 7.What metric is used to determine when to form a new construction?

Paucity and Opulence Poverty of the stimulus –Coined to suggest how little information children have to learn from Opulence of the substrate –Opulence = “richness” –Coined in response to suggest how much background information children have

Questions 1.Why is learning language difficult? 2.What are the “paucity of the stimulus” and the “opulence of the substrate”? 3.What is Gold's Theorem? 4.How does the analyzer use the constructions to parse a sentence? 5.How can we learn new ECG constructions? 6.What are ways to re-organize and consolidate the current grammar? 7.What metric is used to determine when to form a new construction?

Gold's Theorem Suppose that you have an infinite number of languages –language = “set of legal sentences” Suppose that for every language Ln, there is a bigger language Ln+1 –makes every sentence, and then some There's some language Linfinity –contains all the sentences in all other grammars You can arrange data so that no one ever learns Linfinity – on.GoldsTheorem.pdf

Questions 1.Why is learning language difficult? 2.What are the “paucity of the stimulus” and the “opulence of the substrate”? 3.What is Gold's Theorem? 4.How does the analyzer use the constructions to parse a sentence? 5.How can we learn new ECG constructions? 6.What are ways to re-organize and consolidate the current grammar? 7.What metric is used to determine when to form a new construction?

“you” you schema Addressee subcase of Human FORM (sound) MEANING (stuff) Analyzing “You Throw The Ball” “throw” thro w schema Throw roles: thrower throwee “ball” ball schema Ball subcase of Object “block” block schema Block subcase of Object t1 before t2 t2 before t3 Thrower-Throw-Object t2.thrower ↔ t1 t2.throwee ↔ t3 “the” Addressee Throw thrower throwee Ball

Another way to think of the SemSpec

Analyzing in ECG create a recognizer for each construction in the grammar for each level j (in ascending order) repeat for each recognizer r in j for each position p of utterance initiate r starting at p until we don't find anything new

Recognizer for the Transitive-Cn an example of a level-1 construction is Red-Ball-Cn each recognizer looks for its constituents in order (the ordering constraints on the constituents can be a partial ordering) ag t vobj I ge t cu p level 0 level 2 level 1

Construction s (Utterance, Situation) 1.Learner passes input (Utterance + Situation) and current grammar to Analyzer. Analyze Semantic Specification, Constructional Analysis 2.Analyzer produces SemSpec and Constructional Analysis. 1.Learner updates grammar: Hypothesize a.Hypothesize new map. Reorganize b.Reorganize grammar (merge or compose). c.Reinforce (based on usage). Learning-Analysis Cycle (Chang, 2004)

Questions 1.Why is learning language difficult? 2.What are the “paucity of the stimulus” and the “opulence of the substrate”? 3.What is Gold's Theorem? 4.How does the analyzer use the constructions to parse a sentence? 5.How can we learn new ECG constructions? 6.What are ways to re-organize and consolidate the current grammar? 7.What metric is used to determine when to form a new construction?

Acquisitio n Reorganize Hypothesize Production Utterance (Comm. Intent, Situation) Generate Construction s (Utterance, Situation) Analysis Comprehensio n Analyze Partial Usage-based Language Learning

Basic Learning Idea The learner’s current grammar produces a certain analysis for an input sentence The context contains richer information (e.g. bindings) that are unaccounted for in the analysis Find a way to account for these meaning relations (by looking for corresponding form relations) SemSpecr1r2r1r2r3r1r2Contextr1r2r1r2r3r1r2

“you” “throw” “ball” you thro w ball “block” block schema Addressee subcase of Human FORM (sound) MEANING (stuff) lexical constructions Initial Single-Word Stage schema Throw roles: thrower throwee schema Ball subcase of Object schema Block subcase of Object

“you” you schema Addressee subcase of Human FORMMEANING New Data: “You Throw The Ball” “throw” thro w schema Throw roles: thrower throwee “ball” ball schema Ball subcase of Object “block” block schema Block subcase of Object “the” Addressee Throw thrower throwee Ball Self SITUATION Addressee Throw thrower throwee Ball before role-filler throw-ball

Relational Mapping Scenarios AfAf BfBf AmAm BmBm A B form- relation role- fille r throw ball throw.throwee ↔ ball AfAf BfBf AmAm BmBm A B form- relation role- fille r X put ball down put.mover ↔ ball down.tr ↔ ball AfAf BfBf AmAm BmBm A B form- relation role- fille r Y Nomi ball possession.possessor ↔ Nomi possession.possessed ↔ ball

Questions 1.Why is learning language difficult? 2.What are the “paucity of the stimulus” and the “opulence of the substrate”? 3.What is Gold's Theorem? 4.How does the analyzer use the constructions to parse a sentence? 5.How can we learn new ECG constructions? 6.What are ways to re-organize and consolidate the current grammar? 7.What metric is used to determine when to form a new construction?

Merging Similar Constructions throw before block Throw.throwee = Block throw before ball Throw.throwee = Ball throw before-s ing Throw.aspect = ongoing throw-ing the ball throw the block throw before Object f THROW.throwee = Object m THROW-OBJECT

Resulting Construction construction THROW-OBJECT constructional constituents t : THROW o : OBJECT form t f before o f meaning t m.throwee ↔ o m

Composing Co-occurring Constructions ball before off Motion m m.mover = Ball m.path = Off ball off throw before ball Throw.throwee = Ball throw the ball throw before ball ball before off THROW.throwee = Ball Motion m m.mover = Ball m.path = Off THROW-BALL-OFF

Resulting Construction construction THROW-BALL-OFF constructional constituents t : THROW b : BALL o : OFF form t f before b f b f before o f meaning evokes MOTION as m t m.throwee ↔ b m m.mover ↔ b m m.path ↔ o m

Questions 1.Why is learning language difficult? 2.What are the “paucity of the stimulus” and the “opulence of the substrate”? 3.What is Gold's Theorem? 4.How does the analyzer use the constructions to parse a sentence? 5.How can we learn new ECG constructions? 6.What are ways to re-organize and consolidate the current grammar? 7.What metric is used to determine when to form a new construction?

Minimum Description Length Choose grammar G to minimize cost(G|D): –cost(G|D) = α size(G) + β complexity(D|G) –Approximates Bayesian learning; cost(G|D) ≈ 1/posterior probability ≈ 1/P(G|D) Size of grammar = size(G) ≈ 1/prior ≈ 1/P(G) –favor fewer/smaller constructions/roles; isomorphic mappings Complexity of data given grammar ≈ 1/likelihood ≈ 1/P(D|G) –favor simpler analyses (fewer, more likely constructions) –based on derivation length + score of derivation

Size Of Grammar Size of the grammar G is the sum of the size of each construction: Size of each construction c is: where n c = number of constituents in c, m c = number of constraints in c, length(e) = slot chain length of element reference e

Example: The Throw-Ball Cxn construction THROW-BALL constructional constituents t : THROW b : BALL form t f before b f meaning t m.throwee ↔ b m size ( THROW-BALL ) = (2 + 3) = 9

Complexity of Data Given Grammar Complexity of the data D given grammar G is the sum of the analysis score of each input token d: Analysis score of each input token d is: where c is a construction used in the analysis of d weight c ≈ relative frequency of c, |type r | = number of ontology items of type r used, height d = height of the derivation graph, semfit d = semantic fit provide by the analyzer

Final Remark The goal here is to build a cognitive plausible model of language learning A very different game that one could play is unsupervised / semi-supervised language learning using shallow or no semantics –statistical NLP –automatic extraction of syntactic structure –automatic labeling of frame elements Fairly reasonable results for use in tasks such as information retrieval, but the semantic representation is very shallow