Presentation is loading. Please wait.

Presentation is loading. Please wait.

Modeling Language Acquisition with Neural Networks A preliminary research plan Steve R. Howell.

Similar presentations


Presentation on theme: "Modeling Language Acquisition with Neural Networks A preliminary research plan Steve R. Howell."— Presentation transcript:

1 Modeling Language Acquisition with Neural Networks A preliminary research plan Steve R. Howell

2 Presentation Overview Goals & challenges Previous & related research Model Overview Implementation details

3 Project Goals Model two aspects of human language (grammar and semantics) Use single neural network performing word prediction Use orthographic representation Use small but functional word corpus –e.g. child’s basic functional vocabulary?

4 Challenges Need a network architecture capable of modeling both grammar and semantics What if phonology is required? Computational limitations

5 Previous Research

6 Ellman (1990) Mozer (1987) Seidelberg & McLelland (1989) Landauer et al. (LSA) Rao & Ballard (1997)

7 Ellman (1990) Simple recurrent network (context units) No built-in representational constraints Predicts next input from current plus context Discovers word boundaries in continuous stream of phonemes (slide) Ellman J. L. (1990). Finding structure in time. Cognitive Science, 14, p. 179-211 FOR MORE INFO...

8 Mozer (1987) - BLIRNET Interesting model of spatially parallel processing of independent words Possible input representation -letter triples –e.g. money = **M, **_0, *MO,…E_**, Y** Encodes beginnings and ends of words well, as well as relative letter position, important to fit human relative-position priming data Mozer M.C. (1987) Early parallel processing in reading: A connectionist approach. In M. Coltheart (Ed.) Attention and Performance, 12: The psychology of reading. FOR MORE INFO...

9 Siedelberg & McLelland (1989) Model of word pronunciation Again, relevant for the input representations used in its orthographic input - triples: –MAKE = **MA, MAK, AKE, KE** (**= WS) Distributed coding scheme for triples = distributed internal lexicon Seidelberg, M.S. & McClelland J. L. (1989). A distributed, developmental model of word recognition and naming. Psychological Review, 96, 523-568 FOR MORE INFO...

10 Landauer et al. “LSA” - a statistical model of semantic learning Very large word corpus Significant computation required Good performance Data set apparently proprietary FOR MORE INFO... Don’t call them, they’ll call you.

11 Rao & Ballard (1997) Basis for present network Algorithms based on extended Kalman Filter Internal state variable is output of input*weights (Internal state)*(transpose of feedforward weights) feeds back to predict next input Rao, R.P.N. & Ballard, D.H. (1997). Dynamic model of visual recognition predicts neural response properties in the visual cortex. Neural Computation, 9, 721-763. FOR MORE INFO...

12 Model Overview

13 Architecture as in Rao & Ballard Recurrent structure excellent for temporal variability Starting with single layer network Moving to multi-layer Rao & Ballard net

14 Model Overview(cont’d) High-level input representations First layer of net performs word prediction from letters Second layer adds word prediction from previous words –Words predict next words - Simple grammar?

15 Model Overview(cont’d) Additional higher levels should add larger temporal range of context Words in a large temporal range of current word help to predict it –Implies semantic linkage? –Analogous to LSA “Bag of words” approach at these levels

16 Possible Advantages Lower level units learn grammar Higher level units learn semantics Combines grammar-learning methods with “bag of words” approach Possible modification to language generation

17 Disadvantages Complex mathematical implementation Unclear how well higher levels will actually perform As yet unclear how to modify the net for language generation

18 Implementation Details

19 Implementation Challenges Locating basic functional vocabulary of English language –600-800 words? Compare to children’s language learning/usage, not adults –Locating child data?

20 Model Evaluation (basic) Test grammar learning as per Ellman Test semantic regularities as for LSA

21 Model Evaluation (optimistic) If generative modifications possible: Ability to output words/phrases semantically linked to input? –ElizaNet? Child Turing Test? –Human judges compare model output to real children's output for same input?

22 Current Status Continue reviewing previous research Working through implementation details of Rao & Ballard algorithm Considering different types of high-level input representations Need to develop/acquire basic English vocabulary/grammar data set

23 Thank you. Questions and comments are sincerely welcomed. Thoughts on any of the questions raised herein will be extremely valuable. FOR MORE INFO... Please see my web page at: http://www.the-wire.com/~showell/


Download ppt "Modeling Language Acquisition with Neural Networks A preliminary research plan Steve R. Howell."

Similar presentations


Ads by Google