Presentation is loading. Please wait.

Presentation is loading. Please wait.

SeidenbergNCPW13 7/2012 PDP Models and American Health Care Reform Mark S. Seidenberg NCPW13 BCBL San Sebastian 2012.

Similar presentations

Presentation on theme: "SeidenbergNCPW13 7/2012 PDP Models and American Health Care Reform Mark S. Seidenberg NCPW13 BCBL San Sebastian 2012."— Presentation transcript:

1 SeidenbergNCPW13 7/2012 PDP Models and American Health Care Reform Mark S. Seidenberg NCPW13 BCBL San Sebastian 2012

2 SeidenbergNCPW13 7/2012 Observation: Many core concepts of the PDP approach have been broadly assimilated into cognitive science/neuroscience But the modeling, not so much (present distinguished company notwithstanding) skepticism about relevance/adequacy in areas such as language acquisition arising from close analyses of specific PDP models availability of alternative approaches 7/14/2012Seidenberg NCPW13 talk2

3 SeidenbergNCPW13 7/2012 Observation II: People will endorse PDP concepts as long as you call them something else Like health care debate in US: Click here for movie controls 7/14/2012Seidenberg NCPW13 talk3

4 SeidenbergNCPW13 7/2012 Obamacare vs. PDP maintain current insurance promote access to affordable health care no denial based on pre- existing conditions individual mandate (the thing that pays for the good stuff) distributed representations interactive processing computation of best fits PDP models 7/14/2012Seidenberg NCPW13 talk4

5 SeidenbergNCPW13 7/2012 Why? Let us look. Three case studies, all alike 1. Diagnosis of a fatal problem models behave differently from people broad implications, widely repeated attention heads elsewhere 2. Diagnosis turns out to be wrong critiques dont support broader implications less widely known (like getting an audience for a failure to replicate) 3. There is, however, a related problem of considerable interest producing exciting work but need more; have to overcome (1) 7/14/2012Seidenberg NCPW13 talk5

6 SeidenbergNCPW13 7/2012 Case 1: Catastrophic interference 1. Diagnosis of problem McCloskey & Cohen, 1989 Unlike people, simple feedforward nets exhibit unwanted retroactive interference Based on close analyses of models of simple arithmetic task 2. Solution is interleaving Hippocampus in complementary systems model (MMO, 1995) Life (in which experience is massively interleaved) The type of catastrophic interference that McCloskey-Cohen focused on does occur occasionally certain verbal learning experiments unusual circumstances like Korea France émigrés (Pallier et al.) 7/14/2012Seidenberg NCPW13 talk6

7 SeidenbergNCPW13 7/2012 3. The interesting related problem Massive entrenchment! Reduction in plasticity associated with expertise Example: Critical periods in language learning Paradox of success (Seidenberg & Zevin, 2006), Expertise with L1 makes it difficult to absorb L2 Some models in this area (Ping Li, others). Havent gone that far. A recent example:, 7/14/2012Seidenberg NCPW13 talk7

8 SeidenbergNCPW13 7/2012 Impact of Dialect Variation in the US on Learning to Read Achievement gap in reading 1. African Americans (and other minorities) perform less well on tests of reading and other subjects compared to whites; 2. gap has been persistent for many years 3. poor reading skills a problem for individuals and society 7/14/2012Seidenberg NCPW13 talk8

9 SeidenbergNCPW13 7/2012 Why? Not just poverty, school/teacher quality Possibly related to language experience? Major US dialects: Standard American English African American English These dialects overlap more than 2 languages But also differ a lot: phonology, morphology, syntax, discourse 7/14/2012Seidenberg NCPW13 talk9

10 SeidenbergNCPW13 7/2012 Dialect mismatch effects Home dialect AAE vs. school dialect SAE When schooling starts, child has to learn more of the second dialect learn in less familiar dialect, in noisy environment using books written SAE Dialect differences make learning a more difficult task than for child who uses same dialect at home and in school. But all are judged against same achievement milestones. Gap ensues Other factors like SES may exacerbate further 7/14/2012Seidenberg NCPW13 talk10

11 SeidenbergNCPW13 7/2012 We wanted to examine impact on reading. Obvious area: how differences in pronunciation affect acquiring basic decoding skills 7/14/2012Seidenberg NCPW13 talk11

12 SeidenbergNCPW13 7/2012 Pronunciation differences Many words pronounced the same (at phonemic level) Many words pronounced differently Percentage varies with dialect density 30% of words and higher GOLD, FLOOR, and LOW rhyme in AAE 7/14/2012Seidenberg NCPW13 talk12

13 SeidenbergNCPW13 7/2012 Teacher: G-O-L-D, thats gold [child searches spoken language vocabulary for gold] Child: Ohhh, gole 7/14/2012Seidenberg NCPW13 talk13

14 SeidenbergNCPW13 7/2012 Thus: Spelling-sound correspondences are more complex for AAE speakers. We have models for that…. 7/14/2012Seidenberg NCPW13 talk14

15 SeidenbergNCPW13 7/2012 Contrastive words: different pronunciations in SAE, AAE bound old toast Non-contrastive: same pronunciation in both dialects brush air stage Latencies do not differ in ELP data base. 7/14/2012Seidenberg NCPW13 talk15

16 SeidenbergNCPW13 7/2012 Naming latencies as a function of AAE density Children (N =22, M age =11.4 years old) Adults (N = 32, M age = 35.5) 7/14/2012Seidenberg NCPW13 talk16

17 SeidenbergNCPW13 7/2012 Modeling Once you see the set-up, effects are obvious orth phon model Learns phonology first Then learns to map spellings onto phonology SAElearn map spellings onto known SAE pronunciations AAElearn to pronounce words in SAE while continue using AAE phonology in speech 7/14/2012Seidenberg NCPW13 talk17

18 SeidenbergNCPW13 7/2012 Model (based on Harm & Seidenberg, 1999) Training corpus: 1700 words from 2 nd grade norms SAE version AAE version: about half the pronunciations are different 7/14/2012Seidenberg NCPW13 talk18

19 SeidenbergNCPW13 7/2012 SAE match: SAE-SAE AAE match: AAE-AAE Mismatch: AAE-SAE 7/14/2012Seidenberg NCPW13 talk19

20 SeidenbergNCPW13 7/2012 Training on both dialects 7/14/2012Seidenberg NCPW13 talk20

21 SeidenbergNCPW13 7/2012 Summary About achievement gap: dialect mismatch slows learning Playing field is not level Models suggest ways to fix this. About models: entrenchment, proactive interference 7/14/2012Seidenberg NCPW13 talk21

22 SeidenbergNCPW13 7/2012 Case 2: Language acquisition 1. Diagnosis of the problem Language has properties that cant be captured by NNs Rules (Pinker), algebraic rules (Marcus), procedural knowledge (Ullman) Demonstrations: Marcus et al. Lather, rinse, repeat 2. Second opinions: Plenty of people have taken issue with these claims rule-governed only under idealization of data competence theory of performance: Seidenberg & Plaut (in press?) semantic-phonological theory of the past tense (not rules-exceptions) improved models (Altmann, others) 7/14/2012Seidenberg NCPW13 talk22

23 SeidenbergNCPW13 7/2012 3. The interesting related problem: What is Statistical learning? Language learners learn from statistics of the input Process starts in infancy Many studies examining what kinds of statistics are learned Little of the research makes contact with PDP/connectionist models/concepts Newport (2010) sees progress in the movement in many parts of psycholinguistics from rules to connectionism to statistical learning (p. 369). Statistical learning is not Obamacare! 7/14/2012Seidenberg NCPW13 talk23

24 SeidenbergNCPW13 7/2012 Irony: Linguists early criticism of connectionist/PDP models languages exhibit lots of regularities depending on how you count models are too powerful; can learn any arbitrary association cant explain why languages exhibit some regularities and not others why people can learn some things and not others Current research on statistical learning in language acquisition same issues! lots of different statistics can be studied in artificial language studies what are the general principles? why are some regularities learnable and not others? 7/14/2012Seidenberg NCPW13 talk24

25 SeidenbergNCPW13 7/2012 Theyve thrown the theory of how the child learns out with the connectionist bathwater. Need more models, not fewer Recent example: Willits (2012) thesis, UW 7/14/2012Seidenberg NCPW13 talk25

26 SeidenbergNCPW13 7/2012 Heres what Jon did Studies of non-adjacent dependencies which are everywhere in NL drink, drank, drunk was cooking TheThe woman gave the book to the boy The key(s) to the cabinets is/are on the table. S -> NP + (S) +VP Challenging learning problem. Many recent behavioral studies of infants, toddlers using artificial grammar methods Not much connection to earlier AGL research 7/14/2012Seidenberg NCPW13 talk26

27 SeidenbergNCPW13 7/2012 Pel Wadim Rud Pel Kicey Rud Pel Puser Rud Vot Wadim Jic Vot Kicey Jic Vot Puser Jic Vary number of As, Bs, Xs Surprisingly hard to learn Gomez, Maye, Newport & Aslin Representative studies: learning an AxB pattern (auditory presentation) Pel Wadim Rud Pel Kicey Rud Pel Puser Rud Vot Wadim Jic Vot Kicey Jic Vot Puser Jic 7/14/2012Seidenberg NCPW13 talk27

28 SeidenbergNCPW13 7/2012 Willits (2012) Used SRNs to address 4 phenomena: 1. Learning distance-invariant nonadjacent dependencies AxB with 0-3 intervening items 2. Impact of correlated semantic cue (AB are both animals or both foods) 3. Impact of consistent but semantically-unrelated cue (A animal, B food) 4. abstract rule-like knowledge (Marcus) Learntest ABA ABA (same pattern, new items)ABB Key change: let model learn during test phase (like babies do). Then model can learn test pattern with new items-- with savings. 7/14/2012Seidenberg NCPW13 talk28

29 SeidenbergNCPW13 7/2012 Conclusions 1. Overcoming purported limitations of SRNs, yes. Behavior is similar to humans, yes. 2. More important: Analysis shows reasons why models work. Implications re: learnability of other abstract, rule-like properties of language un-learnability of some types of problems which should be unlearnable for people too 7/14/2012Seidenberg NCPW13 talk29

30 SeidenbergNCPW13 7/2012 Case 3: Linking Brain and Behavior Problem: PDP models motivated by linkage to brain, neurally inspired, etc. But, most models have not been very constrained by brain data (PDP, neuroimaging developed in parallel at about the same time) 1. Diagnosis: poor fit because the brain doesnt work that way, e.g., backprop, units neurons, etc. 2. Second opinion: things are moving along fine Recent models that are more closely tied to brain Plaut, Lambon Ralph, Taiji Ueno, McClelland, others here 7/14/2012Seidenberg NCPW13 talk30

31 SeidenbergNCPW13 7/2012 3. Interesting related problem: more please! Integrate PDP models with brain data Otherwise differences in activation for words vs. nonwords = word level representations Grain of neuroimaging data is like grain of behavioral data Models can indeed apply to both 7/14/2012Seidenberg NCPW13 talk31

32 SeidenbergNCPW13 7/2012 Recent example from our group Jeff Binder (Medical College of Wisconsin Will Graves (now at Rutgers) Me, Tim Rogers (Wisconsin) 7/14/2012Seidenberg NCPW13 talk32

33 SeidenbergNCPW13 7/2012 How many ways are there to be a skilled reader? Do skilled readers (e.g., of English) read the same way? Old question: Baron & Strawson (1976) Chinese vs. Phoenician readers visual phonological orth semorth phon sem 7/14/2012Seidenberg NCPW13 talk33

34 SeidenbergNCPW13 7/2012 Maybe different division of labor? Computing a code depends on input from various parts of the system Efficiency arises from division of labor between sources Affected by type of word, type of writing system Plaut et al., 1996: computing phonology Harm & Seidenberg, 2004: computing semantics Individual differences could be related to reading skill, experience 7/14/2012Seidenberg NCPW13 talk34

35 SeidenbergNCPW13 7/2012 New work looked at impact of semantics on reading words aloud In principle words can be read without using semantics (as in the DRC-CDP+ models) However, in our model, orth sem phon is available, and could facilitate performance for some words or readers Semantic effects on naming: are there any? YES: Strain et al., 1995; Hino & Lupker, 1996; Lichacz et al., 1999; Strain & Herdman, 1999; Hino et al., 2001; Shibahara et al., 2003, and several others. NO: Monaghan & Ellis, 2001; Brown & Watson, 1987; de Groot, 1989; Baayen et al., 2006). 7/14/2012Seidenberg NCPW13 talk35

36 SeidenbergNCPW13 7/2012 Perhaps there are individual differences… Study: examined use of semantics in reading aloud among skilled readers (college graduates, med students) Determine if individual differences are associated with neuroanatomical variation in relevant parts of reading network. 7/14/2012Seidenberg NCPW13 talk36

37 SeidenbergNCPW13 7/2012 1. Graves et al. (2010): 18 subjects read 465 words aloud in scanner 2. Effect of semantics on naming indexed by impact of imageability. Also looked at freq, consistency, bigrams, number of letters, other factors. 3. Graves et al. (2012): Left hemisphere semantic and phonological ROIs based on results of 2010 study semantic: AGITG/ITS phonological:pSTGpMTG 4. DTI tractography to measure volumes of pathways 7/14/2012Seidenberg NCPW13 talk37

38 SeidenbergNCPW13 7/2012 7/14/2012Seidenberg NCPW13 talk38

39 SeidenbergNCPW13 7/2012 Semantic effects on naming correlated with white matter volume in sem-phon pathways Anatomy, not strategy 7/14/2012Seidenberg NCPW13 talk39

40 SeidenbergNCPW13 7/2012 7/14/2012Seidenberg NCPW13 talk40

41 SeidenbergNCPW13 7/2012 Everything A-OK? Some reasons why models get a bad name 1. we take credit for good behavior, and discount the bad behavior implementations limited, etc etc like: model learns something that people learn but takes 10 million trials heads I win, tails you lose Properties that hold over many models? Requires doing a lot of models. Like doing replication experiments. Takes lots of time, analysis. Could be hard to build a career around. 7/14/2012Seidenberg NCPW13 talk41

42 SeidenbergNCPW13 7/2012 2. What about taking learning seriously? Problem wasnt that backprop wasnt neurally realistic It isnt behaviorally realistic. what is learning really like? conditions vary: explicit extrenally provided teacher external or self-generated error signals that are noisy, partial, inconsistent, wrong general rather than specific etc. Can be addressed (h/t OReilly). Maybe models would learn on the human order of magnitude. 7/14/2012Seidenberg NCPW13 talk42

43 SeidenbergNCPW13 7/2012 So, there is progress, there are obstacles, there are future directions. Why is this important to recognize? In the famous words of the philosopher, Those who fail to remember history are doomed to fail to remember repeating it. Carlos Santana 7/14/2012Seidenberg NCPW13 talk43

44 SeidenbergNCPW13 7/2012 Thanks for listening! Dialect research:Julie Washington GSU Daragh Sibley Haskins Acquisition researchJon Willits Indiana Jenny Saffran Wisconsin Reading brainJeff Binder MCW Will Graves MCW And Jay for introducing me to PDP. 7/14/2012Seidenberg NCPW13 talk44 Thanks also to collaborators:

Download ppt "SeidenbergNCPW13 7/2012 PDP Models and American Health Care Reform Mark S. Seidenberg NCPW13 BCBL San Sebastian 2012."

Similar presentations

Ads by Google