Presentation on theme: "Psych 85-419: Introduction to Parallel Distributed Processing Michael Harm, Professor Anthony Cate, TA."— Presentation transcript:
Psych 85-419: Introduction to Parallel Distributed Processing Michael Harm, Professor Anthony Cate, TA
Course Objectives Solid background in the philosophical and computational underpinnings of modern connectionist (PDP) research Experience with the construction and analysis of pdp models Appreciation of the benefits (and limitations!) of PDP approaches to psychological research
By May, You All Should Be Able To: Recognize when a PDP model may be useful to your research, Build a model of a phenomena that interests you Understand the contributions of models you see in the literature … and/or critique them!
Course Will Be Geared Towards Two Communities Modelers who plan to use these techniques in their work Researchers who want to better understand these models and their implications, even if they don’t want to be a modeler straw poll: which group do you fall into?
Grading Four homeworks, each of which count for 10% of your final grade One exam, worth 15% of your grade A project proposal, worth 5% A final project worth 30% of your grade Class participation, worth 10% of your grade No final exam
Class Web Page www.cnbc.cmu.edu/~mharm/courses/pdp_spring2001 / Watch for updates
What is Expected of You Readings assigned for each class. Read them! Come prepared with thoughtful questions Participate in class discussions Complete assignments on time –Come to us if you need help! Don’t wait until the last minute!
Overview of Class What is PDP, anyway? (That’s next) Processing and Constraint Satisfaction Simple learning and distributed representations Learning internal representations Unsupervised learning Psychological phenomena –Language, vision, higher level cognition
So What is PDP, Anyway? Start by describing more traditional approaches Why would one want a different approach? PDP defined A case study History of the approach
Traditional Approach to Studying Cognition The mind is like a computer There are rules, facts and propositions There is a logic engine that operates over these rules and propositions –Generates new propositions, new facts, new rules The Name of the Game: Identify the rules and propositions for a given phenomena
Who Uses This Method (Implicitly or Otherwise)? Traditional AI, e.g. unification –if (not (married X)) -> (bachelor X) –(not (married JOHN)) implies JOHN is a bachelor Traditional linguistics (Chomsky, etc.) Philosophy of Mind (Fodor, etc.) Psychologists (some, at least)
Why Would One Question This Approach? Descriptive versus explanatory –An equation for an ellipse describes planetary motion. –But planets do not compute the equation for an ellipse to decide where to go! –Has an air of Greek Mythology about it Creating theories to account for data, with no external validation
Why Would One Question This Approach (More) Doesn’t seem to be how the mind actually works –Robust to damage –Graded degradation in performance –Doesn’t seem to be a single “logic engine” shared across all domains
Why Would One Question This Approach (Yet More) No obvious link to neuroscience –Single cell recordings, systems neuroscience –Impairments that have different effects on cells Method is typically grounded in symbolic rules –What about phenomena that aren’t rule governed?
So, Fine. Now Will You Tell Us What PDP Is? The idea that cognition can arise through the interactions of simple processing units –Blind to the global task at hand –Output activity based on state and summed input –… kind of like neurons … and that this may be a good way to study cognition
The Name of the Game Construct a model consisting of processing units and connections between them –Guided by theory, observation, hypothesis Explore the behavior of the model. Relate to behavioral data Use model to gain insights into causes of behavioral data
A Case Study: Frequency by Regularity in Reading Regular words are words whose spelling to sound correspondences are predictable from other words. Like gave, save, wave, pave. Exception words are ones that violate the normal rules of pronunciation, like have, yacht, sergeant Word frequency is how often it is seen. Words like the versus yacht
Frequency by Regularity Exception words affected by frequency Regular words not (more or less)
Traditional Account (Coltheart and colleagues) One cognitive module is responsible for reading exception words. It is frequency sensitive Another module can only read regular items. It is rule governed, frequency insensitive.
An Alternative Account, Part I: The Existence Proof Seidenberg & McClelland ‘89 constructed large scale connectionist model of reading Mapped spelling patterns onto pronunciation Observed same frequency by regularity interaction Therefore, data does not necessitate separate systems for rules and exceptions
An Alternative Account, Part II: Analysis Plaut et al ‘96 analyzed a network that exhibited frequency by regularity interaction Accounted for effect through mathematical analysis of network This is a different kind of theorizing. –Rooted in computational principles –Discovered, rather than designed
History I: The Age of Discovery McCulloch & Pitts (1943) –Networks of simple logic gates can compute any finite logic proposition Hebb (1949) –Clear definition of a learning rule for neurons Selfridge (1958) –Intelligent behavior from interactions of many agents And many others...
History II: The Cold Years Minsky & Papert ‘69: Simple associators cannot compute problems that are not linearly separable –The XOR problem Many problems aren’t linearly separable Led to scarcity of funding for such research. Golden years of artificial intelligence.
History III: Renaissance of the Mid ‘80s Discovery of training algorithms that are more powerful than simple associators –Could compute problems that are not linearly separable –Resurgence in interest in use of these models for theory construction
History IV: The Counter Attack Pinker & Prince ‘88 launched attack on PDP account of inflectional morphology Fodor & Pylyshyn ‘88 attacked connectionist enterprise as a whole Besner et al., Coltheart et al. attacked findings of Seidenberg & McClelland ‘89 model McCloskey: Networks are not theories!
Where We Are Today High Level Low Level Classical Conditioning, Priming Reading Morphology Parsing Sentences Reasoning, Creativity
For Next Class Read PDP1, Chapter 2 Optional: Read PDP1, Chapter 1