Similarity and Attribution Contrasting Approaches To Semantic Knowledge Representation and Inference Jay McClelland Stanford University.

Slides:



Advertisements
Similar presentations
Naïve-Bayes Classifiers Business Intelligence for Managers.
Advertisements

Intelligent systems Lection 7 Frames, selection of knowledge representation, its combinations.
Machine Learning in Practice Lecture 7 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Cognitive - knowledge.ppt © 2001 Laura Snodgrass, Ph.D.1 Knowledge Structure of semantic memory –relationships among concepts –organization of memory –memory.
A (very) brief introduction to multivoxel analysis “stuff” Jo Etzel, Social Brain Lab
PDP: Motivation, basic approach. Cognitive psychology or “How the Mind Works”
Does the Brain Use Symbols or Distributed Representations? James L. McClelland Department of Psychology and Center for Mind, Brain, and Computation Stanford.
Organization of Semantic Memory The study of Collins & Quillian (1969):Collins & Quillian (1969): The authors were interested in the organization of semantic.
Emergence in Cognitive Science: Semantic Cognition Jay McClelland Stanford University.
Emergence of Semantic Structure from Experience Jay McClelland Stanford University.
Language, Mind, and Brain by Ewa Dabrowska Chapter 10: The cognitive enterprise.
PSY 369: Psycholinguistics Mental representations II.
Concepts and Categories. Functions of Concepts By dividing the world into classes of things to decrease the amount of information we need to learn, perceive,
Knowing Semantic memory.
Introduction to Neural Networks Simon Durrant Quantitative Methods December 15th.
Visual Cognition II Object Perception. Theories of Object Recognition Template matching models Feature matching Models Recognition-by-components Configural.
Knowledge information that is gained and retained what someone has acquired and learned organized in some way into our memory.
Latent Semantic Analysis (LSA). Introduction to LSA Learning Model Uses Singular Value Decomposition (SVD) to simulate human learning of word and passage.
Lecture 10 – Semantic Networks 1 Two questions about knowledge of the world 1.How are concepts stored? We have already considered prototype and other models.
Machine Learning. Learning agent Any other agent.
How to do backpropagation in a brain
General Knowledge Dr. Claudia J. Stanny EXP 4507 Memory & Cognition Spring 2009.
Development and Disintegration of Conceptual Knowledge: A Parallel-Distributed Processing Approach Jay McClelland Department of Psychology and Center for.
Dynamics of learning: A case study Jay McClelland Stanford University.
Using Backprop to Understand Apects of Cognitive Development PDP Class Feb 8, 2010.
Representation, Development and Disintegration of Conceptual Knowledge: A Parallel-Distributed Processing Approach James L. McClelland Department of Psychology.
Emergence of Semantic Structure from Experience Jay McClelland Stanford University.
The changing face of face research Vicki Bruce School of Psychology Newcastle University.
A Model of Object Permanence Psych 419/719 March 6, 2001.
Semantic Memory Knowledge memory Main questions How do we gain knowledge? How is our knowledge represented and organised in the mind-brain? What happens.
Outline What Neural Networks are and why they are desirable Historical background Applications Strengths neural networks and advantages Status N.N and.
Integrating New Findings into the Complementary Learning Systems Theory of Memory Jay McClelland, Stanford University.
The PDP Approach to Understanding the Mind and Brain J. McClelland Cognitive Core Class Lecture March 7, 2011.
Disintegration of Conceptual Knowledge In Semantic Dementia James L. McClelland Department of Psychology and Center for Mind, Brain, and Computation Stanford.
Contrasting Approaches To Semantic Knowledge Representation and Inference Psychology 209 February 15, 2013.
Shane T. Mueller, Ph.D. Indiana University Klein Associates/ARA Rich Shiffrin Indiana University and Memory, Attention & Perception Lab REM-II: A model.
Emergence of Semantic Knowledge from Experience Jay McClelland Stanford University.
PSY 323 – COGNITION Chapter 9: Knowledge.  Categorization ◦ Process by which things are placed into groups  Concept ◦ Mental groupings of similar objects,
The Influence of Feature Type, Feature Structure and Psycholinguistic Parameters on the Naming Performance of Semantic Dementia and Alzheimer’s Patients.
Development, Disintegration, and Neural Basis of Semantic Cognition: A Parallel-Distributed Processing Approach James L. McClelland Department of Psychology.
Emergence of Semantic Structure from Experience Jay McClelland Stanford University.
Rapid integration of new schema- consistent information in the Complementary Learning Systems Theory Jay McClelland, Stanford University.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
Long Term Memory: Semantic Kimberley Clow
Semantic Cognition: A Parallel Distributed Processing Approach James L. McClelland Center for the Neural Basis of Cognition and Departments of Psychology.
Cognitive Processes PSY 334 Chapter 5 – Meaning-Based Knowledge Representation.
1 How is knowledge stored? Human knowledge comes in 2 varieties: Concepts Concepts Relations among concepts Relations among concepts So any theory of how.
The Origins of Knowledge Debate How do people gain knowledge about the world around them? Are we born with some fundamental knowledge about concepts like.
Origins of Cognitive Abilities Jay McClelland Stanford University.
The Emergent Structure of Semantic Knowledge
Verbal Representation of Knowledge
Emergent Semantics: Meaning and Metaphor Jay McClelland Department of Psychology and Center for Mind, Brain, and Computation Stanford University.
Semantic Knowledge: Its Nature, its Development, and its Neural Basis James L. McClelland Department of Psychology and Center for Mind, Brain, and Computation.
Organization and Emergence of Semantic Knowledge: A Parallel-Distributed Processing Approach James L. McClelland Department of Psychology and Center for.
Development and Disintegration of Conceptual Knowledge: A Parallel-Distributed Processing Approach James L. McClelland Department of Psychology and Center.
Neural Networks Lecture 4 out of 4. Practical Considerations Input Architecture Output.
Structured Probabilistic Models: A New Direction in Cognitive Science
Chapter 9 Knowledge. Some Questions to Consider Why is it difficult to decide if a particular object belongs to a particular category, such as “chair,”
Big data classification using neural network
Bayesian inference in neural networks
Psychology 209 – Winter 2017 January 31, 2017
Psychology 209 – Winter 2017 Feb 28, 2017
Split-Brain Studies What do you see? “Nothing”
Development and Disintegration of Conceptual Knowledge: A Parallel-Distributed Processing Approach James L. McClelland Department of Psychology and Center.
Does the Brain Use Symbols or Distributed Representations?
Emergence of Semantic Structure from Experience
Emergence of Semantics from Experience
Chap. 7 Regularization for Deep Learning (7.8~7.12 )
CLS, Rapid Schema Consistent Learning, and Similarity-weighted Interleaved learning Psychology 209 Feb 26, 2019.
Presentation transcript:

Similarity and Attribution Contrasting Approaches To Semantic Knowledge Representation and Inference Jay McClelland Stanford University

Emergent vs. Stipulated Structure Old London Midtown Manhattan

Where does structure come from? It’s built in It’s learned from experience There are constraints built in that shape what’s learned

The Rumelhart Model The Quillian Model

1.Show how learning could capture the emergence of hierarchical structure 2.Show how the model could make inferences as in the Quillian model DER’s Goals for the Model

ExperienceExperience Early Later Later Still

Start with a neutral representation on the representation units. Use backprop to adjust the representation to minimize the error.

The result is a representation similar to that of the average bird…

Use the representation to infer what this new thing can do.

Questions About the Rumelhart Model Does the model offer any advantages over other approaches? Can the mechanisms of learning and representation in the model tell us anything about Development? Effects of neuro-degeneration?

Phenomena in Development Progressive differentiation Overgeneralization of – Typical properties – Frequent names Emergent domain-specificity of representation Basic level advantage Expertise and frequency effects Conceptual reorganization

Neural Networks and Probabilistic Models The Rumelhart model is learning to match the conditional probability structure of the training data: P(Attribute i = 1|Item j & Context k ) for all i,j,k The adjustments to the connection weights move them toward values than minimize a measure of the divergence between the network’s estimates of these probabilities and their values as reflected in the training data. It does so subject to strong constraints imposed by the initial connection weights and the architecture. – These constraints produce progressive differentiation, overgeneralization, etc. Depending on the structure in the training data, it can behave as though it is learning something much like one of K&T’s structure types, as well as many structures that cannot be captured exactly by any of the structures in K&T’s framework.

The Hierarchical Naïve Bayes Classifier Model (with R. Grosse and J. Glick) The world consists of things that belong to categories. Each category in turn may consist of things in several sub-categories. The features of members of each category are treated as independent –P({f i }|C j ) =  i p(f i |C j ) Knowledge of the features is acquired for the most inclusive category first. Successive layers of sub- categories emerge as evidence accumulates supporting the presence of co-occurrences violating the independence assumption. Living Things … Animals Plants Birds Fish Flowers Trees

PropertyOne-Class Model1 st class in two-class model 2 nd class in two-class model Can Grow1.0 0 Is Living1.0 0 Has Roots Has Leaves Has Branches Has Bark Has Petals Has Gills Has Scales Can Swim Can Fly Has Feathers Has Legs Has Skin Can See A One-Class and a Two-Class Naïve Bayes Classifier Model

Accounting for the network’s feature attributions with mixtures of classes at different levels of granularity Regression Beta Weight Epochs of Training Property attribution model: P(f i |item) =  k p(f i |c k ) + (1-  k )[(  j p(f i |c j ) + (1-  j )[…])

Should we replace the PDP model with the Naïve Bayes Classifier? It explains a lot of the data, and offers a succinct abstract characterization But –It only characterizes what’s learned when the data actually has hierarchical structure So it may be a useful approximate characterization in some cases, but can’t really replace the real thing.

An exploration of these ideas in the domain of mammals What is the best representation of domain content? How do people make inferences about different kinds of object properties?

Structure Extracted by a Structured Statistical Model

Predictions Similarity ratings and patterns of inference will violate the hierarchical structure Patterns of inference will vary by context

Experiments Size, predator/prey, and other properties affect similarity across birds, fish, and mammals Property inferences show clear context specificity Property inferences for blank biological properties violate predictions of K&T’s tree model for weasels, hippos, and other animals

Learning the Structure in the Training Data Progressive sensitization to successive principal components captures learning of the mammals dataset. This subsumes the naïve Bayes classifier as a special case when there is no real cross-domain structure (as in the Quillian training corpus). So are PDP networks – and our brain’s neural networks – simply performing familiar statistical analyses?  No!

Extracting Cross- Domain Structure

v

Input Similarity Learned Similarity

A Newer Direction Exploiting knowledge sharing across domains –Lakoff: Abstract cognition is grounded in concrete reality –Boroditsky: Cognition about time is grounded in our conceptions of space Can we capture these kinds of influences through knowledge sharing across contexts? Work by Thibodeau, Glick, Sternberg & Flusberg shows that the answer is yes

Summary Distributed representations provide the substrate for learned semantic representations in the brain that develop gradually with experience and degrade gracefully with damage Succinctly stated probabilistic models can sometimes nicely approximate the structure learned, if the training data is consistent with them, but what is really learned is not likely to exactly match any such structure. The structure should be seen as a useful approximation rather than a procrustean bed into which actual knowledge must be thought of as conforming.