Topic Models for Semantic Memory and Information Extraction Mark Steyvers Department of Cognitive Sciences University of California, Irvine Joint work.

Slides:



Advertisements
Similar presentations
SCIENCE PROCESS SKILLS
Advertisements

Ch 1 - The Nature of Science
GED Science Lesson 1.
STUDENT-CENTERED TEACHING
Information retrieval – LSI, pLSI and LDA
Cross-Corpus Analysis with Topic Models Padhraic Smyth, Mark Steyvers, Dave Newman, Chaitanya Chemudugunta University of California, Irvine New York Times.
Earth’s Magnetic Field
STEM Fair Projects.
CHAPTER 23: Two Categorical Variables: The Chi-Square Test
Probabilistic inference in human semantic memory Mark Steyvers, Tomas L. Griffiths, and Simon Dennis 소프트컴퓨팅연구실오근현 TRENDS in Cognitive Sciences vol. 10,
Specialized Understanding of Mathematics: A Study of Prospective Elementary Teachers Meg Moss.
Introduction  Bayesian methods are becoming very important in the cognitive sciences  Bayesian statistics is a framework for doing inference, in a principled.
Part IV: Inference algorithms. Estimation and inference Actually working with probabilistic models requires solving some difficult computational problems…
Latent Dirichlet Allocation a generative model for text
Exploring subjective probability distributions using Bayesian statistics Tom Griffiths Department of Psychology Cognitive Science Program University of.
Semantic Representations with Probabilistic Topic Models
Latent Semantic Analysis Probabilistic Topic Models & Associative Memory.
Presented by Zeehasham Rasheed
Concepts & Categorization. Measurement of Similarity Geometric approach Featural approach  both are vector representations.
Probabilistic Latent Semantic Analysis
Latent Semantic Analysis (LSA). Introduction to LSA Learning Model Uses Singular Value Decomposition (SVD) to simulate human learning of word and passage.
Probabilistic Topic Models Mark Steyvers Department of Cognitive Sciences University of California, Irvine Joint work with: Tom Griffiths, UC Berkeley.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
Science Fair.
Science Fair Projects Atlantis Elementary
SAMPLE PRESENTATION ON NEW STANDARDS To present to families.
Topics in statistical language modeling Tom Griffiths.
The Role of Prior Knowledge in Human Reconstructive Memory Mark Steyvers Pernille Hemmer University of California, Irvine 1.
TNEEL-NE. Slide 2 Connections: Communication TNEEL-NE Health Care Training Traditional Training –Health care training stresses diagnosis and treatment.
Topic Models in Text Processing IR Group Meeting Presented by Qiaozhu Mei.
Pattern Recognition & Machine Learning Debrup Chakraborty
Sight Words.
The Scientific Method Honors Biology Laboratory Skills.
Topic Modelling: Beyond Bag of Words By Hanna M. Wallach ICML 2006 Presented by Eric Wang, April 25 th 2008.
Finding Scientific topics August , Topic Modeling 1.A document as a probabilistic mixture of topics. 2.A topic as a probability distribution.
Grade 1 Nature of Science Big Idea 1: The Practice of Science Quarter 1 Topic 1 Practicing Science Department of Science.
Artificial Intelligence By Michelle Witcofsky And Evan Flanagan.
PROBLEM AREAS IN MATHEMATICS EDUCATION By C.K. Chamasese.
Scientific Inquiry and Skills
A Model for Learning the Semantics of Pictures V. Lavrenko, R. Manmatha, J. Jeon Center for Intelligent Information Retrieval Computer Science Department,
Lesson Overview Lesson Overview What Is Science? Lesson Overview 1.1 What Is Science?
Latent Dirichlet Allocation D. Blei, A. Ng, and M. Jordan. Journal of Machine Learning Research, 3: , January Jonathan Huang
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
Probabilistic Latent Query Analysis for Combining Multiple Retrieval Sources Rong Yan Alexander G. Hauptmann School of Computer Science Carnegie Mellon.
International Baccalaureate
Sight Words.
CS246 Latent Dirichlet Analysis. LSI  LSI uses SVD to find the best rank-K approximation  The result is difficult to interpret especially with negative.
Supervised Machine Learning: Classification Techniques Chaleece Sandberg Chris Bradley Kyle Walsh.
Experimental Psychology PSY 433 Chapter 5 Research Reports.
What is rhetoric? What you need to know for AP Language.
Forensic Science: Fundamentals & Investigations, Chapter 1 1 Chapter 1: Observation Skills Introduction The forensic examiner must be able to 1. find—identify.
CE Unit Four Seminar: Psychosocial and Cognitive Development of the Infant Please chat amongst yourselves, seminar will begin at 9 PM.
Memory. Hopfield Network Content addressable Attractor network.
Why study Psychology?  Perhaps by studying psychology we can get a better understanding of why people act the way they do, or maybe you want to learn.
Intro. To Psychology Intro. Unit Mr. Stalnaker. Psychology What is Psychology? Psychology is old as a study but young, vigorous, and growing as an organized.
Journal # /4/14 Define Psychology in your own words. Explain what a psychologist does (explain as many different roles as you can)
Soft mass of supportive tissues and nerve cells connected to the spinal cord.
Parent Resource Guide to Math and Science
Neural representation and decoding of the meanings of words
Histograms CSE 6363 – Machine Learning Vassilis Athitsos
Online Multiscale Dynamic Topic Models
Experimental Psychology
Nature of Science Laboratory Instruments
Parent Resource Guide to Math and Science
What Is Science? Read the lesson title aloud to students.
What Is Science? Read the lesson title aloud to students.
What Is Science? Read the lesson title aloud to students.
Michal Rosen-Zvi University of California, Irvine
What Is Science? Read the lesson title aloud to students.
Science Chapter 1 THE NATURE OF SCIENCE.
Presentation transcript:

Topic Models for Semantic Memory and Information Extraction Mark Steyvers Department of Cognitive Sciences University of California, Irvine Joint work with: Tom Griffiths, UC Berkeley Padhraic Smyth, UC Irvine Dave Newman, UC Irvine Chaitanya Chemudugunta, UC Irvine

Human Memory Information Retrieval Finding relevant memories Extracting meaning Finding relevant documents Extracting content

Semantic Representations What are suitable representations? How can it be learned from experience?

MONEY Structured representations Encoding of propositions Hand coded examples Not inferred from data CASH BILL LOAN MONEY BANK RIVER STREAM Two approaches to semantic representation Semantic networks e.g. Collins & Quillian 1969 Semantic Spaces e.g. Landauer & Dumais, 1997 LOAN CASH RIVER BANK BILL Relatively unstructured representations Can be learned from data

Overview I Probabilistic Topic Models II Associative Memory III Episodic Memory IV Applications V Conclusion

Probabilistic Topic Models Extract topics from large text collections  unsupervised  Bayesian statistical techniques Topics provide quick summary of content / gist

What are topics? A topic represents a probability distribution over words –Related words get high probability in same topic Example topics extracted from psychology grant applications: Probability distribution over words. Most likely words listed at the top

Model Input Matrix of counts: number of times words occur in documents Note: –word order is lost: “bag of words” approach –Some function words are deleted: “the”, “a”, “in” documents words 1…1… 16 … 0…0… FOOD … 6190ITALIAN 2012PASTA 3034PIZZA Doc3 …Doc2Doc1

Model Assumptions Each topic is a probability distribution over words Each document is modeled as a mixture of topics

Generative Model for Documents TOPICS MIXTURE TOPIC WORD for each document, choose a mixture of topics 2. sample a topic [1..T] from the mixture 3. sample a word from the topic (“LDA” by Blei, Ng, & Jordan, 2002; “pLSI” by Hoffman, 1999; Griffith & Steyvers, 2002 & 2004)

Document = mixture of topics Document % 20% Document %

The Generative Model Conditional probability of word in document: word probability in topic j probability of topic j in document

Inverting the generative model Generative model gives procedure to obtain corpus from topics and mixing proportions Inverting the model involves extracting topics and mixing proportions per document from corpus

Inverting the generative model Estimate topic assignments –Each occurrence of a word is assigned to one topic [ 1..T ] Large state space –With T=300 topics and 6,000,000 words, the size of the discrete state space is (300) 6,000,000 Need efficient sampling techniques –Markov Chain Monte Carlo (MCMC) with Gibbs sampling

INPUT:word-document counts (word order is irrelevant) OUTPUT: topic assignments to each wordP( z i ) likely words in each topicP( w | z ) likely topics in each document (“gist”)P( z | d ) Summary

THEORY SCIENTISTS EXPERIMENT OBSERVATIONS SCIENTIFIC EXPERIMENTS HYPOTHESIS EXPLAIN SCIENTIST OBSERVED EXPLANATION BASED OBSERVATION IDEA EVIDENCE THEORIES BELIEVED DISCOVERED OBSERVE FACTS SPACE EARTH MOON PLANET ROCKET MARS ORBIT ASTRONAUTS FIRST SPACECRAFT JUPITER SATELLITE SATELLITES ATMOSPHERE SPACESHIP SURFACE SCIENTISTS ASTRONAUT SATURN MILES ART PAINT ARTIST PAINTING PAINTED ARTISTS MUSEUM WORK PAINTINGS STYLE PICTURES WORKS OWN SCULPTURE PAINTER ARTS BEAUTIFUL DESIGNS PORTRAIT PAINTERS STUDENTS TEACHER STUDENT TEACHERS TEACHING CLASS CLASSROOM SCHOOL LEARNING PUPILS CONTENT INSTRUCTION TAUGHT GROUP GRADE SHOULD GRADES CLASSES PUPIL GIVEN BRAIN NERVE SENSE SENSES ARE NERVOUS NERVES BODY SMELL TASTE TOUCH MESSAGES IMPULSES CORD ORGANS SPINAL FIBERS SENSORY PAIN IS CURRENT ELECTRICITY ELECTRIC CIRCUIT IS ELECTRICAL VOLTAGE FLOW BATTERY WIRE WIRES SWITCH CONNECTED ELECTRONS RESISTANCE POWER CONDUCTORS CIRCUITS TUBE NEGATIVE A selection from 500 topics [ P(w|z = j) ]

FIELD MAGNETIC MAGNET WIRE NEEDLE CURRENT COIL POLES IRON COMPASS LINES CORE ELECTRIC DIRECTION FORCE MAGNETS BE MAGNETISM POLE INDUCED SCIENCE STUDY SCIENTISTS SCIENTIFIC KNOWLEDGE WORK RESEARCH CHEMISTRY TECHNOLOGY MANY MATHEMATICS BIOLOGY FIELD PHYSICS LABORATORY STUDIES WORLD SCIENTIST STUDYING SCIENCES BALL GAME TEAM FOOTBALL BASEBALL PLAYERS PLAY FIELD PLAYER BASKETBALL COACH PLAYED PLAYING HIT TENNIS TEAMS GAMES SPORTS BAT TERRY JOB WORK JOBS CAREER EXPERIENCE EMPLOYMENT OPPORTUNITIES WORKING TRAINING SKILLS CAREERS POSITIONS FIND POSITION FIELD OCCUPATIONS REQUIRE OPPORTUNITY EARN ABLE Words can have high probability in multiple topics

Disambiguation Give example for field

Topics versus LSA Latent Semantic Analysis (LSI/LSA) –Projects words into a K-dimensional hidden space –Less interpretable –Not generalizable –Not as accurate MONEY LOAN CASH RIVER BANK BILL (high dimensional space)

Modeling Word Association

Word Association (norms from Nelson et al. 1998) CUE: PLANET

people EARTH STARS SPACE SUN MARS UNIVERSE SATURN GALAXY associate number Word Association (norms from Nelson et al. 1998) CUE: PLANET (vocabulary = words)

Topic model for Word Association Association as a problem of prediction: –Given that a single word is observed, predict what other words might occur in that context Under a single topic assumption: Response Cue

people EARTH STARS SPACE SUN MARS UNIVERSE SATURN GALAXY model STARS STAR SUN EARTH SPACE SKY PLANET UNIVERSE associate number Word Association (norms from Nelson et al. 1998) CUE: PLANET First associate “EARTH” is in the set of 8 associates (from the model)

P( set contains first associate ) LSA TOPICS

Why would LSA perform worse? Cosine similarity measure imposes unnecessary constraints on representations, e.g. –Symmetry –Triangle inequality

Violation of triangle inequality Can find associations that violate this: A BC AC  AB + BC SOCCER FIELD MAGNETIC

No Triangle Inequality with Topics SOCCER MAGNETIC FIELD TOPIC 1 TOPIC 2 Topic structure easily explains violations of triangle inequality

Modeling Episodic Memory

Semantic Isolation Effect Study this list: PEAS, CARROTS, BEANS, SPINACH, LETTUCE, HAMMER, TOMATOES, CORN, CABBAGE, SQUASH HAMMER, PEAS, CARROTS,...

Semantic Isolation Effect Verbal explanations: –Attention, surprise, distinctiveness Our approach: memory system is trading off two encoding resources –Storing specific words (e.g. “hammer”) –Storing general theme of list (e.g. “vegetables”)

Computational Problem How to tradeoff specificity and generality? –Remembering detail and gist  Special word topic model

Special word topic (SW) model Each word can be generated via one route –Topics –Special words distribution (unique to a document) Conditional prob. of a word under a document:

Study words PEAS CARROTS BEANS SPINACH LETTUCE HAMMER TOMATOES CORN CABBAGE SQUASH ENCODING RETRIEVAL Special

Hunt & Lamb (2001 exp. 1) OUTLIER LIST PEAS CARROTS BEANS SPINACH LETTUCE HAMMER TOMATOES CORN CABBAGE SQUASH CONTROL LIST SAW SCREW CHISEL DRILL SANDPAPER HAMMER NAILS BENCH RULER ANVIL

Model Predictions

False memory effects MAD FEAR HATE SMOOTH NAVY HEAT SALAD TUNE COURTS CANDY PALACE PLUSH TOOTH BLIND WINTER Number of Associates 369 MAD FEAR HATE RAGE TEMPER FURY SALAD TUNE COURTS CANDY PALACE PLUSH TOOTH BLIND WINTER MAD FEAR HATE RAGE TEMPER FURY WRATH HAPPY FIGHT CANDY PALACE PLUSH TOOTH BLIND WINTER (lure = ANGER) Robinson & Roediger (1997)

Relation to Dual Process Models Gist/Verbatim distinction (e.g. Reyna, Brainerd, 1995) –Maps onto topics and special words Our approach specifies both encoding and retrieval representations and processes –Routes are not independent –Model explains performance for actual word lists

Applications I

Topics provide quick summary of content What is in this corpus? What is in this document? What are the topical trends over time? Who writes on this topic?

330,000 articles Analyzing the New York Times

Three investigations began Thursday into the securities and exchange_commission's choice of william_webster to head a new board overseeing the accounting profession. house and senate_democrats called for the resignations of both judge_webster and harvey_pitt, the commission's chairman. The white_house expressed support for judge_webster as well as for harvey_pitt, who was harshly criticized Thursday for failing to inform other commissioners before they approved the choice of judge_webster that he had led the audit committee of a company facing fraud accusations. “The president still has confidence in harvey_pitt,” said dan_bartlett, bush's communications director … Extracted Named Entities Used standard algorithms to extract named entities: - People - Places - Organizations

Standard Topic Model with Entities

Topic Trends Tour-de-France Anthrax Quarterly Earnings Proportion of words assigned to topic for that time slice

Example of Extracted Entity-Topic Network

Prediction of Missing Entities in Text Shares of XXXX slid 8 percent, or $1.10, to $12.65 Tuesday, as major credit agencies said the conglomerate would still be challenged in repaying its debts, despite raising $4.6 billion Monday in taking its finance group public. Analysts at XXXX Investors service in XXXX said they were keeping XXXX and its subsidiaries under review for a possible debt downgrade, saying the company “will continue to face a significant debt burden,'' with large slices of debt coming due, over the next 18 months. XXXX said … Test article with entities removed

Prediction of Missing Entities in Text Shares of XXXX slid 8 percent, or $1.10, to $12.65 Tuesday, as major credit agencies said the conglomerate would still be challenged in repaying its debts, despite raising $4.6 billion Monday in taking its finance group public. Analysts at XXXX Investors service in XXXX said they were keeping XXXX and its subsidiaries under review for a possible debt downgrade, saying the company “will continue to face a significant debt burden,'' with large slices of debt coming due, over the next 18 months. XXXX said … fitch goldman-sachs lehman-brother moody morgan-stanley new-york- stock-exchange standard-and-poor tyco tyco-international wall-street worldco Test article with entities removed Actual missing entities

Prediction of Missing Entities in Text Shares of XXXX slid 8 percent, or $1.10, to $12.65 Tuesday, as major credit agencies said the conglomerate would still be challenged in repaying its debts, despite raising $4.6 billion Monday in taking its finance group public. Analysts at XXXX Investors service in XXXX said they were keeping XXXX and its subsidiaries under review for a possible debt downgrade, saying the company “will continue to face a significant debt burden,'' with large slices of debt coming due, over the next 18 months. XXXX said … fitch goldman-sachs lehman-brother moody morgan-stanley new-york- stock-exchange standard-and-poor tyco tyco-international wall-street worldco wall-street new-york nasdaq securities-exchange-commission sec merrill- lynch new-york-stock-exchange goldman-sachs standard-and-poor Test article with entities removed Actual missing entities Predicted entities given observed words (matches in blue)

Applications II

Faculty Browser System collects PDF files from websites –UCI –UCSD Applies topic model on text extracted from PDFs Display faculty research with topics Demo:

one topic most prolific researchers for this topic

topics this researcher works on other researchers with similar topical interests one researcher

Conclusions

Human Memory Information Retrieval Finding relevant memories Finding relevant documents

Software Public-domain MATLAB toolbox for topic modeling on the Web:

Hidden Markov Topic Model

Hidden Markov Topics Model Syntactic dependencies  short range dependencies Semantic dependencies  long-range  zz zz zz zz ww ww ww ww ss ss ss ss Semantic state: generate words from topic model Syntactic states: generate words from HMM (Griffiths, Steyvers, Blei, & Tenenbaum, 2004)

HEART 0.2 LOVE 0.2 SOUL 0.2 TEARS 0.2 JOY 0.2 z = SCIENTIFIC 0.2 KNOWLEDGE 0.2 WORK 0.2 RESEARCH 0.2 MATHEMATICS 0.2 z = THE 0.6 A 0.3 MANY 0.1 OF 0.6 FOR 0.3 BETWEEN Transition between semantic state and syntactic states

THE ……………………………… HEART 0.2 LOVE 0.2 SOUL 0.2 TEARS 0.2 JOY 0.2 z = SCIENTIFIC 0.2 KNOWLEDGE 0.2 WORK 0.2 RESEARCH 0.2 MATHEMATICS 0.2 z = x = 1 THE 0.6 A 0.3 MANY 0.1 x = 3 OF 0.6 FOR 0.3 BETWEEN 0.1 x = Combining topics and syntax

THE LOVE…………………… HEART 0.2 LOVE 0.2 SOUL 0.2 TEARS 0.2 JOY 0.2 z = SCIENTIFIC 0.2 KNOWLEDGE 0.2 WORK 0.2 RESEARCH 0.2 MATHEMATICS 0.2 z = x = 1 THE 0.6 A 0.3 MANY 0.1 x = 3 OF 0.6 FOR 0.3 BETWEEN 0.1 x = Combining topics and syntax

THE LOVE OF……………… HEART 0.2 LOVE 0.2 SOUL 0.2 TEARS 0.2 JOY 0.2 z = SCIENTIFIC 0.2 KNOWLEDGE 0.2 WORK 0.2 RESEARCH 0.2 MATHEMATICS 0.2 z = x = 1 THE 0.6 A 0.3 MANY 0.1 x = 3 OF 0.6 FOR 0.3 BETWEEN 0.1 x = Combining topics and syntax

THE LOVE OF RESEARCH …… HEART 0.2 LOVE 0.2 SOUL 0.2 TEARS 0.2 JOY 0.2 z = SCIENTIFIC 0.2 KNOWLEDGE 0.2 WORK 0.2 RESEARCH 0.2 MATHEMATICS 0.2 z = x = 1 THE 0.6 A 0.3 MANY 0.1 x = 3 OF 0.6 FOR 0.3 BETWEEN 0.1 x = Combining topics and syntax

FOOD FOODS BODY NUTRIENTS DIET FAT SUGAR ENERGY MILK EATING FRUITS VEGETABLES WEIGHT FATS NEEDS CARBOHYDRATES VITAMINS CALORIES PROTEIN MINERALS MAP NORTH EARTH SOUTH POLE MAPS EQUATOR WEST LINES EAST AUSTRALIA GLOBE POLES HEMISPHERE LATITUDE PLACES LAND WORLD COMPASS CONTINENTS DOCTOR PATIENT HEALTH HOSPITAL MEDICAL CARE PATIENTS NURSE DOCTORS MEDICINE NURSING TREATMENT NURSES PHYSICIAN HOSPITALS DR SICK ASSISTANT EMERGENCY PRACTICE BOOK BOOKS READING INFORMATION LIBRARY REPORT PAGE TITLE SUBJECT PAGES GUIDE WORDS MATERIAL ARTICLE ARTICLES WORD FACTS AUTHOR REFERENCE NOTE GOLD IRON SILVER COPPER METAL METALS STEEL CLAY LEAD ADAM ORE ALUMINUM MINERAL MINE STONE MINERALS POT MINING MINERS TIN BEHAVIOR SELF INDIVIDUAL PERSONALITY RESPONSE SOCIAL EMOTIONAL LEARNING FEELINGS PSYCHOLOGISTS INDIVIDUALS PSYCHOLOGICAL EXPERIENCES ENVIRONMENT HUMAN RESPONSES BEHAVIORS ATTITUDES PSYCHOLOGY PERSON CELLS CELL ORGANISMS ALGAE BACTERIA MICROSCOPE MEMBRANE ORGANISM FOOD LIVING FUNGI MOLD MATERIALS NUCLEUS CELLED STRUCTURES MATERIAL STRUCTURE GREEN MOLDS Semantic topics PLANTS PLANT LEAVES SEEDS SOIL ROOTS FLOWERS WATER FOOD GREEN SEED STEMS FLOWER STEM LEAF ANIMALS ROOT POLLEN GROWING GROW

GOOD SMALL NEW IMPORTANT GREAT LITTLE LARGE * BIG LONG HIGH DIFFERENT SPECIAL OLD STRONG YOUNG COMMON WHITE SINGLE CERTAIN THE HIS THEIR YOUR HER ITS MY OUR THIS THESE A AN THAT NEW THOSE EACH MR ANY MRS ALL MORE SUCH LESS MUCH KNOWN JUST BETTER RATHER GREATER HIGHER LARGER LONGER FASTER EXACTLY SMALLER SOMETHING BIGGER FEWER LOWER ALMOST ON AT INTO FROM WITH THROUGH OVER AROUND AGAINST ACROSS UPON TOWARD UNDER ALONG NEAR BEHIND OFF ABOVE DOWN BEFORE SAID ASKED THOUGHT TOLD SAYS MEANS CALLED CRIED SHOWS ANSWERED TELLS REPLIED SHOUTED EXPLAINED LAUGHED MEANT WROTE SHOWED BELIEVED WHISPERED ONE SOME MANY TWO EACH ALL MOST ANY THREE THIS EVERY SEVERAL FOUR FIVE BOTH TEN SIX MUCH TWENTY EIGHT HE YOU THEY I SHE WE IT PEOPLE EVERYONE OTHERS SCIENTISTS SOMEONE WHO NOBODY ONE SOMETHING ANYONE EVERYBODY SOME THEN Syntactic classes BE MAKE GET HAVE GO TAKE DO FIND USE SEE HELP KEEP GIVE LOOK COME WORK MOVE LIVE EAT BECOME

MODEL ALGORITHM SYSTEM CASE PROBLEM NETWORK METHOD APPROACH PAPER PROCESS IS WAS HAS BECOMES DENOTES BEING REMAINS REPRESENTS EXISTS SEEMS SEE SHOW NOTE CONSIDER ASSUME PRESENT NEED PROPOSE DESCRIBE SUGGEST USED TRAINED OBTAINED DESCRIBED GIVEN FOUND PRESENTED DEFINED GENERATED SHOWN IN WITH FOR ON FROM AT USING INTO OVER WITHIN HOWEVER ALSO THEN THUS THEREFORE FIRST HERE NOW HENCE FINALLY #*IXTN-CFP#*IXTN-CFP EXPERTS EXPERT GATING HME ARCHITECTURE MIXTURE LEARNING MIXTURES FUNCTION GATE DATA GAUSSIAN MIXTURE LIKELIHOOD POSTERIOR PRIOR DISTRIBUTION EM BAYESIAN PARAMETERS STATE POLICY VALUE FUNCTION ACTION REINFORCEMENT LEARNING CLASSES OPTIMAL * MEMBRANE SYNAPTIC CELL * CURRENT DENDRITIC POTENTIAL NEURON CONDUCTANCE CHANNELS IMAGE IMAGES OBJECT OBJECTS FEATURE RECOGNITION VIEWS # PIXEL VISUAL KERNEL SUPPORT VECTOR SVM KERNELS # SPACE FUNCTION MACHINES SET NETWORK NEURAL NETWORKS OUPUT INPUT TRAINING INPUTS WEIGHTS # OUTPUTS NIPS Semantics NIPS Syntax

Random sentence generation LANGUAGE: [S] RESEARCHERS GIVE THE SPEECH [S] THE SOUND FEEL NO LISTENERS [S] WHICH WAS TO BE MEANING [S] HER VOCABULARIES STOPPED WORDS [S] HE EXPRESSLY WANTED THAT BETTER VOWEL

Nested Chinese Restaurant Process

Topic Hierarchies In regular topic model, no relations between topics topic 1 topic 2 topic 3 topic 4 topic 5 topic 6 topic 7 Nested Chinese Restaurant Process –Blei, Griffiths, Jordan, Tenenbaum (2004) –Learn hierarchical structure, as well as topics within structure

Example: Psych Review Abstracts RESPONSE STIMULUS REINFORCEMENT RECOGNITION STIMULI RECALL CHOICE CONDITIONING SPEECH READING WORDS MOVEMENT MOTOR VISUAL WORD SEMANTIC ACTION SOCIAL SELF EXPERIENCE EMOTION GOALS EMOTIONAL THINKING GROUP IQ INTELLIGENCE SOCIAL RATIONAL INDIVIDUAL GROUPS MEMBERS SEX EMOTIONS GENDER EMOTION STRESS WOMEN HEALTH HANDEDNESS REASONING ATTITUDE CONSISTENCY SITUATIONAL INFERENCE JUDGMENT PROBABILITIES STATISTICAL IMAGE COLOR MONOCULAR LIGHTNESS GIBSON SUBMOVEMENT ORIENTATION HOLOGRAPHIC CONDITIONIN STRESS EMOTIONAL BEHAVIORAL FEAR STIMULATION TOLERANCE RESPONSES A MODEL MEMORY FOR MODELS TASK INFORMATION RESULTS ACCOUNT SELF SOCIAL PSYCHOLOGY RESEARCH RISK STRATEGIES INTERPERSONAL PERSONALITY SAMPLING MOTION VISUAL SURFACE BINOCULAR RIVALRY CONTOUR DIRECTION CONTOURS SURFACES DRUG FOOD BRAIN AROUSAL ACTIVATION AFFECTIVE HUNGER EXTINCTION PAIN THE OF AND TO IN A IS

Generative Process RESPONSE STIMULUS REINFORCEMENT RECOGNITION STIMULI RECALL CHOICE CONDITIONING SPEECH READING WORDS MOVEMENT MOTOR VISUAL WORD SEMANTIC ACTION SOCIAL SELF EXPERIENCE EMOTION GOALS EMOTIONAL THINKING GROUP IQ INTELLIGENCE SOCIAL RATIONAL INDIVIDUAL GROUPS MEMBERS SEX EMOTIONS GENDER EMOTION STRESS WOMEN HEALTH HANDEDNESS REASONING ATTITUDE CONSISTENCY SITUATIONAL INFERENCE JUDGMENT PROBABILITIES STATISTICAL IMAGE COLOR MONOCULAR LIGHTNESS GIBSON SUBMOVEMENT ORIENTATION HOLOGRAPHIC CONDITIONIN STRESS EMOTIONAL BEHAVIORAL FEAR STIMULATION TOLERANCE RESPONSES A MODEL MEMORY FOR MODELS TASK INFORMATION RESULTS ACCOUNT SELF SOCIAL PSYCHOLOGY RESEARCH RISK STRATEGIES INTERPERSONAL PERSONALITY SAMPLING MOTION VISUAL SURFACE BINOCULAR RIVALRY CONTOUR DIRECTION CONTOURS SURFACES DRUG FOOD BRAIN AROUSAL ACTIVATION AFFECTIVE HUNGER EXTINCTION PAIN THE OF AND TO IN A IS