CONCEPTUAL KNOWLEDGE: EVIDENCE FROM CORPORA, THE MIND, AND THE BRAIN Massimo Poesio Uni Trento, Center for Mind / Brain Sciences Uni Essex, Language &

Slides:



Advertisements
Similar presentations
FMRI Methods Lecture 10 – Using natural stimuli. Reductionism Reducing complex things into simpler components Explaining the whole as a sum of its parts.
Advertisements

A Probabilistic Representation of Systemic Functional Grammar Robert Munro Department of Linguistics, SOAS, University of London.
1 Unsupervised Ontology Induction From Text Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with Pedro Domingos)
Building Wordnets Piek Vossen, Irion Technologies.
GMD German National Research Center for Information Technology Darmstadt University of Technology Perspectives and Priorities for Digital Libraries Research.
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Chapter 4: The Visual Cortex and Beyond
Matthias Wimmer, Bernd Radig, Michael Beetz Chair for Image Understanding Computer Science TU München, Germany A Person and Context.
Introduction to Information Retrieval Outline ❶ Latent semantic indexing ❷ Dimensionality reduction ❸ LSI in information retrieval 1.
Andreas Kleinschmidt INSERM U992 CEA NeuroSpin Saclay, France Mind Reading - Can Imaging Tell What You Are Thinking?
Feature Selection as Relevant Information Encoding Naftali Tishby School of Computer Science and Engineering The Hebrew University, Jerusalem, Israel NIPS.
Chapter 5: Introduction to Information Retrieval
UCB Computer Vision Animals on the Web Tamara L. Berg CSE 595 Words & Pictures.
Fabio Massimo Zanzotto and Danilo Croce University of Rome “Tor Vergata” Roma, Italy Reading what Machines ‘Think’
Principal Component Analysis (PCA) for Clustering Gene Expression Data K. Y. Yeung and W. L. Ruzzo.
Chapter 7 Knowledge Terms: concept, categorization, prototype, typicality effect, object concepts, rule-governed, exemplars, hierarchical organization,
LEDIR : An Unsupervised Algorithm for Learning Directionality of Inference Rules Advisor: Hsin-His Chen Reporter: Chi-Hsin Yu Date: From EMNLP.
Latent Semantic Analysis
Concepts and Categories. Functions of Concepts By dividing the world into classes of things to decrease the amount of information we need to learn, perceive,
Concepts and Categories. Functions of Concepts By dividing the world into classes of things to decrease the amount of information we need to learn, perceive,
Lecture #1COMP 527 Pattern Recognition1 Pattern Recognition Why? To provide machines with perception & cognition capabilities so that they could interact.
Knowledge Representation Reading: Chapter
Self-organizing Conceptual Map and Taxonomy of Adjectives Noriko Tomuro, DePaul University Kyoko Kanzaki, NICT Japan Hitoshi Isahara, NICT Japan April.
From Semantic Similarity to Semantic Relations Georgeta Bordea, November 25 Based on a talk by Alessandro Lenci titled “Will DS ever become Semantic?”,
Cognitive Psychology, 2 nd Ed. Chapter 8 Semantic Memory.
Ontology Learning and Population from Text: Algorithms, Evaluation and Applications Chapters Presented by Sole.
Principal Component Analysis (PCA) for Clustering Gene Expression Data K. Y. Yeung and W. L. Ruzzo.
Mining and Summarizing Customer Reviews
Data Mining Techniques
Slide Image Retrieval: A Preliminary Study Guo Min Liew and Min-Yen Kan National University of Singapore Web IR / NLP Group (WING)
INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Unsupervised and Semi-Supervised Relation Extraction.
ERP DATA ACQUISITION & PREPROCESSING EEG Acquisition: 256 scalp sites; vertex recording reference (Geodesic Sensor Net)..01 Hz to 100 Hz analogue filter;
General Knowledge Dr. Claudia J. Stanny EXP 4507 Memory & Cognition Spring 2009.
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
Chapter Eight The Concept of Measurement and Attitude Scales
Exploiting Ontologies for Automatic Image Annotation M. Srikanth, J. Varner, M. Bowden, D. Moldovan Language Computer Corporation
English Let’s say “hello” Hello, hello , hello how are you?
Annotating Words using WordNet Semantic Glosses Julian Szymański Department of Computer Systems Architecture, Faculty of Electronics, Telecommunications.
SYMPOSIUM ON SEMANTICS IN SYSTEMS FOR TEXT PROCESSING September 22-24, Venice, Italy Combining Knowledge-based Methods and Supervised Learning for.
ALIP: Automatic Linguistic Indexing of Pictures Jia Li The Pennsylvania State University.
Katrin Erk Vector space models of word meaning. Geometric interpretation of lists of feature/value pairs In cognitive science: representation of a concept.
A Bootstrapping Method for Building Subjectivity Lexicons for Languages with Scarce Resources Author: Carmen Banea, Rada Mihalcea, Janyce Wiebe Source:
인공지능 연구실 황명진 FSNLP Introduction. 2 The beginning Linguistic science 의 4 부분 –Cognitive side of how human acquire, produce, and understand.
Chapter 8: Perceiving Motion
The interface between model-theoretic and corpus-based semantics
Authors: Marius Pasca and Benjamin Van Durme Presented by Bonan Min Weakly-Supervised Acquisition of Open- Domain Classes and Class Attributes from Web.
1/21 Automatic Discovery of Intentions in Text and its Application to Question Answering (ACL 2005 Student Research Workshop )
2005/12/021 Fast Image Retrieval Using Low Frequency DCT Coefficients Dept. of Computer Engineering Tatung University Presenter: Yo-Ping Huang ( 黃有評 )
Exploiting Ontologies for Automatic Image Annotation Munirathnam Srikanth, Joshua Varner, Mitchell Bowden, Dan Moldovan Language Computer Corporation SIGIR.
Web Search and Text Mining Lecture 5. Outline Review of VSM More on LSI through SVD Term relatedness Probabilistic LSI.
Evaluation issues in anaphora resolution and beyond Ruslan Mitkov University of Wolverhampton Faro, 27 June 2002.
Neural decoding of Visual Imagery during sleep PRESENTED BY: Sandesh Chopade Kaviti Sai Saurab T. Horikawa, M. Tamaki et al.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
Methodology in the Biological Level of Analysis Learning Objectives: 1.Discuss how and why particular research methods are used at the biological level.
Chapter 11 Language. Some Questions to Consider How do we understand individual words, and how are words combined to create sentences? How can we understand.
Pattern Recognition. What is Pattern Recognition? Pattern recognition is a sub-topic of machine learning. PR is the science that concerns the description.
Classify traits for selection of animals Objective 4.01.
V k equals the vector difference between the object and the block across the first and last frames in the image sequence or more formally: Toward Learning.
4.01 SELECTION OF LIVESTOCK.  Livestock producers use visual observations to:  Select breeding cattle or swine based on conformation, breed character,
Chapter 9 Knowledge. Some Questions to Consider Why is it difficult to decide if a particular object belongs to a particular category, such as “chair,”
Neural representation and decoding of the meanings of words
Representational Similarity Analysis
Learning Attributes and Relations
School of Computer Science & Engineering
Representational Similarity Analysis
To discuss this week What is a classifier? What is generalisation?
Understanding Agriculture Animals
2.9 SPECIALIZED CELLS Many complex organisms, such as a human being or a maple tree, begin life as a single fertilized egg or a single cell. The cells.
Machine Learning for Visual Scene Classification with EEG Data
Text Categorization Berlin Chen 2003 Reference:
Presentation transcript:

CONCEPTUAL KNOWLEDGE: EVIDENCE FROM CORPORA, THE MIND, AND THE BRAIN Massimo Poesio Uni Trento, Center for Mind / Brain Sciences Uni Essex, Language & Computation (joint work with A. Almuhareb, E. Barbu, M. Baroni, B. Murphy)

MOTIVATIONS Research on conceptual knowledge is carried out in Computational Linguistics, Neural Science, and Psychology Research on conceptual knowledge is carried out in Computational Linguistics, Neural Science, and Psychology But there is limited interchange between CL and the other disciplines studying concepts But there is limited interchange between CL and the other disciplines studying concepts –Except indirectly through the use of WordNet This work: use data from Psychology and Neural Science to evaluate (vector-space) models produced in CL This work: use data from Psychology and Neural Science to evaluate (vector-space) models produced in CL

OUTLINE Vector space representations Vector space representations A `semantic vector space model A `semantic vector space model How to evaluate such models How to evaluate such models Attribute extraction and Feature norms Attribute extraction and Feature norms Category distinctions and Brain data Category distinctions and Brain data

CONCEPTUAL SEMANTICS IN VECTOR SPACE

LEXICAL ACQUISITION IN CORPUS / COMP LING Vectorial representations of lexical meaning derived from IR Vectorial representations of lexical meaning derived from IR WORD-BASED vector models: WORD-BASED vector models: –vector dimensions are words –Schuetze 91, 98; HAL, LSA, Turney, Rapp GRAMMATICAL RELATION models: GRAMMATICAL RELATION models: – vector dimensions are pairs – vector dimensions are pairs – Grefenstette 93, Lin 98, Curran&Moens, Pantel, Widdows, Pado & Lapata, …..

FEATURES IN VECTOR SPACE MODELS GRAMMATICAL RELATIONS WORDS

STRENGHTS OF THIS APPROACH: CATEGORIZATION

LIMITATIONS OF THIS WORK Very simplistic view of concepts Very simplistic view of concepts –In fact, typically extract lexical representations for WORDS (non- disambiguated) Limited evaluation Limited evaluation –Typical evaluation: judges opinions about correctness of distances / comparing with WordNet Most work not connected with work on concepts in Psychology / Neural Science Most work not connected with work on concepts in Psychology / Neural Science

OUR WORK Acquire richer, more semantic- oriented concept descriptions by exploiting relation extraction techniques Acquire richer, more semantic- oriented concept descriptions by exploiting relation extraction techniques Develop task-based methods for evaluating the results Develop task-based methods for evaluating the results Integrate results from corpora with results from psychology & neural science Integrate results from corpora with results from psychology & neural science

THIS TALK Acquire richer, more semantic- oriented concept descriptions by exploiting relation extraction techniques Acquire richer, more semantic- oriented concept descriptions by exploiting relation extraction techniques Develop task-based methods for evaluating the results Develop task-based methods for evaluating the results Integrate results from corpora with results from psychology & neural science Integrate results from corpora with results from psychology & neural science

OUTLINE Vector space representations Vector space representations A `semantic vector space model A `semantic vector space model How to evaluate such models How to evaluate such models Attribute extraction and Feature norms Attribute extraction and Feature norms Category distinctions and Brain data Category distinctions and Brain data

OUTLINE

MORE ADVANCED THEORIES OF CONCEPTS In Linguistics: In Linguistics: –Pustejovsky In AI: In AI: –Description Logics –Formal ontologies In Psychology: In Psychology: –Theory Theory (Murphy, 2002) –FUSS (Vigliocco Vinson et al)

SEMANTIC CONCEPT DESCRIPTIONS PUSTEJOVSKY (1991, 1995) Lexical entries have a QUALIA STRUCTURE consisting of four roles Lexical entries have a QUALIA STRUCTURE consisting of four roles –FORMAL role: what type of object it is (shape, color, ….) –CONSTITUTIVE role: what it consists of (parts, stuff, etc.) E.g., for books, chapters, index, paper …. E.g., for books, chapters, index, paper …. –TELIC role: what is the purpose of the object (e.g., for books, READING) –AGENTIVE role: how the object was created (e.g., for books, WRITING)

BEYOND BUNDLES OF ATTRIBUTES: DESCRIPTION LOGICS, THEORY THEORY We know much more about concepts than the fact that they have certain attributes: We know much more about concepts than the fact that they have certain attributes: –We know that cars have 4 wheels whereas bicycles have 2 –We dont just know that people have heads, bodies and legs, but that heads are attached in certain positions whereas legs are attached in other ones –Facts of this type can be expressed even in the simplest concept description languages, those of description logics

BEYOND SIMPLE RELATIONS: DESCRIPTION LOGICS Bear (and Animal ( 4 Paw) …) Strawberry (and Fruit (fills Color red) … ) Female (and Human ( Male))

WORD SENSE DISCRIMINATION The senses of palm in WordNet The senses of palm in WordNet 1.the inner surface of the hand from the wrist to the base of the fingers 2.a linear unit based on the length or width of the human hand 3.any plant of the family Palmae having an unbranched trunk crowned by large pinnate or palmate leaves 4.an award for winning a championship or commemorating some other event

CONCEPT ACQUISITION MEETS RELATION EXTRACTION –We developed methods to identify SEMANTIC properties of concepts (`Deep lexical relations) ATTRIBUTES and their VALUES ATTRIBUTES and their VALUES –Almuhareb & Poesio 2004, 2005 Extracting QUALIA Extracting QUALIA –Poesio & Almuhareb 2005 Letting relations emerge from the data: STRUDEL Letting relations emerge from the data: STRUDEL –Baroni et al, Cognitive Science to appear Extracting Wu & Barsalou –style relations Extracting Wu & Barsalou –style relations –Poesio Barbu Giuliano & Romano, 2008 We showed that for a variety of tasks such conceptual descriptions are better than word-based or grammatical function-based descriptions

ALMUHAREB & POESIO 2005: USING A PARSER LOOKING ONLY FOR (POTENTIAL) ATTRIBUTES AND THEIR VALUES BETTER THAN USING ALL GRS EVEN IF ATTRIBUTES OBTAINED USING TEXT PATTERNS (THE X OF THE Y )

ATTRIBUTES AND VALUES VS. ALL CORPUS FEATURES Description Vector size Values only 64.86%94.59% % 94.59% Attributes only 97.30%97.30% % - Attributes (1522) & Values (1522) --100%--

SUPERVISED EXTRACTION OF CONCEPT DESCRIPTIONS Using a theory of attributes merging ideas from Pustejovsky and Guarino (Poesio and Almuhareb, 2005) Using a theory of attributes merging ideas from Pustejovsky and Guarino (Poesio and Almuhareb, 2005) Using Wu and Barsalous theory of attributes (Poesio Barbu Romano & Giuliano, 2008) Using Wu and Barsalous theory of attributes (Poesio Barbu Romano & Giuliano, 2008)

SUPERVISED EXTRACTION OF CONCEPT DESCRIPTIONS Using a theory of attributes merging ideas from Pustejovsky and Guarino (Poesio and Almuhareb, 2005) Using a theory of attributes merging ideas from Pustejovsky and Guarino (Poesio and Almuhareb, 2005) Using Wu and Barsalous theory of attributes (Poesio Barbu Romano & Giuliano, 2008) Using Wu and Barsalous theory of attributes (Poesio Barbu Romano & Giuliano, 2008)

THE CLASSIFICATION SCHEME FOR ATTRIBUTES OF POESIO & ALMUHAREB 2005 PART PART –(cfr. Guarinos non-relational attributes, Pustejovskys constitutive roles) RELATED OBJECT RELATED OBJECT –Non-relational attributes other than parts, relational roles QUALITY QUALITY –Guarinos qualities, Pustejovskys formal roles ACTIVITY ACTIVITY –Pustejosvkys telic and agentive roles RELATED AGENT RELATED AGENT NOT AN ATTRIBUTE (= everything else ) NOT AN ATTRIBUTE (= everything else )

A SUPERVISED FEATURE CLASSIFIER We developed a supervised feature classifier that relies on 4 types of information We developed a supervised feature classifier that relies on 4 types of information –Morphological info (Dixon, 1991) –Question patterns –Features of features –Feature use Some nouns used more commonly as features than as concepts: i.e., the F of the C is more frequent than the * of the F is Some nouns used more commonly as features than as concepts: i.e., the F of the C is more frequent than the * of the F is (These last four methods all rely on info extracted from the Web) (These last four methods all rely on info extracted from the Web)

THE EXPERIMENT We created a BALANCED DATASET We created a BALANCED DATASET –~ 400 concepts –representing all 21 WordNet classes, including both ABSTRACT and CONCRETE concepts –balanced as to ambiguity and frequency We collected from the Web 20,000 candidate features of these concepts using patterns We collected from the Web 20,000 candidate features of these concepts using patterns We hand-classified 1,155 candidate features We hand-classified 1,155 candidate features We used these data to train We used these data to train –A binary classifier (feature / non feature) –A 5-way classifier

OUTLINE Vector space representations Vector space representations An example of `Semantic-based vector space model An example of `Semantic-based vector space model Evaluating such models Evaluating such models Attribute extraction and Feature norms Attribute extraction and Feature norms Category distinctions and Brain data Category distinctions and Brain data

EVALUATION –Qualitative: Visual inspection Visual inspection Ask subjects to assess correctness of the classification of the attributes Ask subjects to assess correctness of the classification of the attributes –Quantitative: Use conceptual descriptions for CLUSTERING (CATEGORIZATION) Use conceptual descriptions for CLUSTERING (CATEGORIZATION)

QUANTITATIVE EVALUATION: PROBLEMS Attribute extraction Attribute extraction –WordNet only contains ISA and PART attributes –Attribute extraction can only be evaluated by hand Categorical distinctions Categorical distinctions –The WordNet category structure is highly subjective

VISUAL EVALUATION: TOP 400 FEATURES OF DEER ACCORDING TO OUR CLASSIFIER Class Attributes Parts & Related Objects antlers, leg, carcass, head, eyes, skin, body, blood, track, neck, horns, flesh, meat, legs, hide, loin, chest, throat, tongue, heart, horn, coat, trail, tail, bones, ears, scent, home, coverts, nose, feet, shoulder, stomach, foot, sight, rack, skull, hair, intestines, necks, line, brain, belly, tendons, step, heads, mind, entrails, skins, hooves, cells, bell, cavity, picture, testicles, photo, forehead, genitals, knees, innards, rump, butt, fur, face, shank, brains, image, ear, statue, path, corpse, jaw, bladder, muzzle, calf, hoofs, abdomen, hill, quarters, shit, senses, paths, wound, stream, feces, varieties, protector, hoof, pics, nostrils, portrait, liver, flanks, pen, forest, vitals, side, hips, garden, food, muscle, muscles, guts, droppings, bodies, veil, footprints, wounds, hearts, homes, teeth, underside, breast, turn, haunches, forelegs, brow, hip, figure, torso, village, spring, chin, tails, organs, enclosure, lips, hindquarters, valley, cave, flank, figures, tissues, spot, insides, dick, backs, skeleton, bark

VISUAL EVALUATION: QUALITIES Class Attributes Quality death, age, beauty, sex, form, sense, reaction, cry, terror, curiosity, health, fleetness, appetite, survival, condition, thirst, life, ways, swiftness, plight, whereabouts, fate, grace, gender, sickness, need, perspective, slaughter, capture, modesty, ecology, preservation, detriment, agility, heartbeat, greed, gentleness, behaviour, behavior, aggressiveness, screams, favor, predicament, genetics, honours, elegance, propensity, reactions, harvest, rescue, curse, mercy, gaze, sustainability, intelligence, lives, thoughts

QUANTITATIVE EVALUATION ATTRIBUTES ATTRIBUTES –PROBLEM: cant compare against WordNet –Precision / recall against hand-annotated datasets –Human judges (ourselves): We used the classifiers to classify the top 20 features of 21 randomly chosen concepts We used the classifiers to classify the top 20 features of 21 randomly chosen concepts We separately evaluated the results We separately evaluated the results CATEGORIES: CATEGORIES: –Clustering of the balanced dataset –PROBLEM: The WordNet category structure is highly subjective

ATTRIBUTE CLASSIFICATION DescriptionCross-Validation Human Judge (AA) Human Judge (MP) Correctly Classified Instances (Accuracy) 928 (out of 1155) 80.35% 244 (out of 365) 66.85% 182 (out of 260) 70.00% MeasurePRFPRFPRF Activity Part/Related- Object Quality Related-Agent Not-Attribute

CLUSTERING WITH 2-WAY CLASSIFIER All Candidate Attributes Heuristic filtering Filtering by classification Purity Entropy Vector size 24,1784,2963,824 Clustered Concepts

CLUSTERING: ERROR ANALYSIS ANIMALbear, bull, camel, cat, cow, deer, dog, elephant, horse, kitten, lion, monkey, puppy, rat, sheep, tiger, turtle

CLUSTERING: ERROR ANALYSIS ANIMALbear, bull, camel, cat, cow, deer, dog, elephant, horse, kitten, lion, monkey, puppy, rat, sheep, tiger, turtle EDIBLE FRUIT apple, banana, berry, cherry, fig, grape, kiwi, lemon, lime, mango, melon, olive, orange, peach, pear, pineapple, strawberry, watermelon, (pistachio, oyster)

CLUSTERING: ERROR ANALYSIS ANIMALbear, bull, camel, cat, cow, deer, dog, elephant, horse, kitten, lion, monkey, puppy, rat, sheep, tiger, turtle EDIBLE FRUIT apple, banana, berry, cherry, fig, grape, kiwi, lemon, lime, mango, melon, olive, orange, peach, pear, pineapple, strawberry, watermelon, (pistachio, oyster) ILLNESSacne, anthrax, arthritis, asthma, cancer, cholera, cirrhosis, diabetes, eczema, flu, glaucoma, hepatitis, leukemia, malnutrition, meningitis, plague, rheumatism, smallpox, (superego, lumbago, neuralgia, sciatica, gestation, menopause, quaternary, pain)

CLUSTERING: ERROR ANALYSIS ANIMALbear, bull, camel, cat, cow, deer, dog, elephant, horse, kitten, lion, monkey, puppy, rat, sheep, tiger, turtle EDIBLE FRUIT apple, banana, berry, cherry, fig, grape, kiwi, lemon, lime, mango, melon, olive, orange, peach, pear, pineapple, strawberry, watermelon, (pistachio, oyster) ILLNESSacne, anthrax, arthritis, asthma, cancer, cholera, cirrhosis, diabetes, eczema, flu, glaucoma, hepatitis, leukemia, malnutrition, meningitis, plague, rheumatism, smallpox, (superego, lumbago, neuralgia, sciatica, gestation, menopause, quaternary, pain) IN WORDNET: PAIN

LIMITS OF THIS TYPE OF EVALUATION No way of telling how complete / accurate are our concept descriptions No way of telling how complete / accurate are our concept descriptions –Both in terms of relations and in terms of their relative importance No way of telling whether the category distinctions we get from WordNet are empirically founded No way of telling whether the category distinctions we get from WordNet are empirically founded

BEYOND JUDGES / EVALUATION AGAINST WORDNET Task-based evaluation Task-based evaluation Evidence from other areas of cognitive science Evidence from other areas of cognitive science (ESSLLI 2008 Workshop - Baroni / Evert / Lenci ) (ESSLLI 2008 Workshop - Baroni / Evert / Lenci )

TASK-BASED (BLACK-BOX) EVALUATION Tasks requiring lexical knowledge: Tasks requiring lexical knowledge: –Lexical tests: TOEFL test (Rapp 2001, Turney 2005) TOEFL test (Rapp 2001, Turney 2005) –NLP tasks: Eg, anaphora resolution (Poesio et al 2004) Eg, anaphora resolution (Poesio et al 2004) –Actual applications E.g., language models (Mitchell & Lapata ACL 2009, Lapata invited talk) E.g., language models (Mitchell & Lapata ACL 2009, Lapata invited talk)

EVIDENCE FROM OTHER AREAS OF COGNITIVE SCIENCE Attributes: evidence from psychology Attributes: evidence from psychology –Association lists (priming) E.g., use results of association tests to evaluate proximity (Lund et al, 1995; Pado and Lapata, 2008) E.g., use results of association tests to evaluate proximity (Lund et al, 1995; Pado and Lapata, 2008) Comparison against feature norms: Schulte im Walde, 2008) Comparison against feature norms: Schulte im Walde, 2008) –Feature norms Category distinctions: evidence from neural science Category distinctions: evidence from neural science

OUTLINE Vector space representations Vector space representations An example of `Semantic-based vector space model An example of `Semantic-based vector space model How to evaluate such models How to evaluate such models Attribute extraction and Feature norms Attribute extraction and Feature norms Category distinctions and Brain data Category distinctions and Brain data

FEATURE-BASED REPRESENTATIONS IN PSYCHOLOGY Feature-based concept representations assumed by many cognitive psychology theories (Smith and Medin, 1981, McRae et al, 1997) Feature-based concept representations assumed by many cognitive psychology theories (Smith and Medin, 1981, McRae et al, 1997) Underpin development of prototype theory (Rosch et al) Underpin development of prototype theory (Rosch et al) Used, e.g., to account for semantic priming (McRae et al, 1997; Plaut, 1995) Used, e.g., to account for semantic priming (McRae et al, 1997; Plaut, 1995) Underlie much work on category-specific defects (Warrington and Shallice, 1984; Caramazza and Shelton, 1998; Tyler et al, 2000; Vinson and Vigliocco, 2004) Underlie much work on category-specific defects (Warrington and Shallice, 1984; Caramazza and Shelton, 1998; Tyler et al, 2000; Vinson and Vigliocco, 2004)

FEATURE NORMS Subjects produce lists of features for a concept Subjects produce lists of features for a concept Weighed by number of subjects that produce them Weighed by number of subjects that produce them Several existing (Rosch and Mervis, Garrard et al, McRae et al, Vinson and Vigliocco) Several existing (Rosch and Mervis, Garrard et al, McRae et al, Vinson and Vigliocco) Substantial differences in collection methodology and results Substantial differences in collection methodology and results

SPEAKER-GENERATED FEATURES (VINSON AND VIGLIOCCO)

COMPARING CORPUS FEATURES WITH FEATURE NORMS (Almuhareb et al 2005, Poesio et al 2007) 35 concepts in common between the Almuhareb & Poesio dataset and the dataset produced by Vinson and Vigliocco (2002, 2003) 35 concepts in common between the Almuhareb & Poesio dataset and the dataset produced by Vinson and Vigliocco (2002, 2003) –ANIMALS: bear, camel, cat, cow, dog, elephant, horse, lion, mouse, sheep, tiger, zebra –FRUIT: apple, banana, cherry, grape, lemon, orange, peach, pear, pineapple, strawberry, watermelon –VEHICLE: airplane, bicycle, boat, car, helicopter, motorcycle, ship, truck, van We compared the features we obtained for these concepts with the speaker-generated features collected by Vinson and Vigliocco We compared the features we obtained for these concepts with the speaker-generated features collected by Vinson and Vigliocco

RESULTS Best recall: ~ 52% (using all attributes and values) Best recall: ~ 52% (using all attributes and values) Best precision: ~ 19% Best precision: ~ 19% But: high correlation (ro=.777) between the distances between concept representations obtained from corpora, and the distances between the representations for the same concepts obtained from subjects (using the cosine as a measure of similarity) But: high correlation (ro=.777) between the distances between concept representations obtained from corpora, and the distances between the representations for the same concepts obtained from subjects (using the cosine as a measure of similarity)

DISCUSSION –Substantial differences in features and overlap, but correlation similar –Problems: Each feature norm slightly different Each feature norm slightly different They have been normalized by hand: LOUD, NOISY, NOISE all mapped to LOUD They have been normalized by hand: LOUD, NOISY, NOISE all mapped to LOUD

AN EXAMPLE: STRAWBERRY Speaker-generated features: Matching Features Collected Using Our Text Patterns – (with frequency) red (20) red (5), colour (5), color (1) fruit (18) fruit (5) sweet (13) sweetness (8) has seeds (12) seeds (6), seed (2) grows (10) growth (1), ripening (10) small (6) size (19) taste (6) taste (6), flavor (6), flavour (2) food (5) nutrition (1) from garden (5) cultivation (7), harvest (6), harvester (2) juice (5) juice (10), juices (3) dessert (3) sweetness (8) eat (3) nutrition (1)

Problem: differences between feature norms motorcycle motorcycle –Vinson & Vigliocco: wheel, motor, loud, vehicle, wheel, fast, handle, ride, transport, bike, human, danger, noise, seat, brake, drive, fun, gas, machine, object, open, small, travel, wind wheel, motor, loud, vehicle, wheel, fast, handle, ride, transport, bike, human, danger, noise, seat, brake, drive, fun, gas, machine, object, open, small, travel, wind –Garrard et al: vehicle, wheel, fast, handlebar, light, seat, make a noise, tank, metal, unstable, tyre, coloured, sidecar, indicator, pannier, pedal, speedometer, manoeuvrable, race, brakes, stop, move, engine, petrol, economical, gears vehicle, wheel, fast, handlebar, light, seat, make a noise, tank, metal, unstable, tyre, coloured, sidecar, indicator, pannier, pedal, speedometer, manoeuvrable, race, brakes, stop, move, engine, petrol, economical, gears –McRae et al: wheels, 2_wheels, dangerous, engine, fast, helmets, Harley_Davidson, loud, 1_or_2_people, vehicle, leather, transportation, 2_people, fun, Hell's_Angels, gasoline wheels, 2_wheels, dangerous, engine, fast, helmets, Harley_Davidson, loud, 1_or_2_people, vehicle, leather, transportation, 2_people, fun, Hell's_Angels, gasoline –Mutual correlation of ranks ranges from 0.4 to 0.7

FEATURE NORMS (GARRARD ET AL 2001)

DISCUSSION –Preliminary conclusions: need to collect new feature norms for CL E.g., use similar techniques to collect attributes for WordNet E.g., use similar techniques to collect attributes for WordNet See Kremer & Baroni 2008 See Kremer & Baroni 2008 –For more work on using feature norms for conceptual acquisition, see Schulte im Walde 2008 Schulte im Walde 2008 Baroni et al to appear Baroni et al to appear –For the correlation between feature norms and information in WordNet (meronymy, isa, plus info from glosses): Barbu & Poesio GWC 2008

OUTLINE Vector space representations Vector space representations An example of `Semantic-based vector space model An example of `Semantic-based vector space model How to evaluate such models How to evaluate such models Attribute extraction and Feature norms Attribute extraction and Feature norms Category distinctions and brain data Category distinctions and brain data

USING BRAIN DATA TO IDENTIFY CATEGORY DISTINCTIONS Studies of brain-damaged patients have been shown to provide useful insights in the organization of conceptual knowledge in the brain Studies of brain-damaged patients have been shown to provide useful insights in the organization of conceptual knowledge in the brain –Warrington and Shallice 1984, Caramazza & Shilton 1998 fMRI has been used to identify these distinctions in healthy patients as well fMRI has been used to identify these distinctions in healthy patients as well –E.g., Martin & Chao See, e.g., Capitani et al 2003 for a survey See, e.g., Capitani et al 2003 for a survey

CATEGORY DISTINCTIONS IN THE BRAIN ANIMALS TOOLS

CORPUS DATA AND BRAIN DATA Can brain data (from healthy patients) be used to get an objective picture of categorical distinctions in the brain? Can brain data (from healthy patients) be used to get an objective picture of categorical distinctions in the brain? Can our findings be useful to understand better the neurological results? Can our findings be useful to understand better the neurological results? Ongoing project: using EEG and fMRI to identify such distinctions Ongoing project: using EEG and fMRI to identify such distinctions

EEG Spectral Analysis of Concepts Participants presented with aural or visual concept stimuli Participants presented with aural or visual concept stimuli EEG apparatus records electrical activity on the scalp EEG apparatus records electrical activity on the scalp Waveforms can be reduced to frequency components Waveforms can be reduced to frequency components

EEG vs. fMRI

EEG pros and cons Pros: Pros: –Lighter –Cheaper –Better temporal resolution (ms) Cons: Cons: –Coarser spatial resolution (cm) –Noisy (e.g., very sensitive to skull depth)

A CATEGORY DISTINCTION EXPERIMENT WITH EEG Murphy Poesio Bovolo DalPonte & Bruzzone, Cogsci 2008 Murphy Poesio Bovolo DalPonte & Bruzzone, Cogsci 2008 Seven Italian native speakers Seven Italian native speakers Image stimuli only: Image stimuli only: –30 tools –30 animals Each stimulus presented six times Each stimulus presented six times Optimal time / frequency window identified automatically Optimal time / frequency window identified automatically – ms –3-17 Hz

Stimuli: Images from Web

Data analysis Preprocessing Preprocessing –Artefact removal Feature extraction Feature extraction – CSSD: a form of supervised component analysis Classification Classification –Using SVMs

EEG SIGNALS: TIME-FREQUENCY (PER CHANNEL)

Extraction of features (EEGOXELS) from EEG data

Data analysis Classification System Schematic Filter by Time, Freqand Eelectr.CSSDDecompositionVector TransformSupVec Machine var(tool),var(animal) 64 channels preprocessed data X channels filtered data* Tool component Animal component Feature vector Answer ?

RESULTS

Classification results: Animals vs plants VisualAuditory Participant A 74.6%88.2% Participant B 72.4%65.4% Participant C 82.6%92.7% Participant D 81.8%77.7%

RESULTS: across participants

Representation of categories in CSSD spaces component analysis identifies 2- dimensional spaces component analysis identifies 2- dimensional spaces Analysis of these spaces may provide useful data against which to compare our corpus models Analysis of these spaces may provide useful data against which to compare our corpus models

CSSD-derived conceptual spaces

BRAIN DATA AND CORPUS DATA What is the relation between the conceptual spaces induced from corpora with the conceptual spaces elicited using EEG? What is the relation between the conceptual spaces induced from corpora with the conceptual spaces elicited using EEG?

PREDICTING BRAIN (FMRI) ACTIVATION USING CONCEPT DESCRIPTIONS T. Mitchell, S. Shinkareva, A. Carlson, K. Chang, V. Malave, R. Mason and M. Just Predicting human brain activity associated with the meanings of nouns. Science 320, 1191–1195

MITCHELL ET AL 2008: METHODS Record fMRI activation for 60 nominal concepts Record fMRI activation for 60 nominal concepts –And extract 200 best features, or VOXELs Build conceptual descriptions for these concepts from corpora (the Web) Build conceptual descriptions for these concepts from corpora (the Web) –25 features for each concept –25 verbs expressing typical properties of living things / tools –Collect strength of association between these features and each concept Learn association between each voxel and the 25 verbal features using 58 concepts Learn association between each voxel and the 25 verbal features using 58 concepts Use learned model to predict activation of 2 held- out data (compare using Euclidean distance) Use learned model to predict activation of 2 held- out data (compare using Euclidean distance) –Accuracy: 77%

MITCHELL ET AL 2008

MITCHELL ET AL 2008: VERB FEATURES

MITCHELL ET AL: LEARNING ASSOCIATIONS

OUR EXPERIMENTS Replicate the Mitchell et al study using EEG data instead of fMRI Replicate the Mitchell et al study using EEG data instead of fMRI –Different feature selection mechanisms Compare different methods for building concept descriptions Compare different methods for building concept descriptions –In addition to hand-picked, also a variety of standard corpus models For Italian For Italian B. Murphy, M. Baroni, M. Poesio, B. Murphy, M. Baroni, M. Poesio, EEG responds to conceptual stimuli and corpus semantics, EMNLP 2009

RESULTS USING THE HAND- PICKED FEATURES

RESULTS USING AUTOMATICALLY SELECTED FEATURES MITCHELL ET AL AA-MP

COMPARISON BETWEEN CORPUS MODELS

RECAP We need to relate the evidence from corpora with evidence about concepts coming from empirical work in the neuroscience and psychology We need to relate the evidence from corpora with evidence about concepts coming from empirical work in the neuroscience and psychology Feature norms databases could be used to evaluate attribute extraction Feature norms databases could be used to evaluate attribute extraction –But: need to find better ways of collecting them Brain data may give us information about the real conceptual categories Brain data may give us information about the real conceptual categories –Results still preliminary

COLLABORATORS ABDULRAHMAN ALMUHAREB (Essex PhD 2006 Now at KACST, Saudi Arabia) HEBA LAKANY (formerly Essex, now Strathclyde) BRIAN MURPHY (Trento) EDUARD BARBU (Trento PhD forthc) MARCO BARONI (Trento)

THANKS To the audience ….. And to Galja, Ruslan & the other organizers for yet another splendid RANLP!