Sense clusters versus sense relations Irina Chugur, Julio Gonzalo UNED (Spain)

Slides:



Advertisements
Similar presentations
SEMANTICS.
Advertisements

Properties of Text CS336 Lecture 3:. 2 Generating Document Representations Want to automatically generate with little human intervention Use significant.
Processing Multiple Unrelated Meanings versus Multiple Related Senses Ekaterini Klepousniotou McGill University.
La indexación con técnicas lingüísticas en el modelo clásico de Recuperación de Información Julio Gonzalo, Anselmo Peñas y Felisa Verdejo Grupo de Procesamiento.
Statistical NLP: Lecture 3
Machine Learning in Practice Lecture 3 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
The quest for meaning in language documentation Felix Ameka.
CS Word Sense Disambiguation. 2 Overview A problem for semantic attachment approaches: what happens when a given lexeme has multiple ‘meanings’?
ISP 433/633 Week 10 Vocabulary Problem & Latent Semantic Indexing Partly based on G.Furnas SI503 slides.
Taylor 6 Polysemy & Meaning Chains. Overview Many linguistic categories are associated with several prototypes. This chapter will talk about family resemblance.
1 Word meaning and equivalence M.A. Literary Translation- Lesson 1 prof. Hugo Bowles January
1 Human simulations of vocabulary learning Présentation Interface Syntaxe-Psycholinguistique Y-Lan BOUREAU Gillette, Gleitman, Gleitman, Lederer.
By Fourth Group Members: Nopel Sahrul Putra E1D Siti Lathipatul Hikmah E1D Zahrina Kartika E1D HOMONYMY AND POLYSEMY.
Meaning and Language Part 1.
Homonyms.
Natural Language Processing Expectation Maximization.
Introduction to linguistics II
TYPOLOGY AND UNIVERSALS. TYPOLOGY borrowed from the field of biology and means something like ‘taxonomy’ or ‘classification’ the study of linguistic systems.
+ Predication: Verbs, EVENTS, and STATES Presenter: Emily Lu.
Language, Mind, and Brain by Ewa Dabrowska Chapter 7: Words.
Evaluating the Contribution of EuroWordNet and Word Sense Disambiguation to Cross-Language Information Retrieval Paul Clough 1 and Mark Stevenson 2 Department.
Computational Lexical Semantics Lecture 8: Selectional Restrictions Linguistic Institute 2005 University of Chicago.
APPLIED LINGUISTICS AMBIGUITY. LOOK AT THIS: WHAT IS AMBIGUITY? A word, phrase, or sentence is ambiguous if it has more than one meaning, in other words.
1. Prediction: to tell something before it happens 2. Hypothesis: a possible answer to a question based on gathered information.
Gregor Johann Mendel Studied peas in his monastry garden Kept careful records of all procedures and findings “controlled” pollination between plants with.
Philosophy of IR Evaluation Ellen Voorhees. NIST Evaluation: How well does system meet information need? System evaluation: how good are document rankings?
Multi-Prototype Vector Space Models of Word Meaning __________________________________________________________________________________________________.
Semantics.
Paper Review by Utsav Sinha August, 2015 Part of assignment in CS 671: Natural Language Processing, IIT Kanpur.
Garlic mustard frequently occurs in moist, shaded soil of river floodplains, forests, roadsides, edges of woods and trails edges and forest openings.
Improving Subcategorization Acquisition using Word Sense Disambiguation Anna Korhonen and Judith Preiss University of Cambridge, Computer Laboratory 15.
Reasoning Top-down biases symbolic distance effects semantic congruity effects Formal logic syllogisms conditional reasoning.
Lexical Semantics Chapter 16
Geography To the Greeks, geography was: “A description of the earth”
WordNet: Connecting words and concepts Christiane Fellbaum Cognitive Science Laboratory Princeton University.
CS 4705 Lecture 19 Word Sense Disambiguation. Overview Selectional restriction based approaches Robust techniques –Machine Learning Supervised Unsupervised.
Graph-based Analysis of Semantic Drift in Espresso-like Bootstrapping Algorithms EMNLP /5/27 Mamoru Komachi †, Taku Kudo ‡, Masashi Shimbo † and.
11 Chapter 19 Lexical Semantics. 2 Lexical Ambiguity Most words in natural languages have multiple possible meanings. –“pen” (noun) The dog is in the.
Clustering Word Senses Eneko Agirre, Oier Lopez de Lacalle IxA NLP group
Scientific Method The Way Science Works. Science Science is a method of understanding the natural world. It is characterized by empirical criteria, logical.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
Wikipedia as Sense Inventory to Improve Diversity in Web Search Results Celina SantamariaJulio GonzaloJavier Artiles nlp.uned.es UNED,c/Juan del Rosal,
English Linguistics 1. 3 What's in a word: lexicology 3.1Conceptual and lexical categories 3.1.1Conceptual categories 3.1.2Lexical categories 3.2Words.
Lecture 21 Computational Lexical Semantics Topics Features in NLTK III Computational Lexical Semantics Semantic Web USCReadings: NLTK book Chapter 10 Text.
CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 24 (14/04/06) Prof. Pushpak Bhattacharyya IIT Bombay Word Sense Disambiguation.
The Language of Science.  Hypothesis: a prediction that can be tested; an educated guess base on observations and prior knowledge  Theory: a well tested.
Chapter 13: Historical Linguistics Language Change over Time NoTES: About exercising: it keeps you healthy: physically & mentally. We won’t cover the entire.
Graduate School for Social Research Autumn 2015 Research Methodology and Methods of Social Inquiry socialinquiry.wordpress.com Causality.
Think like a scientists. What skills are needed to think like a scientist? Observing- using one or more of your senses with or without instruments to.
THE COLOR OF WATER BY JAMES MCBRIDE NOTES, GUIDED READING ACTIVITIES & RESOURCES.
Lecture 1 Ling 442.
Feature Assignment LBSC 878 February 22, 1999 Douglas W. Oard and Dagobert Soergel.
Meaning and Language Part 1. Plan We will talk about two different types of meaning, corresponding to two different types of objects: –Lexical Semantics:
SEMANTICS Chapter 10 Ms. Abrar Mujaddidi. What is semantics?  Semantics is the study of the conventional meaning conveyed by the use of words, phrases.
Data Mining – Introduction (contd…) Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot.
Language, Mind, and Brain by Ewa Dabrowska
contrastive linguistics
Measuring Monolinguality
Words to Know Hypothesis (prediction)- Testable prediction based on observations. Usually an if/then/because statement. Inference- a conclusion reached.
Journalism 614: Reliability and Validity
contrastive linguistics
ArtsSemNet: From Bilingual Dictionary To Bilingual Semantic Network
Category-Based Pseudowords
WordNet WordNet, WSD.
Denotative meaning and translation issues
COMPARATIVE Linguistics 2018/2019
contrastive linguistics
Unsupervised Word Sense Disambiguation Using Lesk algorithm
contrastive linguistics
Chapter 12 Vocabulary.
Presentation transcript:

Sense clusters versus sense relations Irina Chugur, Julio Gonzalo UNED (Spain)

Sense clusters vs. Sense relations Arguments for sense clustering subtle distinctions produce noise in applications WN too fine- grained Remove predictable sense extensions?

Sense clusters vs. Sense relations Arguments for sense clustering subtle distinctions produce noise in applications WN too fine- grained Remove predictable sense extensions? But... clusters are not absolute (e.g. metaphors in IR/MT) not really! Use them to infer and study systematic polysemy Polysemy relations are more informative and predictive WN rich sense distinctions permit empirical /quantitative studies on polysemy phenomena

Sense clusters vs. Sense relations Arguments for sense clustering subtle distinctions produce noise in applications WN too fine- grained Remove predictable sense extensions? But... clusters are not absolute (e.g., are metaphors close?) not really! Use them to infer and study systematic polysemy Polysemy relations are more informative and predictive Annotation of semantic relations in 1000 wn nouns

Sense distinctions for IR Helpful distinctions Spring Spring –season –Fountain –Metal device –To jump Bank Bank –River bank –Seat –Financial institution Useless distinctions Bet –Act of gambling –Money risked on a gamble –To gamble Bother –Smth. or someone who causes trouble, a source of unhappiness –An angry disturbance

1) Cluster evidence from Semcor Hypothesis: if two senses tend to co-occur in the same documents, they are not good IR discriminators. Hypothesis: if two senses tend to co-occur in the same documents, they are not good IR discriminators. Criterion: cluster senses that co-occur frequently in IR-Semcor collection. Criterion: cluster senses that co-occur frequently in IR-Semcor collection. Example: fact 1 and fact 2 co-occur in 13 out of 171 docs. Example: fact 1 and fact 2 co-occur in 13 out of 171 docs. –Fact 1. (a piece of information about circumstances that exist or events that have occurred) –Fact 2. (a statement or assertion of verified information about something that is the case or has happened)

Cluster from Semcor: results Positive clusters: 507 (630 sense pairs) Positive clusters: 507 (630 sense pairs) Threshold: #docs  2 with similar distribution of senses Precision: 70% (directly related to threshold) Negative clusters: 530 Negative clusters: 530 Threshold: #sense occurrences  8 Precision: 80%

2) Cluster evidence from parallel polysemy English Orquesta 4 Spanish Band 1 Instrumentalists not including string players Band 2 A group of musicians playing popular music for dancing

2) Cluster evidence from parallel polysemy English Orquesta 4 Spanish Band 1 Instrumentalists not including string players Band 2 A group of musicians playing popular music for dancing Groupe 9 Groupe 6 FrenchGerman Band 2

Parallel polysemy in EuroWordNet EnglishSpanish French German {child,kid}  {niño,crío,menor}  {enfant,mineur}  {Kind} {male child,  {niño}  {enfant}  {Kind,Spross} Boy,child}

Comparison of clustering criteria

Clusters vs. semantic relations Polysemy relations are more predictive!

Characterization of sense inventories for WSD Given two senses of a word, Given two senses of a word, –How are they related? (polysemy relations) –How closely? (sense proximity) –In what applications should be distinguished? Given an individual sense of a word Given an individual sense of a word –Should it be split into subsenses? (sense stability)

Cross-Linguistic evidence Fine Mountains on the other side of the valley rose from the mist like islands, and here and there flecks of cloud as pale and fine as sea-spray, trailed across their sombre, wooded slopes. TRANSLATION: * *

P L (same lexicalization| w i, w j )  1 |w i || w j |  x  {w i examples } y  {w j examples } tr L (x) = tr L (y) Proximity( w i, w j )  1 |languages|  L  languages P L (same lexicalization| w i, w j ) Sense proximity (Resnik & Yarowsky)

Sense Stability Stability(w i )  |x, y  {w i examples }, x  y| | languages| 1  L  languages 1 t L (x) = t L (y)  x, y  w i examples }, x  y

Experiment Design MAIN SET 44 Senseval-2 words (nouns and adjectives) 182 senses 508 examples 11 native/bilingual speakers of 4 languages Bulgarian Russian Spanish Urdu (control set: 12 languages, 5 families, 28 subjects)

RESULTS: distribution of proximity indexes Average proximity = 0.29: same as Hector in Senseval 1!

Results: distribution of stability indexes Average stability = 0.80

distribution of homonyms ?

distribution of metaphors

distribution of metonymy Average proximity: target in source 0.64, source in target 0.37

Systematic polysemy  sense proximity Onion 1. Bulbuos plant ( kind of alliaceous plant ) 2. Pungent bulb ( kind of vegetable ) 3. Edible bulb of an onion plant ( kind of bulb ) plant / food Systematic polysemy IR cluster Positive and negative rules ? container / quantity music /dance animal / food language / people

distribution of specialization/generalization

Annotation of 1000 wn nouns Relation % sense pairs Homonymy41.2 Metonymy32.5 Metaphor13.0 Specialization7.7 Generalization1.7 Equivalence3.1 fuzzy0.8 Need for cluster here!

Typology of sense relations HomonymyMetonymyMetaphorSpecializationGeneralizationEquivalencefuzzy

Typology of sense relations: metonymy HomonymyMetonymyMetaphorSpecializationGeneralizationEquivalencefuzzy target in source source in target Co-metonymy Animal-meat Animal-fur Tree-wood Object-color Plant-fruit People-language Action-duration Recipient-quantity... Action-object Action-result Shape-object Plant-food/beverage Material-product... Substance-agent

Typology of sense relations: metaphors HomonymyMetonymy Metaphor (182) SpecializationGeneralizationEquivalencefuzzy Action/state/entity (source domain) Action/state/entity (target domain) object  object / person (47) person  person (21) physical action  abstract action (16) Physical property  abstract property (11) Animal  person (10)...

Typology of sense relations: metaphors HomonymyMetonymy Metaphor (182) SpecializationGeneralizationEquivalencefuzzy Action/state/entity (source domain) Action/state/entity (target domain) object  object / person (47) person  person (21) Source: historical, mythological, biblical character... profession, occupation, position... Target: prototype person e.g. Adonis (greek mythology/handsome)

Conclusions Let’s annotate semantic relations between WN word senses!