WordNet: Connecting words and concepts Christiane Fellbaum Cognitive Science Laboratory Princeton University.

Slides:



Advertisements
Similar presentations
Building Wordnets Piek Vossen, Irion Technologies.
Advertisements

Using Link Grammar and WordNet on Fact Extraction for the Travel Domain.
The WordNet Lexical Database Bernardo Magnini ITC-irst, Istituto per la Ricerca Scientifica e Tecnologica Trento - Italy.
1 Extended Gloss Overlaps as a Measure of Semantic Relatedness Satanjeev Banerjee Ted Pedersen Carnegie Mellon University University of Minnesota Duluth.
Lexical Semantics and Word Senses Hongning Wang
Creating a Similarity Graph from WordNet
Section 4: Language and Intelligence Overview Instructor: Sandiway Fong Department of Linguistics Department of Computer Science.
Metaphors in the (mental)lexicon Christiane Fellbaum Princeton University and Berlin-Brandenburg Academy of Science.
CS 4705 Relations Between Words. Today Word Clustering Words and Meaning Lexical Relations WordNet Clustering for word sense discovery.
C SC 620 Advanced Topics in Natural Language Processing Sandiway Fong.
ISP 433/633 Week 10 Vocabulary Problem & Latent Semantic Indexing Partly based on G.Furnas SI503 slides.
Experiments on Using Semantic Distances Between Words in Image Caption Retrieval Presenter: Cosmin Adrian Bejan Alan F. Smeaton and Ian Quigley School.
Creating a Bilingual Ontology: A Corpus-Based Approach for Aligning WordNet and HowNet Marine Carpuat Grace Ngai Pascale Fung Kenneth W.Church.
Meaning and Language Part 1.
ImageNet: A Large-Scale Hierarchical Image Database
Medical WordNet A Proposal Christiane Fellbaum Princeton University and Berlin-Brandenburg Academy of Sciences.
Course G Web Search Engines 3/9/2011 Wei Xu
SI485i : NLP Set 10 Lexical Relations slides adapted from Dan Jurafsky and Bill MacCartney.
WORDNET Approach on word sense techniques - AKILAN VELMURUGAN.
Adam Pease and Christiane Fellbaum Presenter: 吳怡安
Evaluating the Contribution of EuroWordNet and Word Sense Disambiguation to Cross-Language Information Retrieval Paul Clough 1 and Mark Stevenson 2 Department.
1 Natural Language Processing (2a) Zhao Hai 赵海 Department of Computer Science and Engineering Shanghai Jiao Tong University
COMP423: Intelligent Agent Text Representation. Menu – Bag of words – Phrase – Semantics – Bag of concepts – Semantic distance between two words.
COMP423.  Query expansion  Two approaches ◦ Relevance feedback ◦ Thesaurus-based  Most Slides copied from ◦
“How much context do you need?” An experiment about context size in Interactive Cross-language Question Answering B. Navarro, L. Moreno-Monteagudo, E.
Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher Laura Po and Sonia Bergamaschi DII, University of Modena and Reggio Emilia, Italy.
Jiuling Zhang  Why perform query expansion?  WordNet based Word Sense Disambiguation WordNet Word Sense Disambiguation  Conceptual Query.
WordNet ® and its Java API ♦ Introduction to WordNet ♦ WordNet API for Java Name: Hao Li Uni: hl2489.
Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.
Annotating Words using WordNet Semantic Glosses Julian Szymański Department of Computer Systems Architecture, Faculty of Electronics, Telecommunications.
1 Query Operations Relevance Feedback & Query Expansion.
Paper Review by Utsav Sinha August, 2015 Part of assignment in CS 671: Natural Language Processing, IIT Kanpur.
WORD SENSE DISAMBIGUATION STUDY ON WORD NET ONTOLOGY Akilan Velmurugan Computer Networks – CS 790G.
Péter Schönhofen – Ad Hoc Hungarian → English – CLEF Workshop 20 Sep 2007 Performing Cross-Language Retrieval with Wikipedia Participation report for Ad.
SYMPOSIUM ON SEMANTICS IN SYSTEMS FOR TEXT PROCESSING September 22-24, Venice, Italy Combining Knowledge-based Methods and Supervised Learning for.
Lexical Semantics Chapter 16
10/22/2015ACM WIDM'20051 Semantic Similarity Methods in WordNet and Their Application to Information Retrieval on the Web Giannis Varelas Epimenidis Voutsakis.
Application of INTEX in refinement and validation of Serbian WordNet Ivan Obradović, Ranka Stanković Cvetana Krstev, Gordana Pavlović-Lažetić University.
WordNet: Connecting words and concepts Peng.Huang.
Unsupervised Word Sense Disambiguation REU, Summer, 2009.
11 Chapter 19 Lexical Semantics. 2 Lexical Ambiguity Most words in natural languages have multiple possible meanings. –“pen” (noun) The dog is in the.
Wordnet - A lexical database for the English Language.
WordNet Enhancements: Toward Version 2.0 WordNet Connectivity Derivational Connections Disambiguated Definitions Topical Connections.
Ontology Engineering: from Cognitive Science to the Semantic Web Maria Teresa Pazienza University of Roma Tor Vergata, Italy 1.
Word Relations Slides adapted from Dan Jurafsky, Jim Martin and Chris Manning.
Using Semantic Relatedness for Word Sense Disambiguation
1 Masters Thesis Presentation By Debotosh Dey AUTOMATIC CONSTRUCTION OF HASHTAGS HIERARCHIES UNIVERSITAT ROVIRA I VIRGILI Tarragona, June 2015 Supervised.
Word Meaning and Similarity
2/10/2016Semantic Similarity1 Semantic Similarity Methods in WordNet and Their Application to Information Retrieval on the Web Giannis Varelas Epimenidis.
Word Meaning and Similarity Word Senses and Word Relations.
Semantic Grounding of Tag Relatedness in Social Bookmarking Systems Ciro Cattuto, Dominik Benz, Andreas Hotho, Gerd Stumme ISWC 2008 Hyewon Lim January.
Lexical Semantics and Word Senses Hongning Wang
Query expansion COMP423. Menu Query expansion Two approaches Relevance feedback Thesaurus-based Most Slides copied from
COMP423: Intelligent Agent Text Representation. Menu – Bag of words – Phrase – Semantics Semantic distance between two words.
Lexicons, Concept Networks, and Ontologies
SENSEVAL: Evaluating WSD Systems
Generating sets of synonyms between languages
Ontology Engineering: from Cognitive Science to the Semantic Web
ConceptNet: Search ontology classes via human senses ---A proposal
Word Relations Slides adapted from Dan Jurafsky, Jim Martin and Chris Manning.
ArtsSemNet: From Bilingual Dictionary To Bilingual Semantic Network
WordNet: A Lexical Database for English
Bulgarian WordNet Svetla Koeva Institute for Bulgarian Language
WordNet WordNet, WSD.
CS 620 Class Presentation Using WordNet to Improve User Modelling in a Web Document Recommender System Using WordNet to Improve User Modelling in a Web.
A method for WSD on Unrestricted Text
Word Relations Slides adapted from Dan Jurafsky, Jim Martin and Chris Manning.
Giannis Varelas Epimenidis Voutsakis Paraskevi Raftopoulou
Automatic generation of UW Dictionary through WordNet
Relations Between Words
Presentation transcript:

WordNet: Connecting words and concepts Christiane Fellbaum Cognitive Science Laboratory Princeton University

What is WordNet? A large lexical database, or “electronic dictionary” Covers most English nouns, verbs, adjectives, adverbs Electronic format makes it amenable to automatic manipulation Used in many applications (document retrieval and sorting, machine translation,...)

What’s so special about WordNet? Traditional paper dictionaries are organized alphabetically, so words that are grouped together (on the same page) are unrelated WordNet is organized by meaning, so words in close proximity are related

What’s so special...? Users can browse WordNet and find words related to their queries (like in a thesaurus)

Basic Design of WordNet WordNet entries are word-concept mappings Natural Languages map many-to many: One concept can be expressed by many words (synonymy): {car, auto, automobile} {close, shut}

Basic Design of WordNet One word can express many concepts (polysemy): {club, stick} {club, nightclub} {club, playing card}

Basic Design of WordNet Added problem in Natural Language: The words we use most frequently are the most polysemous (have the most meanings)!

Basic Design of WordNet WordNet harnesses synonymy and polysemy Represents words and concepts unambiguously Meaningfully relates words and concepts

Basic Design of WordNet WordNet’s building blocks: sets of synonyms (synsets) {hit, beat} {big, large} {queue, line} Each synset expresses a distinct concept. Currently, WordNet contains appr. 117,000 synsets

Basic Design of WordNet WordNet stores, and allows one to retrieve, --all concepts that a given word can express --all words that express a given concept

But wait--there’s more! Words and synsets are connected via meaning-based relations Result: a large semantic network (as opposed to a flat list in a paper dictionary)

Relations among WN noun synsets Hyperonymy/hyponymy relates super/subordinate synsets (denting more/less general concepts): {vehicle} / \ {car, automobile} {bicycle, bike} / \ \ {convertible} {SUV} {mountain bike} Transitivity: A car is a kind of vehicle An SUV is a kind of car => An SUV is a kind of vehicle

Relations among noun synsets Meronymy/holonymy (part/whole) {car, automobile} | {engine} / \ {spark plug} {cylinder} Inheritance: A car has an engine An engine has spark plugs => A car has spark plugs

Relations among verb synsets Verbs denote event Related by a “manner” relation {communicate} | {talk} / \ {stammer} {whisper}

Relations among verb synset Semantics of events (verbs) are very different from semantics of entities (nouns) WordNet captures this fact with different relations Relation refer to temporal properties of events --partial and complete overlap of two events --prior or posterior events

WordNet Relations among synsets create interconnected network Different senses of polysemous words are members of distinct synsets that are related to different synsets (i.e., occupy different locations in the network) e.g., {stock, broth} has superordinate synset {dish} {stock, breed} has superordinate {variety} These different synsets are also linked to different part/whole synsets

WordNet A word’s meaning can be defined in terms of its position in the network club 1 is a kind of association/has members club 2 is a kind of stick Relatedness between words or synsets can be quantified in terms of path length (number of connections among synsets)

WordNet How closely related are {zebra} and {horse}? Very: Both share the direct superordinate equine What about {horse, sawhorse} and {horse, gymnastic horse}? Related, but less so: joint superordinate {artifact} is 4-5 levels up What about {zebra} and {horse, gymnastic horse}? Unrelated: the trees containing them never intersect!

WordNet for Word Sense Disambiguation WSD is a major problem in Natural Language Processing Assumption: words in a context (phrase, sentence, discourse) are semantically related So, horse in the neighborhood of zebra is likely to mean “equine”; in the neighborhood of gym it likely means “gymnastic horse.”

WordNet for WSD If you want to disambiguate “horse” in the context of “zebra,” look for all WordNet paths from “zebra” to “horse.” The shortest one is likely to give you the correct sense of “horse.”

WordNet for WSD Can take advantage of WordNet classes (trees of hierarchically related synsets) e.g., run 1 co-occurs with nouns that are all hyponyms (subordinate, more specific concepts) of office (mayor, congresswoman, President,...) run 2 co-occurs with nouns that are hyponyms of machine (computer, washer, printing press, engine,...)

Topics/Domain in WordNet Hierachical organization leaves many related concepts unconnected Solution: link synsets across “trees” in terms of their membership in a “domain” or topic E.g., synsets {contraindication},{surgery}, {physician},....are all linked to {medicine}, the concept that defines a domain or topic

Topics/Domain in WordNet Customizable: user can define new topics Topics can be as coarse- or fine-grained as desired By using synsets as topic labels, the concepts subsumed under the new topic(s) will continue to be part of the network

Current and Future Work Increase density of WordNet More links, new relations E.g. “role” relation among nouns: distinguish {poodle}-{dog} (a “type” relation) from {poodle}-{pet} (a “role” relation) poodle is a type of dog, but not a type of pet poodle can (but must not) play the “role” of pet

Work just completed... (sponsored by ARDA/AQUAINT) Manually link nouns, verbs, adjectives, adverbs in the definitions (“glosses”) to the appropriate synset: {bank (a financial institution that accepts deposits...)} {bank (sloping land..)}

Gloss Disambiguation {bank (a financial institution that accepts deposits...)} {financial, fiscal} {institution, establishment} {institution, custom} {bank (sloping land..)} {slope, incline} {land, ground, earth} {land, country}

Gloss Disambiguation: Results A closed system linking glosses and synsets (and a more densely connected network) Each gloss is more informative as it adds synset information for the words in the gloss Glosses are examples of contexts for many word- sense pairs, telling us how words with specific senses are being used in context Glosses can be used as training data for machine learning systems that want to “learn” to disambiguate words automatically

Where to find WordNet Freely downloadable: Database, browser, documentation

Global WordNet Currently, wordnets exist for some 40 languages, including Arabic, Basque, Bulgarian, Estonian, Hebrew, Icelandic, Italian, Kannada, Latvian, Persian, Romanian, Sanskrit, Tamil, Thai, Turkish,...

Thank you! For questions, comments, and papers: