SYMPOSIUM ON SEMANTICS IN SYSTEMS FOR TEXT PROCESSING September 22-24, 2008 - Venice, Italy Combining Knowledge-based Methods and Supervised Learning for.

Slides:



Advertisements
Similar presentations
DISTRIBUTIONAL WORD SIMILARITY David Kauchak CS159 Fall 2014.
Advertisements

1 Extended Gloss Overlaps as a Measure of Semantic Relatedness Satanjeev Banerjee Ted Pedersen Carnegie Mellon University University of Minnesota Duluth.
Automatic Metaphor Interpretation as a Paraphrasing Task Ekaterina Shutova Computer Lab, University of Cambridge NAACL 2010.
CS Word Sense Disambiguation. 2 Overview A problem for semantic attachment approaches: what happens when a given lexeme has multiple ‘meanings’?
CS 4705 Lecture 19 Word Sense Disambiguation. Overview Selectional restriction based approaches Robust techniques –Machine Learning Supervised Unsupervised.
1 Noun Homograph Disambiguation Using Local Context in Large Text Corpora Marti A. Hearst Presented by: Heng Ji Mar. 29, 2004.
Designing clustering methods for ontology building: The Mo’K workbench Authors: Gilles Bisson, Claire Nédellec and Dolores Cañamero Presenter: Ovidiu Fortu.
CS 4705 Word Sense Disambiguation. Overview Selectional restriction based approaches Robust techniques –Machine Learning Supervised Unsupervised –Dictionary-based.
1/17 Acquiring Selectional Preferences from Untagged Text for Prepositional Phrase Attachment Disambiguation Hiram Calvo and Alexander Gelbukh Presented.
Semantic Video Classification Based on Subtitles and Domain Terminologies Polyxeni Katsiouli, Vassileios Tsetsos, Stathes Hadjiefthymiades P ervasive C.
WSD using Optimized Combination of Knowledge Sources Authors: Yorick Wilks and Mark Stevenson Presenter: Marian Olteanu.
Word Sense Disambiguation for Automatic Taxonomy Construction from Text-Based Web Corpora 12th International Conference on Web Information System Engineering.
Aiding WSD by exploiting hypo/hypernymy relations in a restricted framework MEANING project Experiment 6.H(d) Luis Villarejo and Lluís M à rquez.
Evaluating the Contribution of EuroWordNet and Word Sense Disambiguation to Cross-Language Information Retrieval Paul Clough 1 and Mark Stevenson 2 Department.
COMP423: Intelligent Agent Text Representation. Menu – Bag of words – Phrase – Semantics – Bag of concepts – Semantic distance between two words.
Learning Information Extraction Patterns Using WordNet Mark Stevenson and Mark A. Greenwood Natural Language Processing Group University of Sheffield,
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
A Fully Unsupervised Word Sense Disambiguation Method Using Dependency Knowledge Ping Chen University of Houston-Downtown Wei Ding University of Massachusetts-Boston.
An Integrated Approach to Extracting Ontological Structures from Folksonomies Huairen Lin, Joseph Davis, Ying Zhou ESWC 2009 Hyewon Lim October 9 th, 2009.
COMP423.  Query expansion  Two approaches ◦ Relevance feedback ◦ Thesaurus-based  Most Slides copied from ◦
Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher Laura Po and Sonia Bergamaschi DII, University of Modena and Reggio Emilia, Italy.
Jiuling Zhang  Why perform query expansion?  WordNet based Word Sense Disambiguation WordNet Word Sense Disambiguation  Conceptual Query.
Annotating Words using WordNet Semantic Glosses Julian Szymański Department of Computer Systems Architecture, Faculty of Electronics, Telecommunications.
1 Statistical NLP: Lecture 9 Word Sense Disambiguation.
1 Query Operations Relevance Feedback & Query Expansion.
Paper Review by Utsav Sinha August, 2015 Part of assignment in CS 671: Natural Language Processing, IIT Kanpur.
WORD SENSE DISAMBIGUATION STUDY ON WORD NET ONTOLOGY Akilan Velmurugan Computer Networks – CS 790G.
Part-Of-Speech Tagging using Neural Networks Ankur Parikh LTRC IIIT Hyderabad
W ORD S ENSE D ISAMBIGUATION By Mahmood Soltani Tehran University 2009/12/24 1.
Improving Subcategorization Acquisition using Word Sense Disambiguation Anna Korhonen and Judith Preiss University of Cambridge, Computer Laboratory 15.
21/11/2002 The Integration of Lexical Knowledge and External Resources for QA Hui YANG, Tat-Seng Chua Pris, School of Computing.
Partially Supervised Classification of Text Documents by Bing Liu, Philip Yu, and Xiaoli Li Presented by: Rick Knowles 7 April 2005.
10/22/2015ACM WIDM'20051 Semantic Similarity Methods in WordNet and Their Application to Information Retrieval on the Web Giannis Varelas Epimenidis Voutsakis.
An Effective Word Sense Disambiguation Model Using Automatic Sense Tagging Based on Dictionary Information Yong-Gu Lee
2014 EMNLP Xinxiong Chen, Zhiyuan Liu, Maosong Sun State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information.
A Bootstrapping Method for Building Subjectivity Lexicons for Languages with Scarce Resources Author: Carmen Banea, Rada Mihalcea, Janyce Wiebe Source:
CS 4705 Lecture 19 Word Sense Disambiguation. Overview Selectional restriction based approaches Robust techniques –Machine Learning Supervised Unsupervised.
A Semantic Approach to IE Pattern Induction Mark Stevenson and Mark A. Greenwood Natural Language Processing Group University of Sheffield, UK.
Wikipedia as Sense Inventory to Improve Diversity in Web Search Results Celina SantamariaJulio GonzaloJavier Artiles nlp.uned.es UNED,c/Juan del Rosal,
A Scalable Machine Learning Approach for Semi-Structured Named Entity Recognition Utku Irmak(Yahoo! Labs) Reiner Kraft(Yahoo! Inc.) WWW 2010(Information.
Disambiguation Read J & M Chapter 17.1 – The Problem Washington Loses Appeal on Steel Duties Sue caught the bass with the new rod. Sue played the.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Word sense disambiguation of WordNet glosses Presenter: Chun-Ping Wu Author: Dan Moldovan, Adrian Novischi.
Using Semantic Relatedness for Word Sense Disambiguation
Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.
Exploiting Ontologies for Automatic Image Annotation Munirathnam Srikanth, Joshua Varner, Mitchell Bowden, Dan Moldovan Language Computer Corporation SIGIR.
Number Sense Disambiguation Stuart Moore Supervised by: Anna Korhonen (Computer Lab)‏ Sabine Buchholz (Toshiba CRL)‏
1 Generating Comparative Summaries of Contradictory Opinions in Text (CIKM09’)Hyun Duk Kim, ChengXiang Zhai 2010/05/24 Yu-wen,Hsu.
Multi-level Bootstrapping for Extracting Parallel Sentence from a Quasi-Comparable Corpus Pascale Fung and Percy Cheung Human Language Technology Center,
1 Gloss-based Semantic Similarity Metrics for Predominant Sense Acquisition Ryu Iida Nara Institute of Science and Technology Diana McCarthy and Rob Koeling.
From Words to Senses: A Case Study of Subjectivity Recognition Author: Fangzhong Su & Katja Markert (University of Leeds, UK) Source: COLING 2008 Reporter:
2/10/2016Semantic Similarity1 Semantic Similarity Methods in WordNet and Their Application to Information Retrieval on the Web Giannis Varelas Epimenidis.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
Word Sense and Subjectivity (Coling/ACL 2006) Janyce Wiebe Rada Mihalcea University of Pittsburgh University of North Texas Acknowledgements: This slide.
Semantic search-based image annotation Petra Budíková, FI MU CEMI meeting, Plzeň,
Semantic Evaluation of Machine Translation Billy Wong, City University of Hong Kong 21 st May 2010.
Finding Predominant Word Senses in Untagged Text Diana McCarthy & Rob Koeling & Julie Weeds & Carroll Department of Indormatics, University of Sussex {dianam,
Sentiment Analysis Using Common- Sense and Context Information Basant Agarwal 1,2, Namita Mittal 2, Pooja Bansal 2, and Sonal Garg 2 1 Department of Computer.
Query expansion COMP423. Menu Query expansion Two approaches Relevance feedback Thesaurus-based Most Slides copied from
COMP423: Intelligent Agent Text Representation. Menu – Bag of words – Phrase – Semantics Semantic distance between two words.
Word Sense Disambiguation Algorithms in Hindi
SENSEVAL: Evaluating WSD Systems
Using lexical chains for keyword extraction
Statistical NLP: Lecture 9
WordNet WordNet, WSD.
CS 620 Class Presentation Using WordNet to Improve User Modelling in a Web Document Recommender System Using WordNet to Improve User Modelling in a Web.
A method for WSD on Unrestricted Text
Giannis Varelas Epimenidis Voutsakis Paraskevi Raftopoulou
Unsupervised Word Sense Disambiguation Using Lesk algorithm
Statistical NLP : Lecture 9 Word Sense Disambiguation
Presentation transcript:

SYMPOSIUM ON SEMANTICS IN SYSTEMS FOR TEXT PROCESSING September 22-24, Venice, Italy Combining Knowledge-based Methods and Supervised Learning for Effective Word Sense Disambiguation Pierpaolo Basile, Marco de Gemmis, Pasquale Lops and Giovanni Semeraro Department Of Computer Science University of Bari (ITALY)

Outline  Word Sense Disambiguation (WSD)  Knowledge-based methods  Supervised methods  Combined WSD strategy  Evaluation  Conclusions and Future Works

Word Sense Disambiguation  Word Sense Disambiguation (WSD) is the problem of selecting a sense for a word from a set of predefined possibilities  sense inventory usually comes from a dictionary or thesaurus  knowledge intensive methods, supervised learning, and (sometimes) bootstrapping approaches

Knowledge-based Methods  Use external knowledge sources  Thesauri  Machine Readable Dictionaries  Exploiting  dictionary definitions  measures of semantic similarity  heuristic methods

Supervised Learning  Exploits machine learning techniques to induce models of word usage from large text collections  annotated corpora are tagged manually using semantic classes chosen from a sense inventory  each sense-tagged occurrence of a particular word is transformed into a feature vector, which is then used in an automatic learning process

Problems & Motivation  Knowledge-based methods  outperformed by supervised methods  high coverage: applicable to all words in unrestricted text  Supervised methods  good precision  low coverage: applicable only to those words for which annotated corpora are available

Solution  Combination of Knowledge-based methods and Supervised Learning can improve WSD effectiveness  Knowledge-based methods can improve coverage  Supervised Learning can improve precision  WordNet-like dictionaries as sense inventory

JIGSAW  Knowledge-based WSD algorithm  Disambiguation of words in a text by exploiting WordNet senses  Combination of three different strategies to disambiguate nouns, verbs, adjectives and adverbs  Main motivation: the effectiveness of a WSD algorithm is strongly influenced by the POS-tag of the target word

JIGSAW_nouns  Based on Resnik algorithm for disambiguating noun groups  Given a set of nouns N={n 1,n 2,...,n n } from document d:  each n i has an associated sense inventory S i ={s i1, s i2,..., s ik } of possible senses  Goal: assigning each w i with the most appropriate sense s ih  S i, maximizing the similarity of n i with the other nouns in N

JIGSAW_nouns N=[ n 1, n 2, … n n ]={cat,mouse,…,bat} [s 11 s 12 … s 1k ] [s 21 s 22 … s 1h ] [s n1 s n2 … s nm ] mouse#1 cat#1 Placental mammal Carnivore Rodent Feline, felid Cat (feline mammal) Mouse (rodent) MSS Leacock-Chodorow measure

JIGSAW_nouns W=[ w 1, w 2, … w n ]={cat,mouse,…,bat} [s 11 s 12 … s 1k ] [s 21 s 22 … s 1h ] [s n1 s n2 … s nm ] mouse#1 cat#1 MSS=Placental mammal bat#1 bat#1 is hyponym of MSS increase the credit of bat#

JIGSAW_verbs  Try to establish a relation between verbs and nouns (distinct IS-A hierarchies in WordNet)  Verb w i disambiguated using:  nouns in the context C of w i  nouns into the description (gloss + WordNet usage examples) of each candidate synset for w i

JIGSAW_verbs  For each candidate synset s ik of w i  computes nouns(i, k): the set of nouns in the description for s ik  for each w j in C and each synset s ik computes the highest similarity max jk  max jk is the highest similarity value for w j wrt the nouns related to the k-th sense for w i (using Leacock-Chodorow measure)

JIGSAW_verbs 1.(70) play -- (participate in games or sport; "We played hockey all afternoon"; "play cards"; "Pele played for the Brazilian teams in many important matches") 2.(29) play -- (play on an instrument; "The band played all night long") 3.… w i =play C={basketball, soccer} nouns(play,1): game, sport, hockey, afternoon, card, team, match nouns(play,2): instrument, band, night nouns(play,35): … … I play basketball and soccer

JIGSAW_verbs nouns(play,1): game, sport, hockey, afternoon, card, team, match game game 1 game 2 game k … sport sport 1 sport 2 sport m … w i =play C={basketball, soccer} basketball basketball 1 basketball h … MAX basketball = MAX i Sim(w i,basketball) w i  nouns(play,1)

JIGSAW_others  Based on the WSD algorithm proposed by Banerjee and Pedersen (inspired to Lesk)  Idea: computes the overlap between the glosses of each candidate sense (including related synsets) for the target word to the glosses of all words in its context  assigns the synset with the highest overlap score  if ties occur, the most common synset in WordNet is chosen

Supervised Learning Method (1/2)  Features:  nouns: the first noun, verb or adjective before the target noun, within a window of at most three words to the left and its PoS-tag  verbs: the first word before and the first word after the target verb and their PoS-tag  adjectives: six nouns (before and after the target adjective)  adverbs: the same as adjectives but adjectives rather than nouns are used

Supervised Learning Method (2/2)  K-NN algorithm  Learning: build a vector for each annotated word  Classification build a vector v f for each word in the text compute similarity between v f and the training vectors rank the training vectors in decreasing order according to the similarity value choose the most frequent sense in the first K vectors

Evaluation (1/3)  Dataset  EVALITA WSD All-Words Task Dataset  Italian texts from newspapers (about 5000 words)  Sense Inventory: ItalWordNet  MultiSemCor as annotated corpus (only available semantic annotated resource for Italian) MultiWordNet-ItalWordNet mapping is required  Two strategy  integrating JIGSAW into a supervised learning method  integrating supervised learning into JIGSAW

Evaluation (2/3)  Integrating JIGSAW into a supervised learning method 1.supervised method is applied to words for which training examples are provided 2.JIGSAW is applied to words not covered by the first step

Evaluation (3/3)  Integrating supervised learning into JIGSAW 1.JIGSAW is applied to assign a sense to the words which can be disambiguated with a high level of confidence 2.remaining words are disambiguated by the supervised method

Evaluation: results RunPrecisionRecallF 1st sense58,4548,5853,06 Random43,5535,8839,34 JIGSAW55,1445,8350,05 K-NN59,1511,4619,20 K-NN+1st sense57,5347,8152,22 K-NN+JIGSAW56,6247,0551,39 K-NN+JIGSAW (  >0.90) 61,8826,1636,77 K-NN+JIGSAW (  >0.80) 61,4032,2142,25 JIGSAW+K-NN (  >0.90) 61,4827,4237,92 JIGSAW+K-NN (  >0.80) 61,1732,5942,52 JIGSAW+K-NN (  >0.70) 59,4436,5645,27

Conclusions  PoS-Tagging and lemmatization introduce error (~15%)  low recall  MultiSemCor does not contain enough annotated words  MultiWordNet-ItalWordNet mapping reduces the number of examples  Gloss quality affects verbs disambiguation  No other Italian WSD systems for comparison

Future Works  Use the same sense inventory for training and test  Improve pre-processing step  PoS-Tagging, lemmatization  Exploit several combination methods  voting strategies  combination of several unsupervised/supervised methods  unsupervised output as feature into supervised system

Thank you! Thank you for your attention!