A Robust Approach to Aligning Heterogeneous Lexical Resources Mohammad Taher Pilehvar Roberto Navigli MultiJEDI ERC 259234.

Slides:



Advertisements
Similar presentations
Building Wordnets Piek Vossen, Irion Technologies.
Advertisements

A centralized approach to language resources Piek Vossen S&T Forum on Multilingualism, Luxembourg, June 6th 2005.
1 Extended Gloss Overlaps as a Measure of Semantic Relatedness Satanjeev Banerjee Ted Pedersen Carnegie Mellon University University of Minnesota Duluth.
Proposed concepts illustrated well on sets of face images extracted from video: Face texture and surface are smooth, constraining them to a manifold Recognition.
Sentiment Lexicon Creation from Lexical Resources BIS 2011 Bas Heerschop Erasmus School of Economics Erasmus University Rotterdam
Gimme’ The Context: Context- driven Automatic Semantic Annotation with CPANKOW Philipp Cimiano et al.
A System for A Semi-Automatic Ontology Annotation Kiril Simov, Petya Osenova, Alexander Simov, Anelia Tincheva, Borislav Kirilov BulTreeBank Group LML,
 Manmatha MetaSearch R. Manmatha, Center for Intelligent Information Retrieval, Computer Science Department, University of Massachusetts, Amherst.
Creating a Bilingual Ontology: A Corpus-Based Approach for Aligning WordNet and HowNet Marine Carpuat Grace Ngai Pascale Fung Kenneth W.Church.
Semantic (Language) Models: Robustness, Structure & Beyond Thomas Hofmann Department of Computer Science Brown University Chief Scientist.
Designing clustering methods for ontology building: The Mo’K workbench Authors: Gilles Bisson, Claire Nédellec and Dolores Cañamero Presenter: Ovidiu Fortu.
Semantic Video Classification Based on Subtitles and Domain Terminologies Polyxeni Katsiouli, Vassileios Tsetsos, Stathes Hadjiefthymiades P ervasive C.
Multi-view Exploratory Learning for AKBC Problems Bhavana Dalvi and William W. Cohen School Of Computer Science, Carnegie Mellon University Motivation.
Word sense induction using continuous vector space models
Xiaomeng Su & Jon Atle Gulla Dept. of Computer and Information Science Norwegian University of Science and Technology Trondheim Norway June 2004 Semantic.
Enhance legal retrieval applications with an automatically induced knowledge base Ka Kan Lo.
OMAP: An Implemented Framework for Automatically Aligning OWL Ontologies SWAP, December, 2005 Raphaël Troncy, Umberto Straccia ISTI-CNR
Aiding WSD by exploiting hypo/hypernymy relations in a restricted framework MEANING project Experiment 6.H(d) Luis Villarejo and Lluís M à rquez.
Evaluating the Contribution of EuroWordNet and Word Sense Disambiguation to Cross-Language Information Retrieval Paul Clough 1 and Mark Stevenson 2 Department.
A Fully Unsupervised Word Sense Disambiguation Method Using Dependency Knowledge Ping Chen University of Houston-Downtown Wei Ding University of Massachusetts-Boston.
Classifying Tags Using Open Content Resources Simon Overell, Borkur Sigurbjornsson & Roelof van Zwol WSDM ‘09.
CLEF Ǻrhus Robust – Word Sense Disambiguation exercise UBC: Eneko Agirre, Oier Lopez de Lacalle, Arantxa Otegi, German Rigau UVA & Irion: Piek Vossen.
Exploiting Wikipedia as External Knowledge for Document Clustering Sakyasingha Dasgupta, Pradeep Ghosh Data Mining and Exploration-Presentation School.
“How much context do you need?” An experiment about context size in Interactive Cross-language Question Answering B. Navarro, L. Moreno-Monteagudo, E.
Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher Laura Po and Sonia Bergamaschi DII, University of Modena and Reggio Emilia, Italy.
Jiuling Zhang  Why perform query expansion?  WordNet based Word Sense Disambiguation WordNet Word Sense Disambiguation  Conceptual Query.
Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.
Machine Learning in Spoken Language Processing Lecture 21 Spoken Language Processing Prof. Andrew Rosenberg.
LREC 2008 AWN 1 Arabic WordNet: Semi-automatic Extensions using Bayesian Inference H. Rodríguez 1, D. Farwell 1, J. Farreres 1, M. Bertran 1, M. Alkhalifa.
Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.
Word Sense Disambiguation UIUC - 06/10/2004 Word Sense Disambiguation Another NLP working problem for learning with constraints… Lluís Màrquez TALP, LSI,
Paper Review by Utsav Sinha August, 2015 Part of assignment in CS 671: Natural Language Processing, IIT Kanpur.
WORD SENSE DISAMBIGUATION STUDY ON WORD NET ONTOLOGY Akilan Velmurugan Computer Networks – CS 790G.
Improving Subcategorization Acquisition using Word Sense Disambiguation Anna Korhonen and Judith Preiss University of Cambridge, Computer Laboratory 15.
SYMPOSIUM ON SEMANTICS IN SYSTEMS FOR TEXT PROCESSING September 22-24, Venice, Italy Combining Knowledge-based Methods and Supervised Learning for.
Unsupervised Constraint Driven Learning for Transliteration Discovery M. Chang, D. Goldwasser, D. Roth, and Y. Tu.
2014 EMNLP Xinxiong Chen, Zhiyuan Liu, Maosong Sun State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information.
Mining fuzzy domain ontology based on concept Vector from wikipedia category network.
Very Large Cross-lingual Resources at OAEI 2008 Laura Hollink Véronique Malaisé Vrije Universiteit Amsterdam.
Wikipedia as Sense Inventory to Improve Diversity in Web Search Results Celina SantamariaJulio GonzaloJavier Artiles nlp.uned.es UNED,c/Juan del Rosal,
Evgeniy Gabrilovich and Shaul Markovitch
Element Level Semantic Matching Pavel Shvaiko Meaning Coordination and Negotiation Workshop, ISWC 8 th November 2004, Hiroshima, Japan Paper by Fausto.
Online Multiple Kernel Classification Steven C.H. Hoi, Rong Jin, Peilin Zhao, Tianbao Yang Machine Learning (2013) Presented by Audrey Cheong Electrical.
Improving Named Entity Translation Combining Phonetic and Semantic Similarities Fei Huang, Stephan Vogel, Alex Waibel Language Technologies Institute School.
1 Chen Yirong, Lu Qin, Li Wenjie, Cui Gaoying Department of Computing The Hong Kong Polytechnic University Chinese Core Ontology Construction from a Bilingual.
Iterative similarity based adaptation technique for Cross Domain text classification Under: Prof. Amitabha Mukherjee By: Narendra Roy Roll no: Group:
Accurate Cross-lingual Projection between Count-based Word Vectors by Exploiting Translatable Context Pairs SHONOSUKE ISHIWATARI NOBUHIRO KAJI NAOKI YOSHINAGA.
1 Measuring the Semantic Similarity of Texts Author : Courtney Corley and Rada Mihalcea Source : ACL-2005 Reporter : Yong-Xiang Chen.
1 Gloss-based Semantic Similarity Metrics for Predominant Sense Acquisition Ryu Iida Nara Institute of Science and Technology Diana McCarthy and Rob Koeling.
AIFB Ontology Mapping I3CON Workshop PerMIS August 24-26, 2004 Washington D.C., USA Marc Ehrig Institute AIFB, University of Karlsruhe.
From Words to Senses: A Case Study of Subjectivity Recognition Author: Fangzhong Su & Katja Markert (University of Leeds, UK) Source: COLING 2008 Reporter:
Semantic Grounding of Tag Relatedness in Social Bookmarking Systems Ciro Cattuto, Dominik Benz, Andreas Hotho, Gerd Stumme ISWC 2008 Hyewon Lim January.
Neural decoding of Visual Imagery during sleep PRESENTED BY: Sandesh Chopade Kaviti Sai Saurab T. Horikawa, M. Tamaki et al.
Second Language Learning From News Websites Word Sense Disambiguation using Word Embeddings.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Enhancing Text Clustering by Leveraging Wikipedia Semantics.
Sentiment Analysis Using Common- Sense and Context Information Basant Agarwal 1,2, Namita Mittal 2, Pooja Bansal 2, and Sonal Garg 2 1 Department of Computer.
Big Data: Every Word Managing Data Data Mining TerminologyData Collection CrowdsourcingSecurity & Validation Universal Translation Monolingual Dictionaries.
Mapping the NCI Thesaurus and the Collaborative Inter-Lingual Index Amanda Hicks University of Florida HealthInsight Workshop, Oslo, Norway.
Word Sense Disambiguation Algorithms in Hindi
SERVICE ANNOTATION WITH LEXICON-BASED ALIGNMENT Service Ontology Construction Ontology of a given web service, service ontology, is constructed from service.
Exploiting Wikipedia as External Knowledge for Document Clustering
Cross-Ontological Relationships
Automatically Extending NE coverage of Arabic WordNet using Wikipedia

Cross-language Information Retrieval
CS 620 Class Presentation Using WordNet to Improve User Modelling in a Web Document Recommender System Using WordNet to Improve User Modelling in a Web.
Information Networks: State of the Art
Unsupervised Pretraining for Semantic Parsing
Natural Language to SQL(nl2sql)
Unsupervised Word Sense Disambiguation Using Lesk algorithm
Presentation transcript:

A Robust Approach to Aligning Heterogeneous Lexical Resources Mohammad Taher Pilehvar Roberto Navigli MultiJEDI ERC

Lexical Resource WordNet BabelNet UBY

Lexical Resource WordNet BabelNet UBY

Lexical Resource WordNet BabelNet UBY

Improved word and concept coverage  e.g., named entities, new senses Improved domain coverage Improved multilinguality  dozens of new languages Expert-made relations preserved  e.g., Hypernymy, meronymy, etc. Why combine resources?

Provides complementary knowledge Applications: Semantic Parsing Shi and Mihalcea, 2005 Semantic Role Labeling Palmer et al., 2010 WSD and entity linking Moro et al., TACL 2014 Why combine resources?

A synset from BabelNet

Why combine resources? A synset from BabelNet for plant (living organism) a case example BabelNet

Why combine resources? A synset from BabelNet for plant (living organism)

Why combine resources?

A synset from BabelNet for plant (living organism)

Why combine resources? A synset from BabelNet for plant (living organism)

Difficulty of resource alignment Fine granularity of lexical resources plant WordNet 4 senses15 senses

How resource alignment works? WordNet plant#n#1 Usually measures the similarity of two concepts WKT: WN:

How resource alignment works? Usually measures the similarity of two concepts And aligns two concepts if their similarity exceeds a certain threshold ? WordNet plant#n#1 WKT: WN:

How resource alignment works? Alignment approaches differ in the way they calculate this similarity ? WordNet plant#n#1 WKT: WN:

Gloss similarity WordNet Denfinitional similarity

Strong baseline Fall short when Different wordings are used for same concepts When two words lack quality glosses key -- Metal device shaped in such a way that when it is inserted into the appropriate lock the lock's mechanism can be rotated Key -- An object designed to open and close a lock Denfinitional similarity

Strong baseline Fall short when Different wordings are used for same concepts When two words lack quality glosses plant -- Buildings for carrying on industrial labor. plant -- The necessary infrastructure used in support and maintenance of a given facility. Denfinitional similarity

Contributions A novel concept similarity measure Denfinitional similarity A robust technique for alignment of resources A robust technique for alignment of heterogeneous resources An effective ontologization approach

Our approach: SemAlign

Definition similarity

Our approach: SemAlign Structural similarity

SemAlign: structural similarity WordNet

SemAlign: core Modeling concepts through Semantic Signatures

SemAlign: core Semantic Signatures of concepts

some Personalized PageRank Semantic Signature of a concept

Distributional representation over all concepts in the semantic network Personalized PageRank Semantic Signature of a concept

Distributional representation over all concepts in the semantic network...

Semantic Signature Example WordNet concept for Importance of concept_3 for our concept...

Semantic Signature Importance of concept_3 for our concept... cooking#1 frying_pan#1 oil#4 kitchen#3 Example WordNet concept for -----

Semantic Signature Importance of concept_3 for our concept... table#3 carpet#2 sugar#2 natural_gas#2 Example WordNet concept for -----

Semantic Signature Importance of concept_3 for our concept... read#3 bank#2 Example WordNet concept for -----

SemAlign Definition similarity WordNet

SemAlign WordNet

SemAlign: signature unification

WordNet SemAlign: signature unification Find concepts associated with monosemous words

WordNet SemAlign: signature unification Truncate vectors to the overlapping concepts

SemAlign: signature unification WordNet Synsets containing at least one monosemous word ~ 60% (72,000) The reliability of leveraging monosemous words

Semantic Signature Comparison

Weighted Overlap (Pilehvar et al., ACL 2013) Semantic Signature Comparison

Comparing Semantic Signatures Weighted Overlap

SemAlign: score combination

Ontologization of lexical resources

WordNet Ontologization of lexical resources

WordNet Ontologization of lexical resources

WordNet Ontologization of lexical resources

WordNet Ontologization of lexical resources

Definition page for sail

WordNet Ontologization of lexical resources

WordNet Ontologization of lexical resources

WordNet Ontologization of lexical resources

Definition page for windmill 1.A machine which translates linear motion of wind to rotational motion by means of adjustable vanes called sails. Ontologization of lexical resources

Definition page for windmill 1.A machine which translates linear motion of wind to rotational motion by means of adjustable vanes called sails. windmill.n.1 linear motion.n.1 rotational motion.n.1 translate.v.3 wind.n.1 machine.n.1 sail.n.3 adjustable vanes.v.4 Ontologization of lexical resources

Ontologization : similarity-based disambiguation Definition page for windmill 1.A machine which translates linear motion of wind to rotational motion by means of adjustable vanes called sails.

Ontologization : similarity-based disambiguation 1.A trip in a boat, especially a sailboat. 2.A sailing vessel; a vessel of any kind; a craft. 3.The blade of a windmill. 4.A tower-like structure found on the dorsal (topside) surface of submarines. 1.A machine which translates linear motion of wind to rotational motion by means of adjustable vanes called sails. Definition page for sail Definition page for windmill

Ontologization : similarity-based disambiguation 1.A trip in a boat, especially a sailboat. 2.A sailing vessel; a vessel of any kind; a craft. 3.The blade of a windmill. 4.A tower-like structure found on the dorsal (topside) surface of submarines. 1.A machine which translates linear motion of wind to rotational motion by means of adjustable vanes called sails. Definition page for sail Definition page for windmill

Ontologization : similarity-based disambiguation 1.A trip in a boat, especially a sailboat. 2.A sailing vessel; a vessel of any kind; a craft. 3.The blade of a windmill. 4.A tower-like structure found on the dorsal (topside) surface of submarines. 1.A machine which translates linear motion of wind to rotational motion by means of adjustable vanes called sails. Definition page for sail Definition page for windmill

Ontologization : similarity-based disambiguation 1.A trip in a boat, especially a sailboat. 2.A sailing vessel; a vessel of any kind; a craft. 3.The blade of a windmill. 4.A tower-like structure found on the dorsal (topside) surface of submarines. 1.A machine which translates linear motion of wind to rotational motion by means of adjustable vanes called sails. Definition page for sail Definition page for windmill

Ontologization: evaluation (Meyer & Gurevych, 2012)

Alignment Experiments: Datasets (Matuschek & Gurevych, TACL 2013) WordNet WN synsets manually mapped to their corresponding concepts

Alignment Experiments: System configurations Parameter 1: score combination parameter

Alignment Experiments: System configurations Parameter 2: similarity threshold

Alignment Experiments: System configurations Setting and a. Unsupervised system

Alignment Experiments: System configurations 01 a. Unsupervised system Setting and b. Tuning on subset

Alignment Experiments: System configurations a. Unsupervised system Setting and b. Tuning on subset c. Cross validation

Comparison system Dijkstra-WSA (Matuschek & Gurevych, TACL 2013)

Alignment Experiments SB SB+DWSA F1

Alignment Experiments F1 SB SB+DWSA

SemAlign: structural similarity

Alignment Experiments F1

Conclusions A novel approach for aligning lexical resources –Accurate even in the absence of training data –Robust across different resources An effective ontologization approach Experiments on aligning –WN to WP, WT, and OW

Future directions Integrating the approach into BabelNet for boosting the alignment accuracy Alignment across different languages Updating lexicons with novel terms

1. thanks -- an acknowledgment of appreciation 2. thanks -- with the help of or owing to; 1.thanks -- an expression of gratitude. 2. thanks -- because of; normally used with a positive connotation, though it can be used sarcastically. WordNet Thanks!