Faculty Of Applied Science Simon Fraser University Cmpt 825 presentation Corpus Based PP Attachment Ambiguity Resolution with a Semantic Dictionary Jiri.

Slides:



Advertisements
Similar presentations
COMP3740 CR32: Knowledge Management and Adaptive Systems
Advertisements

Data Mining Lecture 9.
WSPD Applications.
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.
Paper By - Manish Mehta, Rakesh Agarwal and Jorma Rissanen
Huffman code and ID3 Prof. Sin-Min Lee Department of Computer Science.
IT 433 Data Warehousing and Data Mining
Hunt’s Algorithm CIT365: Data Mining & Data Warehousing Bajuna Salehe
Decision Tree Approach in Data Mining
Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Introduction to Data Mining by Tan, Steinbach,
Classification Techniques: Decision Tree Learning
Learning Visual Similarity Measures for Comparing Never Seen Objects Eric Nowak, Frédéric Jurie CVPR 2007.
Decision Trees Instructor: Qiang Yang Hong Kong University of Science and Technology Thanks: Eibe Frank and Jiawei Han.
Seeing the forest for the trees : using the Gene Ontology to restructure hierarchical clustering Dikla Dotan-Cohen, Simon Kasif and Avraham A. Melkman.
ID3 Algorithm Abbas Rizvi CS157 B Spring What is the ID3 algorithm? ID3 stands for Iterative Dichotomiser 3 Algorithm used to generate a decision.
HCS Clustering Algorithm
Decision Tree Algorithm
Induction of Decision Trees
Covering Algorithms. Trees vs. rules From trees to rules. Easy: converting a tree into a set of rules –One rule for each leaf: –Antecedent contains a.
1/17 Acquiring Selectional Preferences from Untagged Text for Prepositional Phrase Attachment Disambiguation Hiram Calvo and Alexander Gelbukh Presented.
WSD using Optimized Combination of Knowledge Sources Authors: Yorick Wilks and Mark Stevenson Presenter: Marian Olteanu.
Chapter 7 Decision Tree.
PUBLIC: A Decision Tree Classifier that Integrates Building and Pruning RASTOGI, Rajeev and SHIM, Kyuseok Data Mining and Knowledge Discovery, 2000, 4.4.
Basic Data Mining Techniques
Evaluating the Contribution of EuroWordNet and Word Sense Disambiguation to Cross-Language Information Retrieval Paul Clough 1 and Mark Stevenson 2 Department.
Learning what questions to ask. 8/29/03Decision Trees2  Job is to build a tree that represents a series of questions that the classifier will ask of.
Personalisation Seminar on Unlocking the Secrets of the Past: Text Mining for Historical Documents Sven Steudter.
Processing of large document collections Part 3 (Evaluation of text classifiers, applications of text categorization) Helena Ahonen-Myka Spring 2005.
Bayesian Networks. Male brain wiring Female brain wiring.
Decision Trees Jyh-Shing Roger Jang ( 張智星 ) CSIE Dept, National Taiwan University.
WORD SENSE DISAMBIGUATION STUDY ON WORD NET ONTOLOGY Akilan Velmurugan Computer Networks – CS 790G.
Improving Subcategorization Acquisition using Word Sense Disambiguation Anna Korhonen and Judith Preiss University of Cambridge, Computer Laboratory 15.
Binary Trees. Binary Tree Finite (possibly empty) collection of elements A nonempty binary tree has a root element The remaining elements (if any) are.
Ch10 Machine Learning: Symbol-Based
Wikipedia as Sense Inventory to Improve Diversity in Web Search Results Celina SantamariaJulio GonzaloJavier Artiles nlp.uned.es UNED,c/Juan del Rosal,
CS690L Data Mining: Classification
MACHINE LEARNING 10 Decision Trees. Motivation  Parametric Estimation  Assume model for class probability or regression  Estimate parameters from all.
Decision Trees Binary output – easily extendible to multiple output classes. Takes a set of attributes for a given situation or object and outputs a yes/no.
Named Entity Disambiguation on an Ontology Enriched by Wikipedia Hien Thanh Nguyen 1, Tru Hoang Cao 2 1 Ton Duc Thang University, Vietnam 2 Ho Chi Minh.
Finding frequent and interesting triples in text Janez Brank, Dunja Mladenić, Marko Grobelnik Jožef Stefan Institute, Ljubljana, Slovenia.
NLP. Parsing Manually label a set of instances. Split the labeled data into training and testing sets. Use the training data to find patterns. Apply.
2015/12/251 Hierarchical Document Clustering Using Frequent Itemsets Benjamin C.M. Fung, Ke Wangy and Martin Ester Proceeding of International Conference.
M180: Data Structures & Algorithms in Java Trees & Binary Trees Arab Open University 1.
Spatial Indexing Techniques Introduction to Spatial Computing CSE 5ISC Some slides adapted from Spatial Databases: A Tour by Shashi Shekhar Prentice Hall.
Commonsense Reasoning in and over Natural Language Hugo Liu, Push Singh Media Laboratory of MIT The 8 th International Conference on Knowledge- Based Intelligent.
Using Wikipedia for Hierarchical Finer Categorization of Named Entities Aasish Pappu Language Technologies Institute Carnegie Mellon University PACLIC.
1 Classification: predicts categorical class labels (discrete or nominal) classifies data (constructs a model) based on the training set and the values.
Data Mining By Farzana Forhad CS 157B. Agenda Decision Tree and ID3 Rough Set Theory Clustering.
BINARY TREES Objectives Define trees as data structures Define the terms associated with trees Discuss tree traversal algorithms Discuss a binary.
Learning Event Durations from Event Descriptions Feng Pan, Rutu Mulkar, Jerry R. Hobbs University of Southern California ACL ’ 06.
Using category-Based Adherence to Cluster Market-Basket Data Author : Ching-Huang Yun, Kun-Ta Chuang, Ming-Syan Chen Graduate : Chien-Ming Hsiao.
Efficient Semantic Web Service Discovery in Centralized and P2P Environments Dimitrios Skoutas 1,2 Dimitris Sacharidis.
Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining Chapter 3 Basic Data Mining Techniques Jason C. H. Chen, Ph.D. Professor of MIS School of Business.
Introduction to Data Mining Clustering & Classification Reference: Tan et al: Introduction to data mining. Some slides are adopted from Tan et al.
Chapter 3 Data Mining: Classification & Association Chapter 4 in the text box Section: 4.3 (4.3.1),
DATA MINING TECHNIQUES (DECISION TREES ) Presented by: Shweta Ghate MIT College OF Engineering.
CSE343/543 Machine Learning: Lecture 4.  Chapter 3: Decision Trees  Weekly assignment:  There are lot of applications and systems using machine learning.
Ontology Engineering and Feature Construction for Predicting Friendship Links in the Live Journal Social Network Author:Vikas Bahirwani 、 Doina Caragea.
By N.Gopinath AP/CSE.  A decision tree is a flowchart-like tree structure, where each internal node (nonleaf node) denotes a test on an attribute, each.
Chapter 6 Decision Tree.
Coarse-grained Word Sense Disambiguation
DECISION TREES An internal node represents a test on an attribute.
C4.5 algorithm Let the classes be denoted {C1, C2,…, Ck}. There are three possibilities for the content of the set of training samples T in the given node.
C4.5 algorithm Let the classes be denoted {C1, C2,…, Ck}. There are three possibilities for the content of the set of training samples T in the given node.
Ch9: Decision Trees 9.1 Introduction A decision tree:
Classification and Prediction
Classification by Decision Tree Induction
Huffman Encoding Huffman code is method for the compression for standard text documents. It makes use of a binary tree to develop codes of varying lengths.
A method for WSD on Unrestricted Text
©Jiawei Han and Micheline Kamber
Presentation transcript:

Faculty Of Applied Science Simon Fraser University Cmpt 825 presentation Corpus Based PP Attachment Ambiguity Resolution with a Semantic Dictionary Jiri Stetina, Makoto Nagao Presented by: Xianghua Jiang

Agenda Introduction PP-Attachment & Word Sense Ambiguity Word Sense Disambiguation PP-Attachment Decision Tree Induction, Classification Evaluation and Experimental Result Conclusion and Future Work

PP-Attachment Ambiguous Problem: ambiguous prepositional phrase attachment Buy books for money adverbial attach to the verb buy Buy books for children adjectival attach to the object noun book adverbial attach to the verb buy

PP-Attachment Ambiguous Backed–off model (Collins and Brooks in [C&B95]) Overall accuracy: 84.5% Accuracy of full quadruple matches : 92.6% Accuracy for a match on three words : 90.1% Increase the percentage of full quadruple and triple matches by employing the semantic distance measure instead of word-string matching.

PP-Attachment Ambiguous Example Buy books for children Buy magazines for children 2 sentences should be matched due to small conceptual distance between books and magazines.

PP-Attachment Ambiguous 2 Problems What is unknown is the limit distance for two concepts to be matched. Most of the words are semantically ambiguous and unless disambiguated, it is difficult to establish distances between them.

Word Sense Ambiguous Why? Because we want to match two different words based on their semantic distance. In order to determine the position of a word in the semantic hierarchy, we have to determine the sense of the word from the context in which it appears.

Semantic Hierarchy Semantic hierarchy The hierarchy for semantic matching is the semantic network of WordNet. Nouns are organized as 11 topical hierarchies, where each root represents the most general concept for each topic. Verbs are formed into 15 groups and have altogether 337 possible roots.

Semantic Distance D = ½ (L1/D1 + L2/D2) L1, L2 are the lengths of paths between the concepts and the nearest common ancestor D1, D2 are the depths of each concept in the hierarchy

Semantic Distance 2

Word Sense Disambiguation Reason of the Word Sense Disambiguation Disambiguated senses PP Attachment Resolution

Word Sense Disambiguation Algorithm 1 From the training corpus, extract all the sentences which contain a prepositional phrase with a verb- object-preposition-description quadruple. Mark each quadruple with the corresponding PP attachment

Word Sense Disambiguation Algorithm 2 2Set the Similarity Distance Threshold SDT = 0 SDT : define the limit matching distance between two quadruples. We say two quadruples are similar, if their distance is less or equal to the current SDT The matching distance between two quadruples Q1 = v1-n1-p- d1 and Q2 = v2-n2-p-d2 is defined as follows: 1 Dqv(Q1, Q2) = (D(v1, v2)^2)+D(n1,n2)+D(d1,d2))/P 2Dqn(Q1, Q2 = (D(v1,v2)+D(n1,n2)^2+D(d1,d2))/P 3Dqd(Q1, Q2) = (D(v1,v2)+D(n1,n2)+D(d1,d2)^2)/P P is the number of pairs of words in the quadruples which have a common semantic ancestor.

Word Sense Disambiguation Algorithm 3 3 Repeat For each quadruple Q in the training set: For each ambiguous word in the quadruple: Among the remaining quadruples find a set S of similar quadruples For each non-empty set S: Choose the nearest similar quadruple from the set S Disambiguate the ambiguous word to the nearest sense of the corresponding word of the chosen nearest quadruple increase the Similarity Distance Threshold SDT=SDT Until all the quadruples are disambiguated or SDT = 3

Word Sense Disambiguation Algorithm 4 Example: Q1. Shut plant for week Q2. Buy company for million Q3. Acquire business for million Q4. Purchase company for million Q5. Shut facility for inspection Q6. Acquire subsidiary for million SDT = 0 : quadruples with all the words with semantic distance = 0.

Word Sense Disambiguation Algorithm 6 Example: Q1. Shut plant for week Q2. Buy company for million Q3. Acquire business for million Q4. Purchase company for million Q5. Shut facility for inspection Q6. Acquire subsidiary for million SDT = 0.0 Min(dis(buy,purchase)) = dist(BUY-1,PURCHASE-1)=0.0 Dqv(Q2,Q4) = 0.0 SDT = 0.1

PP-ATTACHMENT Algorithm Decision Tree Induction Classification

PP-ATTACHMENT Algorithm 2 Decision Tree Induction Algorithm uses the concepts of the WordNet hierarchy as attribute values and create the decision tree. Classification

Decision Tree Induction Let T be a training set of classified quadruples. 1.If all the examples in T are of the same PP attachment type then the result is a leaf labeled with this type, Else 2. Select the most informative attribute A among verb, noun and description 3. For each possible value Aw of the selected attribute A construct recursively a subtree Sw calling the same algorithm on a set of quadruples for which A belongs to the same WordNet class as Aw. 4. Return a tree whose root is A and whose subtrees are Sw and links between A and Sw are labelled Aw.

Decision Tree Induction 2 Most Informative attribute is the one which splits the set T into the most homogenous subsets. The attribute with the lowest overall heterogeneity is selected for the decision tree expansion. Conditional Probabilities of Adverbial Conditional Probabilities of Adjectival

Decision Tree Induction 3

Decision Tree Induction 4 At first, all the training examples are split into subsets which correspond to the topmost concepts of WordNet. Each subset is further split by the attribute which provides less heterogeneous splitting.

PP-ATTACHMENT Algorithm 4 Classification Then a path is traversed in the decision tree, starting at its root and ending at a leaf. The quadruple is assigned the attachment type associated with the leaf, i.e. adjectival or adverbial.

Evaluation And Experimental Result

Conclusion and Future Work Word sense disambiguation can be accompanied by PP attachment resolution, and they complement each other. The most computationally expensive part of the system is the word sense disambiguation of the training corpus. There is still a space for improvement, more training data and/or more accurate sense disambiguation.

Thank you!