Download presentation
Presentation is loading. Please wait.
1
A method for WSD on Unrestricted Text
Authors: Rada Mihalcea and Dan Moldovan Presenter: Marian Olteanu
2
Introduction WSD methods: Hybrid methods
Information in MRD (machine readable dictionaries) Supervised training (info from a disambiguated corpus) Unsupervised training (info from a raw corpus) Hybrid methods
3
Approach Unsupervised learning
Tag all content words (nouns, verbs, adjectives, adverbs) Use Web as a corpus (Altavista search engine) Use semantic density (using WordNet)
4
Algorithm Use word pairs (one word in the context of the other)
Verb-noun pairs (syntactically linked) I.e.: investigate report {report#1, study}, {report#2, news report, story, account, write up}
5
Algorithm (cont.) Search for “investigate report” and “investigate study” – first sense Search for “investigate report”, “investigate news report”, …, “investigate write up” – second sense Order sense # by counts
6
Algorithm (cont.) Repeat for verbs
Use both phrases and NEAR operator – similar results Select first 4 senses for N and V, first 2 for J and R
7
Algorithm – step 2 Compute conceptual density
Apply only for N-V pair (because WN doesn’t have adequate hierarchies for J and R) Between senses found at step 1 Count match between nouns in the sub-glosses of the verb and all the hyponyms (+noun) for the noun
8
Algorithm – step 2 (cont.)
Formula: I find it flawed (log part) revise law:
9
Evaluation SemCor Step 1: Step 2:
10
Comparison
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.