Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Gloss-based Semantic Similarity Metrics for Predominant Sense Acquisition Ryu Iida Nara Institute of Science and Technology Diana McCarthy and Rob Koeling.

Similar presentations


Presentation on theme: "1 Gloss-based Semantic Similarity Metrics for Predominant Sense Acquisition Ryu Iida Nara Institute of Science and Technology Diana McCarthy and Rob Koeling."— Presentation transcript:

1 1 Gloss-based Semantic Similarity Metrics for Predominant Sense Acquisition Ryu Iida Nara Institute of Science and Technology Diana McCarthy and Rob Koeling University of Sussex IJCNLP2008 Jan 10, 2008

2 2 Word Sense Disambiguation Predominant sense acquisition Exploited as a powerful back-off strategy for word sense disambiguation McCarthy et al (2004): Achieved 64% precision on Senseval2 all- words task Strongly relies on linguistic resources such as WordNet for calculating the semantic similarity  Difficulty: porting it to other languages

3 IJCNLP2008 Jan 10, 2008 3 Focus How to calculate the semantic similarity score without semantic relations such as hyponym Explore the potential use of the word definitions (glosses) instead of WordNet- style resources for porting McCarthy et al.’s method to other languages

4 IJCNLP2008 Jan 10, 2008 4 Table of contents 1. Task 2. Related work: McCarthy et al (2004) 3. Gloss-based semantic similarity metrics 4. Experiments WSD on the two datasets: EDR and Japanese Senseval2 task 5. Conclusion and future directions

5 IJCNLP2008 Jan 10, 2008 5 Word Sense Disambiguation (WSD) task select the correct sense of the word appearing in the context I ate fried chicken last Sunday. sense idgloss 1a common farm bird that is kept for its meat and eggs 2the meat from this bird eaten as food 3informal someone who is not at all brave 4a game in which children must do something dangerous to show that they are brave Supervised approaches have been mainly applied to learn the context

6 IJCNLP2008 Jan 10, 2008 6 Word Sense Disambiguation (WSD) task (Cont’d) Estimate the most predominant sense of a word regardless of its context English coarse-grained all words task (2007) Choosing most frequent senses: 78.9% Best performing system: 82.5% Systems using a first sense heuristic have relied on sense-tagged data However, sense-tagged data is expensive

7 IJCNLP2008 Jan 10, 2008 7 McCarthy et al. (2004)’s unsupervised approach Extract top N neighbour words of the target word according to the distributional similarity score (sim ds ) Calculate the prevalent score of each sense Calculate sim ds weighted by the semantic similarity score (sim ss ) Sum up all the weighted sim ds of top N neighbours Semantic similarity: estimated from linguistic resources (e.g. WordNet) Output the sense which has the maximum prevalent score

8 IJCNLP2008 Jan 10, 2008 8 McCarthy et al. (2004)’s approach: An example neighboursim ds turkey0.1805 meat0.1781... tomato0.1573 sense2: the meat from this bird eaten as food. sense3: informal someone who is not at all brave. chicken prevalence(sense2) = 0.0271 + 0.0365 +... + 0.0157 = 0.152 distributional similarity score 0.20 0.10... 0.15 sim ss (word, sense2) 0.0365 0.0157... 0.0271 weighted sim ds = semantic similarity score (from WordNet)

9 IJCNLP2008 Jan 10, 2008 9 McCarthy et al. (2004)’s approach: An example neighboursim ds turkey0.1805 meat0.1781... tomato0.1573 sim ss (word, sense3) 0.01 0.02... 0.01 weighted sim ds 0.0018 0.0037... 0.0016 = sense2: the meat from this bird eaten as food. sense3: informal someone who is not at all brave. chicken prevalence(sense3) = 0.0018 + 0.0037 +... + 0.0016 = 0.023 prevalence(sense2) = 0.152 prevalence(sense2) > prevalence(sense3)  predominant sense: sense2

10 IJCNLP2008 Jan 10, 2008 10 Problem While the McCarthy et al.’s method works well for English, other inventories do no always have WordNet-style resources to tie the nearest neighbors to the sense inventory While traditional dictionaries do not organise senses into synsets, they do typically have sense definitions (glosses) associated with the senses

11 IJCNLP2008 Jan 10, 2008 11 Gloss-based similarity Calculate similarity between two glosses in a dictionary as semantic similarity sim lesk : simply calculate the overlap of the content words in the glosses of the two word senses sim DSlesk : use distributional similarity as an approximation of semantic distance between the words in the two glosses

12 IJCNLP2008 Jan 10, 2008 12 lesk: Example sim lesk (chicken, turkey) = 2 “meat” and “food” are overlapped in two glosses wordgloss chickenthe meat from this bird eaten as food turkeythe meat from a turkey eaten as food

13 IJCNLP2008 Jan 10, 2008 13 lesk: Example sim lesk (chicken, tomato) = 0 No overlap in two glosses wordgloss chickenthe meat from this bird eaten as food tomatoa round soft red fruit eaten raw or cooked as a vegetable

14 IJCNLP2008 Jan 10, 2008 14 sim DSlesk (chicken, tomato) = 1/3 (0.1843 + 0.1001 + 0.1857) = 0.1557 DSlesk Calculate distributional similarity scores of any pairs of nouns in two glosses Output the average of the maximum distributional similarity of all the nouns in target word sim ds (meat, fruit) = 0.1625, sim ds (meat, vegetable) = 0.1843, sim ds (bird, fruit) = 0.1001, sim ds (bird, vegetable) = 0.0717, sim ds (food, fruit) = 0.1857, sim ds (food, vegetable) = 0.1772

15 IJCNLP2008 Jan 10, 2008 15 DSlesk : noun appearing in : gloss of word sense

16 IJCNLP2008 Jan 10, 2008 16 Apply Gloss-based similarity to McCarthy et al.’s approach neighboursim ds turkey0.1805 meat0.1781... tomato0.1573 sim DSlesk (word, sense2) 0.3453 0.2323... 0.1557 weighted sim ds 0.0623 0.0414... 0.0245 = sense2: the meat from this bird eaten as food. sense3: informal someone who is not at all brave. chicken prevalence(sense2) = 0.0623 + 0.0414 +... + 0.0245 = 0.2387

17 IJCNLP2008 Jan 10, 2008 17 Table of contents 1. Task 2. Related work: McCarthy et al (2004) 3. Gloss-based semantic similarity metrics 4. Experiments WSD on the two datasets: EDR and Japanese Senseval2 task 5. Conclusion and future directions

18 IJCNLP2008 Jan 10, 2008 18 Experiment 1: EDR Dataset: EDR corpus 3,836 polysemous nouns (183,502 instances) Adopt the similarity score proposed by Lin (1998) as the distributional similarity score 9-years Mainichi newspaper articles and 10- years Nikkei newspaper articles Japanese dependency parser CaboCha (Kudo and Matsumoto, 2002) Use 50 nearest neighbors in line with McCarthy et al. (2004)

19 IJCNLP2008 Jan 10, 2008 19 Methods Baseline Select one word sense at random for each word token and average the precision over 100 trials Unsupervised: McCarthy et al. (2004) Semantic similarity: Jiang and Conrath (1997) (jcn), lesk, DSlesk Supervised (Majority) Use hand-labeled training data for obtaining the predominant sense of the test words

20 IJCNLP2008 Jan 10, 2008 20 Results: EDR DSlesk is comparable to jcn without the requirement for semantic relations such as hyponymy recallprecision baseline0.402 jcn 0.495 lesk0.4740.488 DSlesk 0.495 upper-bound0.745 supervised0.731

21 IJCNLP2008 Jan 10, 2008 21 Results: EDR (Cont’d) All methods for finding a predominant sense outperform the supervised one for item with little data (≤ 5), indicating that these methods robustly work even for low frequency data where hand-tagged data is unreliable allfreq ≤ 10freq ≤ 5 baseline 0.4020.4050.402 jcn 0.4950.4450.431 lesk 0.4740.4480.426 DSlesk 0.4950.453 0.433 upper-bound 0.7450.6740.639 supervised 0.7310.5190.367

22 IJCNLP2008 Jan 10, 2008 22 Experiment 2 and Results: Senseval2 in Japanese 50 nouns (5,000 instances) 4 methods lesk, DSlesk, baseline, supervised fine-grainedcoarse-grained baseline0.2820.399 lesk0.3440.501 DSlesk 0.3860.593 upper-bound0.7470.834 supervised0.7420.842 precision = recall sense-id: 105-0-0-2-0 fine-grained coarse-grained

23 IJCNLP2008 Jan 10, 2008 23 Conclusion We examined different measures of semantic similarity for finding a first sense heuristic for WSD automatically in Japanese We defined a new gloss-based similarity (DSlesk) and evaluated the performance on two Japanese WSD datasets (EDR and Senseval2), outperforming lesk and achieving a performance comparable to the jcn method which relies on hyponym links which are not always available

24 IJCNLP2008 Jan 10, 2008 24 Future directions Explore other information in the glosses, such as words of other POS and predicate-argument relations Group fine-grained word senses into clusters, making the task suitable for NLP applications (Ide and Wilks, 2006) Use the results of predominant sense acquisition as a prior knowledge of other approaches Graph-based approaches (Mihalcea 2005, Nastase 2008)


Download ppt "1 Gloss-based Semantic Similarity Metrics for Predominant Sense Acquisition Ryu Iida Nara Institute of Science and Technology Diana McCarthy and Rob Koeling."

Similar presentations


Ads by Google