Presentation is loading. Please wait.

Presentation is loading. Please wait.

Using Semantic Relatedness for Word Sense Disambiguation

Similar presentations


Presentation on theme: "Using Semantic Relatedness for Word Sense Disambiguation"— Presentation transcript:

1 Using Semantic Relatedness for Word Sense Disambiguation
Siddharth Patwardhan 10/24/2002

2 The Lesk1 Algorithm Two Hypotheses:
The intended sense of the target word in a given context is semantically related to other word senses in the context. Semantically related words have greater number of overlaps of their dictionary definitions. 1[Lesk 1986]

3 The rate of interest at this bank is high.
An Example The rate of interest at this bank is high. interest (involvement) interest (interestingness) interest (sake) interest (charge for loan) interest (pastime) interest (stake) bank (financial institution) bank (river) bank (stock) bank (building) bank (arrangement) bank (container) rate (charge per unit) rate (change with time) rate (pace) rate: amount of a charge or payment relative to some basis; "a 10-minute phone call at that rate would cost $5“ interest: a fixed charge for borrowing money; usually a percentage of the amount borrowed; "how much interest do you pay on your mortgage?“ bank: a financial institution that accepts deposits and channels the money into lending activities; "he cashed a check at the bank"; "that bank holds the mortgage on my home"

4 Adapting Lesk to WordNet2
Banerjee and Pedersen [2001] adapt the Lesk algorithm to use the rich source of knowledge in WordNet. hypernym: gloss hypernym: gloss hypernym: gloss bank: gloss interest: gloss rate: gloss hyponym: gloss hyponym: gloss hyponym: gloss 2 [Fellbaum 1998]

5 Semantic Relatedness by counting edges.
Rada, et al [1989] introduce a notion of relatedness between words by counting the number of edges between the them in a “broader-than” hierarchy (MeSH: a hierarchy of medical terms). Leacock and Chodorow [1998] use a similar approach to measure semantic relatedness between concepts by finding the length of the shortest path between the two concepts in the is-a hierarchy of WordNet. They scale this value by the maximum depth of the taxonomy and get a formula for relatedness: relatedness = -log(pathLength/(2·maxDepth))

6 Information Content Introduced by Resnik [1995].
Indicates the specificity or generality of a concept. More specific concepts have higher information content, while more general concepts have less information content. For example concepts like dime, clinker and hayfork are rather specific or topical, would be localized in a discourse and would greatly restrict the choice of concepts that can be used around them (in the context). Computed from large (ideally sense-tagged) corpora. IC(concept) = -log(Probability of occurrence of the concept in a large corpus)

7 Information Content *Root* *Root* +1 +1 Motor vehicle Motor vehicle +1
car car +1 +1 cab cab +1 minicab minicab +1

8 Measures of Semantic Relatedness based on a concept hierarchy
LOWEST COMMON SUBSUMER CONCEPT c2 CONCEPT c1

9 Measures of Semantic Relatedness
Resnik[1995] relatedness = IC(lcs) Jiang Conrath[1997] distance = 2 x IC(lcs) – (IC(c1) + IC(c2)) Lin[1998] relatedness = 2 x IC(lcs) IC(c1) + IC(c2)

10 Measures of Semantic Relatedness
The Hirst-St.Onge[1998] measure c2 Extra Strong Relation – between two occurrences of the same word. (2) Strong Relation – three rules. Synonyms Horizontal Relation Compound – Word Relation (3) Medium Strong Relation – if there exists an allowable path between the two concepts. Is a kind of (UPWARD) Opposite (HORIZONTAL) c4 c1 Has part (DOWNWARD) c3

11 Word Sense Disambiguation using Measures of Semantic Relatedness
CONTEXT Word Target Word2 W11 W12 T1 T2 W21 W22 SENSES LOCAL APPROACH W11 W12 W21 W22 W11 W12 W21 W22 S11 S21 T1 S12 T2 S22 S13 S23 S24 S14 Score(T1) = S11 + S12 + S13 + S14 Score(T2) = S21 + S22 + S23 + S24

12 Word Sense Disambiguation using Measures of Semantic Relatedness
GLOBAL APPROACH a b W11 T1 W21 S1 = a + b + c S5 W11 T2 W21 c T2 W12 T1 W22 S6 W12 W22 S2 W11 T1 W22 W11 T2 W22 S3 S7 W12 T1 W21 W12 T2 S8 W21 S4 Combination with the highest score is selected.

13 Some Results The experiments were performed on the noun instances of
the SENSEVAL-2 data (1723 instances). A context window size of 3 with the local scoring approach was considered for the experiments. Information content was calculated using the SemCor semantically tagged corpus and from the Brown corpus. SemCor Brown Resnik 0.295 0.290 Jiang-Conrath 0.330 0.331 Lin 0.328 0.363 Leacock-Chodorow 0.305 Hirst-St.Onge 0.316 Adapted Lesk 0.391

14 References [Lesk 1986] M. Lesk. Automatic sense disambiguation using machine readable dictionaries: How to tell a pine cone from an ice cream cone In Proceedings of International Conference on Machine Learning, Madison, Wisconsin, August 1998. [Budanitsky Hirst 2001] A. Budanitsky and G. Hirst. Semantic Distance in WordNet: An experimental application-oriented evaluation of five measures. In Workshop on WordNet and Other Lexical Resources, Second meeting of the North American Chapter of the Association for Computational Linguistics, Pittsburgh, June 2001. [Banerjee Pedersen 2002] S. Banerjee and T. Pedersen. An adapted Lesk algorithm for word sense disambiguation using WordNet. In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics, Mexico City, Feb 2002. [Resnik 1995] P. Resnik. Using information content to evaluate semantic similarity in a taxonomy. In Proceedings of the 14th International Joint Conference on Artificial Intelligence, Montreal, August 1995.

15 References [Jiang Conrath 1997] J. Jiang and D. Conrath. Semantic similarity based on corpus statistics and lexical taxonomy. In Proceedings on International Conference on Research in Computational Linguistics, Taiwan, 1997. [Lin 1998] D. Lin. An information-theoretic definition of similarity. In Proceedings of International Conference on Machine Learning, Madison, Wisconsin, August 1998. [Leacock Chodorow 1998] C. Leacock and M. Chodorow. Combining local context and WordNet similarity for word sense identification. In Fellbaum, pp. 265 – 283, 1998. [Hirst St-Onge 1998] G. Hirst and D. St-Onge. Lexical chain as representations of context for the detection and correction of malapropisms. In Fellbaum, pp. 305 – 332, 1998. [Fellbaum 1998] C. Fellbaum, editor. WordNet: An electronic lexical database. MIT Press, 1998. [Rada et al, 1989] R. Rada, H. Mili, E. Bicknell and M. Blettner. Development and application of a metric on semantic nets. IEEE Transactions on Systems, Man and Cybernetics, 19(1):17-30, February 1989.


Download ppt "Using Semantic Relatedness for Word Sense Disambiguation"

Similar presentations


Ads by Google