Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to NLP Thanks for Hongning Wang@UVa’s slides on Text Ming Courses, Slides are slightly modified by Lei Chen.

Similar presentations


Presentation on theme: "Introduction to NLP Thanks for Hongning Wang@UVa’s slides on Text Ming Courses, Slides are slightly modified by Lei Chen."— Presentation transcript:

1 Introduction to NLP Thanks for Hongning slides on Text Ming Courses, Slides are slightly modified by Lei Chen

2 What is NLP? كلب هو مطاردة صبي في الملعب. Arabic text
How can a computer make sense out of this string? - What are the basic units of meaning (words)? - What is the meaning of each word? Morphology Syntax - How are words related with each other? Semantics - What is the “combined meaning” of words? Pragmatics - What is the “meta-meaning”? (speech act) Discourse - Handling a large chunk of text Inference - Making sense of everything CS6501: Text Mining

3 An example of NLP A dog is chasing a boy on the playground. + Det Noun
Aux Verb Prep Lexical analysis (part-of-speech tagging) Noun Phrase Complex Verb Prep Phrase Verb Phrase Sentence Dog(d1). Boy(b1). Playground(p1). Chasing(d1,b1,p1). Semantic analysis Syntactic analysis (Parsing) Scared(x) if Chasing(_,x,_). + Scared(b1) Inference A person saying this may be reminding another person to get the dog back… Pragmatic analysis (speech act) CS6501: Text Mining

4 If we can do this for all the sentences in all languages, then …
Automatically answer our s Translate languages accurately Help us manage, summarize, and aggregate information Use speech as a UI (when needed) Talk to us / listen to us If we can do this for all the sentences in all languages, then … BAD NEWS: Unfortunately, we cannot right now. General NLP = “Complete AI” CS6501: Text Mining

5 NLP is difficult!!!!!!! Ambiguity is a “killer”!
Natural language is designed to make human communication efficient. Therefore, We omit a lot of “common sense” knowledge, which we assume the hearer/reader possesses We keep a lot of ambiguities, which we assume the hearer/reader knows how to resolve This makes EVERY step in NLP hard Ambiguity is a “killer”! Common sense reasoning is pre-required CS6501: Text Mining

6 An example of ambiguity
Get the cat with the gloves. CS6501: Text Mining

7 Examples of challenges
Word-level ambiguity “design” can be a noun or a verb (Ambiguous POS) “root” has multiple meanings (Ambiguous sense) Syntactic ambiguity “natural language processing” (Modification) “A man saw a boy with a telescope.” (PP Attachment) Anaphora resolution “John persuaded Bill to buy a TV for himself.” (himself = John or Bill?) Presupposition “He has quit smoking.” implies that he smoked before. CS6501: Text Mining

8 Despite all the challenges, research in NLP has also made a lot of progress…
CS6501: Text Mining

9 A brief history of NLP Early enthusiasm (1950’s): Machine Translation
Too ambitious Bar-Hillel report (1960) concluded that fully-automatic high-quality translation could not be accomplished without knowledge (Dictionary + Encyclopedia) Less ambitious applications (late 1960’s & early 1970’s): Limited success, failed to scale up Speech recognition Dialogue (Eliza) Inference and domain knowledge (SHRDLU=“block world”) Real world evaluation (late 1970’s – now) Story understanding (late 1970’s & early 1980’s) Large scale evaluation of speech recognition, text retrieval, information extraction (1980 – now) Statistical approaches enjoy more success (first in speech recognition & retrieval, later others) Current trend: Boundary between statistical and symbolic approaches is disappearing. We need to use all the available knowledge Application-driven NLP research (bioinformatics, Web, Question answering…) Deep understanding in limited domain Shallow understanding Knowledge representation Robust component techniques Statistical language models Applications CS6501: Text Mining

10 The state of the art A dog is chasing a boy on the playground POS
Tagging: 97% Det Noun Aux Verb Det Noun Prep Det Noun Noun Phrase Noun Phrase Complex Verb Noun Phrase Prep Phrase Verb Phrase Parsing: partial >90% Semantics: some aspects Entity/relation extraction Word sense disambiguation Anaphora resolution Verb Phrase Sentence Inference: ??? Speech act analysis: ??? CS6501: Text Mining

11 Machine translation CS6501: Text Mining

12 Dialog systems Apple’s siri system Google search CS@UVa
CS6501: Text Mining

13 Information extraction
Google Knowledge Graph Wiki Info Box CS6501: Text Mining

14 Information extraction
YAGO Knowledge Base CMU Never-Ending Language Learning CS6501: Text Mining

15 Building a computer that ‘understands’ text: The NLP pipeline
CS6501: Text Mining

16 Tokenization/Segmentation
Split text into words and sentences Task: what is the most likely segmentation /tokenization? There was an earthquake near D.C. I’ve even felt it in Philadelphia, New York, etc. There + was + an + earthquake + near + D.C. I + ve + even + felt + it + in + Philadelphia, + New + York, + etc. CS6501: Text Mining

17 Part-of-Speech tagging
Marking up a word in a text (corpus) as corresponding to a particular part of speech Task: what is the most likely tag sequence A + dog + is + chasing + a + boy + on + the + playground A + dog + is + chasing + a + boy + on + the + playground Det Noun Aux Verb Prep CS6501: Text Mining

18 Named entity recognition
Determine text mapping to proper names Task: what is the most likely mapping Its initial Board of Visitors included U.S. Presidents Thomas Jefferson, James Madison, and James Monroe. Its initial Board of Visitors included U.S. Presidents Thomas Jefferson, James Madison, and James Monroe. Organization, Location, Person CS6501: Text Mining

19 Syntactic parsing Grammatical analysis of a given sentence, conforming to the rules of a formal grammar Task: what is the most likely grammatical structure A + dog + is + chasing + a + boy + on + the + playground Det Noun Aux Verb Det Noun Prep Det Noun Noun Phrase Complex Verb Prep Phrase Verb Phrase Sentence CS6501: Text Mining

20 Relation extraction Identify the relationships among named entities
Shallow semantic analysis Its initial Board of Visitors included U.S. Presidents Thomas Jefferson, James Madison, and James Monroe. 1. Thomas Jefferson Is_Member_Of Board of Visitors 2. Thomas Jefferson Is_President_Of U.S. CS6501: Text Mining

21 Logic inference Convert chunks of text into more formal representations Deep semantic analysis: e.g., first-order logic structures Its initial Board of Visitors included U.S. Presidents Thomas Jefferson, James Madison, and James Monroe. ∃𝑥 (Is_Person(𝑥) & Is_President_Of(𝑥,’U.S.’) & Is_Member_Of(𝑥,’Board of Visitors’)) CS6501: Text Mining

22 Towards understanding of text
Who is Carl Lewis? Did Carl Lewis break any records? CS6501: Text Mining

23 Major NLP applications
Speech recognition: e.g., auto telephone call routing Text mining Text clustering Text classification Text summarization Topic modeling Question answering Language tutoring Spelling/grammar correction Machine translation Cross-language retrieval Restricted natural language Natural language user interface Our focus CS6501: Text Mining

24 NLP & text mining Better NLP => Better text mining
Bad NLP => Bad text mining? Robust, shallow NLP tends to be more useful than deep, but fragile NLP. Errors in NLP can hurt text mining performance… CS6501: Text Mining

25 How much NLP is really needed?
Tasks Dependency on NLP Scalability Classification Clustering Summarization Extraction Topic modeling Translation Dialogue Question Answering Inference Speech Act CS6501: Text Mining

26 So, what NLP techniques are the most useful for text mining?
Statistical NLP in general. The need for high robustness and efficiency implies the dominant use of simple models CS6501: Text Mining

27 What is POS tagging Tagged Text Raw Text POS Tagger
Tag Set NNP: proper noun CD: numeral JJ: adjective POS Tagger Pierre_NNP Vinken_NNP ,_, 61_CD years_NNS old_JJ ,_, will_MD join_VB the_DT board_NN as_IN a_DT nonexecutive_JJ director_NN Nov._NNP 29_CD ._. Tagged Text Raw Text Pierre Vinken , 61 years old , will join the board as a nonexecutive director Nov. 29 . CS 6501: Text Mining

28 Why POS tagging? POS tagging is a prerequisite for further NLP analysis Syntax parsing Basic unit for parsing Information extraction Indication of names, relations Machine translation The meaning of a particular word depends on its POS tag Sentiment analysis Adjectives are the major opinion holders Good v.s. Bad, Excellent v.s. Terrible CS 6501: Text Mining

29 Challenges in POS tagging
Words often have more than one POS tag The back door (adjective) On my back (noun) Promised to back the bill (verb) Simple solution with dictionary look-up does not work in practice One needs to determine the POS tag for a particular instance of a word from its context CS 6501: Text Mining

30 Define a tagset We have to agree on a standard inventory of word classes Taggers are trained on a labeled corpora The tagset needs to capture semantically or syntactically important distinctions that can easily be made by trained human annotators CS 6501: Text Mining

31 Word classes Open classes Closed classes
Nouns, verbs, adjectives, adverbs Closed classes Auxiliaries and modal verbs Prepositions, Conjunctions Pronouns, Determiners Particles, Numerals CS 6501: Text Mining

32 Public tagsets in NLP Brown corpus - Francis and Kucera 1961
500 samples, distributed across 15 genres in rough proportion to the amount published in 1961 in each of those genres 87 tags Penn Treebank - Marcus et al. 1993 Hand-annotated corpus of Wall Street Journal, 1M words 45 tags, a simplified version of Brown tag set Standard for English now Most statistical POS taggers are trained on this Tagset CS 6501: Text Mining

33 How much ambiguity is there?
Statistics of word-tag pair in Brown Corpus and Penn Treebank 11% 18% CS 6501: Text Mining

34 Is POS tagging a solved problem?
Baseline Tag every word with its most frequent tag Tag unknown words as nouns Accuracy Word level: 90% Sentence level Average English sentence length 14.3 words =22% Accuracy of State-of-the-art POS Tagger Word level: 97% Sentence level: =65% CS 6501: Text Mining

35 Building a POS tagger Rule-based solution
Take a dictionary that lists all possible tags for each word Assign to every word all its possible tags Apply rules that eliminate impossible/unlikely tag sequences, leaving only one tag per word she PRP promised VBN,VBD to TO back VB, JJ, RB, NN!! the DT bill NN, VB R1: Pronoun should be followed by a past tense verb Rules can be learned via inductive learning. R2: Verb cannot follow determiner CS 6501: Text Mining

36 Building a POS tagger Statistical POS tagging 𝒕 ∗ =𝑎𝑟𝑔𝑚𝑎 𝑥 𝒕 𝑝(𝒕|𝒘)
What is the most likely sequence of tags 𝒕 for the given sequence of words 𝒘 𝒕= 𝑡 1 𝑡 2 𝑡 3 𝑡 4 𝑡 5 𝑡 6 𝒘= 𝑤 1 𝑤 2 𝑤 3 𝑤 4 𝑤 5 𝑤 6 𝒕 ∗ =𝑎𝑟𝑔𝑚𝑎 𝑥 𝒕 𝑝(𝒕|𝒘) CS 6501: Text Mining

37 POS tagging with generative models
Bayes Rule Joint distribution of tags and words Generative model A stochastic process that first generates the tags, and then generates the words based on these tags 𝒕 ∗ =𝑎𝑟𝑔𝑚𝑎 𝑥 𝒕 𝑝 𝒕 𝒘 =𝑎𝑟𝑔𝑚𝑎 𝑥 𝒕 𝑝 𝒘 𝒕 𝑝(𝒕) 𝑝(𝒘) =𝑎𝑟𝑔𝑚𝑎 𝑥 𝒕 𝑝 𝒘 𝒕 𝑝(𝒕) CS 6501: Text Mining

38 Hidden Markov models Two assumptions for POS tagging
Current tag only depends on previous 𝑘 tags 𝑝 𝒕 = 𝑖 𝑝( 𝑡 𝑖 | 𝑡 𝑖−1 , 𝑡 𝑖−2 ,…, 𝑡 𝑖−𝑘 ) When 𝑘=1, it is so-called first-order HMMs Each word in the sequence depends only on its corresponding tag 𝑝 𝒘 𝒕 = 𝑖 𝑝( 𝑤 𝑖 | 𝑡 𝑖 ) CS 6501: Text Mining

39 Graphical representation of HMMs
𝑝( 𝑡 𝑖 | 𝑡 𝑖−1 ) Transition probability All the tags in the tagset All the words in the vocabulary 𝑝( 𝑤 𝑖 | 𝑡 𝑖 ) Emission probability Light circle: latent random variables Dark circle: observed random variables Arrow: probabilistic dependency CS 6501: Text Mining

40 Finding the most probable tag sequence
𝒕 ∗ =𝑎𝑟𝑔𝑚𝑎 𝑥 𝒕 𝑝 𝒕 𝒘 =𝑎𝑟𝑔𝑚𝑎 𝑥 𝒕 𝑖 𝑝 𝑤 𝑖 𝑡 𝑖 𝑝( 𝑡 𝑖 | 𝑡 𝑖−1 ) Complexity analysis Each word can have up to 𝑇 tags For a sentence with 𝑁 words, there will be up to 𝑇 𝑁 possible tag sequences Key: explore the special structure in HMMs! CS 6501: Text Mining

41 𝒕 𝟏 = 𝑡 4 𝑡 1 𝑡 3 𝑡 5 𝑡 7 𝒕 𝟐 = 𝑡 4 𝑡 1 𝑡 3 𝑡 5 𝑡 2 𝑤 1 𝑤 2 𝑤 3 𝑤 4 𝑤 5 𝑡 1 𝑡 2 𝑡 3 𝑡 4 𝑡 5 𝑡 6 𝑡 7 Word 𝑤 1 takes tag 𝑡 4 CS 6501: Text Mining

42 Trellis: a special structure for HMMs
𝒕 𝟏 = 𝑡 4 𝑡 1 𝑡 3 𝑡 5 𝑡 7 𝒕 𝟐 = 𝑡 4 𝑡 1 𝑡 3 𝑡 5 𝑡 2 Computation can be reused! 𝑤 1 𝑤 2 𝑤 3 𝑤 4 𝑤 5 𝑡 1 𝑡 2 𝑡 3 𝑡 4 𝑡 5 𝑡 6 𝑡 7 Word 𝑤 1 takes tag 𝑡 4 CS 6501: Text Mining

43 Viterbi algorithm Store the best tag sequence for 𝑤 1 … 𝑤 𝑖 that ends in 𝑡 𝑗 in 𝑇[𝑗][𝑖] 𝑇[𝑗][𝑖]=max⁡𝑝( 𝑤 1 … 𝑤 𝑖 , 𝑡 1 …, 𝑡 𝑖 = 𝑡 𝑗 ) Recursively compute trellis[j][i] from the entries in the previous column trellis[j][i-1] 𝑇 𝑗 𝑖 =𝑃 𝑤 𝑖 𝑡 𝑗 𝑀𝑎 𝑥 𝑘 𝑇 𝑘 𝑖−1 𝑃 𝑡 𝑗 𝑡 𝑘 Generating the current observation Transition from the previous best ending tag The best i-1 tag sequence CS 6501: Text Mining

44 Viterbi algorithm Dynamic programming: 𝑂( 𝑇 2 𝑁)! 𝑤 1 𝑤 2 𝑤 3 𝑤 4 𝑤 5
𝑇 𝑗 𝑖 =𝑃 𝑤 𝑖 𝑡 𝑗 𝑀𝑎 𝑥 𝑘 𝑇 𝑘 𝑖−1 𝑃 𝑡 𝑗 𝑡 𝑘 𝑤 1 𝑤 2 𝑤 3 𝑤 4 𝑤 5 𝑡 1 𝑡 2 𝑡 3 𝑡 4 𝑡 5 𝑡 6 𝑡 7 Order of computation CS 6501: Text Mining

45 Decode 𝑎𝑟𝑔𝑚𝑎 𝑥 𝒕 𝑝(𝒕|𝒘) Take the highest scoring entry in the last column of the trellis Keep backpointers in each trellis to keep track of the most probable sequence 𝑇 𝑗 𝑖 =𝑃 𝑤 𝑖 𝑡 𝑗 𝑀𝑎 𝑥 𝑘 𝑇 𝑘 𝑖−1 𝑃 𝑡 𝑗 𝑡 𝑘 𝑤 1 𝑤 2 𝑤 3 𝑤 4 𝑤 5 𝑡 1 𝑡 2 𝑡 3 𝑡 4 𝑡 5 𝑡 6 𝑡 7 CS 6501: Text Mining

46 Train an HMMs tagger Parameters in an HMMs tagger
Transition probability: 𝑝 𝑡 𝑖 𝑡 𝑗 , 𝑇×𝑇 Emission probability: 𝑝 𝑤 𝑡 , 𝑉×𝑇 Initial state probability: 𝑝 𝑡 𝜋 , 𝑇×1 For the first tag in a sentence CS 6501: Text Mining

47 Train an HMMs tagger Maximum likelihood estimator
Given a labeled corpus, e.g., Penn Treebank Count how often we have the pair of 𝑡 𝑖 𝑡 𝑗 and 𝑤 𝑖 𝑡 𝑗 𝑝 𝑡 𝑗 𝑡 𝑖 = 𝑐( 𝑡 𝑖 , 𝑡 𝑗 ) 𝑐( 𝑡 𝑖 ) 𝑝 𝑤 𝑖 𝑡 𝑗 = 𝑐( 𝑤 𝑖 , 𝑡 𝑗 ) 𝑐( 𝑡 𝑗 ) Proper smoothing is necessary! CS 6501: Text Mining

48 Viterbi Algorithm Example
CS 6501: Text Mining

49 Viterbi Algorithm Example (Cont.)
CS 6501: Text Mining

50 CS 6501: Text Mining

51 Public POS taggers Brill’s tagger TnT tagger Stanford tagger SVMTool
TnT tagger Stanford tagger SVMTool GENIA tagger More complete list at CS 6501: Text Mining

52 What you should know Definition of POS tagging problem Public tag sets
Property & challenges Public tag sets Generative model for POS tagging HMMs CS 6501: Text Mining

53 Lexical Semantics Lexical semantics WordNet Distributional semantics
Meaning of words Relation between different meanings WordNet An ontology structure of word senses Similarity between words Distributional semantics Word sense disambiguation CS 6501: Text Mining

54 What is the meaning of a word?
Most words have many different senses dog = animal or sausage? lie = to be in a horizontal position or a false statement made with deliberate intent What are the relations of different words in terms of meaning? Specific relations between senses Animal is more general than dog Semantic fields Money is related to bank “a set of words grouped, referring to a specific subject … not necessarily synonymous, but are all used to talk about the same general phenomenon ” - wiki CS 6501: Text Mining

55 Word senses What does ‘bank’ mean? A financial institution
E.g., “US bank has raised interest rates.” A particular branch of a financial institution E.g., “The bank on Main Street closes at 5pm.” The sloping side of any hollow in the ground, especially when bordering a river E.g., “In 1927, the bank of the Mississippi flooded.” A ‘repository’ E.g., “I donate blood to a blood bank.” CS 6501: Text Mining

56 Lexicon entries lemma senses CS 6501: Text Mining

57 Some terminologies Word forms: runs, ran, running; good, better, best
Any, possibly inflected, form of a word Lemma (citation/dictionary form): run; good A basic word form (e.g. infinitive or singular nominative noun) that is used to represent all forms of the same word Lexeme: RUN(V), GOOD(A), BANK1(N), BANK2(N) An abstract representation of a word (and all its forms), with a part-of-speech and a set of related word senses Often just written (or referred to) as the lemma, perhaps in a different FONT Lexicon A (finite) list of lexemes CS 6501: Text Mining

58 Make sense of word senses
Polysemy A lexeme is polysemous if it has different related senses bank = financial institution or a building CS 6501: Text Mining

59 Make sense of word senses
Homonyms Two lexemes are homonyms if their senses are unrelated, but they happen to have the same spelling and pronunciation bank = financial institution or river bank CS 6501: Text Mining

60 Relations between senses
Symmetric relations Synonyms: couch/sofa Two lemmas with the same sense Antonyms: cold/hot, rise/fall, in/out Two lemmas with the opposite sense Hierarchical relations: Hypernyms and hyponyms: pet/dog The hyponym (dog) is more specific than the hypernym (pet) Holonyms and meronyms: car/wheel The meronym (wheel) is a part of the holonym (car) CS 6501: Text Mining

61 WordNet A very large lexical database of English:
George Miller, Cognitive Science Laboratory of Princeton University, 1985 A very large lexical database of English: 117K nouns, 11K verbs, 22K adjectives, 4.5K adverbs Word senses grouped into synonym sets (“synsets”) linked into a conceptual-semantic hierarchy 82K noun synsets, 13K verb synsets, 18K adjectives synsets, 3.6K adverb synsets Avg. # of senses: 1.23/noun, 2.16/verb, 1.41/adj, 1.24/adverb Conceptual-semantic relations hypernym/hyponym CS 6501: Text Mining

62 A WordNet example http://wordnet.princeton.edu/ CS@UVa
CS 6501: Text Mining

63 Hierarchical synset relations: nouns
Hypernym/hyponym (between concepts) The more general ‘meal’ is a hypernym of the more specific ‘breakfast’ Instance hypernym/hyponym (between concepts and instances) Austen is an instance hyponym of author Member holonym/meronym (groups and members) professor is a member meronym of (a university’s) faculty Part holonym/meronym (wholes and parts) wheel is a part meronym of (is a part of) car. Substance meronym/holonym (substances and components) flour is a substance meronym of (is made of) bread Jane Austen, 1775–1817, English novelist CS 6501: Text Mining

64 WordNet hypernyms & hyponyms
CS 6501: Text Mining

65 Hierarchical synset relations: verbs
the presence of a ‘manner’ relation between two lexemes Hypernym/troponym (between events) travel/fly, walk/stroll Flying is a troponym of traveling: it denotes a specific manner of traveling Entailment (between events): snore/sleep Snoring entails (presupposes) sleeping CS 6501: Text Mining

66 WordNet similarity Path based similarity measure between words
Shortest path between two concepts (Leacock & Chodorow 1998) sim = 1/|shortest path| Path length to the root node from the least common subsumer (LCS) of the two concepts (Wu & Palmer 1994) sim = 2*depth(LCS)/(depth(w1)+depth(w2)) the most specific concept which is an ancestor of both A and B. CS 6501: Text Mining

67 WordNet::Similarity CS 6501: Text Mining

68 WordNet::Similarity CS 6501: Text Mining

69 Distributional hypothesis
What is tezgüino? A bottle of tezgüino is on the table. Everybody likes tezgüino. Tezgüino makes you drunk. We make tezgüino out of corn. The contexts in which a word appears tell us a lot about what it means CS 6501: Text Mining

70 Distributional semantics
Use the contexts in which words appear to measure their similarity Assumption: similar contexts => similar meanings Approach: represent each word 𝑤 as a vector of its contexts 𝑐 Vector space representation Each dimension corresponds to a particular context 𝑐 𝑛 Each element in the vector of 𝑤 captures the degree to which the word 𝑤 is associated with the context 𝑐 𝑛 Similarity metric Cosine similarity CS 6501: Text Mining

71 How to define the contexts
within a sentence Nearby words 𝑤 appears near 𝑐 if 𝑐 occurs within ±𝑘 words of 𝑤 It yields fairly broad thematic relations Decide on a fixed vocabulary of 𝑁 context words 𝑐 1 .. 𝑐 𝑁 Prefer words occur frequently enough in the corpus but not too frequent (i.e., avoid stopwords) Co-occurrence count of word 𝑤 and context 𝑐 as the corresponding element in the vector Pointwise Mutual Information (PMI) Grammatical relations How often is 𝑤 used as the subject of the verb 𝑐? Fine-grained thematic relations CS 6501: Text Mining

72 Mutual information Relatedness between two random variables
𝐼 𝑋;𝑌 = 𝑦∈𝑌 𝑥∈𝑋 𝑝(𝑥,𝑦) log ( 𝑝(𝑥,𝑦) 𝑝 𝑥 𝑝(𝑦) ) CS 6501: Text Mining

73 Pointwise mutual information
within a sentence PMI between w and c using a fixed window of ±𝑘 words 𝑃𝑀𝐼 𝑤;𝑐 =𝑝(𝑤,𝑐) log ( 𝑝(𝑤,𝑐) 𝑝 𝑤 𝑝(𝑐) ) How often 𝑤 and 𝑐 co-occur inside a window How often 𝑤 occurs How often 𝑐 occurs CS 6501: Text Mining

74 Word sense disambiguation
What does this word mean? This plant needs to be watered each day. living plant This plant manufactures 1000 widgets each day. factory Word sense disambiguation (WSD) Identify the sense of content words (noun, verb, adjective) in context (assuming a fixed inventory of word senses) watered manufactures CS 6501: Text Mining

75 Dictionary-based methods
A dictionary/thesaurus contains glosses and examples of a word bank1 Gloss: a financial institution that accepts deposits and channels the money into lending activities Examples: “he cashed the check at the bank”, “that bank holds the mortgage on my home” bank2 Gloss: sloping land (especially the slope beside a body of water) Examples: “they pulled the canoe up on the bank”, “he sat on the bank of the river and watched the current” CS 6501: Text Mining

76 Lesk algorithm Compare the context with the dictionary definition of the sense Construct the signature of a word in context by the signatures of its senses in the dictionary Signature = set of context words (in examples/gloss or in context) Assign the dictionary sense whose gloss and examples are the most similar to the context in which the word occurs Similarity = size of intersection of context signature and sense signature context words CS 6501: Text Mining

77 Sense signatures bank1 bank2
Gloss: a financial institution that accepts deposits and channels the money into lending activities Examples: “he cashed the check at the bank”, “that bank holds the mortgage on my home” bank2 Gloss: sloping land (especially the slope beside a body of water) Examples: “they pulled the canoe up on the bank”, “he sat on the bank of the river and watched the current” Signature(bank1) = {financial, institution, accept, deposit, channel, money, lend, activity, cash, check, hold, mortgage, home} Signature(bank1) = {slope, land, body, water, pull, canoe, sit, river, watch, current} CS 6501: Text Mining

78 Signature of target word
“The bank refused to give me a loan.” Simplified Lesk Words in context Signature(bank) = {refuse, give, loan} Original Lesk Augmented signature of the target word Signature(bank) = {refuse, reject, request,... , give, gift, donate,... loan, money, borrow,...} CS 6501: Text Mining

79 What you should know Lexical semantics Distributional semantics
Relationship between words WordNet Distributional semantics Similarity between words Word sense disambiguation CS 6501: Text Mining


Download ppt "Introduction to NLP Thanks for Hongning Wang@UVa’s slides on Text Ming Courses, Slides are slightly modified by Lei Chen."

Similar presentations


Ads by Google