Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 16: Lexical Semantics, Wordnet, etc

Similar presentations


Presentation on theme: "Lecture 16: Lexical Semantics, Wordnet, etc"— Presentation transcript:

1 Lecture 16: Lexical Semantics, Wordnet, etc
LING 138/238 SYMBSYS 138 Intro to Computer Speech and Language Processing Lecture 16: Lexical Semantics, Wordnet, etc November 23, 2004 Dan Jurafsky Thanks to Jim Martin for many of these slides! 11/13/2018 LING 138/238 Autumn 2004

2 Meaning Traditionally, meaning in language has been studied from three perspectives The meanings of individual words How those meanings combine to make meanings for individual sentences or utterances How those meanings combine to make meanings for a text or discourse We are going to focus today on word meaning, also called lexical semantics. 11/13/2018 LING 138/238 Autumn 2004

3 Outline: Lexical Semantics
Concepts about word meaning including: Homonymy, Polysemy, Synonymy Thematic roles Two computational areas Enabling resource: WordNet Enabling technology: Word sense disambiguation 11/13/2018 LING 138/238 Autumn 2004

4 Preliminaries What’s a word?
Definitions we’ve used over the quarter: Types, tokens, stems, roots, inflected forms, etc... Lexeme: An entry in a lexicon consisting of a pairing of a form with a single meaning representation Lexicon: A collection of lexemes 11/13/2018 LING 138/238 Autumn 2004

5 Relationships between word meanings
Homonymy Polysemy Synonymy Antonymy Hypernomy Hyponomy Meronomy 11/13/2018 LING 138/238 Autumn 2004

6 Homonymy Homonymy: Lexemes that share a form
Phonological, orthographic or both But have unrelated, distinct meanings Clear example: Bat (wooden stick-like thing) vs Bat (flying scary mammal thing) Or bank (financial institution) versus bank (riverside) Can be homophones, homographs, or both: Homophones: Write and right Piece and peace 11/13/2018 LING 138/238 Autumn 2004

7 Homonymy causes problems for NLP applications
Text-to-Speech Same orthographic form but different phonological form bass vs bass Information retrieval Different meanings same orthographic form QUERY: bat care Machine Translation Speech recognition Why? 11/13/2018 LING 138/238 Autumn 2004

8 Polysemy The bank is constructed from red brick I withdrew the money from the bank Are those the same sense? Or consider the following WSJ example While some banks furnish sperm only to married women, others are less restrictive Which sense of bank is this? Is it distinct from (homonymous with) the river bank sense? How about the savings bank sense? 11/13/2018 LING 138/238 Autumn 2004

9 Polysemy A single lexeme with multiple related meanings (bank the building, bank the finantial institution) Most non-rare words have multiple meanings The number of meanings is related to its frequency Verbs tend more to polysemy Distinguishing polysemy from homonymy isn’t always easy (or necessary) 11/13/2018 LING 138/238 Autumn 2004

10 Metaphor and Metonymy Specific types of polysemy Metaphor: Metonymy
Germany will pull Slovenia out of its economic slump. I spent 2 hours on that homework. Metonymy The White House announced yesterday. This chapter talks about part-of-speech tagging Bank (building) and bank (financial institution) 11/13/2018 LING 138/238 Autumn 2004

11 How do we know when a word has more than one sense?
ATIS examples Which flights serve breakfast? Does America West serve Philadelphia? The “zeugma” test: ?Does United serve breakfast and San Jose? 11/13/2018 LING 138/238 Autumn 2004

12 Synonyms Word that have the same meaning in some or all contexts.
filbert hazelnut youth adolescent big large automobile car Two lexemes are synonyms if they can be successfully substituted for each other in all situations 11/13/2018 LING 138/238 Autumn 2004

13 Synonyms But there are few (or no) examples of perfect synonymy.
Why should that be? Even if many aspects of meaning are identical Still may not preserve the acceptability based on notions of politeness, slang, register, genre, etc. Example: Big and large? That’s my big sister That’s my large sister 11/13/2018 LING 138/238 Autumn 2004

14 Antonyms Words that are opposites with respect to one feature of their meaning Otherwise, they are very similar! Dark light Boy girl Hot cold Up down In out 11/13/2018 LING 138/238 Autumn 2004

15 Word Similarity Computation
For various computational applications it’s useful to find words which are similar to another word. Machine translation (to find near-synonyms) Information retrieval (to do “query expansion”) Two ways to do this: Automatic computation based on distributional similarity Lsa.colorado.edu Use a thesaurus which lists similar words. WordNet (we’ll return to this in a few slides) 11/13/2018 LING 138/238 Autumn 2004

16 Hyponymy Hyponymy: the meaning of one lexeme is a subset of the meaning of another Since dogs are canids Dog is a hyponym of canid and Canid is a hypernym of dog Similarly, Car is a hyponym of vehicle Vehicle is a hypernym of car 11/13/2018 LING 138/238 Autumn 2004

17 WordNet A hierarchically organized lexical database
On-line thesaurus + aspects of a dictionary Versions for other languages are under development Category Unique Forms # of Senses Noun 114,648 141,690 Verb 11,306 24,632 Adjective 21,436 31,015 Adverb 4,669 5808 11/13/2018 LING 138/238 Autumn 2004

18 WordNet Demo: http://www.cogsci.princeton.edu/cgi-bin/webwn 11/13/2018
LING 138/238 Autumn 2004

19 Format of Wordnet Entries
11/13/2018 LING 138/238 Autumn 2004

20 WordNet Noun Relations
11/13/2018 LING 138/238 Autumn 2004

21 WordNet Verb and Adj Relations
11/13/2018 LING 138/238 Autumn 2004

22 WordNet Hierarchies 11/13/2018 LING 138/238 Autumn 2004

23 How is “sense” defined in WordNet?
The critical thing to grasp about WordNet is the notion of a synset; it’s their version of a sense or a concept Example: table as a verb to mean defer > {postpone, hold over, table, shelve, set back, defer, remit, put off} For WordNet, the meaning of this sense of table is this list. Another example: give has 45 senses in WN; one of them is: {supply, provide, render, furnish}. 11/13/2018 LING 138/238 Autumn 2004

24 The Internal Structure of Words
Previous section talked about relations between words Now we’ll look at “internal structure” of word meaning. We’ll look at only a few aspects: Thematic roles in predicate-bearing lexemes Selection restrictions on thematic roles Decompositional semantics of predicates Feature-structures for nouns 11/13/2018 LING 138/238 Autumn 2004

25 Thematic Roles Thematic roles:
Thematic roles are semantic generalizations over the specific roles that occur with specific verbs. I.e. Takers, givers, eaters, makers, doers, killers, all have something in common -er They’re all the agents of the actions We can generalize across other roles as well to come up with a small finite set of such roles 11/13/2018 LING 138/238 Autumn 2004

26 Thematic Roles 11/13/2018 LING 138/238 Autumn 2004

27 Thematic Role Examples
11/13/2018 LING 138/238 Autumn 2004

28 Thematic Roles Takes some of the work away from the verbs.
It’s not the case that every verb is unique and has to completely specify how all of its arguments uniquely behave. Provides a locus for organizing semantic processing It permits us to distinguish near surface-level semantics from deeper semantics 11/13/2018 LING 138/238 Autumn 2004

29 Linking Thematic roles, syntactic categories and their positions in larger syntactic structures are all intertwined in complicated ways. For example… AGENTS are often subjects In a VP->V NP NP rule, the first NP is often a GOAL and the second a THEME 11/13/2018 LING 138/238 Autumn 2004

30 More on Linking John opened the door AGENT THEME
The door was opened by John THEME AGENT The door opened THEME John opened the door with the key AGENT THEME INSTRUMENT 11/13/2018 LING 138/238 Autumn 2004

31 Inference Given an event expressed by a verb that expresses a transfer, what can we conclude about the thing labeled THEME with respect to the thing labeled GOAL? 11/13/2018 LING 138/238 Autumn 2004

32 Deeper Semantics From the WSJ…
He melted her reserve with a husky-voiced paean to her eyes. If we label the constituents He and her reserve as the Melter and Melted, then those labels lose any meaning they might have had. If we make them Agent and Theme then we can do more inference. 11/13/2018 LING 138/238 Autumn 2004

33 Problems What exactly is a role? What’s the right set of roles?
Are such roles universals? Are these roles atomic? I.e. Agents Animate, Volitional, Direct causers, etc Can we automatically label syntactic constituents with thematic roles? 11/13/2018 LING 138/238 Autumn 2004

34 Selection Restrictions
I want to eat someplace near campus Using thematic roles we can now say that eat is a predicate that has an AGENT and a THEME What else? And that the AGENT must be capable of eating and the THEME must be something typically capable of being eaten 11/13/2018 LING 138/238 Autumn 2004

35 As Logical Statements For eat…
Eating(e) ^Agent(e,x)^ Theme(e,y)^Isa(y, Food) (adding in all the right quantifiers and lambdas) 11/13/2018 LING 138/238 Autumn 2004

36 Back to WordNet Use WordNet hyponyms (type) to encode the selection restrictions 11/13/2018 LING 138/238 Autumn 2004

37 Selectional Restrictions as loose approximation of deeper semantics
Unfortunately, verbs are polysemous and language is creative… WSJ examples… … ate glass on an empty stomach accompanied only by water and tea you can’t eat gold for lunch if you’re hungry … get it to try to eat Afghanistan 11/13/2018 LING 138/238 Autumn 2004

38 Solutions Eat glass Eat gold Eat Afghanistan
This is actually about an eating event, so might change our model of “Edible”. Eat gold Also about eating, and the can’t creates a scope that permits the THEME to not be edible Eat Afghanistan This is harder, its not really about eating at all 11/13/2018 LING 138/238 Autumn 2004

39 Word Sense Disambiguation (WSD)
Given a word in context, decide which sense of the word this is. 11/13/2018 LING 138/238 Autumn 2004

40 Selection Restrictions for WSD
Semantic selection restrictions can be used to help in WSD. They can help disambiguate: Ambiguous arguments to unambiguous predicates Ambiguous predicates with unambiguous arguments Ambiguity all around 11/13/2018 LING 138/238 Autumn 2004

41 WSD and Selection Restrictions
Ambiguous arguments Prepare a dish Wash a dish Ambiguous predicates Serve Denver Serve breakfast Both Serves vegetarian dishes 11/13/2018 LING 138/238 Autumn 2004

42 Problems As we saw earlier, selection restrictions are violated all the time. This doesn’t mean that the sentences are ill-formed or preferred less than others. This approach needs some way of categorizing and dealing with the various ways that restrictions can be violated 11/13/2018 LING 138/238 Autumn 2004

43 Supervised Machine Learning Approaches
a training corpus of words tagged in context with their sense used to train a classifier that can tag words in new text Just as we saw for part-of-speech tagging, speech recognition, statistical MT. 11/13/2018 LING 138/238 Autumn 2004

44 WSD Tags What’s a tag? A dictionary sense? For example, for WordNet an instance of “bass” in a text has 8 possible tags or labels (bass1 through bass8). 11/13/2018 LING 138/238 Autumn 2004

45 WordNet Bass The noun ``bass'' has 8 senses in WordNet
bass - (the lowest part of the musical range) bass, bass part - (the lowest part in polyphonic music) bass, basso - (an adult male singer with the lowest voice) sea bass, bass - (flesh of lean-fleshed saltwater fish of the family Serranidae) freshwater bass, bass - (any of various North American lean-fleshed freshwater fishes especially of the genus Micropterus) bass, bass voice, basso - (the lowest adult male singing voice) bass - (the member with the lowest range of a family of musical instruments) bass -(nontechnical name for any of numerous edible marine and freshwater spiny-finned fishes) 11/13/2018 LING 138/238 Autumn 2004

46 Representations Most supervised ML approaches require a very simple representation for the input training data. Vectors of sets of feature/value pairs I.e. files of comma-separated values So our first task is to extract training data from a corpus with respect to a particular instance of a target word This typically consists of a characterization of the window of text surrounding the target 11/13/2018 LING 138/238 Autumn 2004

47 Representations This is where ML and NLP intersect
If you stick to trivial surface features that are easy to extract from a text, then most of the work is in the ML system If you decide to use features that require more analysis (say parse trees) then the ML part may be doing less work (relatively) if these features are truly informative 11/13/2018 LING 138/238 Autumn 2004

48 Surface Representations
Collocational and co-occurrence information Collocational Features about words at specific positions near target word Often limited to just word identity and POS Co-occurrence Features about words that occur anywhere in the window (regardless of position) Typically limited to frequency counts 11/13/2018 LING 138/238 Autumn 2004

49 Examples Example text (WSJ)
An electric guitar and bass player stand off to one side not really part of the scene, just as a sort of nod to gringo expectations perhaps Assume a window of +/- 2 from the target 11/13/2018 LING 138/238 Autumn 2004

50 Examples Example text An electric guitar and bass player stand off to one side not really part of the scene, just as a sort of nod to gringo expectations perhaps Assume a window of +/- 2 from the target 11/13/2018 LING 138/238 Autumn 2004

51 Collocational Position-specific information about the words in the window guitar and bass player stand [guitar, NN, and, CJC, player, NN, stand, VVB] Wordn-2, POSn-2, wordn-1, POSn-1, Wordn+1 POSn+1… In other words, a vector consisting of [position n word, position n part-of-speech…] 11/13/2018 LING 138/238 Autumn 2004

52 Co-occurrence Information about the words that occur within the window. First derive a set of terms to place in the vector. Then note how often each of those terms occurs in a given window. 11/13/2018 LING 138/238 Autumn 2004

53 Co-Occurrence Example
Assume we’ve settled on a possible vocabulary of 12 words that includes guitar and player but not and and stand guitar and bass player stand [0,0,0,1,0,0,0,0,0,1,0,0] Which are the counts of words predefined as e.g., [fish,fishing,viol, guitar, double,cello… 11/13/2018 LING 138/238 Autumn 2004

54 Classifiers Once we cast the WSD problem as a classification problem, then all sorts of techniques are possible Naïve Bayes (the right thing to try first) Decision lists Decision trees Neural nets Support vector machines Nearest neighbor methods… 11/13/2018 LING 138/238 Autumn 2004

55 Classifiers The choice of technique, in part, depends on the set of features that have been used Some techniques work better/worse with features with numerical values Some techniques work better/worse with features that have large numbers of possible values For example, the feature the word to the left has a fairly large number of possible values 11/13/2018 LING 138/238 Autumn 2004

56 Naïve Bayes Rewriting with Bayes Removing denominator
assuming independence of the features 11/13/2018 LING 138/238 Autumn 2004

57 Naïve Bayes P(s) … just the prior of that sense.
Just as with part of speech tagging, not all senses will occur with equal frequency P(vj|s)… conditional probability of some particular feature/value combination given a particular sense You can get both of these from a tagged corpus with the features encoded 11/13/2018 LING 138/238 Autumn 2004

58 Naïve Bayes Test On a corpus of examples of uses of the word line, naïve Bayes achieved about 73% correct Good? 11/13/2018 LING 138/238 Autumn 2004

59 Decision Lists Another popular method… 11/13/2018
LING 138/238 Autumn 2004

60 Learning Decision Lists
Restrict the lists to rules that test a single feature (1-decisionlist rules) Evaluate each possible test and rank them based on how well they work. Glue the top-N tests together and call that your decision list. 11/13/2018 LING 138/238 Autumn 2004

61 Yarowsky On a binary (homonymy) distinction used the following metric to rank the tests This gives about 95% on this test… 11/13/2018 LING 138/238 Autumn 2004

62 Bootstrapping What if you don’t have enough data to train a system…
Pick a word that you as an analyst think will co-occur with your target word in particular sense Grep through your corpus for your target word and the hypothesized word Assume that the target tag is the right one 11/13/2018 LING 138/238 Autumn 2004

63 Bootstrapping For bass
Assume play occurs with the music sense and fish occurs with the fish sense 11/13/2018 LING 138/238 Autumn 2004

64 Bass Results 11/13/2018 LING 138/238 Autumn 2004

65 Bootstrapping Perhaps better
Use the little training data you have to train an inadequate system Use that system to tag new data. Use that larger set of training data to train a new system 11/13/2018 LING 138/238 Autumn 2004

66 Problems Given these general ML approaches, how many classifiers do I need to perform WSD robustly One for each ambiguous word in the language How do you decide what set of tags/labels/senses to use for a given word? Depends on the application 11/13/2018 LING 138/238 Autumn 2004

67 WordNet Bass Tagging with this set of senses is an impossibly hard task that’s probably overkill for any realistic application bass - (the lowest part of the musical range) bass, bass part - (the lowest part in polyphonic music) bass, basso - (an adult male singer with the lowest voice) sea bass, bass - (flesh of lean-fleshed saltwater fish of the family Serranidae) freshwater bass, bass - (any of various North American lean-fleshed freshwater fishes especially of the genus Micropterus) bass, bass voice, basso - (the lowest adult male singing voice) bass - (the member with the lowest range of a family of musical instruments) bass -(nontechnical name for any of numerous edible marine and freshwater spiny-finned fishes) 11/13/2018 LING 138/238 Autumn 2004

68 Summary Concepts about word meaning including: Two computational areas
Homonymy, Polysemy, Synonymy Thematic roles Two computational areas Enabling resource: WordNet Enabling technology: Word sense disambiguation 11/13/2018 LING 138/238 Autumn 2004


Download ppt "Lecture 16: Lexical Semantics, Wordnet, etc"

Similar presentations


Ads by Google