Natural Language Processing Lecture 19: Lexical Semantics.

2 Word Associations One data-driven approach says… 2 gazelle0.105 zebra0.099 leopard0.098 antelope0.090 rhino0.083 baboon0.082 deer0.079 ostrich0.077 rhinoceros0.076 chimpanzee0.074 bison0.072 moose0.068 alligator0.067 gorilla0.066 kangaroo0.066

3 Relations between Words/Senses Synonymy Antonymy Hyponymy/Hypernymy Meronymy/Holonymy

4 Lexical Mini-Ontology wall.v.1 wall.n.1 surround.v.2 fence.n.1 build.v.1 door.n.1 building.n.1enclosure.n.1 destroy.v.1 hypernymy (is-a) hypernym hyponym synonymysynonym meronymy (has-a) holonym (whole) meronym (part) antonymyantonym

5 Synsets for dog (n) S: (n) dog, domestic dog, Canis familiaris (a member of the genus Canis (probably descended from the common wolf) that has been domesticated by man since prehistoric times; occurs in many breeds) "the dog barked all night" S: (n) frump, dog (a dull unattractive unpleasant girl or woman) "she got a reputation as a frump"; "she's a real dog" S: (n) dog (informal term for a man) "you lucky dog" S: (n) cad, bounder, blackguard, dog, hound, heel (someone who is morally reprehensible) "you dirty dog" S: (n) frank, frankfurter, hotdog, hot dog, dog, wiener, wienerwurst, weenie (a smooth-textured sausage of minced beef or pork usually smoked; often served on a bread roll) S: (n) pawl, detent, click, dog (a hinged catch that fits into a notch of a ratchet to move a wheel forward or prevent it from moving backward) S: (n) andiron, firedog, dog, dog-iron (metal supports for logs in a fireplace) "the andirons were too hot to touch" 5

6 What’s a fish? (According to WordNet) fish (any of various mostly cold-blooded aquatic vertebrates usually having scales and breathing through gills) aquatic vertebrate (animal living wholly or chiefly in or on water) aquatic vertebrate vertebrate, craniate (animals having a bony or cartilaginous skeleton with a segmented spinal column and a large brain enclosed in a skull or cranium) vertebratecraniate chordate (any animal of the phylum Chordata having a notochord or spinal column) chordate animal, animate being, beast, brute, creature, fauna (a living organism characterized by voluntary movement) animalanimate beingbeastbrutecreaturefauna organism, being (a living thing that has (or can develop) the ability to act or function independently) organismbeing living thing, animate thing (a living (or once living) entity) living thinganimate thing whole, unit (an assemblage of parts that is regarded as a single entity) wholeunit object, physical object (a tangible and visible entity; an entity that can cast a shadow) objectphysical object entity (that which is perceived or known or inferred to have its own distinct existence (living or nonliving)) entity 6

7 Thesaurus-based Word Similarity 7 giraffe gazelle Order Artiodactyla Class Mammalia lion Order Carnivora Genus Felidae Genus Caniformia Genus Bovidae Genus Giraffidae …

8 Information Content 8 (Adapted from Lin. 1998. An information Theoretic Definition of Similarity. ICML.) # words that are equivalent to or are hyponyms of c IC(c) = -log # words in corpus Entity Inanimate-object Natural-object Geological formation Natural-elevation Hill Shore Coast 0.93 1.79 4.12 6.34 9.09 9.39 10.88 10.74

9 Two Views Ontological Distributional

10 Where’s the beef? Sentences from the brown corpus. Extracted from the concordancer in The Compleat Lexical Tutor,

11 chicken

12 Context Vectors

13 Hypothetical Counts based on Syntactic Dependencies Modified-by- ferocious(adj) Subject-of- devour(v) Object-of- pet(v) Modified-by- African(adj) Modified-by- big(adj) Lion15506 Dog738012 Cat11619 Elephant0001015 … 13

14 Pointwise Mutual Information

15 Distributionally Similar Words 15 Rum vodka cognac brandy whisky liquor detergent cola gin lemonade cocoa chocolate scotch noodle tequila juice Write read speak present receive call release sign offer know accept decide issue prepare consider publish Ancient old modern traditional medieval historic famous original entire main indian various single african japanese giant Mathematics physics biology geology sociology psychology anthropology astronomy arithmetic geography theology hebrew economics chemistry scripture biotechnology (from an implementation of the method described in Lin. 1998. Automatic Retrieval and Clustering of Similar Words. COLING-ACL. Trained on newswire text.)

16 Word Associations Stimulus: wall Number of different answers: 39 Total count of all answers: 98 BRICK 16 0.16 STONE 9 0.09 PAPER 7 0.07 GAME 5 0.05 BLANK 4 0.04 BRICKS 4 0.04 FENCE 4 0.04 FLOWER 4 0.04 BERLIN 3 0.03 CEILING 3 0.03 HIGH 3 0.03 STREET 3 0.03... Stimulus: giraffe Number of different answers: 26 Total count of all answers: 98 NECK 33 0.34 ANIMAL 9 0.09 ZOO 9 0.09 LONG 7 0.07 TALL 7 0.07 SPOTS 5 0.05 LONG NECK 4 0.04 AFRICA 3 0.03 ELEPHANT 2 0.02 HIPPOPOTAMUS 2 0.02 LEGS 2 0.02... From Edinburgh Word Association Thesaurus,

