2003.09.09 - SLIDE 1IS 202 – FALL 2003 Lecture 5: Lexical Relations & WordNet Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday.

Slides:



Advertisements
Similar presentations
The Meaning of Language
Advertisements

Ontology From Wikipedia, the free encyclopedia In philosophy, ontology (from the Greek oν, genitive oντος: of being (part. of εiναι: to be) and –λογία:
An Introduction to Artificial Intelligence Presented by : M. Eftekhari.
ConceptNet: A Wonderful Semantic World
Introduction to phrases & clauses
Introduction to Knowledge Representation Marti Hearst SIMS 202: Information Organization and Retrieval Lecture 6, Sept 10, 1998.
A Library of Generic Concepts for Composing Knowledge Bases Ken Barker, Bruce UTAustin Peter
Introduction to Linguistics and Basic Terms
SLIDE 1IS Fall 2002 Lecture 04: Knowledge Representation Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.
SLIDE 1IS FALL 2004 Lecture 16: Knowledge Representation Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30.
1 Nov 2001IS202: Information Organization and Retrieval Lexical Relations and WordNet Ray Larson & Warren Sack University of California, Berkeley School.
Concepts and Categories. Functions of Concepts By dividing the world into classes of things to decrease the amount of information we need to learn, perceive,
SLIDE 1IS 202 – FALL 2002 Lecture 20: Evaluation Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00.
SLIDE 1IS 202 – FALL 2004 Lecture 13: Midterm Review Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am -
SIMS 202, Marti Hearst MetaData, Objects, Relations: Similarities and Differences and Cognitive Aspects of Categorization SIMS 202, Lecture 10 Fall, 1997.
6 Nov 2001IS202: Information Organization and Retrieval Information Extraction Ray Larson & Warren Sack IS202: Information Organization and Retrieval Fall.
Language, Mind, and Brain by Ewa Dabrowska Chapter 2: Language processing: speed and flexibility.
SLIDE 1IS 202 – FALL 2003 Lecture 26: Final Review Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00.
PSY 369: Psycholinguistics Some basic linguistic theory part3.
COMP 3009 Introduction to AI Dr Eleni Mangina
SLIDE 1IS 202 – FALL 2002 Lecture 20: Lexical Relations & WordNet Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday.
Knowledge Representation Reading: Chapter
PRESENTING NEW LANGUAGE STRUCTURE LANGUAGE STUDENTS ARE NOT ABLE TO USE YET LANGUAGE SHOULD BE PRESENTED IN CONTEXT CHARACTERISTICS TYPES SHOWS WHAT LANGUAGE.
Symbols and Language Lexical Relations SIMS 202 Profs. Hearst & Larson UC Berkeley SIMS Fall 2000.
Meaning and Language Part 1.
The Langue/Parole distinction`
9/8/20151 Natural Language Processing Lecture Notes 1.
CSCI 4410 Introduction to Artificial Intelligence.
Lecture 1 Note: Some slides and/or pictures are adapted from Lecture slides / Books of Dr Zafar Alvi. Text Book - Aritificial Intelligence Illuminated.
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
Knowledge representation
The Cognitive Perspective in Information Science Research Anthony Hughes Kristina Spurgin.
What is linguistics  It is the science of language.  Linguistics is the systematic study of language.  The field of linguistics is concerned with the.
Unit Five Semantics, Study of Meaning
Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.
1 Query Operations Relevance Feedback & Query Expansion.
Chapter 6. Semantics is the study of the meaning of words, phrases and sentences. In semantic analysis, there is always an attempt to focus on what the.
An Intelligent Analyzer and Understander of English Yorick Wilks 1975, ACM.
LOGIC AND ONTOLOGY Both logic and ontology are important areas of philosophy covering large, diverse, and active research projects. These two areas overlap.
Artificial Intelligence By Michelle Witcofsky And Evan Flanagan.
BAA - Big Mechanism using SIRA Technology Chuck Rehberg CTO at Trigent Software and Chief Scientist at Semantic Insights™
How Solvable Is Intelligence? A brief introduction to AI Dr. Richard Fox Department of Computer Science Northern Kentucky University.
I Robot.
LECTURE 2: SEMANTICS IN LINGUISTICS
Wordnet - A lexical database for the English Language.
For Friday Finish chapter 23 Homework –Chapter 23, exercise 15.
What is Artificial Intelligence?
Levels of Linguistic Analysis
Knowledge Structure Vijay Meena ( ) Gaurav Meena ( )
SIMS 296a-4 Text Data Mining Marti Hearst UC Berkeley SIMS.
Semantics Lecture 5. Semantics Language uses a system of linguistic signs, each of which is a combination of meaning and phonological and/or orthographic.
Chapter 11 Language. Some Questions to Consider How do we understand individual words, and how are words combined to create sentences? How can we understand.
Artificial Intelligence Hossaini Winter Outline book : Artificial intelligence a modern Approach by Stuart Russell, Peter Norvig. A Practical Guide.
COMPUTER SYSTEM FUNDAMENTAL Genetic Computer School INTRODUCTION TO ARTIFICIAL INTELLIGENCE LESSON 11.
Decision Support and Business Intelligence Systems (9 th Ed., Prentice Hall) Chapter 12: Artificial Intelligence and Expert Systems.
Knowledge Representation Techniques
What is cognitive psychology?
Artificial intelligence (AI)
CHAPTER 1 Introduction BIC 3337 EXPERT SYSTEM.
ece 627 intelligent web: ontology and beyond
SEMASIOLOGY LECTURE 1.
Fundamentals/ICY: Databases 2010/11 WEEK 1
Philosophy of Mathematics 1: Geometry
Survey of Knowledge Base Content
What is Linguistics? The scientific study of human language
CSC 594 Topics in AI – Applied Natural Language Processing
Introduction Artificial Intelligent.
Levels of Linguistic Analysis
Information Retrieval
Presentation transcript:

SLIDE 1IS 202 – FALL 2003 Lecture 5: Lexical Relations & WordNet Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall SIMS 202: Information Organization and Retrieval

SLIDE 2IS 202 – FALL 2003 Lecture Overview Review Lexical Relations WordNet Demo Discussion Questions Action Items for Next Time Credit for some of the slides in this lecture goes to Marti Hearst and Warren Sack

SLIDE 3IS 202 – FALL 2003 Lecture Overview Review Lexical Relations WordNet Demo Discussion Questions Action Items for Next Time Credit for some of the slides in this lecture goes to Marti Hearst and Warren Sack

SLIDE 4IS 202 – FALL 2003 Definition of AI “... artificial intelligence [AI] is the science of making machines do things that would require intelligence if done by [humans]” (Minsky, 1963)

SLIDE 5IS 202 – FALL 2003 The Goals of AI Are Not New Ancient Greece –Daedalus’ automata Judaism’s myth of the Golem 18 th century automata –Singing, dancing, playing chess? Mechanical metaphors for mind –Clock –Telegraph/telephone network –Computer

SLIDE 6IS 202 – FALL 2003 Some Areas of AI Knowledge representation Programming languages Natural language understanding Speech understanding Vision Robotics Planning Machine learning Expert systems Qualitative simulation

SLIDE 7IS 202 – FALL 2003 AI or IA? Artificial Intelligence (AI) –Make machines as smart as (or smarter than) people Intelligence Amplification (IA) –Use machines to make people smarter

SLIDE 8IS 202 – FALL 2003 Furnas: The Vocabulary Problem People use different words to describe the same things –“If one person assigns the name of an item, other untutored people will fail to access it on 80 to 90 percent of their attempts.” –“Simply stated, the data tell us there is no one good access term for most objects.”

SLIDE 9IS 202 – FALL 2003 The Vocabulary Problem How is it that we come to understand each other? –Shared context –Dialogue How can machines come to understand what we say? –Shared context? –Dialogue?

SLIDE 10IS 202 – FALL 2003 Vocabulary Problem Solutions? Furnas et al. –Make the user memorize precise system meanings –Have the user and system interact to identify the precise referent –Provide infinite aliases to objects Minsky and Lenat –Give the system “commonsense” so it can understand what the user’s words can mean

SLIDE 11IS 202 – FALL 2003 CYC Decades long effort to build a commonsense knowledge-base Storied past 100,000 basic concepts 1,000,000 assertions about the world The validity of Cyc’s assertions are context-dependent (default reasoning)

SLIDE 12IS 202 – FALL 2003 Cyc Examples Cyc can find the match between a user's query for "pictures of strong, adventurous people" and an image whose caption reads simply "a man climbing a cliff" Cyc can notice if an annual salary and an hourly salary are inadvertently being added together in a spreadsheet Cyc can combine information from multiple databases to guess which physicians in practice together had been classmates in medical school When someone searches for "Bolivia" on the Web, Cyc knows not to offer a follow-up question like "Where can I get free Bolivia online?"

SLIDE 13IS 202 – FALL 2003 Cyc Applications Applications currently available or in development –Integration of Heterogeneous Databases –Knowledge-Enhanced Retrieval of Captioned Information –Guided Integration of Structured Terminology (GIST) –Distributed AI –WWW Information Retrieval Potential applications –Online brokering of goods and services –"Smart" interfaces –Intelligent character simulation for games –Enhanced virtual reality –Improved machine translation –Improved speech recognition –Sophisticated user modeling –Semantic data mining

SLIDE 14IS 202 – FALL 2003 Cyc’s Top-Level Ontology Fundamentals Top Level Time and Dates Types of Predicates Spatial Relations Quantities Mathematics Contexts Groups "Doing" Transformations Changes Of State Transfer Of Possession Movement Parts of Objects Composition of Substances Agents Organizations Actors Roles Professions Emotion Propositional Attitudes Social Biology Chemistry Physiology General Medicine Materials Waves Devices Construction Financial Food Clothing Weather Geography Transportation Information Perception Agreements Linguistic Terms Documentation

SLIDE 15IS 202 – FALL 2003 Lecture Overview Review Lexical Relations WordNet Demo Discussion Questions Action Items for Next Time Credit for some of the slides in this lecture goes to Marti Hearst and Warren Sack

SLIDE 16IS 202 – FALL 2003 Syntax The syntax of a language is to be understood as a set of rules which accounts for the distribution of word forms throughout the sentences of a language These rules codify permissible combinations of classes of word forms

SLIDE 17IS 202 – FALL 2003 Semantics Semantics is the study of linguistic meaning Two standard approaches to lexical semantics (cf., sentential semantics; and, logical semantics): –(1) compositional –(2) relational

SLIDE 18IS 202 – FALL 2003 Lexical Semantics: Compositional Approach Compositional lexical semantics, introduced by Katz & Fodor (1963), analyzes the meaning of a word in much the same way a sentence is analyzed into semantic components. The semantic components of a word are not themselves considered to be words, but are abstract elements (semantic atoms) postulated in order to describe word meanings (semantic molecules) and to explain the semantic relations between words. For example, the representation of bachelor might be ANIMATE and HUMAN and MALE and ADULT and NEVER MARRIED. The representation of man might be ANIMATE and HUMAN and MALE and ADULT; because all the semantic components of man are included in the semantic components of bachelor, it can be inferred that bachelor  man. In addition, there are implicational rules between semantic components, e.g. HUMAN  ANIMATE, which also look very much like meaning postulates. –George Miller, “On Knowing a Word,” 1999

SLIDE 19IS 202 – FALL 2003 Lexical Semantics: Relational Approach Relational lexical semantics was first introduced by Carnap (1956) in the form of meaning postulates, where each postulate stated a semantic relation between words. A meaning postulate might look something like dog  animal (if x is a dog then x is an animal) or, adding logical constants, bachelor  man and never married [if x is a bachelor then x is a man and not(x has married)] or tall  not short [if x is tall then not(x is short)]. The meaning of a word was given, roughly, by the set of all meaning postulates in which it occurs. –George Miller, “On Knowing a Word,” 1999

SLIDE 20IS 202 – FALL 2003 Pragmatics Deals with the relation between signs or linguistic expressions and their users Deixis (literally “pointing out”) –E.g., “I’ll be back in an hour” depends upon the time of the utterance Conversational implicature –A: “Can you tell me the time?” –B: “Well, the milkman has come.” [I don’t know exactly, but perhaps you can deduce it from some extra information I give you.] Presupposition –“Are you still such a bad driver?” Speech acts –Constatives vs. performatives –E.g., “I second the motion.” Conversational structure –E.g., turn-taking rules

SLIDE 21IS 202 – FALL 2003 Language Language only hints at meaning Most meaning of text lies within our minds and common understanding –“How much is that doggy in the window?” How much: social system of barter and trade (not the size of the dog) “doggy” implies childlike, plaintive, probably cannot do the purchasing on their own “in the window” implies behind a store window, not really inside a window, requires notion of window shopping

SLIDE 22IS 202 – FALL 2003 Semantics: The Meaning of Symbols Semantics versus Syntax –add(3,4) –3 + 4 –(different syntax, same meaning) Meaning versus Representation –What a person’s name is versus who they are A rose by any other name... –What the computer program “looks like” versus what it actually does

SLIDE 23IS 202 – FALL 2003 Semantics Semantics: assigning meanings to symbols and expressions –Usually involves defining: Objects Properties of objects Relations between objects –More detailed versions include Events Time Places Measurements (quantities)

SLIDE 24IS 202 – FALL 2003 The Role of Context The concept associated with the symbol “21” means different things in different contexts –Examples? The question “Is there any salt?” –Asked of a waiter at a restaurant –Asked of an environmental scientist at work

SLIDE 25IS 202 – FALL 2003 What’s in a Sentence? “A sentence is not a verbal snapshot or movie of an event. In framing an utterance, you have to abstract away from everything you know, or can picture, about a situation, and present a schematic version which conveys the essentials. In terms of grammatical marking, there is not enough time in the speech situation for any language to allow for the marking of everything which could possibly be significant to the message.” Dan Slobin, in Language Acquisition: The state of the art, 1982

SLIDE 26IS 202 – FALL 2003 Lexical Relations Conceptual relations link concepts –Goal of Artificial Intelligence Lexical relations link words –Goal of Linguistics

SLIDE 27IS 202 – FALL 2003 Major Lexical Relations Synonymy Polysemy Metonymy Hyponymy/Hypernymy Meronymy/Holonymy Antonymy

SLIDE 28IS 202 – FALL 2003 Synonymy Different ways of expressing related concepts Examples –cat, feline, Siamese cat Overlaps with basic and subordinate levels Synonyms are almost never truly substitutable –Used in different contexts –Have different implications This is a point of contention

SLIDE 29IS 202 – FALL 2003 Polysemy Most words have more than one sense –Homonym: same sound and/or spelling, different meaning ( bank (river) bank (financial) –Polysemy: different senses of same word ( That dog has floppy ears. She has a good ear for jazz. bank (financial) has several related senses –the building, the institution, the notion of where money is stored

SLIDE 30IS 202 – FALL 2003 Metonymy Use one aspect of something to stand for the whole –The building stands for the institution of the bank. –Newscast: “The White House released new figures today.” –Waitperson: “The ham sandwich spilled his drink.”

SLIDE 31IS 202 – FALL 2003 Hyponymy/Hyperonymy ISA relation Related to Superordinate and Subordinate level categories –hyponym(robin,bird) –hyponym(emu,bird) –hyponym(bird,animal) –hyperym(animal,bird) A is a hypernym of B if B is a type of A A is a hyponym of B if A is a type of B

SLIDE 32IS 202 – FALL 2003 Basic-Level Categories (Review) Brown 1958, 1965, Berlin et al., 1972, 1973 Folk biology: –Unique beginner: plant, animal –Life form: tree, bush, flower –Generic name: pine, oak, maple, elm –Specific name: Ponderosa pine, white pine –Varietal name: Western Ponderosa pine No overlap between levels Level 3 is basic –Corresponds to genus –Folk biological categories correspond accurately to scientific biological categories only at the basic level

SLIDE 33IS 202 – FALL 2003 Psychologically Primary Levels SUPERORDINATE animal furniture BASIC LEVEL dog chair SUBORDINATE terrier rocker Children take longer to learn superordinate Superordinate not associated with mental images or motor actions

SLIDE 34IS 202 – FALL 2003 Meronymy/Holonymy Part/Whole relation –meronym(beak,bird) –meronym(bark,tree) –holonym(tree,bark) Transitive conceptually but not lexically –The knob is a part of the door. –The door is a part of the house. –? The knob is a part of the house ? Holonyms are (approximately) the inverse of meronyms

SLIDE 35IS 202 – FALL 2003 Antonymy Lexical opposites –antonym(large, small) –antonym(big, small) –antonym(big, little) –but not large, little Many antonymous relations can be reliably detected by looking for statistical correlations in large text collections. (Justeson & Katz 91)

SLIDE 36IS 202 – FALL 2003 Thesauri and Lexical Relations Polysemy: same word, different senses of meaning –Slightly different concepts expressed similarly Synonyms: different words, related senses of meanings –Different ways to express similar concepts Thesauri help draw all these together Thesauri also commonly define a set of relations between terms that is similar to lexical relations –BT, NT, RT More on Thesauri next week…

SLIDE 37IS 202 – FALL 2003 What is an Ontology? From Merriam-Webster’s Collegiate –A branch of metaphysics concerned with the nature and relations of being –A particular theory about the nature of being or the kinds of existence More prosaically –A carving up of the world’s meanings –Determine what things exist, but not how they inter- relate Related terms –Taxonomy, dictionary, category structure Commonly used now in CS literature to describe structures that function as Thesauri

SLIDE 38IS 202 – FALL 2003 Lecture Overview Review Lexical Relations WordNet Demo Discussion Questions Action Items for Next Time Credit for some of the slides in this lecture goes to Marti Hearst and Warren Sack

SLIDE 39IS 202 – FALL 2003 WordNet Started in 1985 by George Miller, students, and colleagues at the Cognitive Science Laboratory, Princeton University –Miller also known as the author of the paper “The Magical Number Seven, Plus or Minus Two: Some Limits on our Capacity for Processing Information” (1956) Can be downloaded for free: –

SLIDE 40IS 202 – FALL 2003 Miller on WordNet “In terms of coverage, WordNet’s goals differ little from those of a good standard college-level dictionary, and the semantics of WordNet is based on the notion of word sense that lexicographers have traditionally used in writing dictionaries. It is in the organization of that information that WordNet aspires to innovation.” –(Miller, 1998, Chapter 1)

SLIDE 41IS 202 – FALL 2003 Presuppositions of WordNet Project Separability hypothesis –The lexical component of language can be separated and studied in its own right Patterning hypothesis –People have knowledge of the systematic patterns and relations between word meanings Comprehensiveness hypothesis –Computational linguistics programs need a store of lexical knowledge that is as extensive as that which people have

SLIDE 42IS 202 – FALL 2003 WordNet: Size POSUniqueSynsets Strings Noun Verb Adjective Adverb Totals WordNet Uses “Synsets” – sets of synonymous terms

SLIDE 43IS 202 – FALL 2003 Structure of WordNet

SLIDE 44IS 202 – FALL 2003 Structure of WordNet

SLIDE 45IS 202 – FALL 2003 Structure of WordNet

SLIDE 46IS 202 – FALL 2003 Unique Beginners Entity, something –(anything having existence (living or nonliving)) Psychological_feature –(a feature of the mental life of a living organism) Abstraction –(a general concept formed by extracting common features from specific examples) State –(the way something is with respect to its main attributes; "the current state of knowledge"; "his state of health"; "in a weak financial state") Event –(something that happens at a given place and time)

SLIDE 47IS 202 – FALL 2003 Unique Beginners Act, human_action, human_activity –(something that people do or cause to happen) Group, grouping –(any number of entities (members) considered as a unit) Possession –(anything owned or possessed) Phenomenon –(any state or process known through the senses rather than by intuition or reasoning)

SLIDE 48IS 202 – FALL 2003 Lecture Overview Review Lexical Relations WordNet Demo Discussion Questions Action Items for Next Time Credit for some of the slides in this lecture goes to Marti Hearst and Warren Sack

SLIDE 49IS 202 – FALL 2003 WordNet Demo Available online (from Unix) if you wish to try it… –Login to irony and type “wn word” for any word you are interested in –Demo…

SLIDE 50IS 202 – FALL 2003 Lecture Overview Review Lexical Relations WordNet Demo Discussion Questions Action Items for Next Time Credit for some of the slides in this lecture goes to Marti Hearst and Warren Sack

SLIDE 51IS 202 – FALL 2003 Discussion Questions Joe Hall on Lexical Relations and WordNet –Which method of linguistic analysis do you think will be more fruitful... the painstaking process involved with building WordNet or the relatively easy output afforded by Church et al.'s computational method that, however, requires much work to decipher the results?

SLIDE 52IS 202 – FALL 2003 Discussion Questions Joe Hall on Lexical Relations and WordNet –What are the problems/advantages of using the World Wide Web itself as a "corpus"? (If you were to incorporate the current digital copies of all newspapers, journals, etc. wouldn't you very quickly exceed the 15 Million words of the largest corpus in the Church article?)

SLIDE 53IS 202 – FALL 2003 Discussion Questions Joe Hall on Lexical Relations and WordNet –With the diversity of dialects of the English language, how much does this type of computational analysis get confused by phrases such as "What up?" (i.e., slang)? Aren't these some of the more interesting parts of language (i.e., how language evolves)?

SLIDE 54IS 202 – FALL 2003 Lecture Overview Review Lexical Relations WordNet Demo Discussion Questions Action Items for Next Time Credit for some of the slides in this lecture goes to Marti Hearst and Warren Sack

SLIDE 55IS 202 – FALL 2003 Homework Read Chapters 3 and 5 of The Organization of Information (Textbook) Discussion Question volunteers? –Tu Tran –Hong Qu

SLIDE 56IS 202 – FALL 2003 Next Time Introduction to Metadata