Presentation is loading. Please wait.

Presentation is loading. Please wait.

2004.10.19 - SLIDE 1IS 202 - FALL 2004 Lecture 15: Categorization Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am -

Similar presentations


Presentation on theme: "2004.10.19 - SLIDE 1IS 202 - FALL 2004 Lecture 15: Categorization Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am -"— Presentation transcript:

1 2004.10.19 - SLIDE 1IS 202 - FALL 2004 Lecture 15: Categorization Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2004 SIMS 202: Information Organization and Retrieval Credits to Marti Hearst and Warren Sack for some of the slides in this lecture

2 2004.10.19 - SLIDE 2IS 202 - FALL 2004 Agenda Information Organization Overview Categorization Discussion Questions Action Items for Next Time

3 2004.10.19 - SLIDE 3IS 202 - FALL 2004 Agenda Information Organization Overview Categorization Discussion Questions Action Items for Next Time

4 2004.10.19 - SLIDE 4IS 202 - FALL 2004 Information Organization Overview Tuesday, October 19, 2004Categorization Thursday, October 21, 2004Knowledge Representation Tuesday, October 26, 2004Project Introduction Thursday, October 28, 2004Lexical Relations and WordNet Tuesday, November 02, 2004Semantic Web and RDF Thursday, November 04, 2004Controlled Vocabularies Introduction Tuesday, November 09, 2004 Facetted Classification and Thesaurus Design and Construction Thursday, November 11, 2004No Class -- Veteran's Day

5 2004.10.19 - SLIDE 5IS 202 - FALL 2004 Information Organization Overview Tuesday, November 16, 2004Metadata Standards Thursday, November 18, 2004 Multimedia Information Organization and Retrieval Tuesday, November 23, 2004 Metadata for Motion Pictures: Media Streams and MPEG-7 Thursday, November 25, 2004No Class -- Thanksgiving Day Tuesday, November 30, 2004 Mobile and Context-Aware Mutlimedia Information Systems Thursday, December 02, 2004Project Presentations Tuesday, December 07, 2004 Looking Backward Looking Forward: Future of Information Systems Thursday, December 09, 2004Final Review

6 2004.10.19 - SLIDE 6IS 202 - FALL 2004 Agenda Information Organization Overview Categorization Discussion Questions Action Items for Next Time

7 2004.10.19 - SLIDE 7IS 202 - FALL 2004 Categorization Tuesday, October 19, 2004Categorization Thursday, October 21, 2004Knowledge Representation Tuesday, October 26, 2004Project Introduction Thursday, October 28, 2004Lexical Relations and WordNet Tuesday, November 02, 2004Semantic Web and RDF Thursday, November 04, 2004Controlled Vocabularies Introduction Tuesday, November 09, 2004 Facetted Classification and Thesaurus Design and Construction

8 2004.10.19 - SLIDE 8IS 202 - FALL 2004 Foucault on Borges This passage quotes “a certain Chinese encyclopedia” in which it is written that ‘animals are divided into: (a) belonging to the Emperor, (b) embalmed, (c) tame, (d) suckling pigs, (e) sirens, (f) fabulous, (g) stray dogs, (h) included in the present classification, (i) frenzied, (j) innumerable, (k) drawn with a very fine camelhair brush, (l) et cetera, (m) having just broken the water pitcher, (n) that from a long way off look like flies.’ –Michel Foucault, The Order of Things, 1970

9 2004.10.19 - SLIDE 9IS 202 - FALL 2004 Yahoo! Categorization

10 2004.10.19 - SLIDE 10IS 202 - FALL 2004 Yahoo! Categorization Detail

11 2004.10.19 - SLIDE 11IS 202 - FALL 2004 Why Study Categorization? Categorization is central to how we organize information and the world Categorization is a core cognitive process In recent years, centuries-old views of categorization have been revised Understanding how people categorize can help us design information systems that do a better job at organization and retrieval

12 2004.10.19 - SLIDE 12IS 202 - FALL 2004 Why Read Lakoff? Very influential figure in recent thinking about human categorization, metaphor, and cognition Provides summary of historical work and develops syncretic model of cognition and categorization Clear explanations using examples Professor at UC Berkeley (Department of Linguistics)

13 2004.10.19 - SLIDE 13IS 202 - FALL 2004 George Lakoff Lakoff’s research covers many areas of Conceptual Analysis within Cognitive Linguistics –The nature of human conceptual systems, especially metaphor systems for concepts such as time, events, causation, emotions, morality, the self, politics, etc. –The development of Cognitive Social Science, which applies ideas of Cognitive Semantics to the Social Sciences –The implications of Cognitive Science for Philosophy, in collaboration with Mark Johnson, Chair of Philosophy at the University of Oregon –Neural foundations of conceptual systems and language, in collaboration with Jerome Feldman, of the International Computer Science Institute, seeking to develop biologically- motivated structured connectionist systems to model both the learning of conceptual systems and their neural representations –The cognitive structure, especially the metaphorical structure, of mathematics, in collaboration with Rafael Núñez

14 2004.10.19 - SLIDE 14IS 202 - FALL 2004 George Lakoff Selected publications –Metaphors We Live By (with Mark Johnson) Univ. of Chicago Press. 1980. –Women, Fire, and Dangerous Things. University of Chicago Press. 1987. –More Than Cool Reason. (with Mark Turner) Univ. of Chicago Press. 1989. –Moral Politics. University of Chicago Press. 1996. –Philosophy in The Flesh. Basic Books, 1999. –Where Mathematics Comes From: How the Embodied Mind Brings Mathematics into Being. (with Rafael Núñez). Basic Books. 2000. –Moral Politics: How Liberals and Conservatives Think. Second Edition. University of Chicago Press, 2002.

15 2004.10.19 - SLIDE 15IS 202 - FALL 2004 Objectivist Views Thought is mechanical manipulation of symbols The mind is an abstract machine Symbols get their meaning from correspondences to the external world Symbols are internal representations Abstract symbols stand in correspondence with the external world independent of the interpreting organism The human mind is a mirror of nature Human bodies play no role in characterizing concepts Thought is abstract and disembodied Exclusively symbolic machines are capable of thought Thought can be broken down into simple “building blocks” Thought is defined by mathematical logic

16 2004.10.19 - SLIDE 16IS 202 - FALL 2004 Experientialist Views Thought is embodied Thought is imaginative Thought has gestalt properties Thought utilizes basic-level categorization and basic- level primacy Thought uses prototypes and family resemblances as organizing structures Conceptual structure can be described using cognitive models that have the above properties The theory of cognitive models incorporates what was right about the traditional view of categorization, meaning, and reason, while accounting for the empirical data on categorization and fitting the new view overall

17 2004.10.19 - SLIDE 17IS 202 - FALL 2004 Central Conceptual Issue Do meaningful thought and reason concern merely the manipulations of abstract symbols and their correspondence to an objective reality, independent of any embodiment (except, perhaps, for limitations imposed by the organism)? Do meaningful thought and reason essentially concern the nature of the organism doing the thinking—including the nature of its body, its interaction in its environment, its social character, and so on?

18 2004.10.19 - SLIDE 18IS 202 - FALL 2004 Categorization Classical categorization –Necessary and sufficient conditions for membership –Generic-to-specific monohierarchical structure Modern categorization –Characteristic features (family resemblances) –Centrality/typicality (prototypes) –Basic-level categories

19 2004.10.19 - SLIDE 19IS 202 - FALL 2004 Defining Category Membership Necessary and sufficient conditions –Every condition must be met –No other conditions can be required Example: A prime number: –An integer divisible only by itself and 1. Source: Webster's Revised Unabridged Dictionary, © 1996, 1998 MICRA, Inc. Example: mother –A woman who has given birth to a child.

20 2004.10.19 - SLIDE 20IS 202 - FALL 2004 Defining Category Membership Necessary and sufficient conditions for Mother? –mother(A,B) -> female(A), gave-birth-to(A,B), same-species(A,B) What about –Birth mother vs. adoptive mother –Surrogate mother –Transgenic mother

21 2004.10.19 - SLIDE 21IS 202 - FALL 2004 Can Category Membership Be Defined? What are the necessary and sufficient conditions for something to be a game? Famous example by Wittgenstein –Classic categories assume clear boundaries defined by common properties (necessary and sufficient conditions) How do we categorize games?

22 2004.10.19 - SLIDE 22IS 202 - FALL 2004 Definition of Game Counterexample: “Game” –No common properties shared by all games Card games, ball games, Olympic games, children’s games –Competition: ring-around-the-rosy –Skill: dice games –Luck: chess –No fixed boundary to category Can be extended to new games (e.g., video games) Alternative notion of category membership –Concepts related by family resemblances

23 2004.10.19 - SLIDE 23IS 202 - FALL 2004 Properties of Categorization Family resemblance –Members of a category may be related to one another without all members having any property in common Instead, they may share a large subset of traits Some attributes are more likely given that others have been seen –Example: feathers, wings, twittering,... Likely to be a bird, but not all features apply to “emu” Unlikely to see an association with “barks”

24 2004.10.19 - SLIDE 24IS 202 - FALL 2004 Properties of Categorization Example: Prime numbers –Definition: An integer divisible only by itself and 1 –Examples: 2, 3, 5, 7, 11, 13, 17, … A very clear-cut category. Or is it? –Can one number be “more prime” than another? Centrality –Some members of a category may be “better examples” than others, i.e., “prototypical” members Example: robins vs. chickens vs. emus

25 2004.10.19 - SLIDE 25IS 202 - FALL 2004 Properties of Categorization Characteristic features –Perceived degree of category membership has to do with which features help define the category –Members usually do not have ALL the necessary features, but have some subset –Those members that have more of the central features are seen as more central members –People have conceptions of typical members

26 2004.10.19 - SLIDE 26IS 202 - FALL 2004 Testing for Centrality/Typicality Ask a series of questions, compare how long it takes people to answer –True or false: An apple is a fruit A plum is a fruit A coconut is a fruit An olive is a fruit A tomato is a fruit Rosch and Mervis –The more features a fruit shares with the other fruits, the more typical a member of the class it is

27 2004.10.19 - SLIDE 27IS 202 - FALL 2004 Characteristic Features Is a cat on a mat a cat? Is a dead cat a cat? Is a photo of a cat a cat? Is a cat with three legs a cat? Is a cat that barks a cat? Is a cat with a dog’s brain a cat? Is a cat with every cell replaced by a dog’s cells a cat?

28 2004.10.19 - SLIDE 28IS 202 - FALL 2004 Properties of Categorization Basic-level categories –Categories are organized into a hierarchy from the most general to the most specific, but the level that is most cognitively basic is “in the middle” of the hierarchy Basic-level primacy –Basic-level categories are functionally primary with respect to factors including ease of cognitive processing (learning, reasoning, recognition, etc.)

29 2004.10.19 - SLIDE 29IS 202 - FALL 2004 Basic-Level Categories Brown 1958, 1965, Berlin et al., 1972, 1973 Folk biology: –Unique beginner: plant, animal –Life form: tree, bush, flower –Generic name: pine, oak, maple, elm –Specific name: Ponderosa pine, white pine –Varietal name: Western Ponderosa pine No overlap between levels Level 3 is basic –Corresponds to genus –Folk biological categories correspond accurately to scientific biological categories only at the basic level

30 2004.10.19 - SLIDE 30IS 202 - FALL 2004 Psychologically Primary Levels SUPERORDINATE animal furniture BASIC LEVEL dog chair SUBORDINATE terrier rocker Children take longer to learn superordinate categories above the basic level Superordinate categories above the basic level are not associated with mental images or motor actions

31 2004.10.19 - SLIDE 31IS 202 - FALL 2004 Basic-Level Categorization Perception –Overall perceived shape –Single mental image –Fast identification Function –General motor program Communication –Shortest, most commonly used and contextually neutral words –First learned by children Knowledge Organization –Most attributes of category members stored at this level

32 2004.10.19 - SLIDE 32IS 202 - FALL 2004 Middle-Out Categorization Top down –Object Writing implement –Pen Bottom up –Sanford Uniball Black Pen Ink Pen –Pen Middle out –Writing implement Pen –Ink Pen

33 2004.10.19 - SLIDE 33IS 202 - FALL 2004 Summary Processes of categorization underlie many of the issues having to do with information organization Categorization is messier than our computer systems would like Human categories have graded membership, consisting of family resemblances –Family resemblance is expressed in part by which subset of features is shared –It is also determined by underlying understandings of the world that do not get represented in most systems Basic-level categories, as well as subordinate and superordinate categories, seem to be cognitively real and therefore important in the design of information organization and retrieval systems

34 2004.10.19 - SLIDE 34IS 202 - FALL 2004 Agenda Information Organization Overview Categorization Discussion Questions Action Items for Next Time

35 2004.10.19 - SLIDE 35IS 202 - FALL 2004 Discussion Questions (Lakoff) Sarita Yardi on Lakoff –Doesn’t Lakoff’s prototype theory completely debunk IR (as we have learned it so far in this course)? The success of IR relies on its ability to categorize queries and documents such that they can be best matched for the user’s needs. However, if categorization is necessarily dependent on prototypes, embodiment, and human thought, then isn’t it impossible to develop an IR system that can be universally applicable? –I think Boolean matching might avoid the problem because it fits into the abstract and mathematic classic theory. All the other types (vector, probabilistic, relevance feedback, etc.) would require custom design after the effects of human reason had been factored in to each individual usage. –Do you agree? Disagree? Couldn’t care less? Agree but nothing we can do about it? Feel that we should never ever try to apply theories to practical applications?

36 2004.10.19 - SLIDE 36IS 202 - FALL 2004 Discussion Questions (Lakoff) Sarita Yardi on Lakoff –Categorization plays an important role on the web. It sure would be nice if we had some super power index with categorized links to everything we could ever want to know. –Is there any hope for categorizing anything on the web at all or will increasing entropy (randomness and disorder) forever dominate the way we manage information on the web? Does the prototype theory give us more or less hope for the role of categorization on the web? (some random subjects to spur thought… blogs, personal home pages, university websites, google versus yahoo, media)

37 2004.10.19 - SLIDE 37IS 202 - FALL 2004 Agenda Information Organization Overview Categorization Discussion Questions Action Items for Next Time

38 2004.10.19 - SLIDE 38IS 202 - FALL 2004 Next Time Knowledge Representation

39 2004.10.19 - SLIDE 39IS 202 - FALL 2004 George Furnas Lecture Towards Framing the Convergence –Wednesday, October 20, 2004, 4:00 - 5:30 pm –202 South Hall Abstract –The past several years have seen an increased convergence in disciplines working towards the goal of bringing people, information and technology together in more valuable ways. The goal has engaged participants ranging from Computer Science to Library Science, from Organizational Theory to Public Policy, and from Economics to Sociology, among many others. At the University of Michigan's School of Information we have been working for several years in an interdisciplinary effort, not just to participate in this convergence, but to understand it: Why is this broad suite of disciplines needed? What intellectual frameworks might we use for trying to understand how they fit together? How might we use such frameworks to leverage the disparate contributions better? This talk will describe those efforts and one take on some emerging results.

40 2004.10.19 - SLIDE 40IS 202 - FALL 2004 Homework (!) Course Reader –“The Vocabulary Problem in Human-System Communication” (G. W. Furnas, T. K. Landauer, L. M. Gomez, S. T. Dumais) (Steve) –“CYC: A Large-Scale Investment in Knowledge Infrastructure” (D. B. Lenat) (Rupa) –“Commonsense-Based Interfaces” (M. Minsky) (Andrew) –Lakoff redux (Morgan)


Download ppt "2004.10.19 - SLIDE 1IS 202 - FALL 2004 Lecture 15: Categorization Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am -"

Similar presentations


Ads by Google