Presentation is loading. Please wait.

Presentation is loading. Please wait.

Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

Similar presentations

Presentation on theme: "Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions."— Presentation transcript:

1 Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions

2 Opening remarks This tutorial is more about cognitive science than IR, is fragmented and offers a somewhat personal interpretation The content is drawn mostly from Gärdenfors Conceptual Spaces: The geometry of thought, MIT Press, 2000. Also driven by some personal intuition: – The model theory for IR should be rooted in cognitive semantics – How do you capture these computational semantics in a computational form and what can you do with them?

3 Gärdenfors point of departure How can representations (information) in a cognitive system be modelled in an appropriate way? – Symbolic perspective: representation via symbol, a cognitive system is described by a Turing machine (cognition = computation = symbol manipulation) – Associationist perspective: representation via associations between different kinds of information elements (e.g. connectionism – associations modelled by artificial neural networks)

4 The problem with the symbolic and associationist perspectives mechanisms of concept acquisition, which are paramount for the understanding of many cognitive phenomena, cannot be given a satisfactory treatment in any of these representational forms – Concept acquisition (learning) closely tied with similarity – Geometric representation: similarity can be modelled in a natural way

5 Gärdenfors cognitive model symbolic conceptual associationist (sub-conceptual) Propositional representation Geometric representation Connectionist representation

6 Conceptual spaces outline Quality dimension Domain Conceptproperty Conceptual spaces are a framework for a number of empirical theories: concept formation, induction, semantics (Context) How can conceptual spaces be realized (e.g., for IR)

7 Quality dimensions Represent various qualities of an object: – Temperature – Weight – Brightness – Pitch – Height – Width – Depth A distinction is made between scientific and phenomenal (psychological) dimensions

8 Quality dimensions (cont) Each quality dimension is endowed with certain geometrical structures (in some cases topological or ordering relations) Weight: isomorphic to non-negative reals 0

9 Quality dimensions may have a discrete geometric structure Discrete structure divides objects into disjoint classes Kinship relation: father, mother, sister etc, (geometric structure = discrete points) Even for discrete dimensions we can distinguish a rudimentary geometric structure 1. 2. t

10 Phenomenal vs. scientific interpretations of dimensions Phenomenal interpretation: dimensions originate from cognitive structures (perception, memories) of humans or other organisms – E.g. (height, width, depth), hue, pitch Scientific interpretation: dimensions are treated as part of a scientific theory – E.g., weight

11 Example: colour Hue- the particular shade of colour – Geometric structure: circle – Value: polar coordinate Chromaticity- the saturation of the colour; from grey to higher intensities – Geometric structure: segment of reals – Value: real number Brightness: black to white – Geometric structure: reals in [0,1] – Value: real number

12 Example: colour (hue, chromaticity, brightness) NB geometric structure allows phenomenologically complementary and opposite hues can be distinguished

13 Integral and separable dimensions Dimensions are integral if an object cannot be assigned a value in one dimension without giving it a value in another: – E.g. cannot distinguish hue without brightness, or pitch without loudness Dimensions that are not integral, are said to be separable Psychologically, integral and separable dimensions are assumed to differ in cross dimensional similarity – – integral dimensions are higher in cross-dimensional similarity than separable dimensions. – (This point will motivate how similarities in the conceptual space are calculated depending on whether dimensions are integral or separable. N.B. IR matching functions treat all dimensions equally)

14 Where do dimensions originate from? Scientific dimensions: tightly connected to the measurement methods used Psychological dimensions: – Some dimensions appear innate, or developed very early; e.g. inside/outside, dangerous/not-dangerous. (These appear to be pre- conscious) – Dimensions are necessary for learning – to make sense of blooming, buzzing, confusion. Dimensions are added by the learning process to expand the conceptual space: – E.g., young children have difficulty in identifying whether two objects differ w.r.t brightness or size, even though they can see the objects differ in some way. Both differentiation and dimensionalization occur throughout ones lifetime.

15 In summary, Quality dimensions are the building blocks of representations within an conceptual space Gärdenfors rebuttal of logical positivism: –Humans and other animals can represent the qualities of objects, for example, when planning an action, without presuming an internal language or another symbolic system in which these qualities are expressed. As a consequence, I claim that the quality dimensions of conceptual spaces are independent of symbolic representations and more fundamental than these

16 Conceptual spaces outline Quality dimension Domain Conceptproperty Conceptual spaces are a framework for a number of empirical theories: concept formation, induction, semantics (Context) How can conceptual spaces be realized (e.g., for IR)

17 Domains and conceptual space A domain is set of integral dimensions- a separable subspace (e.g., hue, chromaticity, brightness) A conceptual space is a collection of one or more domains – Cognitive structure is defined in terms of domains as it is assumed that an object can be ascribed certain properties independently of other properties Not all domains are assumed to be metric – a domain may be an ordering with no distance defined Domains are not independent, but may be correlated, e.g., the ripeness and colour domains co-vary in the space of fruits

18 Conceptual spaces outline Quality dimension Domain Conceptproperty Conceptual spaces are a framework for a number of empirical theories: concept formation, induction, semantics (Context) How can conceptual spaces be realized (e.g., for IR)

19 Properties and concepts: general idea A property is a region in a subspace (domain) A concept is based on several separable subspaces

20 Example property: red hue chromaticity brightness Criterion P: A natural property is a convex region of a domain (subspace) natural – those properties that are natural for the purposes of problem solving, planning, communicating, etc

21 Motivation for convex regions x y x y Convex Not convex x and y are points (objects) in the conceptual space If x and y both have property P, then any object between x and y is assumed to have property P

22 Remarks about Criterion P Criterion P: A natural property is a convex region of a domain (subspace) Assumption: Most properties expressed by simple words in natural languages can be analyzed as natural properties The semantics of the linguistic constituents (e.g. red) is severely constrained by the underlying conceptual space (I.e. no bleen) Criterion P provides an account of properties that is independent of both possible worlds and objects Strong connection between convex regions and prototype theory (categorization) (Easier to understand how inductive inferences are made)

23 Example concept: apple Apple = Criterion C: A natural concept is represented as a set of regions in a number of domains together with an assignment of salience weights to the domains and information about how the regions in the different domains are correlated

24 Concepts and inference (in passing) The salience of different domains determines which associations can be made, and which inferences can be triggered – Context: moving a piano – leads to association heavy More about this next time…..

25 How to model relevance: concept? TopicalityAbout my topic NoveltyUnique or the only source; familiar CurrencyUp-to-date QualityWell written, credible PresentationComprehensive Source aspectsProminent author Info aspectsTheoretical paper Appealenjoyable Table from Yuan, Belkin and Kim, ACM SIGIR 2002 Poster

26 How to model a document(s): ? An exosomantic memory is a computerized system that operates as an extension to human memory. Ideally, use of an exosomantic system would be transparent, so that finding information would seem the same as remembering it to the human user (B.C. Brookes, 1975) – To create computerized representations of data sets that are consistent with human perception of the data sets – To enable personalized relations to representations of data sets – To provide natural interfaces for interaction with exosomantic memory Newby, G. Cognitive space and information space. JASIST 52(12), 2001

27 Term = dimension Since many of the fundamental quality dimensions are determined by our perceptual mechanisms, there is a direct link between properties described by regions of such dimensions and perceptions (rats!) However, dimensional spaces based on terms have shown marked correlation with human information processing: – HAL and note (It is difficult to know how to encode abstract concepts with traditional semantic features. Global co-occurrence models, such as HAL, may provide a solution to part of this problem) – So, terms as dimensions in a global co-occurrence leads useful vector representations of abstract concepts – HALs results seem to be echoed by Newby using Principal Component Analysis on a term-term co-occurrence matrix

28 Text fragment = dimension For example, (term x document) matrix Latent semantic analysis produces vector representations of words in a reduced dimensional space: – LSA correlates with human information processing on a number of tasks, e.g., semantic priming – Landauer at al often use short fragments (dimension = 1 or 2 sentences) Dimensional reduction is apparently successful in re-producing cognitive compatibility, but the reason for this is unknown Determining the appropriate dimensional structure for IR models is still an open question, especially in light of cognitive aspects

29 Similarity: introductory remarks Similarity is central to many aspects of cognition: concept formation (learning), memory and perceptual organization Similarity is not an absolute notion but relative to a particular domain (or dimension) – an apple an orange are similar as they have the same shape – Similarity defined in terms of the number of shared properties leads to arbitrary similarity – a writing desk is like a raven Similarity is an exponentially decreasing function of distance N.B. clustering in IR often uses an absolute notion of similarity

30 Metric spaces A real-valued function d(x,y) is said to be a distance function for space S if it satisfies the following conditions for all points x, y and z in S: A space that has a distance function is called a metric space (There is debate about whether distance is symmetric from a psychological viewpoint. Eg Tversky et al Tel Aviv judged more similar to New York than vice versa. Gärdenfors accepts the symmetry axiom)

31 Equi-distance under the Euclidean metric Set of points at distance d from a point x form a circle Points between x and y are on a straight line x

32 Equi-distance under the city-block metric The set of points at distance d from a point x form a diamond The set of points between x and y is a rectangle generated by x and y and the directions of the axes x

33 Between-ness in the city-block metric x y All points in the rectangle are considered to be between x and y

34 Metrics: integral and separable dimensions For separable dimensions, calculate the distance using the city- block metric: – If two dimensions are separable, the dissimilarity of two stimuli is obtained by adding the dissimilarity along each of the two dimensions For integral dimensions, calculate distance using the Euclidean metric: – When two dimensions are integral, the dissimilarity is determined both dimensions taken together

35 Minkowski metrics Euclidean and city-block are special cases of Minkowski metrics: City-block: r = 1 Euclidean: r = 2

36 Scaling dimensions Due to context, the scales of the different dimensions cannot be assumed identical Dimensional scaling factor

37 Similarity as a function of distance A common assumption in psychological literature is that similarity is an exponentially decaying function of distance: The constant c is a sensitivity parameter. The similarity between x and y drops quickly when the distance between the objects is relatively small, while it drops more slowly when the distance is relatively large. The formula captures the similarity-based generalization performances of human subjects in a variety of settings

38 IR-related comments on similarity In the vector-space model, similarity is determined by the cosine function, which is not exponentially decaying IR models dont distinguish between integral and separable dimensions, even though this distinction is significant from a cognitive point of view Experience so far with computational cognitive models is mixed: – LSA uses cosine similarity (not exponentially decaying)!! – HAL used Minkowski (r = 1) to measure semantic distance, I.e a non- Euclidean distance metric was employed – (Non-Euclidean metrics should perhaps be explored)

39 Prototypes and categorical perception: introductory remarks Human subjects judge a robin as a more prototypical bird than a penguin Classifying an object is accomplished by determining its similarity to the prototype: – Similarity is judged w.r.t a reference object/region – Similarity is context-sensitive: a robin is a prototypical bird, but a canary is a prototypical pet bird Continuous perception: membership to a category is graded

40 Prototype regions in animal space reptile mammal bat platypus penguin bird robin emu archaeopteryx Based on Gärdenfors & Williams IJCAI 2001 Categorical perception: stimuli between categories distinguished with more ease and accuracy than within them

41 Computing categories in conceptual space: Voronoi tessellations Given prototypes require that q be in the same category as its most similar prototype. Consequence: partitioning of the space into convex regions

42 Voronoi Tessellations (cont) Much psychological data concords with tessellating conceptual spaces into star-shaped (and sometimes convex) regions around prototypes (e.g., stop consonants in phoneme classification Boundaries produced by Voronoi tesselations provide the threshold of similarity and support a mechanism explaining categorical perception Gärdenfors & Williams, Reasoning about categories in conceptual spaces, Proceedings IJCAI 2001

43 Part II Concept combination Induction Semantics Non-monotonic aspects of concepts Realizing (approximating) conceptual spaces

Download ppt "Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions."

Similar presentations

Ads by Google