Presentation is loading. Please wait.

Presentation is loading. Please wait.

Universiteit van amsterdam | 1 Reinhard Blutner Geometric models of meaning and compositionality Institute for Logic,

Similar presentations


Presentation on theme: "Universiteit van amsterdam | 1 Reinhard Blutner Geometric models of meaning and compositionality Institute for Logic,"— Presentation transcript:

1 universiteit van amsterdam | 1 Reinhard Blutner http://www.blutner.de blutner@uva.nl Geometric models of meaning and compositionality Institute for Logic, Language and Computation

2 universiteit van amsterdam Tandem Workshop Berlin, December 11-13, 2010 | 2 0 Introduction NL Comprehension: semantic interpretation: automatic etc. pragmatic interpretation: indirect, reflective, normally based on reasoning (inferentialism) Perception: direct interpretat- ion of sensory input automatic, unreflective, instinctive. normally, it is not based on some kind of reasoning.

3 universiteit van amsterdam Thesis Tandem Workshop Berlin, December 11-13, 2010 | 3 NL interpretation (semantic + pragmatic) is basically as direct as perception.  Most current models of NL interpretation are wrong Therefore, look for alternative models. Framework of geometric representations of meaning

4 universiteit van amsterdam Tandem Workshop Berlin, December 11-13, 2010 | 4 Outlook 1 Millikan’s thesis of direct interpretation 2 Modern contextualism 3 Context-sensitivity 4 Geometric models of meaning 5 Adjectival modification 6 Compositionality for geometric models

5 universiteit van amsterdam | 5 1 Millikan’s thesis of direct interpretation DPL-thesis General arguments Experimental evidence Tandem Workshop Berlin, December 11-13, 2010

6 universiteit van amsterdam | 6 Direct Perception through Language  We directly perceive a red chair here. We automatically derive that without deliberation about the reliability of the sources (reflected light).  DPL: NL interpretation is as direct as perception Derivation of literal meaning does not proceed by conscious inference The content of the heard utterances integrates automatically our „belief boxes“ Tandem Workshop Berlin, December 11-13, 2010

7 universiteit van amsterdam | 7 Descartes vs. Spinoza  Can people comprehend assertions without believing them? Descartes suggested that people can and should Spinoza suggested that people should but cannot.  Burge (1993): We may invoke (conscious) justification for not believing the content of some utterance. The default position, however, is to accept such contents as true.  Evolutionary arguments (Axelrod, Millikan, Burge) Tandem Workshop Berlin, December 11-13, 2010

8 universiteit van amsterdam | 8 Gilbert’s (1993) experiment  Subjects read a crime report that contained both true and false statements. The color of critical parts of the text indicated whether a particular statement was true or false.  Some subjects performed a concurrent digit-search task as they read the critical statements [interrupted], and others did not [uninterrupted].  Finally, subjects completed a recognition memory test for the critical sentences contained in the report. Tandem Workshop Berlin, December 11-13, 2010

9 universiteit van amsterdam Results | 9 recognized as true / false statements In the recognition test, subjects responded correctly above the change level The participants of the interrupted group reported false information as true but not true information as false This indicates that Spinoza is right: people automatically take presented information as true; it takes conscious attention not believing it. Tandem Workshop Berlin, December 11-13, 2010

10 universiteit van amsterdam | 10 Resumé  The assumption that NL interpretation is as direct as perception (DPL-thesis) has important consequences for constructing psychologically adequate models of interpretation  I propose to take DPL as a serious challenge for computational models of NL interpretation  DPL − properly generalized − asks for a default mode of NL interpretation which is fully compositional and runs automatically. Tandem Workshop Berlin, December 11-13, 2010

11 universiteit van amsterdam | 11 2 Modern contextualism The neo/post-Gricean picture of contextualism Classification scheme The silent assumptions Tandem Workshop Berlin, December 11-13, 2010

12 universiteit van amsterdam | 12 The neo-/post-Gricean picture: Contextualism  Using the meanings of the words plus the syntactic structure of the sentence, it is not possible to calculate the literal meaning of the sentence. Some kind of underdetermined representation can be computed only.  Semantic underdetermination and the existence of unarticulated constituents are postulated.  The mechanism of pragmatic enrichment is crucial both for determining what the speaker says and what she means. Tandem Workshop Berlin, December 11-13, 2010

13 universiteit van amsterdam | 13 Variants of contextualism Relevance Theory Presumptive Meanings Neo- Gricean Theories (Horn, Atlas) OT-Pragmatics Tandem Workshop Berlin, December 11-13, 2010

14 universiteit van amsterdam | 14 Silent assumptions  Language of thought hypothesis -Cognitive activities require a language-like representational medium -Rule-governed processes operate on representations (inferences) -Symbolic representations have a combinatorial syntax and semantics -Propositions form a Boolean lattice; Kolmogorov probabilities  This contrasts with connectionist and geometric models of semantic interpretation Tandem Workshop Berlin, December 11-13, 2010

15 universiteit van amsterdam | 15 Disjunction puzzle (A & C)  (A & C)  A (distributivity) A | C: p A | C: q A | (C  C): between p … q Tversky and Shafir (1992) show that significantly more students report they would purchase a nonrefundable Hawaiian vacation if they were to know that they have passed or failed an important exam than report they would purchase if they were not to know the outcome of the exam. Tandem Workshop Berlin, December 11-13, 2010

16 universiteit van amsterdam | 16 Interference (Franco, Khrennikov,…)  In QM the superposition of two or more states can lead to interference effects  Classical: P(A/C  C’ ) = ½ P(A|C) + ½ P(A|C’ ), if P(C)/P(C  C’ ) = P(C’ )/P(C  C’ ) = ½   (A|C+C’) = ½ |  A|C+C’  | 2 = ½ |  A|C  | 2 + ½ |  A|C’  | 2 + |  A|C  |  |  A|C’  |  cos(  ) = ½  (A|C) + ½  (A|C’ ) + interference term  A disjunction effect of -.16 (Tversky and Shafir 1992) can be described by cos  = -0.35 Tandem Workshop Berlin, December 11-13, 2010

17 universiteit van amsterdam | 17 The holistic character of decisions  Decision processes are not “rational” in the sense of rational decision theory (checking different cases and calculating expected utilities)  The decision processes are led by simple but powerful global heuristics, which act in a fast and frugal way (Gigerenzer)  Models of bounded rationality can be formulated in agreement with a weak form of rationalism. Tandem Workshop Berlin, December 11-13, 2010

18 universiteit van amsterdam | 18 Resumé  Most variants of modern contextualism ignore the challenge of the DPL-thesis  Issue of performance (automatic/controlled): Inferences are controlled processes  Issue of competence: The proposed models cannot handle puzzles of bounded rationality. Tandem Workshop Berlin, December 11-13, 2010

19 universiteit van amsterdam | 19 3 Context-sensitivity Some examples Why a ‘tall boy’ is a big problem The problem with ‘absolute’ adjectives How to calculate truth-conditions Tandem Workshop Berlin, December 11-13, 2010

20 universiteit van amsterdam | 20 Some examples 1  John ate breakfast [this morning; in the normal way]  Every boy [in the class] is seated  Peter began a novel [ to read/ to write]  I‘m parking outside [my car]  Max is tall [for a fifth grader] Tandem Workshop Berlin, December 11-13, 2010

21 universiteit van amsterdam | 21 Some examples 2  What color is a red nose, red flag, red bean?  This apple is red [on the outside] Tandem Workshop Berlin, December 11-13, 2010

22 universiteit van amsterdam | 22 More examples  Quine (1960) was the first who noted the contrast between red apple (red on the outside) and pink grapefruit (pink on the inside).  In a similar vein, Lahav (1993) argues that an adjective such as brown doesn’t make a simple and fixed contribution to any composite expression in which it appears: In order for a cow to be brown most of its body’s surface should be brown, though not its udders, eyes, or internal organs. A brown crystal, on the other hand, needs to be brown both inside and outside. A brown book is brown if its cover, but not necessarily its inner pages, are mostly brown, while a newspaper is brown only if all its pages are brown. For a potato to be brown it needs to be brown only outside,... (Lahav 1993: 76). Tandem Workshop Berlin, December 11-13, 2010

23 universiteit van amsterdam | 23 Why a ‘tall boy’ is still a problem  Quasi-deictic elements tall boy  x [tall*(x,N) & boy(x)] (Sag, Bartsch, Bosch)  Minimal proposition tall boy  x N[tall*(x,N) & boy(x)] (Capellen & Lapore, Hobbs))  Underdetermination tall boy  x N [tall*(x,N) & boy(x)] (Alshawi, Pinkal) * with tall(x,N)  size(x) > N Tandem Workshop Berlin, December 11-13, 2010

24 universiteit van amsterdam | 24 The problem with ‘absolute’ adjectives  red apple x [part(Y,x) & red(Y) & apple(x)] Requires rather clumsy lexical entries How much of the peel of an apple has to be red in order to call it a red peel? This theory does not really clarify how the border line between semantics and pragmatics is ever to be determined  The best what the symbolic tradition suggests is somewhat like Montague‘s adnominal functors. red apple  x [(red(apple))(x)] Tandem Workshop Berlin, December 11-13, 2010

25 universiteit van amsterdam | 25 A theory based on adnominal functors  Montague (1970) as starting-point: adjectives as adnominal functors (worst case)  red(X) means roughly the property – (a) of having a red inner volume if X denotes fruits only the inside of which is edible – (b) of having a red surface if X denotes fruits with edible outside – (c) of having a functional part that is red if X denotes tools, …  This approach is compositional but not systematic Tandem Workshop Berlin, December 11-13, 2010

26 universiteit van amsterdam | 26 How to calculate truth conditions?  The mechanism of adnominal functors requires idiosyncratic lexical entries for fixing the interpretations of complex expressions.  Alternative suggestions from Cognitive Linguistics -Blending theory* (Fouconnier & Turner) -Modulation (Recanati)  What is the computational mechanism? A lovely notation does not yet provide a real mechanism. *In blending theory the part of a concept for which a given modification is relevant is referred to as an ‘active zone’, first discussed as such in Langacker (1991). In the case of an apple, the color is only relevant for the skin of the apple, which is its active zone. Tandem Workshop Berlin, December 11-13, 2010

27 universiteit van amsterdam | 27 Resumé  Compositional semantics based on standard symbolic models cannot account for systematicity  Alternative proposal: division of labor between underdetermined semantics and pragmatic theories of contextual enrichment  Inferential theories of contextual enrichment fail for reasons of explanatory adequacy. Tandem Workshop Berlin, December 11-13, 2010

28 universiteit van amsterdam | 28 4 Geometric models of meaning Geometric models of meaning Possible worlds and conceptual states Comparing models of meaning Prototype semantics Tandem Workshop Berlin, December 11-13, 2010

29 universiteit van amsterdam | 29 Geometric Models of Meaning 1  Basic claim: An understanding of problem solving, categorization, memory retrieval, inductive reasoning, and other cognitive processes requires that we understand how humans assess similarity.  W. S. Torgerson (1965): Multidimensional scaling of similarity. Psychometrika 30: 379– 393. Tandem Workshop Berlin, December 11-13, 2010

30 universiteit van amsterdam | 30 Geometric Models of Meaning 2  A. Tversky (1977): Features of similarity. Psychological Review 84: 327–352.  P. Gärdenfors: The Geometry of Thought (2000) Concepts as convex spaces  D. Widdows: Geometry and Meaning (2004) Distributional semantics Tandem Workshop Berlin, December 11-13, 2010

31 universiteit van amsterdam | 31 Possible worlds and conceptual states  Possible worlds: Isolated entities which are used for modeling propositions (sets of possible worlds)  Conceptual states: geometrical objects which form vector spaces. The addition of two vectors is an operation which describes the superposition of states  interference phenomena Tandem Workshop Berlin, December 11-13, 2010

32 universiteit van amsterdam | 32 Comparing Models of Meaning Standard symbolic modelsGeometric models Formal semantics (Montague 1978); Partee, Kamp Mental spaces (Gärdenfors 2000); Lakoff, Fouconnier Concepts as set-theoretic constructions Natural concepts as convex subspaces Qualitative aspects of meaning, feature and tree structures, compositionality Quantitative aspects of meaning, Similarity structures, problems with compositionality Boolean algebraOrthoalgebra Tandem Workshop Berlin, December 11-13, 2010

33 universiteit van amsterdam | 33 Conceptual states: sim & prob  A conceptual state is a vector indicating how diagnostic each component (instance, feature) is for the whole state (  frozen statistics)  sim: The scalar product of two normalized vectors is a measure for their similarity  prob: If designates a component, then gives the probability that the conceptual state triggers component. [Born rule] Tandem Workshop Berlin, December 11-13, 2010

34 universiteit van amsterdam | 34 Resumé  Geometric models of meaning can account for similarity ratings and ratings of probability and typicality  Geometric approaches to meaning have problems with handling compositionality. Tandem Workshop Berlin, December 11-13, 2010

35 universiteit van amsterdam | 35 5 Adjectival modification Three kinds of adjectives Conjunction puzzle The modification rule Examples Tandem Workshop Berlin, December 11-13, 2010

36 universiteit van amsterdam | 36 Three kinds of adjectives (based on B. Partee)  Intersective -Adj’(Q)(x)  P Adj’ (x) & Q(x)  Subsective -Adj’(Q)(x)  Q(x)  Privative -Adj’(Q)(x)  Q(x) Tandem Workshop Berlin, December 11-13, 2010

37 universiteit van amsterdam | 37 Natural classification? (B. Partee) intersective subsective modal privative blond, rectangular, French tall, good, typical, recent spurious, imaginary, fake. alleged, potential, arguable blond, rectangular, French spurious, imaginary, fake. modal alleged, potential, arguable Tandem Workshop Berlin, December 11-13, 2010

38 universiteit van amsterdam | 38 The problem  to give a uniform account for each of the big syntactic classes  to explain the pragmatic differences within the classes  to account for both -graded membership function -typicality function  to explain the puzzles of typicality  to conform the DPL thesis. Tandem Workshop Berlin, December 11-13, 2010

39 universiteit van amsterdam | 39 Conjunction puzzle Let (a) be a particular apple. There is no doubt that it is psychologically less proto-typical of an apple (whose prototype looks more like (b)) than of an apple-with-stripes; hence c striped apple (a) > c apple (a) CE = c striped apple (x)  c apple (x) (Osherson & Smith 1981) (a) (c) (b) Tandem Workshop Berlin, December 11-13, 2010

40 universiteit van amsterdam | 40 Typicality ratings Good matchAdjectiveNounAdj-NounCE unsliced apple 8.71 (un- sliced) 7.25 (apple) 8.65 (unslic- ed apple) 1.4 red apple8.5 (red) 7.81 (apple) 8.87 (red apple) 1.06 brown apple6.93 (brown) 3.54 (apple) 8.52 (brown apple) 4.98 Smith & Osherson 1984: Conceptual combination with prototype concepts (11 point scale 0-10) Tandem Workshop Berlin, December 11-13, 2010

41 universiteit van amsterdam | 41 The modification rule  How does a vector modify a vector ?  The answer depends on the nature of the vectors: A. Vectors as superpositions of instances B. Distributional semantics. Vectors as document-based word-vectors (Schütze)  Many proposals in the literature: Aerts, Zadeh, Plate, Smolensky, … Tandem Workshop Berlin, December 11-13, 2010

42 universiteit van amsterdam | 42 A. Prototypes as superposed instances   Even if the prototype is not one of the presented instan- ces it is recognized as such.  Modification rule + recalibrating to unit length Tandem Workshop Berlin, December 11-13, 2010

43 universiteit van amsterdam | 43 B. Distributional Semantics Document 1 is about music instruments, document 2 about fishermen, and document 3 about financial institutions (applying LAS as pre-processing) Modification rule * : *See Mitchell & Lapata (2008): Vector-based models of semantic composition. (Using circular convolution) document 1document 2document 3 bank bass commercial cream guitar fisherman money 0 0.447 0 1 0 0.894 0.707 0 1 0.447 1 0 0.707 0 0.894 Tandem Workshop Berlin, December 11-13, 2010

44 universiteit van amsterdam | 44 Modification:  Build the tensor product  Apply a linear operator for reducing the dimension by 1, e.g.   [ ] = Tandem Workshop Berlin, December 11-13, 2010

45 universiteit van amsterdam | 45 Conjunction effect striped apple striped apple fuzzy logic Tandem Workshop Berlin, December 11-13, 2010

46 universiteit van amsterdam | 46 Form Texture Apple striped Striped apple in 2D Tandem Workshop Berlin, December 11-13, 2010

47 universiteit van amsterdam | 47 Red and White Beans General Distribution Red Color Distribution Beans Color Distribution Red Beans Tandem Workshop Berlin, December 11-13, 2010

48 universiteit van amsterdam | 48 Red and White Beans Color Distribution White Beans Color Distribution Beans General Distribution White Tandem Workshop Berlin, December 11-13, 2010

49 universiteit van amsterdam | 49 Tall Boy tallboy tall boy Tandem Workshop Berlin, December 11-13, 2010

50 universiteit van amsterdam | 50 Red apple: color of peel red apple red  apple Kullback-Leibler information = 0.25 Tandem Workshop Berlin, December 11-13, 2010

51 universiteit van amsterdam | 51 Red apple: color of pulp red apple red  apple Kullback-Leibler information = 0.06 Tandem Workshop Berlin, December 11-13, 2010

52 universiteit van amsterdam | 52 Stone lion stone stone  lion lion Kullback-Leibler Information   Tandem Workshop Berlin, December 11-13, 2010

53 universiteit van amsterdam | 53 Resumé  Concerning adjectival modification, a vector operation is proposed which accounts for a compositional analysis  One consequence is that we have to give up the separation between semantics and pragmatics  The proposed model handles the traditional classes of intersective, subsective and privative adjectives in a uniform way, accounts for the conjunction puzzle of typicality and conforms the DPL-thesis. Tandem Workshop Berlin, December 11-13, 2010

54 universiteit van amsterdam | 54 6 Compositionality for geometric models Typicality as cue validity Reduction by tracing General composition rule The conjunction effect Tandem Workshop Berlin, December 11-13, 2010

55 universiteit van amsterdam | 55 Typicality as cue validity  How typical is x for A? Write c x (A) for it.  Cue validity: c x (A) = P(x|A) with classical probability P  It predicts a conjunction effect  Fact: x 1,x 2  A, then c x1 (A)c x2 (A) iff P(x 1 ) P(x 2 )  Let x 1 be a typical apple, and x 2 a typical striped apple. Then P(x 1 ) > P(x 2 ), but c x1 (striped apple) < c x2 (striped apple) Tandem Workshop Berlin, December 11-13, 2010

56 universiteit van amsterdam | 56 Typicality in the geometric model  Let instance x be a vector of the Hilbert space, and let be a superposition of instances,  Definition: (Born rule)  Assuming that all instances are independent of each other (orthogonal), then  Use vector modification “ o ” instead of “”. Tandem Workshop Berlin, December 11-13, 2010

57 universiteit van amsterdam | 57 Trace and the  -operator  A standard method of dimension reduction is to apply tracing: Tr(X) =  i X ii (reducing the dimension by 2) .. ..  Can we apply this method of reduction in order to get a general method of composing meanings? Yes: P. beim Graben, S. Clark Tandem Workshop Berlin, December 11-13, 2010

58 universiteit van amsterdam | 58 Composition for simple sentences dogs chase cats n (n\s/n) n  N 1 N  N Tandem Workshop Berlin, December 11-13, 2010

59 universiteit van amsterdam | 59 Type reduction  Semantic reduction  Lambek calculus for grammatical types  The reduction diagram indicates the semantic operations (sequence of linear operators in the underlying vector space).  The sequence of operations is applied to an initial tensor product and is reducing it to the semantic category of sentences n (n\s/n) n Tandem Workshop Berlin, December 11-13, 2010

60 universiteit van amsterdam | 60 Composition for adjectival modification (n/n) n red nose 1 N  N product rule Tandem Workshop Berlin, December 11-13, 2010

61 universiteit van amsterdam | 61 Common nouns of type n\s | 61  Lifting common nouns to type n\s, e.g.  It accounts for typicality and graded truth-value!  The composition for adjectival modification can be extended straightforwardly  Both for typicality and for graded truth values we get an explanation of the conjunction puzzle Tandem Workshop Berlin, December 11-13, 2010

62 universiteit van amsterdam | 62 Computational aspects  The tensor product collecting the information from the semantic entries (e.g. ) never needs to be calculated.  The computation of the sentence representation is simply a matter of building a vector using inner product, a computationally simple operation.  In some sense the processing assumptions conform to the slogan (Millikan, Recanati) that NL interpretation is as direct as perception (DPL thesis) Tandem Workshop Berlin, December 11-13, 2010

63 universiteit van amsterdam | 63 Resumé  With common noun of type s/n the present approach separates typicality from graded truth- value  Both for typicality and graded truth values we get an explanation of the conjunction puzzle.  Truth functionally, modification leads to a product t- norm (with recalibration effect)  The processing assumptions conform to the DLP thesis. Tandem Workshop Berlin, December 11-13, 2010

64 universiteit van amsterdam | 64 Conclusions Several examples of context-sensitivity can be treated in a straightforward way by using a compositional operation on conceptual states Since conceptual states contain (frozen) usage information, they combine semantic and pragmatic information Does it make superfluous ‘truth-conditional pragmatics’ (TCP)? Yes: if TCP is a kind of “inferential theory” No: if TCP is “as direct as perception” Tandem Workshop Berlin, December 11-13, 2010

65 universiteit van amsterdam | 65 What is the connection to QM?  The objects in which we have situated word meanings are Hilbert spaces. Hilbert spaces are the basis of the mathematical theory of (classical) QM.  Combining word meanings by using the tensor product. Composite systems in QM are likewise represented using tensor products.  For calculating typicality we use a probability measure based on vector-(sub)spaces (Born-rule). In connection with several puzzles of bounded rationality, there are good arguments why classical probabilities cannot be used. The theory of quantum probabilities forms the basis of QM.  Further: Non-commutativity of questions in the context of survey research (attitude questions). Observables in QM. Tandem Workshop Berlin, December 11-13, 2010


Download ppt "Universiteit van amsterdam | 1 Reinhard Blutner Geometric models of meaning and compositionality Institute for Logic,"

Similar presentations


Ads by Google