Download presentation
Presentation is loading. Please wait.
1
universiteit van amsterdam | 1 Reinhard Blutner http://www.blutner.de blutner@uva.nl Geometric models of meaning and compositionality Institute for Logic, Language and Computation
2
universiteit van amsterdam Tandem Workshop Berlin, December 11-13, 2010 | 2 0 Introduction NL Comprehension: semantic interpretation: automatic etc. pragmatic interpretation: indirect, reflective, normally based on reasoning (inferentialism) Perception: direct interpretat- ion of sensory input automatic, unreflective, instinctive. normally, it is not based on some kind of reasoning.
3
universiteit van amsterdam Thesis Tandem Workshop Berlin, December 11-13, 2010 | 3 NL interpretation (semantic + pragmatic) is basically as direct as perception. Most current models of NL interpretation are wrong Therefore, look for alternative models. Framework of geometric representations of meaning
4
universiteit van amsterdam Tandem Workshop Berlin, December 11-13, 2010 | 4 Outlook 1 Millikan’s thesis of direct interpretation 2 Modern contextualism 3 Context-sensitivity 4 Geometric models of meaning 5 Adjectival modification 6 Compositionality for geometric models
5
universiteit van amsterdam | 5 1 Millikan’s thesis of direct interpretation DPL-thesis General arguments Experimental evidence Tandem Workshop Berlin, December 11-13, 2010
6
universiteit van amsterdam | 6 Direct Perception through Language We directly perceive a red chair here. We automatically derive that without deliberation about the reliability of the sources (reflected light). DPL: NL interpretation is as direct as perception Derivation of literal meaning does not proceed by conscious inference The content of the heard utterances integrates automatically our „belief boxes“ Tandem Workshop Berlin, December 11-13, 2010
7
universiteit van amsterdam | 7 Descartes vs. Spinoza Can people comprehend assertions without believing them? Descartes suggested that people can and should Spinoza suggested that people should but cannot. Burge (1993): We may invoke (conscious) justification for not believing the content of some utterance. The default position, however, is to accept such contents as true. Evolutionary arguments (Axelrod, Millikan, Burge) Tandem Workshop Berlin, December 11-13, 2010
8
universiteit van amsterdam | 8 Gilbert’s (1993) experiment Subjects read a crime report that contained both true and false statements. The color of critical parts of the text indicated whether a particular statement was true or false. Some subjects performed a concurrent digit-search task as they read the critical statements [interrupted], and others did not [uninterrupted]. Finally, subjects completed a recognition memory test for the critical sentences contained in the report. Tandem Workshop Berlin, December 11-13, 2010
9
universiteit van amsterdam Results | 9 recognized as true / false statements In the recognition test, subjects responded correctly above the change level The participants of the interrupted group reported false information as true but not true information as false This indicates that Spinoza is right: people automatically take presented information as true; it takes conscious attention not believing it. Tandem Workshop Berlin, December 11-13, 2010
10
universiteit van amsterdam | 10 Resumé The assumption that NL interpretation is as direct as perception (DPL-thesis) has important consequences for constructing psychologically adequate models of interpretation I propose to take DPL as a serious challenge for computational models of NL interpretation DPL − properly generalized − asks for a default mode of NL interpretation which is fully compositional and runs automatically. Tandem Workshop Berlin, December 11-13, 2010
11
universiteit van amsterdam | 11 2 Modern contextualism The neo/post-Gricean picture of contextualism Classification scheme The silent assumptions Tandem Workshop Berlin, December 11-13, 2010
12
universiteit van amsterdam | 12 The neo-/post-Gricean picture: Contextualism Using the meanings of the words plus the syntactic structure of the sentence, it is not possible to calculate the literal meaning of the sentence. Some kind of underdetermined representation can be computed only. Semantic underdetermination and the existence of unarticulated constituents are postulated. The mechanism of pragmatic enrichment is crucial both for determining what the speaker says and what she means. Tandem Workshop Berlin, December 11-13, 2010
13
universiteit van amsterdam | 13 Variants of contextualism Relevance Theory Presumptive Meanings Neo- Gricean Theories (Horn, Atlas) OT-Pragmatics Tandem Workshop Berlin, December 11-13, 2010
14
universiteit van amsterdam | 14 Silent assumptions Language of thought hypothesis -Cognitive activities require a language-like representational medium -Rule-governed processes operate on representations (inferences) -Symbolic representations have a combinatorial syntax and semantics -Propositions form a Boolean lattice; Kolmogorov probabilities This contrasts with connectionist and geometric models of semantic interpretation Tandem Workshop Berlin, December 11-13, 2010
15
universiteit van amsterdam | 15 Disjunction puzzle (A & C) (A & C) A (distributivity) A | C: p A | C: q A | (C C): between p … q Tversky and Shafir (1992) show that significantly more students report they would purchase a nonrefundable Hawaiian vacation if they were to know that they have passed or failed an important exam than report they would purchase if they were not to know the outcome of the exam. Tandem Workshop Berlin, December 11-13, 2010
16
universiteit van amsterdam | 16 Interference (Franco, Khrennikov,…) In QM the superposition of two or more states can lead to interference effects Classical: P(A/C C’ ) = ½ P(A|C) + ½ P(A|C’ ), if P(C)/P(C C’ ) = P(C’ )/P(C C’ ) = ½ (A|C+C’) = ½ | A|C+C’ | 2 = ½ | A|C | 2 + ½ | A|C’ | 2 + | A|C | | A|C’ | cos( ) = ½ (A|C) + ½ (A|C’ ) + interference term A disjunction effect of -.16 (Tversky and Shafir 1992) can be described by cos = -0.35 Tandem Workshop Berlin, December 11-13, 2010
17
universiteit van amsterdam | 17 The holistic character of decisions Decision processes are not “rational” in the sense of rational decision theory (checking different cases and calculating expected utilities) The decision processes are led by simple but powerful global heuristics, which act in a fast and frugal way (Gigerenzer) Models of bounded rationality can be formulated in agreement with a weak form of rationalism. Tandem Workshop Berlin, December 11-13, 2010
18
universiteit van amsterdam | 18 Resumé Most variants of modern contextualism ignore the challenge of the DPL-thesis Issue of performance (automatic/controlled): Inferences are controlled processes Issue of competence: The proposed models cannot handle puzzles of bounded rationality. Tandem Workshop Berlin, December 11-13, 2010
19
universiteit van amsterdam | 19 3 Context-sensitivity Some examples Why a ‘tall boy’ is a big problem The problem with ‘absolute’ adjectives How to calculate truth-conditions Tandem Workshop Berlin, December 11-13, 2010
20
universiteit van amsterdam | 20 Some examples 1 John ate breakfast [this morning; in the normal way] Every boy [in the class] is seated Peter began a novel [ to read/ to write] I‘m parking outside [my car] Max is tall [for a fifth grader] Tandem Workshop Berlin, December 11-13, 2010
21
universiteit van amsterdam | 21 Some examples 2 What color is a red nose, red flag, red bean? This apple is red [on the outside] Tandem Workshop Berlin, December 11-13, 2010
22
universiteit van amsterdam | 22 More examples Quine (1960) was the first who noted the contrast between red apple (red on the outside) and pink grapefruit (pink on the inside). In a similar vein, Lahav (1993) argues that an adjective such as brown doesn’t make a simple and fixed contribution to any composite expression in which it appears: In order for a cow to be brown most of its body’s surface should be brown, though not its udders, eyes, or internal organs. A brown crystal, on the other hand, needs to be brown both inside and outside. A brown book is brown if its cover, but not necessarily its inner pages, are mostly brown, while a newspaper is brown only if all its pages are brown. For a potato to be brown it needs to be brown only outside,... (Lahav 1993: 76). Tandem Workshop Berlin, December 11-13, 2010
23
universiteit van amsterdam | 23 Why a ‘tall boy’ is still a problem Quasi-deictic elements tall boy x [tall*(x,N) & boy(x)] (Sag, Bartsch, Bosch) Minimal proposition tall boy x N[tall*(x,N) & boy(x)] (Capellen & Lapore, Hobbs)) Underdetermination tall boy x N [tall*(x,N) & boy(x)] (Alshawi, Pinkal) * with tall(x,N) size(x) > N Tandem Workshop Berlin, December 11-13, 2010
24
universiteit van amsterdam | 24 The problem with ‘absolute’ adjectives red apple x [part(Y,x) & red(Y) & apple(x)] Requires rather clumsy lexical entries How much of the peel of an apple has to be red in order to call it a red peel? This theory does not really clarify how the border line between semantics and pragmatics is ever to be determined The best what the symbolic tradition suggests is somewhat like Montague‘s adnominal functors. red apple x [(red(apple))(x)] Tandem Workshop Berlin, December 11-13, 2010
25
universiteit van amsterdam | 25 A theory based on adnominal functors Montague (1970) as starting-point: adjectives as adnominal functors (worst case) red(X) means roughly the property – (a) of having a red inner volume if X denotes fruits only the inside of which is edible – (b) of having a red surface if X denotes fruits with edible outside – (c) of having a functional part that is red if X denotes tools, … This approach is compositional but not systematic Tandem Workshop Berlin, December 11-13, 2010
26
universiteit van amsterdam | 26 How to calculate truth conditions? The mechanism of adnominal functors requires idiosyncratic lexical entries for fixing the interpretations of complex expressions. Alternative suggestions from Cognitive Linguistics -Blending theory* (Fouconnier & Turner) -Modulation (Recanati) What is the computational mechanism? A lovely notation does not yet provide a real mechanism. *In blending theory the part of a concept for which a given modification is relevant is referred to as an ‘active zone’, first discussed as such in Langacker (1991). In the case of an apple, the color is only relevant for the skin of the apple, which is its active zone. Tandem Workshop Berlin, December 11-13, 2010
27
universiteit van amsterdam | 27 Resumé Compositional semantics based on standard symbolic models cannot account for systematicity Alternative proposal: division of labor between underdetermined semantics and pragmatic theories of contextual enrichment Inferential theories of contextual enrichment fail for reasons of explanatory adequacy. Tandem Workshop Berlin, December 11-13, 2010
28
universiteit van amsterdam | 28 4 Geometric models of meaning Geometric models of meaning Possible worlds and conceptual states Comparing models of meaning Prototype semantics Tandem Workshop Berlin, December 11-13, 2010
29
universiteit van amsterdam | 29 Geometric Models of Meaning 1 Basic claim: An understanding of problem solving, categorization, memory retrieval, inductive reasoning, and other cognitive processes requires that we understand how humans assess similarity. W. S. Torgerson (1965): Multidimensional scaling of similarity. Psychometrika 30: 379– 393. Tandem Workshop Berlin, December 11-13, 2010
30
universiteit van amsterdam | 30 Geometric Models of Meaning 2 A. Tversky (1977): Features of similarity. Psychological Review 84: 327–352. P. Gärdenfors: The Geometry of Thought (2000) Concepts as convex spaces D. Widdows: Geometry and Meaning (2004) Distributional semantics Tandem Workshop Berlin, December 11-13, 2010
31
universiteit van amsterdam | 31 Possible worlds and conceptual states Possible worlds: Isolated entities which are used for modeling propositions (sets of possible worlds) Conceptual states: geometrical objects which form vector spaces. The addition of two vectors is an operation which describes the superposition of states interference phenomena Tandem Workshop Berlin, December 11-13, 2010
32
universiteit van amsterdam | 32 Comparing Models of Meaning Standard symbolic modelsGeometric models Formal semantics (Montague 1978); Partee, Kamp Mental spaces (Gärdenfors 2000); Lakoff, Fouconnier Concepts as set-theoretic constructions Natural concepts as convex subspaces Qualitative aspects of meaning, feature and tree structures, compositionality Quantitative aspects of meaning, Similarity structures, problems with compositionality Boolean algebraOrthoalgebra Tandem Workshop Berlin, December 11-13, 2010
33
universiteit van amsterdam | 33 Conceptual states: sim & prob A conceptual state is a vector indicating how diagnostic each component (instance, feature) is for the whole state ( frozen statistics) sim: The scalar product of two normalized vectors is a measure for their similarity prob: If designates a component, then gives the probability that the conceptual state triggers component. [Born rule] Tandem Workshop Berlin, December 11-13, 2010
34
universiteit van amsterdam | 34 Resumé Geometric models of meaning can account for similarity ratings and ratings of probability and typicality Geometric approaches to meaning have problems with handling compositionality. Tandem Workshop Berlin, December 11-13, 2010
35
universiteit van amsterdam | 35 5 Adjectival modification Three kinds of adjectives Conjunction puzzle The modification rule Examples Tandem Workshop Berlin, December 11-13, 2010
36
universiteit van amsterdam | 36 Three kinds of adjectives (based on B. Partee) Intersective -Adj’(Q)(x) P Adj’ (x) & Q(x) Subsective -Adj’(Q)(x) Q(x) Privative -Adj’(Q)(x) Q(x) Tandem Workshop Berlin, December 11-13, 2010
37
universiteit van amsterdam | 37 Natural classification? (B. Partee) intersective subsective modal privative blond, rectangular, French tall, good, typical, recent spurious, imaginary, fake. alleged, potential, arguable blond, rectangular, French spurious, imaginary, fake. modal alleged, potential, arguable Tandem Workshop Berlin, December 11-13, 2010
38
universiteit van amsterdam | 38 The problem to give a uniform account for each of the big syntactic classes to explain the pragmatic differences within the classes to account for both -graded membership function -typicality function to explain the puzzles of typicality to conform the DPL thesis. Tandem Workshop Berlin, December 11-13, 2010
39
universiteit van amsterdam | 39 Conjunction puzzle Let (a) be a particular apple. There is no doubt that it is psychologically less proto-typical of an apple (whose prototype looks more like (b)) than of an apple-with-stripes; hence c striped apple (a) > c apple (a) CE = c striped apple (x) c apple (x) (Osherson & Smith 1981) (a) (c) (b) Tandem Workshop Berlin, December 11-13, 2010
40
universiteit van amsterdam | 40 Typicality ratings Good matchAdjectiveNounAdj-NounCE unsliced apple 8.71 (un- sliced) 7.25 (apple) 8.65 (unslic- ed apple) 1.4 red apple8.5 (red) 7.81 (apple) 8.87 (red apple) 1.06 brown apple6.93 (brown) 3.54 (apple) 8.52 (brown apple) 4.98 Smith & Osherson 1984: Conceptual combination with prototype concepts (11 point scale 0-10) Tandem Workshop Berlin, December 11-13, 2010
41
universiteit van amsterdam | 41 The modification rule How does a vector modify a vector ? The answer depends on the nature of the vectors: A. Vectors as superpositions of instances B. Distributional semantics. Vectors as document-based word-vectors (Schütze) Many proposals in the literature: Aerts, Zadeh, Plate, Smolensky, … Tandem Workshop Berlin, December 11-13, 2010
42
universiteit van amsterdam | 42 A. Prototypes as superposed instances Even if the prototype is not one of the presented instan- ces it is recognized as such. Modification rule + recalibrating to unit length Tandem Workshop Berlin, December 11-13, 2010
43
universiteit van amsterdam | 43 B. Distributional Semantics Document 1 is about music instruments, document 2 about fishermen, and document 3 about financial institutions (applying LAS as pre-processing) Modification rule * : *See Mitchell & Lapata (2008): Vector-based models of semantic composition. (Using circular convolution) document 1document 2document 3 bank bass commercial cream guitar fisherman money 0 0.447 0 1 0 0.894 0.707 0 1 0.447 1 0 0.707 0 0.894 Tandem Workshop Berlin, December 11-13, 2010
44
universiteit van amsterdam | 44 Modification: Build the tensor product Apply a linear operator for reducing the dimension by 1, e.g. [ ] = Tandem Workshop Berlin, December 11-13, 2010
45
universiteit van amsterdam | 45 Conjunction effect striped apple striped apple fuzzy logic Tandem Workshop Berlin, December 11-13, 2010
46
universiteit van amsterdam | 46 Form Texture Apple striped Striped apple in 2D Tandem Workshop Berlin, December 11-13, 2010
47
universiteit van amsterdam | 47 Red and White Beans General Distribution Red Color Distribution Beans Color Distribution Red Beans Tandem Workshop Berlin, December 11-13, 2010
48
universiteit van amsterdam | 48 Red and White Beans Color Distribution White Beans Color Distribution Beans General Distribution White Tandem Workshop Berlin, December 11-13, 2010
49
universiteit van amsterdam | 49 Tall Boy tallboy tall boy Tandem Workshop Berlin, December 11-13, 2010
50
universiteit van amsterdam | 50 Red apple: color of peel red apple red apple Kullback-Leibler information = 0.25 Tandem Workshop Berlin, December 11-13, 2010
51
universiteit van amsterdam | 51 Red apple: color of pulp red apple red apple Kullback-Leibler information = 0.06 Tandem Workshop Berlin, December 11-13, 2010
52
universiteit van amsterdam | 52 Stone lion stone stone lion lion Kullback-Leibler Information Tandem Workshop Berlin, December 11-13, 2010
53
universiteit van amsterdam | 53 Resumé Concerning adjectival modification, a vector operation is proposed which accounts for a compositional analysis One consequence is that we have to give up the separation between semantics and pragmatics The proposed model handles the traditional classes of intersective, subsective and privative adjectives in a uniform way, accounts for the conjunction puzzle of typicality and conforms the DPL-thesis. Tandem Workshop Berlin, December 11-13, 2010
54
universiteit van amsterdam | 54 6 Compositionality for geometric models Typicality as cue validity Reduction by tracing General composition rule The conjunction effect Tandem Workshop Berlin, December 11-13, 2010
55
universiteit van amsterdam | 55 Typicality as cue validity How typical is x for A? Write c x (A) for it. Cue validity: c x (A) = P(x|A) with classical probability P It predicts a conjunction effect Fact: x 1,x 2 A, then c x1 (A)c x2 (A) iff P(x 1 ) P(x 2 ) Let x 1 be a typical apple, and x 2 a typical striped apple. Then P(x 1 ) > P(x 2 ), but c x1 (striped apple) < c x2 (striped apple) Tandem Workshop Berlin, December 11-13, 2010
56
universiteit van amsterdam | 56 Typicality in the geometric model Let instance x be a vector of the Hilbert space, and let be a superposition of instances, Definition: (Born rule) Assuming that all instances are independent of each other (orthogonal), then Use vector modification “ o ” instead of “”. Tandem Workshop Berlin, December 11-13, 2010
57
universiteit van amsterdam | 57 Trace and the -operator A standard method of dimension reduction is to apply tracing: Tr(X) = i X ii (reducing the dimension by 2) .. .. Can we apply this method of reduction in order to get a general method of composing meanings? Yes: P. beim Graben, S. Clark Tandem Workshop Berlin, December 11-13, 2010
58
universiteit van amsterdam | 58 Composition for simple sentences dogs chase cats n (n\s/n) n N 1 N N Tandem Workshop Berlin, December 11-13, 2010
59
universiteit van amsterdam | 59 Type reduction Semantic reduction Lambek calculus for grammatical types The reduction diagram indicates the semantic operations (sequence of linear operators in the underlying vector space). The sequence of operations is applied to an initial tensor product and is reducing it to the semantic category of sentences n (n\s/n) n Tandem Workshop Berlin, December 11-13, 2010
60
universiteit van amsterdam | 60 Composition for adjectival modification (n/n) n red nose 1 N N product rule Tandem Workshop Berlin, December 11-13, 2010
61
universiteit van amsterdam | 61 Common nouns of type n\s | 61 Lifting common nouns to type n\s, e.g. It accounts for typicality and graded truth-value! The composition for adjectival modification can be extended straightforwardly Both for typicality and for graded truth values we get an explanation of the conjunction puzzle Tandem Workshop Berlin, December 11-13, 2010
62
universiteit van amsterdam | 62 Computational aspects The tensor product collecting the information from the semantic entries (e.g. ) never needs to be calculated. The computation of the sentence representation is simply a matter of building a vector using inner product, a computationally simple operation. In some sense the processing assumptions conform to the slogan (Millikan, Recanati) that NL interpretation is as direct as perception (DPL thesis) Tandem Workshop Berlin, December 11-13, 2010
63
universiteit van amsterdam | 63 Resumé With common noun of type s/n the present approach separates typicality from graded truth- value Both for typicality and graded truth values we get an explanation of the conjunction puzzle. Truth functionally, modification leads to a product t- norm (with recalibration effect) The processing assumptions conform to the DLP thesis. Tandem Workshop Berlin, December 11-13, 2010
64
universiteit van amsterdam | 64 Conclusions Several examples of context-sensitivity can be treated in a straightforward way by using a compositional operation on conceptual states Since conceptual states contain (frozen) usage information, they combine semantic and pragmatic information Does it make superfluous ‘truth-conditional pragmatics’ (TCP)? Yes: if TCP is a kind of “inferential theory” No: if TCP is “as direct as perception” Tandem Workshop Berlin, December 11-13, 2010
65
universiteit van amsterdam | 65 What is the connection to QM? The objects in which we have situated word meanings are Hilbert spaces. Hilbert spaces are the basis of the mathematical theory of (classical) QM. Combining word meanings by using the tensor product. Composite systems in QM are likewise represented using tensor products. For calculating typicality we use a probability measure based on vector-(sub)spaces (Born-rule). In connection with several puzzles of bounded rationality, there are good arguments why classical probabilities cannot be used. The theory of quantum probabilities forms the basis of QM. Further: Non-commutativity of questions in the context of survey research (attitude questions). Observables in QM. Tandem Workshop Berlin, December 11-13, 2010
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.