Presentation is loading. Please wait.

Presentation is loading. Please wait.

© CvR1 The Geometry of IR Keith van Rijsbergen Tampere 15 th August, 2002 (lost in Hilbert space!)

Similar presentations


Presentation on theme: "© CvR1 The Geometry of IR Keith van Rijsbergen Tampere 15 th August, 2002 (lost in Hilbert space!)"— Presentation transcript:

1 © CvR1 The Geometry of IR Keith van Rijsbergen Tampere 15 th August, 2002 (lost in Hilbert space!)

2 © CvR2 Unscripted comments I States Observables Measurement => Reality? Projection Postulates Cognitive State Changes

3 © CvR3 Unscripted comments II (quoting John von Neumann) However, all quantum mechanical probabilities are defined by inner products of vectors. Essentially if a state of a system is given by one vector, the transition probability in another state is the inner product of the two which is the square of the angle between them. In other words, probability correspond precisely to intro- ducing the angles geometrically. Furthermore, there is only one way to introduce it. The more so because in the quantum mechanical machinery the negation of a statement, so the negation of a statement which represented by a linear set of vectors, correponds to the orthogonal complement of this linear space. Unsolved problems in mathematics, typescript, September, 1954

4 © CvR4 What is this talk about? Not about quantum computation. see Nielsen and Chuang, CUP, 2000 Not about Logic see Engesser and Gabbay, AI, 2002 History (von Neumann, Dirac, Schroedinger) Motivation (complementarity) Duality (Syntax/Semantics) Measurement (Incompatibility) Projections (subspaces) Probability (inner products) IR application (feedback, clusters, ostension)

5 © CvR5

6 6

7 7 Images not Text: how might that make a difference? no visual keywords (yet) - tf/idf issue aboutness revisable (eg Maron) relevance revisable (eg Goffman) feedback requires salience aboutness -> relevance -> aboutness

8 © CvR8 This is not new! Goffman, 1969:..that the relevance of the information from one document depends upon what is already known about the subject, and in turn affects the relevance of other documents subsequently examined. Maron, : Just because a document is about the subject sought by a patron, that fact does not imply that he would judge it relevant.

9 © CvR9 Marons theory of indexing …..in the case where the query consists of single term, call it B, the probability that a given document will be judged relevant by a patron submitting B is simply the ratio of the number of patrons who submit B as their query and judge that document as relevant, to the number of patrons, who submit B as their search query

10 © CvR10 In 1949 D.M Mackay wrote a paper Quantal aspects of scientific information, SER, vol 41, no.314, in which he alluded to using the quantum mechanics paradigm to IR

11 © CvR11 Expectation Catalogue It ( -function) is now the means for predicting probability of measurement results. In it is embodied the momentarily-attained sum of theoretically based future expectation, somewhat as laid down in a catalogue. It is the relation-and-determinacy-bridge between measurements and measurements...... It is, in principle, determined by a finite number of suitably chosen measurement on the object.....Thus the catalogue of expectations is initially compiled. Schrödinger, 1935 &1980

12 © CvR12 Hypotheses Cluster Hypothesis: closely associated documents tend to be relevant to the same requests. (1971) [co-ordination is positively correlated with external relevance, Jackson, 1969] Association Hypothesis: If an index term is good at discriminating relevant from non-relevant documents then any closely associated index term is also likely to be good at this. (1979) [co-occurrence of terms within documents is a suitable measure of similarity between terms, Jackson,1971]

13 © CvR13 Navigation - Browsing T-space D-space

14 © CvR14 DUALITY Direct file/Inverted file Statespace/Space of Projections d = (x,y,z,u,v,w)d =(u,v,w,k,l,m) [[u]] = {d,d}; [[x]] = {d}; [[m]] = {d} Boolean Logic: [[u x]] = {d}; [[x m]] ={d,d} Quantum Logic: [[u x]] = same; [[x m]] = different

15 © CvR15 The mathematics you need Hilbert space (complex!!!) inner product norms ||x|| 2 = operator (linear) HermitianA*=A tracetr(A) = a ii eigenvaluesAx = x

16 © CvR16 Crash course on Dirac notation |x> : vector (called ket) *: functional (bra) = (row vector)(column vector)= x i *y i |x><y| : linear operator |x><x| : a projector onto ray x tr(|x> I = |i><i| : universal projector

17 © CvR17 Hierarchy of Projectors P 0 = P n = I P 1 = |1><1| P 2 = |1> <2|. P n = |1> <n|

18 © CvR18 Summary Relevance/Aboutness Documents Queries Observables Operators State function Operators can be applied to state function; and operators can be decomposed into projectors. A = a i P i

19 © CvR19 That is the relevance or irrelevance of a given retrieved document may affect the users current state of knowledge resulting in a change of the users information need, which may lead to a change of the users perception/ interpretation of the subsequent retrieved documents…. Borlund, 2000

20 © CvR20 T T T R R Y N Y N N Y Relevance/Aboutnes is Interaction/User dependent

21 © CvR21 probability as inner product |t> <t| = |t><t| = | | 2 |t><t| = cos 2 |t><t| (in real Hilbert space)

22 © CvR22 |r=1> |t=1> |t=0> |r=0> x

23 © CvR23 An operator T is of trace-class provided that T is positive ( 0, x) and trace of T is finite ( ) T is a density operator if T is trace-class and tr(T) = 1 T = a i P i is a density operator if 0 a i and a i = 1

24 © CvR24 Theorem Let be any measure on the closed subspaces of a separable (real or complex) Hilbert space H of dimension at least 3. There exists a positive self-adjoint operator T of trace class such that, for all closed subspace L of H, (L) = Tr(TP L ) If is to be a probability measure, thus requiring that (H) = 1, then Tr(T) =1, that is, T is a density operator.

25 © CvR25 Conditional Probability P(L A |L B ) = tr(P B DP B P A ) / tr(DP B ) Note that P A could be E -> F

26 © CvR26 What is T? – without blinding you with science -Relevance Feedback ( a mixture with log weights) -Pseudo relevance feedback (a mixture with similarity weights) -Clustering (superposition of members?) -Ostension (a history)

27 © CvR27 Conclusions? Is it worth it? Does it matter? - images - logic/probability/information/vectors - language

28 © CvR28 Useful References Readings in Information Retrieval,Morgan Kaufman, Edited by Sparck Jones and Willett Advances in Information Retrieval: Recent Research from CIIR, Edited by Bruce Croft. Information Retrieval: Uncertainty and Logics,Advanced Models for the Representation and Retrieval of Information, Edited by Crestani, Lalmas, Van Rijsbergen. Finding out about, Richard Belew.


Download ppt "© CvR1 The Geometry of IR Keith van Rijsbergen Tampere 15 th August, 2002 (lost in Hilbert space!)"

Similar presentations


Ads by Google