Presentation is loading. Please wait.

Presentation is loading. Please wait.

Conceptual structures in modern information retrieval Claudio Carpineto Fondazione Ugo Bordoni

Similar presentations


Presentation on theme: "Conceptual structures in modern information retrieval Claudio Carpineto Fondazione Ugo Bordoni"— Presentation transcript:

1 Conceptual structures in modern information retrieval Claudio Carpineto Fondazione Ugo Bordoni Romacarpinet@fub.it

2 Overview Keyword-based IR and early conceptual approaches Keyword-based IR and early conceptual approaches Context and concepts in modern topical IR Context and concepts in modern topical IR Emerging IR tasks requiring knowledge structures Emerging IR tasks requiring knowledge structures Research at FUB Research at FUB Conclusions Conclusions

3 DocumentsQuery Vectors of weighted keywords Vector of weighted keywords Retrieved documents Matching Vector-based IR

4 Term weighting tf.idf and vector space model (Salton) very popular in70’s and 80’s BM25 (Robertson) has been the state of the art in the 90’s Several recent term-weighting functions based on statistical language modeling (Ponte, Lafferty) A new weighting framework based on deviation from randomness + information gain (FUB + UG)

5

6 Inherent limitations of keyword-based IR Vocabulary problem Vocabulary problem Relations are ignored Relations are ignored

7 Early approaches to conceptual IR n-grams n-grams (Salton 1975, Maarek 1989) parse tree parse tree (Dillon 1983, Metzler 1989) case relations case relations (Fillmore 1968, Somers 1987) conceptual graphs conceptual graphs (Dick 1991)

8 Why early conceptual IR not successful No best representation scheme No best representation scheme Manual coding too costly Manual coding too costly Automated coding too hard Automated coding too hard Training required both for the indexer and the user Training required both for the indexer and the user Effectiveness not clearly demonstrated Effectiveness not clearly demonstrated Retrieval task often not appropriate Retrieval task often not appropriate

9 Overview Vector-based IR and early conceptual approaches Vector-based IR and early conceptual approaches Context and concepts in modern topical IR Context and concepts in modern topical IR Emerging IR tasks requiring knowledge structures Emerging IR tasks requiring knowledge structures Research at FUB Research at FUB Conclusions Conclusions

10 Evolution of topical IR Very short queries Very short queries Heterogeneous collections Heterogeneous collections Unreliable sources Unreliable sources Interactive sessions Interactive sessions

11 Indexing DocsQueryContextVisualization Ranking Use Indexing Interaction Model of modern topical IR

12

13 Performance of retrieval feedback versus query difficulty

14 Ranking based on interdocument similarity Cluster hypothesis (van Rijsbergen 1978) Approaches - Matching the query against document clusters (Willet 1988) - Matching the query against transformed document representations (GVSM, Wong 1987, LSI, Deerwester 1990) representations (GVSM, Wong 1987, LSI, Deerwester 1990) - Computing the conceptual distance between query and documents (Order-theoretical ranking, Carpineto 2000) documents (Order-theoretical ranking, Carpineto 2000)

15 Order-theoretical ranking NNS 0 FINANCE (Query) 1 NNS FINANCE CREDIT KBS (D7) 4 KBS 1 NNS FINANCE BANK ACCOUNT (D1) 1 NNS 1 FINANCE 2 NNS BANK 2 NNS BANK ACCOUNT (D3) 2 FINANCE CREDIT KBS (D4) 3 CREDIT KBS (D5) 3 NNS BANK RIVER (D2) 3 BANK 4 KBS WATERS (D6)

16 Performance of order-theoretical ranking Better than hierarchic clustering and comparable to best matching on the whole collection Markedly better than both hierarchic clustering and best matching on non-matching relevant documents Order-theoretical ranking does not scale up well but it is synergistic with best matching document ranking

17 Overview Vector-based IR and early conceptual approaches Vector-based IR and early conceptual approaches Context and concepts in modern topical IR Context and concepts in modern topical IR Emerging IR tasks requiring knowledge structures Emerging IR tasks requiring knowledge structures Research at FUB Research at FUB Conclusions Conclusions

18 Question Answering Task: Closed-class questions in unrestricted domains with no guarantee of answer and result possibly scattered over multiple documents

19 Question Answering Approach: 1.Recognize type of queries 2.Retrieve relevant documents 3.Find sought entities near question words 4.Fall back to best-matching passage retrieval in case of failure

20 Web Information Retrieval

21 Current tasks: named-entity finding task topic distillation task Approach: 1.Use of multiple methods 2.Combination of results via interpolation and normalization schemes

22 XML document retrieval Goal: Use document structure to improve precision and recall of unstructured queries “concerts this weekend at Sofia under 20 euros” Approaches: Automatic inference of query structure Semi-automatic query annotation Hybrid query languages

23 Overview Vector-based IR and early conceptual approaches Vector-based IR and early conceptual approaches Context and concepts in modern topical IR Context and concepts in modern topical IR Emerging IR tasks requiring knowledge structures Emerging IR tasks requiring knowledge structures Research at FUB Research at FUB Conclusions Conclusions

24 Recommender systems “Related keyword” feature versus Context-dependent query reformulation

25

26

27 Combining text retrieval and text mining with concept lattices Integration of multiple search strategies (querying, browsing, thesaurus climbing, bounding) into a unique Web interface Goal

28 The use of conceptual structures surfaces in traditional topic relevance retrieval and it is at the heart of many non-topical retrieval tasks Towards conceptual search Conclusions Understand term meaning Adapt to the user Can translate between applications Explainable Capable of filtering and summarization


Download ppt "Conceptual structures in modern information retrieval Claudio Carpineto Fondazione Ugo Bordoni"

Similar presentations


Ads by Google