Presentation is loading. Please wait.

Presentation is loading. Please wait.

Evaluating XML retrieval: The INEX initiative Mounia Lalmas Queen Mary University of London

Similar presentations


Presentation on theme: "Evaluating XML retrieval: The INEX initiative Mounia Lalmas Queen Mary University of London"— Presentation transcript:

1 Evaluating XML retrieval: The INEX initiative Mounia Lalmas Queen Mary University of London

2 Outline Information retrieval Information retrieval (Content-oriented) XML retrieval (Content-oriented) XML retrieval Evaluating information retrieval Evaluating information retrieval Evaluating XML retrieval: INEX Evaluating XML retrieval: INEX

3 Information retrieval Example of a user information need: Find all documents about sailing charter agencies that (1) offer sailing boats in the Greek islands, and (2) are registered with the RYA. The documents should contain boat specification, price per week, and other contact details. A formal representation of an information need constitutes a query

4 Information retrieval IR is concerned with the representation, storage, organisation, and access to repositories of information, usually under the form of documents. Primary goal of an IR system Retrieve all the documents which are relevant (useful) to a user query, while retrieving as few non-relevant documents as possible.

5 DocumentsQuery Document representation Retrieval results Query representation IndexingFormulation Retrieval function Relevance feedback Conceptual model for IR

6 Structured Document Retrieval Traditional IR is about finding relevant documents to a users information need, e.g. entire book. Traditional IR is about finding relevant documents to a users information need, e.g. entire book. SDR allows users to retrieve document components that are more focussed to their information needs, e.g a chapter of a book instead of an entire book. SDR allows users to retrieve document components that are more focussed to their information needs, e.g a chapter of a book instead of an entire book. The structure of documents is exploited to identify which document components to retrieve. The structure of documents is exploited to identify which document components to retrieve.

7 Structured Documents Linear order of words, sentences, paragraphs … Hierarchy or logical structure of a books chapters, sections … Links (hyperlink), cross- references, citations … Temporal and spatial relationships in multimedia documents Book Chapters Sections Paragraphs World Wide Web This is only only another to look one le to show the need an la a out structure of and more a document and so ass to it doe not necessary text a structured document have retrieval on the web is an it important topic of todays research it issues to make se last sentence..

8 Structured Documents Explicit structure formalised through document representation standards (Mark-up Languages) Explicit structure formalised through document representation standards (Mark-up Languages) Layout Layout LaTeX (publishing), HTML (Web publishing) Structure Structure SGML, XML (Web publishing, engineering), MPEG-7 (broadcasting) Content/Semantic Content/Semantic RDF, DAML + OIL, OWL (semantic web) World Wide Web This is only only another to look one le to show the need an la a out structure of and more a document and so ass to it doe not necessary text a structured document have retrieval on the web is an it important topic of todays research it issues to make se last sentence.. SDR … …

9 XML: eXtensible Mark-up Language Meta-language (user-defined tags) currently being adopted as the document format language by W3C Meta-language (user-defined tags) currently being adopted as the document format language by W3C Used to describe content and structure (and not layout) Used to describe content and structure (and not layout) Grammar described in DTD ( used for validation) Grammar described in DTD ( used for validation) Structured Document Retrieval Smith John Introduction into XML retrieval …. … …

10 XML: eXtensible Mark-up Language Use of XPath notation to refer to the XML structure chapter/title: title is a direct sub-component of chapter //title: any title chapter//title: title is a direct or indirect sub-component of chapter chapter/paragraph[2]: any direct second paragraph of any chapter chapter/*: all direct sub-components of a chapter Structured Document Retrieval Smith John Introduction into SDR …. …

11 Querying XML documents Content-only (CO) queries Content-only (CO) queries ' open standards for digital video in distance learning ' Content-and-structure (CAS) queries Content-and-structure (CAS) queries //article [about(., 'formal methods verify correctness aviation systems')] /body//section /body//section [about(.,'case study application model checking theorem proving')] [about(.,'case study application model checking theorem proving')] Structure-only (SA) queries Structure-only (SA) queries/article//*section/paragraph[2]

12 Conceptual model for XML retrieval Structured documents Content + structure Inverted file + structure index tf, idf, acc Matching content + structure Presentation of related components DocumentsQuery Document representation Retrieval results Query representation IndexingFormulation Retrieval function Relevance feedback

13 Content-oriented XML retrieval Return document components of varying granularity (e.g. a book, a chapter, a section, a paragraph, a table, a figure, etc), relevant to the users information need both with regards to content and structure.

14 Content-oriented XML retrieval Retrieve the best components according to content and structure criteria: INEX: most specific component that satisfies the query, while being exhaustive to the query INEX: most specific component that satisfies the query, while being exhaustive to the query Shakespeare study: best entry points, which are components from which many relevant components can be reached through browsing Shakespeare study: best entry points, which are components from which many relevant components can be reached through browsing ??? ???

15 Article ?XML,?retrieval Article ?XML,?retrieval ?authoring ?authoring 0.9 XML 0.5 XML 0.2 XML 0.9 XML 0.5 XML 0.2 XML 0.4 retrieval 0.7 authoring 0.4 retrieval 0.7 authoring Challenges Title Section 1 Section 2 No fixed retrieval unit + nested document components + different types of document components how to obtain document and collection statistics? which component is a good retrieval unit? which components contribute best to content of Article? how to estimate? how to aggregate?

16 Approaches … vector space model probabilistic model bayesian network language model extending DB model boolean model natural language processing cognitive model ontology parameter estimation tuning smoothing fusion phrase term statistics collection statistics component statistics proximity search logistic regression belief model relevance feedback

17 Evaluation The goal of an IR system The goal of an IR system retrieve as many relevant documents as possible and as few non- relevant documents as possible Comparative evaluation of technical performance of IR systems = effectiveness Comparative evaluation of technical performance of IR systems = effectiveness ability of the IR system to retrieve relevant documents and suppress non-relevant documents Effectiveness Effectiveness combination of recall and precision

18 Relevance A document is relevant if it has significant and demonstrable bearing on the matter at hand. A document is relevant if it has significant and demonstrable bearing on the matter at hand. Common assumptions: Common assumptions: Objectivity Objectivity Topicality Topicality Binary nature Binary nature Independence Independence

19 Recall / Precision Document collection Retrieved Relevant Retrieved and relevant

20 Recall / Precision relevant documents for a given query {d3, d5, d9, d25, d39, d44, d56, d71, d89, d123} rankdocprecisionrecallrankdocprecisionrecall d123d84d56D6d8d9d5111/12/33/61/102/103/ d129d187d25d48d250d113d34/105/144/105/10

21 Test collection Document collection = document themselves Document collection = document themselves depend on the task, e.g. evaluating web retrieval requires a collection of HTML documents. Queries / requests Queries / requests simulate real user information needs. Relevance judgements Relevance judgements stating for a query the relevant documents. See TREC, CLEF, etc See TREC, CLEF, etc

22 Evaluation of XML retrieval: INEX Evaluating the effectiveness of content-oriented XML retrieval approaches Evaluating the effectiveness of content-oriented XML retrieval approaches Collaborative effort participants contribute to the development of the collection Collaborative effort participants contribute to the development of the collectionqueries relevance assessments Similar methodology as for TREC, but adapted to XML retrieval Similar methodology as for TREC, but adapted to XML retrieval 40+ participants worldwide 40+ participants worldwide Workshop in Schloss Dagstuhl in December (20+ institutions) Workshop in Schloss Dagstuhl in December (20+ institutions)

23 INEX Test Collection Documents (~500MB), which consist of 12,107 articles in XML format from the IEEE Computer Society; 8 millions elements Documents (~500MB), which consist of 12,107 articles in XML format from the IEEE Computer Society; 8 millions elements INEX 2002 INEX CO and 30 CAS queries inex_eval metric INEX 2003 INEX CO and 30 CAS queries CAS queries are defined according to enhanced subset of XPath inex_eval and inex_eval_ng metrics INEX 2004 is just starting INEX 2004 is just starting

24 Relevance in XML A element is relevant if it has significant and demonstrable bearing on the matter at hand A element is relevant if it has significant and demonstrable bearing on the matter at hand Common assumptions in IR Common assumptions in IR Objectivity Objectivity Topicality Topicality Binary nature Binary nature Independence Independence section paragraph article

25 Relevance in INEX Exhaustivity Exhaustivity how exhaustively a document component discusses the query: 0, 1, 2, 3 Specificity Specificity how focused the component is on the query: 0, 1, 2, 3 Relevance Relevance (3,3), (2,3), (1,1), (0,0), … (3,3), (2,3), (1,1), (0,0), … section article all sections relevant article very relevant all sections relevant article better than sections one section relevant article less relevant one section relevant section better than article …

26 Relevance assessment task Completeness Completeness Element parent element, children element Element parent element, children element Consistency Consistency Parent of a relevant element must also be relevant, although to a different extent Parent of a relevant element must also be relevant, although to a different extent Exhaustivity increase going Exhaustivity increase going Specificity decrease going Specificity decrease going Use of an online interface Use of an online interface Assessing a query takes a week! Assessing a query takes a week! Average 2 topics per participants Average 2 topics per participants Only participants that complete the assessment task have access to the collection Only participants that complete the assessment task have access to the collection section paragraph article

27 Metrics Recall / precision - based Recall / precision - based quantisation functions to obtain one relevance value expected search length expected search length penalise overlap penalise overlap consider size consider size Others Others expected ratio of relevant cumulated gain-based metrics tolerance to irrelevance section article

28 Lessons learnt Good definition of relevance Good definition of relevance Expressing CAS queries was not easy Expressing CAS queries was not easy Relevance assessment process must be improved Relevance assessment process must be improved Further development on metrics needed Further development on metrics needed User studies required User studies required

29 Conclusion XML retrieval is not just about the effective retrieval of XML documents, but also about how to evaluate effectiveness XML retrieval is not just about the effective retrieval of XML documents, but also about how to evaluate effectiveness INEX 2004 tracks INEX 2004 tracks Relevance feedback Relevance feedback Interactive Interactive Heterogeneous collection Heterogeneous collection Natural language query Natural language query

30 Evaluating XML retrieval: The INEX initiative Mounia Lalmas Queen Mary University of London


Download ppt "Evaluating XML retrieval: The INEX initiative Mounia Lalmas Queen Mary University of London"

Similar presentations


Ads by Google