Presentation is loading. Please wait.

Presentation is loading. Please wait.

Evaluating content-oriented XML retrieval: The INEX initiative Mounia Lalmas Queen Mary University of London

Similar presentations


Presentation on theme: "Evaluating content-oriented XML retrieval: The INEX initiative Mounia Lalmas Queen Mary University of London"— Presentation transcript:

1 Evaluating content-oriented XML retrieval: The INEX initiative Mounia Lalmas Queen Mary University of London http://qmir.dcs.qmul.ac.uk

2 Outline Information retrieval Information retrieval XML retrieval XML retrieval Evaluating information retrieval Evaluating information retrieval Evaluating XML retrieval: INEX Evaluating XML retrieval: INEX

3 Information retrieval Example of a user information need (e.g. on the WWW): Find all documents about sailing charter agencies that (1) offer sailing boats in the Greek islands, and (2) are registered with the RYA. The documents should contain boat specification, price per week, e-mail and other contact details. A formal representation of an information need constitutes a query

4 Information retrieval IR is concerned with the representation, storage, organisation, and access to repositories of information, usually under the form of documents. Primary goal of an IR system Retrieve all the documents which are relevant (useful) to a user query, while retrieving as few non-relevant documents as possible.

5 DocumentsQuery Document representation Retrieval results Query representation IndexingFormulation Retrieval function Relevance feedback Conceptual model for IR

6 XML Retrieval Traditional IR is about finding relevant documents to a users information need, e.g. entire book. Traditional IR is about finding relevant documents to a users information need, e.g. entire book. XML allows users to retrieve document components that are more focussed to their information needs, e.g a chapter of a book instead of an entire book. XML allows users to retrieve document components that are more focussed to their information needs, e.g a chapter of a book instead of an entire book. The structure of documents is exploited to identify which document components (XML elements) to retrieve. The structure of documents is exploited to identify which document components (XML elements) to retrieve.

7 XML: eXtensible Mark-up Language Meta-language (user-defined tags) currently being adopted as the document format language by W3C Meta-language (user-defined tags) currently being adopted as the document format language by W3C Used to describe content and structure (and not layout) Used to describe content and structure (and not layout) Grammar described in DTD ( used for validation) Grammar described in DTD ( used for validation) Structured Document Retrieval Smith John Introduction into XML retrieval …. … …

8 XML: eXtensible Mark-up Language Use of XPath notation to refer to the XML structure Use of XPath notation to refer to the XML structure chapter/title: title is a direct sub-component of chapter //title: any title chapter//title: title is a direct or indirect sub-component of chapter chapter/paragraph[2]: any direct second paragraph of any chapter chapter/*: all direct sub-components of a chapter Structured Document Retrieval Smith John Introduction into SDR …. …

9 Queries Content-only (CO) queries Content-only (CO) queries Standard IR queries but here we are retrieving document components Standard IR queries but here we are retrieving document components London tube strikes London tube strikes Content-and-structure (CAS) queries Content-and-structure (CAS) queries Put on constraints on which types of components are to be retrieved Put on constraints on which types of components are to be retrieved E.g. Sections of an article in the Times about congestion charges E.g. Sections of an article in the Times about congestion charges E.g. Articles that contain sections about congestion charges in London, and that contain a picture of Ken Livingstone E.g. Articles that contain sections about congestion charges in London, and that contain a picture of Ken Livingstone

10 Conceptual model for XML retrieval Structured documentsContent + structure Inverted file + structure index tf, idf, acc Matching content + structure Presentation of related components DocumentsQuery Document representation Retrieval results Query representation IndexingFormulation Retrieval function Relevance feedback

11 Example of XML approaches The representation of a composite element (e.g. article and section) is defined as the aggregated representation of its sub-elements section p1 is about XML retrieval p2 is about XML, authoring paragraph article 1 2 1 2 3 Sec3 is then also about XML (in fact very much about XML), retrieval, authoring

12 Example of XML approaches Document {?t 1, ?t 2, ?t 3 } Document {?t 1, ?t 2, ?t 3 } Title Section_1 Section_2 Title Section_1 Section_2 {0.9 t 1, 0.4 t 2 } {0.5 t 1 } {0.2 t 1, 0.7 t 3 } {0.9 t 1, 0.4 t 2 } {0.5 t 1 } {0.2 t 1, 0.7 t 3 } ? = Aggregated weight of t i in Document based on the instances of t i in the sub-elements (Title, Section_1 and Section_2)

13 Evaluation The goal of an IR system The goal of an IR system retrieve as many relevant documents as possible and as few non-relevant documents as possible retrieve as many relevant documents as possible and as few non-relevant documents as possible Comparative evaluation of technical performance of IR systems = effectiveness Comparative evaluation of technical performance of IR systems = effectiveness ability of the IR system to retrieve relevant documents and suppress non-relevant documents ability of the IR system to retrieve relevant documents and suppress non-relevant documents Effectiveness Effectiveness combination of recall and precision combination of recall and precision

14 Relevance A document is relevant if it has significant and demonstrable bearing on the matter at hand. A document is relevant if it has significant and demonstrable bearing on the matter at hand. Common assumptions: Common assumptions: Objectivity Objectivity Topicality Topicality Binary nature Binary nature Independence Independence

15 Recall / Precision Document collection Retrieved Relevant Retrieved and relevant

16 Recall / Precision relevant documents for a given query {d3, d5, d9, d25, d39, d44, d56, d71, d89, d123} rankdocprecisionrecallrankdocprecisionrecall 1234567d123d84d56D6d8d9d5111/12/33/61/102/103/10891011121314d129d187d25d48d250d113d34/105/144/105/10

17 Comparison of systems

18 Test collection Document collection = document themselves Document collection = document themselves depend on the task, e.g. evaluating web retrieval requires a collection of HTML documents. depend on the task, e.g. evaluating web retrieval requires a collection of HTML documents. Queries / requests Queries / requests simulate real user information needs. simulate real user information needs. Relevance judgements Relevance judgements stating for a query the relevant documents. stating for a query the relevant documents. See TREC See TREC

19 Evaluation of XML retrieval: INEX Evaluating the effectiveness of content-oriented XML retrieval approaches Evaluating the effectiveness of content-oriented XML retrieval approaches Collaborative effort = participants contribute to the development of the collection Collaborative effort = participants contribute to the development of the collection queries queries relevance assessments relevance assessments Similar methodology as for TREC, but adapted to XML retrieval. Similar methodology as for TREC, but adapted to XML retrieval.

20 INEX Test Collection The INEX test collection (2002) The INEX test collection (2002) Documents (~500MB), which consist of 12,107 articles in XML format from the IEEE Computer Society Documents (~500MB), which consist of 12,107 articles in XML format from the IEEE Computer Society 30 CO and 30 CAS queries 30 CO and 30 CAS queries Relevance assessments per retrieved components, by participating groups Relevance assessments per retrieved components, by participating groups Relevance defined in terms of relevance and coverage Relevance defined in terms of relevance and coverage Participants: 36 active groups worldwide Participants: 36 active groups worldwide In 2003, INEX has 36 CO and 30 CAS queries In 2003, INEX has 36 CO and 30 CAS queries Same document collections Same document collections CAS queries are defined according to a subset of XPath. CAS queries are defined according to a subset of XPath. Relevance assessments per retrieved components, by participating group Relevance assessments per retrieved components, by participating group Relevance defined in terms of exhaustivity and specificity Relevance defined in terms of exhaustivity and specificity Participants: 40 active groups worldwide Participants: 40 active groups worldwide INEX 2004 is just starting INEX 2004 is just starting

21 Example of CO topic Open standards for digital video in distance learning Open technologies behind media streaming in distance learning projects I am looking for articles/components discussing methodologies of digital video production and distribution that respect free access to media content through internet or via CD-ROMs or DVDs in connection to the learning process. Discussions of open versus proprietary standards of storing and sending digital video will be appreciated. media streaming,video streaming,audio streaming, digital video,distance learning,open standards,free access

22 Example of CAS topic //article[about(.,'formal methods verify correctness aviation systems')]/body//*[about(.,'case study application model checking theorem proving')] Find documents discussing formal methods to verify correctness of aviation systems. From those articles extract parts discussing a case study of using model checking or theorem proving for the verification. To be considered relevant a document must be about using formal methods to verify correctness of aviation systems, such as flight traffic control systems, airplane- or helicopter- parts. From those documents a body-part must be returned (I do not want the whole body element, I want something smaller). That part should be about a case study of applying a model checker or a theorem proverb to the verification. SPIN, SMV, PVS, SPARK, CWB

23 Relevance in XML A element is relevant if it has significant and demonstrable bearing on the matter at hand A element is relevant if it has significant and demonstrable bearing on the matter at hand Common assumptions in IR Common assumptions in IR Objectivity Objectivity Topicality Topicality Binary nature Binary nature Independence Independence section paragraph article 1 2 1 2 3

24 Relevance in XML Exhaustivity Exhaustivity how exhaustively a document component discusses the topic of request how exhaustively a document component discusses the topic of request Specificity Specificity how focused the component is on the topic of request (i.e. discusses no other, irrelevant topics) how focused the component is on the topic of request (i.e. discusses no other, irrelevant topics) 4-graded: 0, 1, 2, 3 4-graded: 0, 1, 2, 3 needed because of the structure needed because of the structure Relevance: (3,3), (2,3), (1,1), (0,0), etc Relevance: (3,3), (2,3), (1,1), (0,0), etc

25 Relevance assessment task Exhaustivity Exhaustivity Element parent element, children element Element parent element, children element Consistency Consistency Parent of a relevant element must also be relevant, although to a different extent Parent of a relevant element must also be relevant, although to a different extent Exhaustivity increase going Exhaustivity increase going Specificity decrease going Specificity decrease going Use of an online interface Use of an online interface Assessing a query takes a week! Assessing a query takes a week! Average 2 topics per participants Average 2 topics per participants Only participants that complete the assessment task have access to the collection Only participants that complete the assessment task have access to the collection section paragraph article 1 2 1 2 3

26 Metrics Recall/precision can used but must take into consideration: near misses (we do not retrieve the best component e.g. p[4] but one near enough e.g. p[2]) near misses (we do not retrieve the best component e.g. p[4] but one near enough e.g. p[2]) overlap (we retrieve a component e.g. doc[23] and one of its sub-components e.g. sec[3]) overlap (we retrieve a component e.g. doc[23] and one of its sub-components e.g. sec[3]) doc[23] sec[3] p[2] p[4]

27 Conclusion XML retrieval is not just about the effective retrieval of XML documents, but also how to evaluate the effectiveness XML retrieval is not just about the effective retrieval of XML documents, but also how to evaluate the effectiveness INEX 2004 INEX 2004 More rigorous query topic format (e.g. parser) More rigorous query topic format (e.g. parser) New metrics (e.g. not based on precision/recall) New metrics (e.g. not based on precision/recall) Tracks Tracks Relevance feedbackRelevance feedback InteractiveInteractive Heterogeneous collectionHeterogeneous collection Natural language queryNatural language query

28 Thank you http://inex.is.informatik.uni-duisburg.de:2004/


Download ppt "Evaluating content-oriented XML retrieval: The INEX initiative Mounia Lalmas Queen Mary University of London"

Similar presentations


Ads by Google