Presentation is loading. Please wait.

Presentation is loading. Please wait.

User Centered and Ontology Based Information Retrieval System for Life Sciences Sylvie RanwezVincent Ranwez Mohameth-François Sy Jacky Montmain Michel.

Similar presentations


Presentation on theme: "User Centered and Ontology Based Information Retrieval System for Life Sciences Sylvie RanwezVincent Ranwez Mohameth-François Sy Jacky Montmain Michel."— Presentation transcript:

1 User Centered and Ontology Based Information Retrieval System for Life Sciences Sylvie RanwezVincent Ranwez Mohameth-François Sy Jacky Montmain Michel Crampes LGI2P Research Center / Ecole des Mines d'Alès, France ISEM – CNRS / Montpellier II University, France

2 Overview Context and objectives Ontology based information retrieval Relevance calculus between a document index and a query Similarity between two concepts Relevance of a document with respect to a concept Relevance of a document with respect to a query Results visualization Conclusion et perspectives 2 User Centered and Ontology Based Information Retrieval System for Life Sciences – S. Ranwez

3 Context: usual information retrieval engine Boolean search + Results are easy to understand - Exact terms matching - Number of results - Rough measurement: "match" or "does not match" - Limited interaction - Aggregating operators are not used ( AND, OR …)  Hard to grasp even with clustering 3 User Centered and Ontology Based Information Retrieval System for Life Sciences – S. Ranwez

4 Context: information retrieval based on a concepts hierarchy 4 User Centered and Ontology Based Information Retrieval System for Life Sciences – S. Ranwez Boolean search + specialization + Extend the query - Number of results - Results are difficult to understand  Which concepts are taken into account?  Which ones have been added? - No relevance assessment  Loss of the first query context Boolean search + specialization + Extend the query - Number of results - Results are difficult to understand  Which concepts are taken into account?  Which ones have been added? - No relevance assessment  Loss of the first query context

5 Context: information retrieval using ontologies 5 User Centered and Ontology Based Information Retrieval System for Life Sciences – S. Ranwez Number of retrieved genes GenBankEnsembl AND 0 0 OR1315 Number of retrieved genes GenBankEnsembl AND 0 0 OR1315 ? Organelle organization (GO_ ) Cardiac muscle fiber development (GO_ )

6 Objectives Take better benefits of ontologies during the information retrieval process (indexing/query matching)  Expand the query if necessary  Measure document/query adequacy by identifying added concepts Favor the overall results' grasp by the user  Explain why a document has been selected  Give an overall vision of results  If a selected document is not relevant, identify why in order to reformulate the query conveniently Taking user preferences into account  Favor interactions and iterative querying process Take better benefits of ontologies during the information retrieval process (indexing/query matching)  Expand the query if necessary  Measure document/query adequacy by identifying added concepts Favor the overall results' grasp by the user  Explain why a document has been selected  Give an overall vision of results  If a selected document is not relevant, identify why in order to reformulate the query conveniently Taking user preferences into account  Favor interactions and iterative querying process 6 User Centered and Ontology Based Information Retrieval System for Life Sciences – S. Ranwez

7 Overview Context and objectives Ontology based information retrieval Relevance calculus between a document index and a query Similarity between two concepts Relevance of a document with respect to a concept Relevance of a document with respect to a query Results visualization Conclusion et perspectives 7 User Centered and Ontology Based Information Retrieval System for Life Sciences – S. Ranwez

8 Ontology based information retrieval Hyponyms and hypernyms to avoid silences  Mix documents that match more or less the query  The selection may be difficult to understand 8 Biological process (GO_ ) Cellular component organization (GO_ ) Organelle organization (GO_ ) Mitochondrion organisation (GO_ ) Cytoskeleton organization (GO_ ) Cellular process (GO_ ) … Muscle fiber development (GO_ ) Cardiac muscle fiber development (GO_ ) Skeletal muscle fiber development (GO: ) Mitochondrion organisation (GO_ ) Muscle fiber development (GO_ ) ? Organelle organization (GO_ ) Cardiac muscle fiber development (GO_ ) User Centered and Ontology Based Information Retrieval System for Life Sciences – S. Ranwez

9 Overview Context and objectives Ontology based information retrieval Relevance calculus between a document index and a query Similarity between two concepts Relevance of a document with respect to a concept Relevance of a document with respect to a query Results visualization Conclusion et perspectives 9 User Centered and Ontology Based Information Retrieval System for Life Sciences – S. Ranwez

10 INTERFACEINTERFACE Query Q Relevance calculus between a document index and a query Domain ontology Semantic map Selected documents User Centered and Ontology Based Information Retrieval System for Life Sciences – S. Ranwez ? ? 10

11 Relevance calculus between a document index and a query Three-level relevance calculus  Similarity between two concepts: a concept from the document index and a concept from the query  Relevance of a document (i.e. the set of its indexing concepts) with respect to a concept from the query  Relevance of a document with respect to a query: Fuzzy aggregation of relevance measures  Advantages Ranking of documents with respect to their relevance Detailed explanation of the document selection 11 User Centered and Ontology Based Information Retrieval System for Life Sciences – S. Ranwez

12 Relevance calculus between a document index and a query Three-level relevance calculus  Similarity between two concepts: a concept from the document index and a concept from the query  Relevance of a document (i.e. the set of its indexing concepts) with respect to a concept from the query  Relevance of a document with respect to a query: Fuzzy aggregation of relevance measures Several similarity measurements have been proposed in literature, this one is easy to understand (% of mutual hyponyms) 12 User Centered and Ontology Based Information Retrieval System for Life Sciences – S. Ranwez

13 Relevance calculus between a document index and a query Three-level relevance calculus  Similarity between two concepts: a concept from the document index and a concept from the query  Relevance of a document (i.e. the set of its indexing concepts) with respect to a concept from the query  Relevance of a document with respect to a query: Fuzzy aggregation of relevance measures Best relevance between indexing concepts of document D and a query concept Q t May be generalized by weighting the concepts D i (using evidence codes in the Gene Ontology for example) 13 User Centered and Ontology Based Information Retrieval System for Life Sciences – S. Ranwez

14 Relevance calculus between a document index and a query Three-level relevance calculus  Similarity between two concepts: a concept from the document index and a concept from the query  Relevance of a document (i.e. the set of its indexing concepts) with respect to a concept from the query  Relevance of a document with respect to a query: Fuzzy aggregation of relevance measures 14 User Centered and Ontology Based Information Retrieval System for Life Sciences – S. Ranwez Combine individual relevance scores to estimate an overall relevance of the document  Take user preferences into account: decision theory  Yager operator (with q  ) q = 1, arithmetic mean,q  0, geometrical mean, q = -1, harmonic mean,q  + ∞, max (OR generalization) q  - ∞, min (AND generalization)

15 Overview Context and objectives Ontology based information retrieval Relevance calculus between a document index and a query Similarity between two concepts Relevance of a document with respect to a concept Relevance of a document with respect to a query Results visualization Conclusion et perspectives 15 User Centered and Ontology Based Information Retrieval System for Life Sciences – S. Ranwez

16 Visualization  A document may be selected even if its index contains no terms of the query Explain the selection to the user: pictograms  Each concept of the query is associated with a bar: Its height is proportional to its relevance Its color says if index the document ( ) specialize (is an hyponym of) generalize (is an hypernym of) 16 User Centered and Ontology Based Information Retrieval System for Life Sciences – S. Ranwez Mitochondrion organisation (GO_ ) Muscle fiber development (GO_ ) ? Organelle organization (GO_ ) Cardiac muscle fiber development (GO_ )

17 Visualization Pictograms are displayed on a semantic map  Their physical distance to the query is proportional to their relevance score:  Visualization and navigation: fit the human cognitive limits (lens, number of results, relevance threshold…) and help the user (selection of concept for the query…) 17 User Centered and Ontology Based Information Retrieval System for Life Sciences – S. Ranwez

18 Overview Context and objectives Ontology based information retrieval Relevance calculus between a document index and a query Similarity between two concepts Relevance of a document with respect to a concept Relevance of a document with respect to a query Results visualization Conclusion et perspectives 18 User Centered and Ontology Based Information Retrieval System for Life Sciences – S. Ranwez

19 Conclusion and perspectives Results  Find more documents (avoid silences)  Improve the relevance: documents ranking  Explain relevance calculus (diagnose)  Visualize overall results  Interaction with the list of retrieved documents: customize user preferences  Iterative improvement of the query Perspectives  Improve CHI  Suggest query reformulation From documents selection by the user (weighting + complement) Underline query terms that are discriminated  Test several semantic distance calculus on different benchmarks (TREC, Much more…)  Improve visualization Filter the displayed results using sub-ontologies extraction Propose a view of the results underlining clusters  Propose an online version 19 User Centered and Ontology Based Information Retrieval System for Life Sciences – S. Ranwez

20 User Centered and Ontology Based Information Retrieval System for Life Sciences

21 Visualization OBIRS on line: 21 User Centered and Ontology Based Information Retrieval System for Life Sciences – S. Ranwez

22 Ontology based information retrieval 22 User Centered and Ontology Based Information Retrieval System for Life Sciences – S. Ranwez Documents Indexation Analyze/Indexation Documents' index Query Index Match Relevant documents REFORMULATIONREFORMULATION ? ? Domain ontology Query

23 Visualization 23 User Centered and Ontology Based Information Retrieval System for Life Sciences – S. Ranwez

24 Avant-propos : retours sur la pertinence des résultats 24 User Centered and Ontology Based Information Retrieval System for Life Sciences – S. Ranwez  Pertinence apprise à partir d'annotations  Peut mutualiser les indexations de certaines bases de données  Taguage possible avec des mots clés personnels hiérarchisés mais  Liste des résultats peut être longue  Pas de justification sur la pertinence  Filtres mais non sémantiques  Pas de vision globale  Pertinence apprise à partir d'annotations  Peut mutualiser les indexations de certaines bases de données  Taguage possible avec des mots clés personnels hiérarchisés mais  Liste des résultats peut être longue  Pas de justification sur la pertinence  Filtres mais non sémantiques  Pas de vision globale

25 Calcul de pertinence d'un document par rapport à une requête Il existe de mesures de distance entre des ensemble de concepts  Entre indexations réalisées avec GO, par exemple cependant la mesure de pertinence d'un document par rapport à une requête  Ne doit pas être symétrique  Doit permettre de détailler le score de chaque terme de la requête 25 User Centered and Ontology Based Information Retrieval System for Life Sciences – S. Ranwez

26 GO: ENSG ENSG ENSG ENSG ENSG ENSG ENSG ENSG ENSG ENSG ENSG ENSG ENSG ENSG ENSG ENSG GO: Ensembl Gene ID ENSG ENSG ENSG ENSG ENSG ENSG ENSG ENSG ENSG ENSG ENSG User Centered and Ontology Based Information Retrieval System for Life Sciences – S. Ranwez

27 Ontologie : ex tiré du MeSH 27 Nervous System Diseases Central Nervous System Diseases Brain Diseases Headache Disorder, Primary Migraine = Migraine Disorder Sign and Symptoms Headache Neurologic Manifestations Migraine Disorder with Aura Migraine Disorder without Aura Headache Disorder Pain … Pathological Conditions, Signs and Symptoms User Centered and Ontology Based Information Retrieval System for Life Sciences – S. Ranwez


Download ppt "User Centered and Ontology Based Information Retrieval System for Life Sciences Sylvie RanwezVincent Ranwez Mohameth-François Sy Jacky Montmain Michel."

Similar presentations


Ads by Google