Presentation is loading. Please wait.

Presentation is loading. Please wait.

MANAGING, QUERYING AND EXTRACTING BIOMEDICAL KNOWLEDGE Trabajo de Investigación Extracción de Conocimiento para la Web Semántica (1241119) Sistemas Informáticos.

Similar presentations


Presentation on theme: "MANAGING, QUERYING AND EXTRACTING BIOMEDICAL KNOWLEDGE Trabajo de Investigación Extracción de Conocimiento para la Web Semántica (1241119) Sistemas Informáticos."— Presentation transcript:

1 MANAGING, QUERYING AND EXTRACTING BIOMEDICAL KNOWLEDGE Trabajo de Investigación Extracción de Conocimiento para la Web Semántica (1241119) Sistemas Informáticos Avanzados Ernesto Jiménez-Ruiz Supervisor: Rafael Berlanga

2 PhD Research Work - Ernesto Jiménez Ruiz - TKBG Group2 Outline Context and Motivation  Application of Ontologies  Biomedical issues and Health-e-Project Proposed Methodologies Ontology Management System Conclusions and Future Work

3 PhD Research Work - Ernesto Jiménez Ruiz - TKBG Group3 Application of Ontologies «An ontology is a formal specification of a shared conceptualization» (Borst (1997)) Applications:  E-Commerce  Web Browsers  Digital Libraries  Biomedicine  etc… Context and Motivation

4 PhD Research Work - Ernesto Jiménez Ruiz - TKBG Group4 Health-e-Child Project Aims to develop an integrated healthcare platform for European pediatrics, achieving a comprehensive view of children’s health Grid Architecture Main Upper Level Applications: KDS, DSS Our tasks: Integration of biomedical data, information, and knowledge. Context and Motivation

5 PhD Research Work - Ernesto Jiménez Ruiz - TKBG Group5 Health-e-Child Project The biomedical information sources will cover six distinct levels (vertical levels):  Molecular  Cellular  Tissue  Organ  Individual  Population And will focus on three representative diseases (inside paediatrics):  Heart diseases  Inflammatory diseases  Brain tumours. Context and Motivation

6 PhD Research Work - Ernesto Jiménez Ruiz - TKBG Group6 Application of current Ontologies in HeC HeC vertical levels expressed by Ontologies Available several large biomedical ontologies and taxonomies, e.g: GO, GALEN, FMA, NCI-Thesurus, Tambis, BioPax, etc. Difficult too apply in concrete applications like HeC:  Scalability in reasoning.  Specificity: local view of the domain  Visualization and treatment Context and Motivation

7 PhD Research Work - Ernesto Jiménez Ruiz - TKBG Group7 View Mechanism Operation through views or used defined fragments/modules. Working on the development of OntoPath Future: to formalize the extracted views Context and Motivation

8 PhD Research Work - Ernesto Jiménez Ruiz - TKBG Group8 Outline Context and Motivation Proposed Methodologies  Development of Ontologies in a Collaborative Way  From Domain Ontologies to Application Views Ontology Management System Conclusions and Future Work

9 PhD Research Work - Ernesto Jiménez Ruiz - TKBG Group9 Development Requirements Development Methodologies with new dimensions:  Dynamism  distribution of team structure  and partially controlled development. Proposed Requirements:  Modularization/Particionamiento  Local Adaptation  Knowledge Abstraction  User-defined modules (Views)  Argumentation and Consensus Development of Ontologies in a Collaborative Way

10 PhD Research Work - Ernesto Jiménez Ruiz - TKBG Group10 Development Phases We distinguish 5 different phases:  Requirements  Development  Publication and Argumentation  Evaluation and Maintenance  Application Development of Ontologies in a Collaborative Way

11 PhD Research Work - Ernesto Jiménez Ruiz - TKBG Group11 Knowledge Spaces Development of Ontologies in a Collaborative Way

12 PhD Research Work - Ernesto Jiménez Ruiz - TKBG Group12 View Derivation Hierarchy Development of Ontologies in a Collaborative Way

13 PhD Research Work - Ernesto Jiménez Ruiz - TKBG Group13 Outline Context and Motivation Proposed Methodologies  Development of Ontologies in a Collaborative Way  From Domain Ontologies to Application Views Ontology Management System Conclusions and Future Work

14 PhD Research Work - Ernesto Jiménez Ruiz - TKBG Group14 ‘Current’ Biomedical Sources Creation of large and important Biomedical Ontologies and Taxonomies:  GALEN, FMA, NCI-Thesurus, Tambis, BioPax, etc  Open Biomedical Ontologies (OBO) Metathesaurus, Dictionaries and Lexicons:  Unified Medical Language System (UMLS)  MesH (Medical Subject Headings)  SNOMED Bioinformatics public databases (OMIM, UNIPROT, DrugBank, etc.) Hospital Resources (databases, texts, forms, images, etc.) From Domain Ontologies to Applications

15 PhD Research Work - Ernesto Jiménez Ruiz - TKBG Group15 New Issues and Challenges Many domain ontologies in Biomedicine do not cover completely the requirements of specific applications. Concepts may involve different abstraction levels (e.g. molecular, organ, disease, etc.) that can be in the same or in different domain ontologies. Domain ontologies are normally rather large:  Users find them hard to use for annotating and querying information sources  Only a subset of those is used by system applications. From Domain Ontologies to Applications

16 PhD Research Work - Ernesto Jiménez Ruiz - TKBG Group16 New Issues and Challenges The work to be presented mainly focuses on this issues:  Do not cover requirements Integration and Enrichment  Involve different abstraction levels Integration, enrichment and definition of user-defined modules (Views)  Are rather large (hard to use and only a subset are used) User-defined modules (Views) From Domain Ontologies to Applications

17 PhD Research Work - Ernesto Jiménez Ruiz - TKBG Group17 From Ontologies to Applications From Domain Ontologies to Applications

18 PhD Research Work - Ernesto Jiménez Ruiz - TKBG Group18 Enrichment from Textual Sources Automatic Instance Generation (Danger (2004))  Look for aggregation paths (i.e.: concept-relation-concept…) in texts.  Necessity of a well created and consistent ontology, UMLS-based Biomedical Entity Recognizer  A treated version of UMLS as a dictionary  Entity relation  co-occurrences  Problem: How to discover named relations between entities.  Technical Report: Jimeno-Yepes and Jiménez-Ruiz et al. (2007) From Domain Ontologies to Applications

19 PhD Research Work - Ernesto Jiménez Ruiz - TKBG Group19 Outline Context and Motivation Proposed Methodologies Ontology Management System  Parser and Storage in G database  OntoPath and Builder  Plugin-Protégé Conclusions and Future Work

20 PhD Research Work - Ernesto Jiménez Ruiz - TKBG Group20 System Architecture Ontology Management System

21 PhD Research Work - Ernesto Jiménez Ruiz - TKBG Group21 OWL Parser Greater flexibility in the OWL treatment and storage capabilities (e.g. indexes) The OWL parser creates from the OWL file a set of structures for classes, properties, nominal and individuals. These structures will be stored in the graph- based database G. Ontology Management System

22 PhD Research Work - Ernesto Jiménez Ruiz - TKBG Group22 G Semi-structured Database Backend to store, index and retrieve the OWL ontologies as graphs. Four database object types are needed: ontology, property, concept, and enumeration (nominals)  O=ontology(name=’Simple.owl’, rootConcept=C1, rootProperty=P1)  C1=concept(name=’Thing’)  P1= property (name=’PropertyThing’)  C2=concept(name=’Person’, subClassOf=C1)  P2= property(name=’hasFriend’, range=C2, domain=C2, subPropertyOf=P1) Ontology Management System

23 PhD Research Work - Ernesto Jiménez Ruiz - TKBG Group23 OWL Parser: DL Treatment (I) Inferred parents: C ≡C1 ⊓ … ⊓ Cn  C.subClassOf=C1,…,Cn Inferred children: C ≡C1 ⊔ … ⊔ Cn  C1.subClassOf.append(C), …, Cn.subClassOf.append(C) Inferred domains: C ⊓  R.D  i-property(name=R, domain=C, range=D). Ontology Management System

24 PhD Research Work - Ernesto Jiménez Ruiz - TKBG Group24 OWL Parser: DL Treatment (II) Creation of new classes:  C ⊑  R.(D ⊓  S.E)  C and D named classes  D’ is created, with D’.subClassOf=D.  It is also created: i-property(name=R, domain=C, range=D’).  D’.name=D_with_S_E. Nominals: (C ⊓  R.{i1, i2…, in})…  i-property(name=R, domain=C, enumeration=E). (C ⊓  R.{ i1})…  i-property(name=R, domain=C, enumeration=E). E=enumeration(name=R-nominals, list-of-values=i1,…, in) Ontology Management System

25 PhD Research Work - Ernesto Jiménez Ruiz - TKBG Group25 OWL Parser: DL Treatment (III) ResistanceToInsulin   isWithReferenceTo (presence ⊓  isPresenceAbsenceOf.Insulin)  C = concept(name=ResistanceToInsulin)  P = concept(name=presence, …)  I = concept(name=insulin,…)  A = concept(name=presence_with_isPresenceAbsenceOf_Insulin, subClassOf=P)  P1 = i-property(name=isWithReferenceTo, domain=C, range=A, …)  P2 = i-property(name=isPresenceAbsenceOf, domain=A, range=I,…) Ontology Management System

26 PhD Research Work - Ernesto Jiménez Ruiz - TKBG Group26 OWL Parser: DL Treatment (IV) Not properly stored (and queried) :  Some Union cases  Negation Expressivity  a subset of SHIF(D),  The closest DL underlying OWL-Lite.  OWL-DL Ontologies uses a subset of the DL SHOIN(D) And it will produce approximate answers to OntoPath queries. Ontology Management System

27 PhD Research Work - Ernesto Jiménez Ruiz - TKBG Group27 OntoPath Query Language To retrieve consistent fragments (personalized modules or views) from domain ontologies. Syntax simple like XPath. Results as a new OWL Ontology Example:  Disease / related_to / Rheumatoid_Factor Ontology Management System

28 PhD Research Work - Ernesto Jiménez Ruiz - TKBG Group28 Protégé Extensions Storing Ontologies Retrieving full ontologies or fragments Representation in a definition hierarchy Connection with Python codes Ontology Management System

29 PhD Research Work - Ernesto Jiménez Ruiz - TKBG Group29 Storing Ontologies Ontology Management System OWL File Selection References to other Ontologies (Views) Biomedical (HeC) Coverage

30 PhD Research Work - Ernesto Jiménez Ruiz - TKBG Group30 Retrieving full ontologies or fragments Ontology Management System Several Fragments Source Ontology Set of OntoPath Queries Metadata

31 PhD Research Work - Ernesto Jiménez Ruiz - TKBG Group31 Representation in a definition hierarchy Ontology Management System Organization of Views in a Definition Hierarchy Classification by Biomedical Level New Tab Created

32 PhD Research Work - Ernesto Jiménez Ruiz - TKBG Group32 Conclusions and Future Work The system is still work in progress  Adaptation to new versions of Protégé  Further tests to the OntoPath language  Formalizations of connections between fragments and source knowledge. e-connections?  Manchester  Enrichment by text mining techniques Work at EBI: from text to ontologies  Apply the ontologies in HeC: evaluation and validation

33 PhD Research Work - Ernesto Jiménez Ruiz - TKBG Group33 Questions and Feedback Thank you very much!!! Ernesto Jiménez-Ruiz  ejimenez@uji.es ejimenez@uji.es  http://www3.uji.es/~ejimenez http://www3.uji.es/~ejimenez  http://krono.act.uji.es/people/Ernesto/ http://krono.act.uji.es/people/Ernesto/


Download ppt "MANAGING, QUERYING AND EXTRACTING BIOMEDICAL KNOWLEDGE Trabajo de Investigación Extracción de Conocimiento para la Web Semántica (1241119) Sistemas Informáticos."

Similar presentations


Ads by Google