Presentation is loading. Please wait.

Presentation is loading. Please wait.

The methodology for building an ontology in four steps Building Medical Ontologies by Terminology Extraction from Texts: Pneumology, a sample methodology.

Similar presentations


Presentation on theme: "The methodology for building an ontology in four steps Building Medical Ontologies by Terminology Extraction from Texts: Pneumology, a sample methodology."— Presentation transcript:

1 The methodology for building an ontology in four steps Building Medical Ontologies by Terminology Extraction from Texts: Pneumology, a sample methodology Texts corpus It is made of 1 000 hospitalization reports 231 4 Computerized ontology Set of terms of the domain The meaning is still ambiguous and context dependent Tree of primitive notions in the medical specialty The context of interpretation is now set Formal ontology Lattice of the defined concepts and the relations they have Hôpital Bidule Le patient Donald Duck, né le 22/5/2064 souffre d’un œdème de Quincke, type II. Hôpital Bidule Le patient Donald Duck, né le 22/5/2064 souffre d’un œdème de Quincke, type II. Hôpital Bidule Le patient Donald Duck, né le 22/5/2064 souffre d’un œdème de Quincke, type II. Audrey Baneyx, INSERM ERM 202 J ean Charlet, STIM AP-HP & INSERM ERM 202 audrey.baneyx@spim.jussieu.fr jc@biomath.jussieu.fr General situation : Access to medical knowledge is a major issue for the health professionals Facing the increase in information sources, especially the higher one in text production, the data-processing tools have been unable to include the specifities of the users' vocabularies. The development of ontological resources to ease the use of national and international terminologies in medicine is thus a major issue as far as data gathering and access to medical knowledge are concerned (such as coding assistance for diagnosis and treatments and the Medline bibliographical database for an instance of medical knowledge). We must stress the urgent need for such research in our media-centered society, in the context of the semantic Web. The development of terminological or ontological resources (TOR) from corpora is necessary for the expected interoperability of health data systems within the Semantic Web. Merits and uses of ontologies The Common Classification of Medical Treatments (CCAM) is a thesaurus built on the ontological model. Even though it has some of its properties (exhaustiveness, bijectivity, unambiguity and the attention for consensus…), the CCAM still does not allow either the saving of causality links that lead from medical hypothesis to diagnosis, or the adding of relevant data onto the concepts. Objective The development of an ontology of pneumology allows one to make up for those failings. It will allow a precise and homogeneous way to describe diagnosis, and keep track of the examinations that would lead to those diagnosis. From there, physicians will have at their disposal a trusted measure relative to the diagnosis. Methodology We recently tested a corpus-based methodology in surgical intensive care domain [4]. This methodology we will use to build our ontology is a warranty for the self-keeping of the axioms and the integrity of the system, as well as the expandability of the representation with no structure change, and maximal independance from the syntax and semantics of the field terms. We want to refine it in order to be used in a knowledge engineering process. Medical ontologies: Why and how ? This work of designing medical ontologies is part of the larger PERTOMed national research project, financed by the CNRS. The objective of the project is to develop a structure offering a range of methods and operational tools for the production and use of TOR in the medical field. My task is to design and develop a terminologies server and its associated linguistic services. It means contriving an optimal architecture for this server, and identifying the logical issues stemming from the different alternatives of knowledge representation and their consequences. My first task will be to create ontological resources in various medical specialties, among which pneumology. Those ontologies are created in close partnership with user groups, who will participate in the evaluation processes of these resources in their real use environment. Collaborations: -Department of pneumology and intensive care, Pitié-Salpêtrière Hospital, UPRES EA2397. -French society of pneumology. The PERTOMed project Production and evaluation of terminological and ontological resources in the medical field http://www.spim.jussieu.fr/pertomed [1] Zweigenbaum P., Bachimont B., Bouaud J., Charlet J. - Issues in the structuring and acquisition of an ontology for medical language understanding. Methods of Information in Medicine, 1995. [2] Bouaud J., Bachimont B., Charlet J., and Zweigenbaum P. - Methodological principles for structuring an “ontology”. In Proceedings of the IJCAI’95 Workshop on “Basic Ontological Issues in Knowledge Sharing”, Canada, 1995. [3] Bachimont B., Isaac A. and Troncy R. - Semantic Commitment for Designing Ontologies: A Proposal. In Asuncion Gomez-Pérez and V. Richard Benjamins, editors, 13th International Conference on Knowledge Engineering and Knowledge Management, EKAW'2002. [4] Le Moigno S., Charlet J., Bourigault D. & Jaulent M.-C. - Terminology extraction from text to build an ontology in surgical intensive care, in Proceedings of the AMIA 2002, USA, 2002. Bibliography lung œdema asthmus pathology 1. Constituting the corpus Implementing analysis programs that allow the selection in the discharge reports of relevant information according to the specialty (pneumology) and the target application, as a preparation before the terminological extraction. The information is then converted into the XML format. 2. Analysis and terms extraction We use two NLP tools in succesion in the process of developping the ontology: SYNTEX, a syntactic parser of specialized corpora, that produces a network of words and syntagms, and UPERY, a distributional analysis module which exploits all the network data to cluster the needed terms. 3. Results We obtain a network of candidate terms, their contextual associations and their links to the corpus. 1. Semantic normalization If we want to normalize the meaning of the candidate terms, we need to fix a unique context. To do that, we must change the candidate term into a primitive notion by both analysing their occcurence in the corpus and using differential semantics [2][3]. 2. Differential semantics The implementation of the four principles of differential semantics allows the building of the tree of the primitive notions: the principle of community with the parent (the child-node inherits the properties of the parent-node) the principle of difference from the parent (the child-node has properties distinct from the parent’s) the principle of community with the brothers (the properties specific to the nodes within a node group) and the principle of difference from brothers (the properties that distinguish a node from its brothers) 3. Results We then obtain a tree of the field's primitive notions (concepts and relations) in which the well-defined meaning of each notion is the expression of a knowledge. The conceptualization task is not finished yet, since those primitive notions are of a linguistic nature and are always subject to interpretation. 1. Referential and extensional semantics We will now move from primitive notions to concepts. In order to do this, we have to operate formal extensional-type semantics. In other words, we have to link the primitive notions to a set of objects in the specialty. The ontological tree will then be made of formal concepts defined in extention. The user will easily be able to build new formal concepts by associating those already available. 2. Results The hierarchical structure thus obtained is not a tree anymore but a lattice. 1. Finalization of the process In this last step, we will try to allow the whole process system to understand the concepts we defined, by specifying the behaviour of those new items. To accomplish this, we choose to express the concepts in a formal implementation language for knowledge representation. This language will use specific inference features as needed by this ontology. 2. Description Logics The language will be based on description logics, which will allow us to test the subsumption links, among other things. The ontology will then be made available in the OWL standard. The differential Ontology Editor: DOE http://opales.ina.fr/public/ PATHOLOGICAL_STATE PATHOLOGICAL_ LOCAL_STATE LESION effusion liquid effusion hemorrhagic effusion LOCALIZATION near outside above … located in ANATOMY ANA_TISSUE_COVER dura mater mesentery skin peritoneum Sample of defined concept Hemoperitoneum: « hemorrhagic effusion located in peritoneum » [Hemorrhagic effusion] (located in) [peritoneum] Excerpt of the hierarchical structure of concepts Excerpt of the hierarchical structure of relations Excerpt of the hierarchical structure of concepts


Download ppt "The methodology for building an ontology in four steps Building Medical Ontologies by Terminology Extraction from Texts: Pneumology, a sample methodology."

Similar presentations


Ads by Google