Presentation is loading. Please wait.

Presentation is loading. Please wait.

Discovering Disease Associations using a Biomedical Semantic Web: Integration and Ranking One of the principal goals of biomedical research is to elucidate.

Similar presentations


Presentation on theme: "Discovering Disease Associations using a Biomedical Semantic Web: Integration and Ranking One of the principal goals of biomedical research is to elucidate."— Presentation transcript:

1 Discovering Disease Associations using a Biomedical Semantic Web: Integration and Ranking One of the principal goals of biomedical research is to elucidate the complex network of gene interactions underlying common human diseases. Although integrative genomics based approaches have been shown to be successful in understanding the underlying pathways and biological processes in normal and disease states, most of the current biomedical knowledge is spread across different databases in different formats. Semantic Web principals, standards and technologies provide an ideal platform to integrate such heterogeneous information and bring forth implicit relations hitherto embedded in these large integrated biomedical and genomic datasets. Semantic Web query languages such as SPARQL can be effectively used to mine the biological entities underlying complex diseases through richer and complex queries on this integrated data. However, the end results are frequently large and unmanageable. Thus, there is a great need to develop techniques to rank resources on the Semantic Web which can later be used to retrieve and rank the results and prevent the information overload. Such ranking can be used to prioritize the discovered disease–gene, disease–pathway or disease– processes novel relationships. We implemented an existing semantic web based knowledge mining technique which not only discovers underlying genes, processes and pathways of diseases but also determines the importance of the resources to rank the results of a search while determining the semantic associations. Data Integration- RDF MODEL Ranga Chandra Gudivada 1,2, Xiaoyan A. Qu 1,2, Anil G Jegga 2,3,4, Eric K. Neumann 5, Bruce J Aronow 1,2,3,4 Departments of Biomedical Engineering 1 and Pediatrics 2, University of Cincinnati, Center for Computational Medicine 3 and Division of Biomedical Informatics 4, Cincinnati Childrens Hospital Medical Center, Cincinnati OH-45229, USA and Teranode Corporation 5, Seattle, WA Case Study-Prioritizing Modifier Genes, Pathways and Biological Processes for Case Study-Prioritizing Modifier Genes, Pathways and Biological Processes for CARDIOMYOPATHY, DILATED AbstractAbstract Computational Problem Data integration: biological feature complexity is deep, heterogeneous, and extensive. Data complexity poses a formidable challenge to efforts to integrate, formally model, and simulate biological systems behaviors Likelihood Ranking requires mining and prioritization of entities and events that function in the context of biological networks Biological Problem Disease genes discovered to date likely represent the easy ones. Discovering the genetic basis of remaining Mendelian and complex gene-X-gene-X-environment disorders will be challenging and require consideration of many more features and causal relationships No gene operates in vacuum, all gene, protein, pathway interactions can lead to Modifier Gene effects Identifying modifier genes, i.e. gene networks underlying diseases is challenging (pathways, biological processes and functions) Benefits of Semantic Web Semantic Web standards such as Resource Description Framework (RDF) & Ontology Web Language (OWL) facilitate semantic integration of heterogeneous multi-source data SPARQL, a semantic web query language, capable of making queries of higher order relationships in multi dimensional data can be used to mine Bio-RDF graphs Prioritization of biological entities on semantic web can be accomplished by extending[2] and applying existing graph algorithms, such as Kleinberg Aglorithm[1] Cell.Component GO ID Disease CUI Gene Symbol Mol.Function GO ID Pathway Id Biol.Process GO ID Biol.Process Description Anatomy CUI Disease Name Anatomy Name Mol.Function Description Pathway Description Cell.Component Description rdfs:label inBiological Process inMolecula rFunction occursIn Pathway hasAssociated Gene hasAssociated Anatomy hasAssociated Disease Mouse Phenotype ID Mouse Phenotype Description hasMouse PhenoType rdfs:label Ranking on Semantic Web BIND REACTOME Nature Pathway Interaction database KleinBerg Algorithm (1) High Authoritative score Authoritative node Pointed by good hubs its authoritative score increases High Hub score Hub Nodes Points to many authoritative sites, increases the hub scores Extending KleinBerg Algorithm(2) for Semantic Web gene Pathway associatedPathway Objectivity weight Subjectivity Weight Subjectivity weight > objectivity weight A single gene participating in multiple biological pathways is considered more sensitive to perturbation than a single pathway having a large number of nodes (Different weights for non - symmetric properties); corollary : geneA geneB interacts Objectivity weight Subjectivity Weight Subjectivity weight = objectivity weight GeneA interacting with various genes has equal significance as GeneB interacting with various genes (Equal weights for symmetric properties) CARDIOMYOPATHY, DILATED, X-LINKED Primary Genes (1) DMD Pathways (1) Interacting Partners (16) Biological Processes (4) Primary genes + Interacting Partners (1+16) Pathways (28) Biological Processes (27) Biological Process Pathways QUERY RESULT WITH PRIORITIZATION Step1 Step2 Modifier Genes (16) Pathways (28) Biological Processes (27) OMIM Mammalian Phenotype Others Disease Entrez Gene SwissProt Gene Ontology others Gene / Protein Annotations BIOCARTA KEGG BIOCYC Pathways Molecular Interactions PREFIX CCHMC: PREFIX rdf: SELECT DISTINCT ?pathway where { ?pathway rdf:type CCHMC:Pathway. ?resource ?PROPERTY ?pathway. } SPARQL QUERY 1.Kleinberg, J. M Authoritative sources in a hyperlinked environment. J. ACM 46, 5 (Sep. 1999) 2 Bhuvan Bamba, Sougata Mukherjea: Utilizing Resource Importance for Ranking Semantic Web Query Results. SWDB 2004: ConclusionConclusion We have shown that related yet heterogeneous information can be integrated using RDF-OWL and that this approach can support mechanistic analyses of diseases. Specifically, we have uncovered additional genes and pathways that could play a role in the onset and treatment of Cardiomyopathy. We intend to expand our analyses into additional modalities such as anatomy, cellular type, and symptoms/ phenotypes.


Download ppt "Discovering Disease Associations using a Biomedical Semantic Web: Integration and Ranking One of the principal goals of biomedical research is to elucidate."

Similar presentations


Ads by Google