Presentation is loading. Please wait.

Presentation is loading. Please wait.

Knowledge Enabled Information and Services Science Semantic Web for Health Care and Biomedical Informatics Keynote at NSF Biomed Web Workshop, December.

Similar presentations


Presentation on theme: "Knowledge Enabled Information and Services Science Semantic Web for Health Care and Biomedical Informatics Keynote at NSF Biomed Web Workshop, December."— Presentation transcript:

1 Knowledge Enabled Information and Services Science Semantic Web for Health Care and Biomedical Informatics Keynote at NSF Biomed Web Workshop, December 4-5, 2007 Amit P. Sheth amit.sheth@wright.edu Thanks Pablo Mendes, Satya Sahoo and Kno.e.sis team;Kno.e.sis Collaborators at Athens Heart Center (Dr. Agrawal), NLM (Olivier Bodenreider), CCRC, UGA (Will York), CCHMC (Bruce Aronow)

2 Knowledge Enabled Information and Services Science Outline Semantic Web – very brief intro Scenarios to demonstrate the applications and benefit of semantic web technologies –Health care –Biomedical Research

3 Knowledge Enabled Information and Services Science Biomedical Informatics... Medical InformaticsBioinformatics Etiology Pathogenesis Clinical findings Diagnosis Prognosis Treatment Genome Transcriptome Proteome Metabolome Physiome...ome Genbank Uniprot PubmedClinical Trials.gov...needs a connection Biomedical Informatics Hypothesis Validation Experiment design Predictions Personalized medicine Semantic Web research aims at providing this connection! More advanced capabilities for search, integration, analysis, linking to new insights and discoveries!

4 Knowledge Enabled Information and Services Science Evolution of the Web Web of pages - text, manually created links - extensive navigation 2007 1997 Web of databases - dynamically generated pages - web query interfaces Web of services - data = service = data, mashups - ubiquitous computing Web of people - social networks, user-created content - GeneRIF, Connotea Web as an oracle / assistant / partner - “ask to the Web” - using semantics to leverage text + data + services + people

5 Knowledge Enabled Information and Services Science Ontology: Agreement with Common Vocabulary & Domain Knowledge; Schema + Knowledge base Semantic Annotation (meatadata Extraction): Manual, Semi-automatic (automatic with human verification), Automatic Reasoning/computation: semantics enabled search, integration, complex queries, analysis (paths, subgraph), pattern finding, mining, hypothesis validation, discovery, visualization Semantic Web Enablers and Techniques

6 Knowledge Enabled Information and Services Science Maturing capabilites and ongoing research Text mining: Entity recognition, Relationship extraction Integrating text, experimetal data, curated and multimedia data Clinical and Scientific Workflows with semantic web services Hypothesis driven retrieval of scientific literature, Undiscovered public knowledge

7 Knowledge Enabled Information and Services Science Metadata and Ontology: Primary Semantic Web enablers Shallow semantics Deep semantics Expressiveness, Reasoning

8 Knowledge Enabled Information and Services Science Characteristics of Semantic Web Self Describing Machine & Human Readable Issued by a Trusted Authority Easy to Understand Convertible Can be Secured The Semantic Web: XML, RDF & Ontology Adapted from William Ruh (CISCO)

9 Knowledge Enabled Information and Services Science Open Biomedical Ontologies Open Biomedical Ontologies, http://obo.sourceforge.net/ Many ontologies exist

10 Knowledge Enabled Information and Services Science Drug Ontology Hierarchy (showing is-a relationships) owl:thingprescription _drug_ brand_name brandname_ undeclared brandname_ composite prescription _drug monograph _ix_class cpnum_ group prescription _drug_ property indication_ property formulary_ property non_drug_ reactant interaction_ property propertyformularybrandname_ individual interaction_ with_prescri ption_drug interactionindicationgeneric_ individual prescription _drug_ generic generic_ composite interaction_ with_non_ drug_reactant interaction_ with_mono graph_ix_cl ass

11 Knowledge Enabled Information and Services Science N-Glycosylation metabolic pathway GNT-I attaches GlcNAc at position 2 UDP-N-acetyl-D-glucosamine + alpha-D-Mannosyl-1,3-(R1)-beta-D-mannosyl-R2 UDP + N-Acetyl-$beta-D-glucosaminyl-1,2-alpha-D-mannosyl-1,3-(R1)-beta-D-mannosyl-$R2 GNT-V attaches GlcNAc at position 6 UDP-N-acetyl-D-glucosamine + G00020 UDP + G00021 N-acetyl-glucosaminyl_transferase_V N-glycan_beta_GlcNAc_9 N-glycan_alpha_man_4

12 Knowledge Enabled Information and Services Science Opportunity: exploiting clinical and biomedical data text Health Information Services Elsevier iConsult Scientific Literature PubMed 300 Documents Published Online each day User-contributed Content (Informal) GeneRifs NCBI Public Datasets Genome, Protein DBs new sequences daily Laboratory Data Lab tests, RTPCR, Mass spec Clinical Data Personal health history Search, browsing, complex query, integration, workflow, analysis, hypothesis validation, decision support. binary

13 Knowledge Enabled Information and Services Science Scenario 1: Status: In use today Where: Athens Heart Center What: Use of semantic Web technologies for clinical decision support

14 Knowledge Enabled Information and Services Science Operational since January 2006

15 Knowledge Enabled Information and Services Science Goals: Increase efficiency with decision support formulary, billing, reimbursement real time chart completion automated linking with billing Reduce Errors, Improve Patient Satisfaction & Reporting drug interactions, allergy, insurance Improve Profitability Technologies: Ontologies, semantic annotations & rules Service Oriented Architecture Thanks -- Dr. Agrawal, Dr. Wingeth, and others. ISWC2006 paperISWC2006 paper Active Semantic Electronic Medical Records (ASEMR)

16 Knowledge Enabled Information and Services Science Demonstration

17 Knowledge Enabled Information and Services Science Chart Completion before the preliminary deployment ASMER Efficiency Chart Completion after the preliminary deployment

18 Knowledge Enabled Information and Services Science Scenario 2: Status: Demonstration Where: W3C Health Care and Life Sciences (HCLS) interest group What: Using semantic web to aggregate and query data about Alzheimer’s http://www.w3.org/2001/sw/hcls/

19 Knowledge Enabled Information and Services Science Scenario 2: Scientific Data Sets for Alzheimer’s

20 Knowledge Enabled Information and Services Science SPARQL Query spanning multiple sources

21 Knowledge Enabled Information and Services Science Scenario 3 Status: Completed research Where: NIH What: Understanding the genetic basis of nicotine dependence. Integrate gene and pathway information and show how three complex biological queries can be answered by the integrated knowledge base. How: Semantic Web technologies (especially RDF, OWL, and SPARQL) support information integration and make it easy to create semantic mashups (semantically integrated resources).

22 Knowledge Enabled Information and Services Science Motivation NIDA study on nicotine dependency List of candidate genes in humans Analysis objectives include: oFind interactions between genes oIdentification of active genes – maximum number of pathways oIdentification of genes based on anatomical locations Requires integration of genome and biological pathway information

23 Knowledge Enabled Information and Services Science Entrez Gene Reactome KEGG HumanCyc GeneOntology HomoloGene Genome and pathway information integration pathway protein pmid pathway protein pmid pathway protein pmid GO ID HomoloGene ID

24 Knowledge Enabled Information and Services Science JBI

25 Knowledge Enabled Information and Services Science BioPAX ontology Entrez Knowledge Model (EKoM)

26 Knowledge Enabled Information and Services Science Deductive Reasoning Protein-Protein Interaction RULE: given that two genes interact with each other, given certain number of parameters being met, we can assert that the gene products also interact with each other IF (x have_common_pathway y) AND (x rdf:type gene) AND (y rdf:type gene) AND (x has_product m) AND (y has_product n) AND (m rdf:type gene_product) AND (n rdf:type gene_product) THEN (m ? n) gene_product has_product have_common_pathway gene2 gene1 has_product database_identifier 2 associated_with database_identifier 1 interacts_with

27 Knowledge Enabled Information and Services Science Scenario 4 Status: Completed research Where: NIH What: queries across integrated data sources –Enriching data with ontologies for integration, querying, and automation –Ontologies beyond vocabularies: the power of relationships

28 Knowledge Enabled Information and Services Science Use data to test hypothesis gene GO PubMed Gene name OMIM Sequence Interactions Glycosyltransferase Congenital muscular dystrophy Link between glycosyltransferase activity and congenital muscular dystrophy? Adapted from: Olivier Bodenreider, presentation at HCLS Workshop, WWW07

29 Knowledge Enabled Information and Services Science In a Web pages world… Congenital muscular dystrophy, type 1D (GeneID: 9215) has_associated_disease has_molecular_function Acetylglucosaminyl- transferase activity Adapted from: Olivier Bodenreider, presentation at HCLS Workshop, WWW07

30 Knowledge Enabled Information and Services Science With the semantically enhanced data MIM:608840 Muscular dystrophy, congenital, type 1D GO:0008375 has_associated_phenotype has_molecular_function EG:9215 LARGE acetylglucosaminyl- transferase GO:0016757 glycosyltransferase GO:0008194 isa GO:0008375 acetylglucosaminyl- transferase GO:0016758 From medinfo paper. Adapted from: Olivier Bodenreider, presentation at HCLS Workshop, WWW07 SELECT DISTINCT ?t ?g ?d { ?t is_a GO:0016757. ?g has molecular function ?t. ?g has_associated_phenotype ?b2. ?b2 has_textual_description ?d. FILTER (?d, “muscular distrophy”, “i”). FILTER (?d, “congenital”, “i”) }

31 Knowledge Enabled Information and Services Science Scenario 5 Status: Research prototype and in progress Workflow withSemantic Annotation of Experimental Data already in use Where: UGA What: –Knowledge driven query formulation –Semantic Problem Solving Environment (PSE) for Trypanosoma cruzi (Chagas Disease)

32 Knowledge Enabled Information and Services Science Knowledge driven query formulation Complex queries can also include: - on-the-fly Web services execution to retrieve additional data - inference rules to make implicit knowledge explicit

33 Knowledge Enabled Information and Services Science T.Cruzi PSE Query Interface Figure 4: Semantic annotation of ms scientific data

34 Knowledge Enabled Information and Services Science N-GlycosylationProcessNGP N-Glycosylation Process (NGP) Cell Culture Glycoprotein Fraction Glycopeptides Fraction extract Separation technique I Glycopeptides Fraction n*m n Signal integration Data correlation Peptide Fraction ms datams/ms data ms peaklist ms/ms peaklist Peptide listN-dimensional array Glycopeptide identification and quantification proteolysis Separation technique II PNGase Mass spectrometry Data reduction Peptide identification binning n 1

35 Knowledge Enabled Information and Services Science Storage Standard Format Data Raw Data Filtered Data Search Results Final Output Agent Biological Sample Analysis by MS/MS Raw Data to Standard Format Data Pre- process DB Search (Mascot/ Sequest) Results Post- process (ProValt) OIOIOIOIO Biological Information Semantic Annotation Applications Semantic Web Process to incorporate provenance

36 Knowledge Enabled Information and Services Science 830.9570 194.9604 2 580.2985 0.3592 688.3214 0.2526 779.4759 38.4939 784.3607 21.7736 1543.7476 1.3822 1544.7595 2.9977 1562.8113 37.4790 1660.7776 476.5043 parent ion m/z fragment ion m/z ms/ms peaklist data fragment ion abundance parent ion abundance parent ion charge ProPreO: Ontology-mediated provenance Mass Spectrometry (MS) Data

37 Knowledge Enabled Information and Services Science <parameter instrument=“micromass_QTOF_2_quadropole_time_of_flight_mass_spectrometer” mode=“ms-ms”/> Ontological Concepts ProPreO: Ontology-mediated provenance Semantically Annotated MS Data

38 Knowledge Enabled Information and Services Science Scenario 6 When: Research in progress Where: Athens Heart Center and Cincinatti Children’s Hospital Medical Center What: scientific literature mining –Dealing with unstructured information –Extracting knowledge from text –Complex entity recognition –Relationship extraction

39 Knowledge Enabled Information and Services Science Heart Failure Clinical Pathway causes Disease Angiotension Receptor Blocker (ARB) Ontology: A Framework for Schema-Driven Relationship Discovery from Unstructured Text, Ramakrishnan, et. al., ISWC 2006, LNCS 4273, pp. 583-596

40 Knowledge Enabled Information and Services Science Contextual delivery of information

41 Knowledge Enabled Information and Services Science Two technical challenges –Text mining –Workflow adaptation

42 Knowledge Enabled Information and Services Science Diabetes mellitus adversely affects the outcomes in patients with myocardial infarction (MI), due in part to the exacerbation of left ventricular (LV) remodeling. Although angiotensin II type 1 receptor blocker (ARB) has been demonstrated to be effective in the treatment of heart failure, information about the potential benefits of ARB on advanced LV failure associated with diabetes is lacking. To induce diabetes, male mice were injected intraperitoneally with streptozotocin (200 mg/kg). At 2 weeks, anterior MI was created by ligating the left coronary artery. These animals received treatment with olmesartan (0.1 mg/kg/day; n = 50) or vehicle (n = 51) for 4 weeks. Diabetes worsened the survival and exaggerated echocardiographic LV dilatation and dysfunction in MI. Treatment of diabetic MI mice with olmesartan significantly improved the survival rate (42% versus 27%, P < 0.05) without affecting blood glucose, arterial blood pressure, or infarct size. It also attenuated LV dysfunction in diabetic MI. Likewise, olmesartan attenuated myocyte hypertrophy, interstitial fibrosis, and the number of apoptotic cells in the noninfarcted LV from diabetic MI. Post-MI LV remodeling and failure in diabetes were ameliorated by ARB, providing further evidence that angiotensin II plays a pivotal role in the exacerbated heart failure after diabetic MI. ARB causes heart failure Extracting the Relationship Angiotensin II type 1 receptor blocker attenuates exacerbated left ventricular remodeling and failure in diabetes-associated myocardial infarction., Matsusaka H, et. al.

43 Knowledge Enabled Information and Services Science Problem – Extracting relationships between MeSH terms from PubMed Biologically active substance Lipid Disease or Syndrome affects causes affects causes complicates Fish Oils Raynaud’s Disease ??????? instance_of UMLS Semantic Network MeSH PubMed 9284 documents 4733 documents 5 documents

44 Knowledge Enabled Information and Services Science Background knowledge used UMLS – A high level schema of the biomedical domain –136 classes and 49 relationships –Synonyms of all relationship – using variant lookup (tools from NLM) –49 relationship + their synonyms = ~350 mostly verbs MeSH –22,000+ topics organized as a forest of 16 trees –Used to query PubMed PubMed –Over 16 million abstract –Abstracts annotated with one or more MeSH terms T147—effect T147—induce T147—etiology T147—cause T147—effecting T147—induced

45 Knowledge Enabled Information and Services Science Method – Parse Sentences in PubMed SS-Tagger (University of Tokyo) SS-Parser (University of Tokyo) (TOP (S (NP (NP (DT An) (JJ excessive) (ADJP (JJ endogenous) (CC or) (JJ exogenous) ) (NN stimulation) ) (PP (IN by) (NP (NN estrogen) ) ) ) (VP (VBZ induces) (NP (NP (JJ adenomatous) (NN hyperplasia) ) (PP (IN of) (NP (DT the) (NN endometrium) ) ) ) ) ) ) Entities (MeSH terms) in sentences occur in modified forms “adenomatous” modifies “hyperplasia” “An excessive endogenous or exogenous stimulation” modifies “estrogen” Entities can also occur as composites of 2 or more other entities “adenomatous hyperplasia” and “endometrium” occur as “adenomatous hyperplasia of the endometrium”

46 Knowledge Enabled Information and Services Science Method – Identify entities and Relationships in Parse Tree TOP NP VP S NP VBZ induces NP PP NP IN of DT the NN endometrium JJ adenomatous NN hyperplasia NP PP IN by NN estrogen DT the JJ excessive ADJP NN stimulation JJ endogenous JJ exogenous CC or MeSHID D004967 MeSHID D006965 MeSHID D004717 UMLS ID T147 Modifiers Modified entities Composite Entities

47 Knowledge Enabled Information and Services Science What can we do with the extracted knowledge? Semantic browser demo

48 Knowledge Enabled Information and Services Science PubMed Complex Query Supporting Document sets retrieved Migraine Stress Patient affects isa Magnesium Calcium Channel Blockers inhibit Keyword query: Migraine[MH] + Magnesium[MH] Evaluating hypotheses

49 Knowledge Enabled Information and Services Science Workflow Adaptation: Why and How Volatile nature of execution environments –May have an impact on multiple activities/ tasks in the workflow HF Pathway –New information about diseases, drugs becomes available –Affects treatment plans, drug-drug interactions Need to incorporate the new knowledge into execution –capture the constraints and relationships between different tasks activities

50 Knowledge Enabled Information and Services Science Workflow Adaptation Why? New knowledge about treatment found during the execution of the pathway New knowledge about drugs, drug drug interactions

51 Knowledge Enabled Information and Services Science Workflow Adaptation: How Decision theoretic approaches –Markov Decision Processes Given the state S of the workflow when an event E occurs –What is the optimal path to a goal state G –Greedy approaches rely on local optimization Need to choose actions based on optimality across the entire horizon, not just the current best action –Model the horizon and use MDP to find the best path to a goal state

52 Knowledge Enabled Information and Services Science Conclusion semantic web technologies can help with: –Fusion of data: semi-structured, structured, experimental, literature, multimedia –Analysis and mining of data, extraction, annotation, capture provenance of data through annotation, workflows with SWS –Querying of data at different levels of granularity, complex queries, knowledge-driven query interface –Perform inference across data sets

53 Knowledge Enabled Information and Services Science Take home points Shift of paradigm: from browsing to querying Machine understanding: –extracting knowledge from text –Inference, software interoperation Semantic-enabled interfaces towards hypothesis validation

54 Knowledge Enabled Information and Services Science References 1.A. Sheth, S. Agrawal, J. Lathem, N. Oldham, H. Wingate, P. Yadav, and K. Gallagher, Active Semantic Electronic Medical Record, Intl Semantic Web Conference, 2006.Active Semantic Electronic Medical Record, Intl Semantic Web Conference 2.Satya Sahoo, Olivier Bodenreider, Kelly Zeng, and Amit Sheth, An Experiment in Integrating Large Biomedical Knowledge Resources with RDF: Application to Associating Genotype and Phenotype Information WWW2007 HCLS Workshop, May 2007.An Experiment in Integrating Large Biomedical Knowledge Resources with RDF: Application to Associating Genotype and Phenotype Information WWW2007 HCLS Workshop 3.Satya S. Sahoo, Kelly Zeng, Olivier Bodenreider, and Amit Sheth, From "Glycosyltransferase to Congenital Muscular Dystrophy: Integrating Knowledge from NCBI Entrez Gene and the Gene Ontology, Amsterdam: IOS, August 2007, PMID: 17911917, pp. 1260-4From "Glycosyltransferase to Congenital Muscular Dystrophy: Integrating Knowledge from NCBI Entrez Gene and the Gene Ontology 4.Satya S. Sahoo, Olivier Bodenreider, Joni L. Rutter, Karen J. Skinner, Amit P. Sheth, An ontology-driven semantic mash-up of gene and biological pathway information: Application to the domain of nicotine dependence, submitted, 2007. 5.Cartic Ramakrishnan, Krzysztof J. Kochut, and Amit Sheth, "A Framework for Schema-Driven Relationship Discovery from Unstructured Text", Intl Semantic Web Conference, 2006, pp. 583- 596A Framework for Schema-Driven Relationship Discovery from Unstructured Text", Intl Semantic Web Conference 6.Satya S. Sahoo, Christopher Thomas, Amit Sheth, William S. York, and Samir Tartir, "Knowledge Modeling and Its Application in Life Sciences: A Tale of Two Ontologies", 15th International World Wide Web Conference (WWW2006), Edinburgh, Scotland, May 23-26, 2006.Knowledge Modeling and Its Application in Life Sciences: A Tale of Two Ontologies Demos at: http://knoesis.wright.edu/library/demos/


Download ppt "Knowledge Enabled Information and Services Science Semantic Web for Health Care and Biomedical Informatics Keynote at NSF Biomed Web Workshop, December."

Similar presentations


Ads by Google