Presentation is loading. Please wait.

Presentation is loading. Please wait.

Information Management for the Life Sciences M. Scott Marshall Marco Roos Adaptive Information Disclosure University of Amsterdam.

Similar presentations


Presentation on theme: "Information Management for the Life Sciences M. Scott Marshall Marco Roos Adaptive Information Disclosure University of Amsterdam."— Presentation transcript:

1 Information Management for the Life Sciences M. Scott Marshall Marco Roos Adaptive Information Disclosure University of Amsterdam

2 Practice: OWL modeling of a statement COI demo: Bridging CDISC and HL7 with query federation Terminology and SKOS Demonstration Toward Query Federation – putting it all together A few loose ends

3 Towards RDF/OWL (1) ALL instances of PeptideHormone are an instance of Peptide that has_role SOME instance of HormoneActivity Source: Alan Ruttenberg

4 Towards RDF/OWL (3) ALL instances of PeptideHormone are an instance of Peptide that has_role SOME instance of HormoneActivity Source: Alan Ruttenberg

5 Towards RDF/OWL (3) - Instances Source: Alan Ruttenberg

6 Towards RDF/OWL (4) URIs chebi:25905 = Source: Alan Ruttenberg

7 Towards OWL (5) : triples chebi:25905 rdfs:subClassOf chebi:16670. chebi:25905 rdfs:subClassOf _:1. :_1 owl:onProperty ro:hasRole. :_1 owl:someValuesFrom go:GO_00179. … Source: Alan Ruttenberg

8 SPARQLing: Put ?variables where you are looking for matches chebi:25905 rdfs:subClassOf chebi:16670. chebi:25905 rdfs:subClassOf _:1. :_1 owl:onProperty ro:hasRole. :_1 owl:someValuesFrom go:GO_00179. select ?moleculeClass where { ?moleculeClass rdfs:subClassOf chebi:16670. ?moleculeClass rdfs:subClassOf ?res. ?res owl:onProperty ro:hasRole. ?res owl:someValuesFrom go:GO_00179. } ?moleculeClass = chebi:25905 Source: Alan Ruttenberg

9 Current Task Forces BioRDF – integrated neuroscience knowledge base – Kei Cheung (Yale University) Clinical Observations Interoperability – patient recruitment in trials – Vipul Kashyap (Cigna Healthcare) Linking Open Drug Data – aggregation of Web-based drug data – Chris Bizer (Free University Berlin) Pharma Ontology – high level patient-centric ontology – Christi Denney (Eli Lilly) Scientific Discourse – building communities through networking – Tim Clark (Harvard University) Terminology – Semantic Web representation of existing resources – John Madden (Duke University)

10 Background of the HCLS IG Originally chartered in 2005 – Chairs: Eric Neumann and Tonya Hongsermeier Re-chartered in 2008 – Chairs: Scott Marshall and Susie Stephens – Team contact: Eric Prud’hommeaux Broad industry participation – Over 100 members – Mailing list of over 600 Background Information – http://www.w3.org/2001/sw/hcls/ http://www.w3.org/2001/sw/hcls/ – http://esw.w3.org/topic/HCLSIG http://esw.w3.org/topic/HCLSIG

11 COI Task Force Task Lead: Vipul Kashap Participants: Eric Prud’hommeaux, Helen Chen, Jyotishman Pathak, Rachel Richesson, Holger Stenzhorn

12 COI: Bridging Bench to Bedside How can existing Electronic Health Records (EHR) formats be reused for patient recruitment? Quasi standard formats for clinical data: – HL7/RIM/DCM – healthcare delivery systems – CDISC/SDTM – clinical trial systems How can we map across these formats? – Can we ask questions in one format when the data is represented in another format? Source: Holger Stenzhorn

13 COI: Use Case Pharmaceutical companies pay a lot to test drugs Pharmaceutical companies express protocol in CDISC -- precipitous gap – Hospitals exchange information in HL7/RIM Hospitals have relational databases Source: Eric Prud’hommeaux

14 Type 2 diabetes on diet and exercise therapy or monotherapy with metformin, insulin secretagogue, or alpha-glucosidase inhibitors, or a low-dose combination of these at 50% maximal dose. Dosing is stable for 8 weeks prior to randomization. … ?patient takes meformin. Inclusion Criteria Source: Holger Stenzhorn

15 Exclusion Criteria Use of warfarin (Coumadin), clopidogrel (Plavix) or other anticoagulants. … ?patient doesNotTake anticoagulant. Source: Holger Stenzhorn

16 ?medication1 sdtm:subject ?patient ; spl:activeIngredient ?ingredient1. ?ingredient1 spl:classCode 6809. #metformin OPTIONAL { ?medication2 sdtm:subject ?patient ; spl:activeIngredient ?ingredient2. ?ingredient2 spl:classCode 11289. #anticoagulant } FILTER (!BOUND(?medication2)) Criteria in SPARQL Source: Holger Stenzhorn

17 Terminology Task Force Task Lead: John Madden Participants: Chimezie Ogbuji, M. Scott Marshall, Helen Chen, Holger Stenzhorn, Mary Kennedy, Xiashu Wang, Rob Frost, Jonathan Borden, Guoqian Jiang

18 Features: the “bridge” to meaning Concepts FeaturesData Ontology Keyword Vectors Literature Ontology Image Features Image(s) Ontology Gene Expression Profile Microarray Ontology Detected Features Sensor Array

19 Terminology: Overview Goal is to identify use cases and methods for extracting Semantic Web representations from existing, standard medical record terminologies, e.g. UMLS Methods should be reproducible and, to the extent possible, not lossy Identify and document issues along the way related to identification schemes, expressiveness of the relevant languages Initial effort will start with SNOMED-CT and UMLS Semantic Networks and focus on a particular sub-domain (e.g. pharmacological classification) Source: John Madden

20 Medical terminologies: today Moderate number of large, evolved terminologies Adapted for specific business- process contexts Each separately, centrally curated Typically hierarchical, various expressivities Uncommon to mix vocabularies Outpatient billing - CPT Inpatient billing - CD Laboratory results - LOINC Clinical findings - SNOMED Journal indexing - MEDLARS Pharmacy - MEDRA Process - HL7 Clinical trials - CDISC Others... Source: John Madden

21 SKOS & the 80/20 principle: map “down” Minimal assumptions about expressiveness of source terminology No assumed formal semantics (no model theory) Treat it as a knowledge “map” Extract 80% of the utility without risk of falsifying intent 21 Source: John Madden

22 The AIDA toolbox for knowledge extraction and knowledge management in a Virtual Laboratory for e-Science

23 23 SNOMED CT/SKOS under AIDA: retrieve

24 Putting it all together Choosing valid terms for use in the SPARQL query by browsing/searching the knowledge base. Create single SPARQL endpoint for a federation of knowledge bases (SWObjects) Apply bridging technique to bridge MeSH terms and terms in HCLS Knowledge Base. Use terms from Terminology Server in Scientific Discourse

25

26

27 Task Force Resources to federate BioRDF – knowledge base, aTags (stored in KB) Clinical Observations Interoperability – drug ontology Linking Open Drug Data – LOD data Pharma Ontology – ontology Scientific Discourse – SWAN ontology, SWAN SKOS, myexperiment ontology Terminology – SNOMED-CT, MeSH, UMLS

28 Someday, we should be able to find this as evidence for a fact in a Knowledge Base

29 Getting Involved Benefits to getting involved include: – Early access to use cases and best practice – Influence standard recommendations – Cost effective exploration of new technology through collaboration – Network with others working on the Semantic Web Get involved Email chairs and team contact team-hcls-chairs@w3.org – Participate in the next F2F (last one was here): http://esw.w3.org/topic/HCLSIG/Meetings/2009-04-30_F2F

30 A Few Announcements Still unofficial but almost set: Semantic Web Applications and Tools for the Life Sciences Workshop (SWAT4LS) in Amsterdam 2009 (tentative date: Nov 20) Possibly W3C Semantic Web Health Care and Life Sciences Interest Group (HCLSIG) F2F in Fall in Amsterdam Shared Names http://sharednames.org workshop likely in the Fall, location unknownhttp://sharednames.org Protégé Conference in Amsterdam June 23 - 26


Download ppt "Information Management for the Life Sciences M. Scott Marshall Marco Roos Adaptive Information Disclosure University of Amsterdam."

Similar presentations


Ads by Google