Presentation is loading. Please wait.

Presentation is loading. Please wait.

©2011 MFMER | slide-1 The Linked Clinical Data Project Jyotishman Pathak, PhD Rick Kiefer SemTIG November 4, 2011.

Similar presentations


Presentation on theme: "©2011 MFMER | slide-1 The Linked Clinical Data Project Jyotishman Pathak, PhD Rick Kiefer SemTIG November 4, 2011."— Presentation transcript:

1 ©2011 MFMER | slide-1 The Linked Clinical Data Project Jyotishman Pathak, PhD Rick Kiefer SemTIG November 4, 2011

2 ©2011 MFMER | slide-2 Purpose The Linked Clinical Data (LCD) project aims to investigate emerging Semantic Web technologies for developing an ontology-driven framework for high- throughput phenotyping using Electronic Medical Records (EMRs) to analyze multi-factorial phenotypes. Investigate ontology-based techniques. Develop a framework for publishing and integrating. Propose and validate semantic reasoning techniques to support rapid cohort identification

3 ©2011 MFMER | slide-3 LCD Architecture Med Index Virtuoso RDF View MCLSS Endpoint SPARQL SQL Linked Open Drug Data Endpoints Selector Thick Client Application Thin Client Application Mobile Client Application Health Quest MICS MRIS NRAF MCLSS Databases Web Server Virtual Server Viewer Formatter Linked Data API Response Request

4 ©2011 MFMER | slide-4 Project – Automated SNPedia SNPedia contains a wealth of data but the information in the wiki is manually curated. The focus of this project is to automate the results using patient data. Using MCLSS, identify patients with specific conditions. Join with OMIM to determine the genetic locus associated with those conditions Join with dbSNP to identify potentially associated SNPs. Each of the joins will be done using a single federated SPARQL query. Results will then be compared to data in SNPedia

5 ©2011 MFMER | slide-5 Disease to SNP architecture dbSNP OMIM MCLSS Databases Endpoints Patient Disease SNOMED/ICD9 Gene SNP Request Results RDF View Mapping SPARQL Query

6 ©2011 MFMER | slide-6 dbSNP/OMIM federated query PREFIX omim: PREFIX dbsnp: SELECT DISTINCT ?rsID ?geneSymbol ?alleleName {http://bio2rdf.org/omim_resourcehttp://edison.mayo.edu:8890/schemas/dbsnp2/ SERVICE {http://omim.bio2rdf.org/sparql SELECT ?geneSymbol ?alleleName WHERE { ?alleleVariant rdf:type omim:AllelicVariant; ?alleleName;http://purl.org/dc/terms/title omim:symbol ?geneSymbol. FILTER(regex(str(?alleleName), "Diabetes", "i")). } } SERVICE {http://edison.mayo.edu:8890/sparql SELECT ?rsID WHERE { ?s dbsnp:symbol ?geneSymbol; dbsnp:rsid ?rsID. } } }

7 ©2011 MFMER | slide-7 Partial SPARQL results

8 ©2011 MFMER | slide-8 Process – Creating dbSNP endpoint No endpoint could be found so one had to be created. Download dbSNP database from a Sybase dump Use Perl to filter the tables in order to isolate desired data and rewrite into tab delimited form. Create tables in mySQL and import the files. Use Virtuoso to link to the tables Create RDF views by mapping the table columns to the desired endpoint subjects

9 ©2011 MFMER | slide-9 Hurdles Endpoints Difficult to find Unreliable up time Unknown age of data Schema documentation Environment Linux - could not find ODBC driver for Virtuoso Virtuoso Bridge did not work with db2 Virtual server – no admin permissions Windows 2008 server – bug in webDAV access

10 ©2011 MFMER | slide-10 Hurdles Virtuoso Did not support federated queries until March. March release has bugs Unable to run SPARQL queries against non- local endpoints Federated queries of mixed location crashes the server Beta fix release has performance issues Documentation – outdated and poor navigation

11 ©2011 MFMER | slide-11 Next steps MCLSS Identify small MCLSS views Federated query with SIDER and RxNorm Use TMO/etc for RDMS -> RDF mapping dbSNP RDF view Standardized RDMS -> RDF mapping Visual graph for dbSNP/OMIM SNPedia Alter Bob’s Perl script to download data Upload in mySQL for comparisions

12 ©2011 MFMER | slide-12 Questions? Website Thank you! Bob Freimuth – Perl scripts to filter and transform the dbSNP database as well as invaluable sharing of genomic knowledge and advice. http://informatics.mayo.edu/LCD/index.php/Main_Page


Download ppt "©2011 MFMER | slide-1 The Linked Clinical Data Project Jyotishman Pathak, PhD Rick Kiefer SemTIG November 4, 2011."

Similar presentations


Ads by Google