Presentation is loading. Please wait.

Presentation is loading. Please wait.

© 2014 Cypher Genomics, Inc.Page 1Proprietary and Confidential The information disclosed in this document, including all designs and related materials,

Similar presentations


Presentation on theme: "© 2014 Cypher Genomics, Inc.Page 1Proprietary and Confidential The information disclosed in this document, including all designs and related materials,"— Presentation transcript:

1 © 2014 Cypher Genomics, Inc.Page 1Proprietary and Confidential The information disclosed in this document, including all designs and related materials, is the valuable property of Cypher Genomics, Inc. and its licensors. Cypher Genomics, Inc. and its licensors, as appropriate, reserve all patent, copyright and other proprietary rights to this document, including all design, manufacturing, reproduction, use, and sales rights, except to the extent said rights are expressly granted to others. Choosing a Cloud Provider for your Genomic Applications Phillip Pham Director, Technology Development phillip@cyphergenomics.com November 13 th, 2014

2 © 2014 Cypher Genomics, Inc.Page 2Proprietary and Confidential Developed Cypher Genomics technology at the Scripps Translational Science Institute Designed and deployed high performance and parallel computing systems for annotation and genome interpretation Expertise in genome analysis workflows Interested in the genetic architecture influencing drug efficacy and response in clinical trials Bioinformatics: Bioengineering, B.S. Patents: – Pham P, Deshpande S, 2013. Systems and Methods for Genomic Variant Annotation, U.S. Patent 13/841,575, filed March 2013. Patent Pending. Biography

3 © 2014 Cypher Genomics, Inc.Page 3Proprietary and Confidential Cypher Genomics Problem Space Simplified Logical Architecture Pain Points Evaluating Alternatives Lessons Learned Summary Agenda

4 © 2014 Cypher Genomics, Inc.Page 4Proprietary and Confidential

5 © 2014 Cypher Genomics, Inc.Page 5Proprietary and Confidential Cypher Genomics provides automated Annotation, Interpretation and Biomarker Discovery for Whole Genomes, Exomes, CNV, etc. – Sequencing = AGCTTGAGGATCAACTAGTGCATGCTATACCTGC… – Alignment & Variant Calling Align the appropriate sets of nucleotides to their place in the genome. – 150 Gb BAM files (compressed) Compare the aligned input genome to the reference genome Identify variants (i.e. mutations) in the input genome. Our Problem Space – Annotation Each variant annotated with 90+ attributes from 50+ reference data sources as well as Cypher’s proprietary data and prediction algorithms. Web-Based UI for accessing all data, applying analytic filters, etc. – Interpretation Human Readable PDF Report with Cypher Synthesis summary of most important findings. – Includes all variants with Cypher Synthesis and references to supporting evidence

6 © 2014 Cypher Genomics, Inc.Page 6Proprietary and Confidential Mantis ™ Workflow – Clinical Use Case Patient Sample Sent to Lab Sample is Sequenced, Aligned & Variant Called 150GB BAM 125MB VCF Next Gen Sequencing Data Uploaded to Cypher Mantis Interprets Genome Data Report Generated.PDF 3MB Clinical Summary Delivered 200M+ Variants Annotated and growing 90+ Annotations per Variant 50+ Reference DB’s 40+ TB (and growing)

7 © 2014 Cypher Genomics, Inc.Page 7Proprietary and Confidential Automated solutions are essential to provide precision medicine at population scales YESTERDAY Medical College of Wisconsin Life-threatening bowel disease Whole exome sequencing (1% of genome) Cost: $15,000 Single mutation in XIAP Scripps IDIOM / Cypher Genomics Debilitating neuromuscular disease Whole genome sequencing Automated interpretation Mutations in ADCY5 Nicholas Volker 2010 Lilly Grossman 2013 TODAYTOMORROW Every Patient 2018 Driven by decrease in sequencing cost (e.g. illumina X10 - $1,000 genome) Whole genome – baseline Updated interpretation over time Genomic clinical decision support ++

8 © 2014 Cypher Genomics, Inc.Page 8Proprietary and Confidential Coral ™ Workflow – Pharma Use Case Patient Data Collected Study Samples Sequenced Next Gen Sequencing Data to Cypher Cypher Runs Genomes at Scale Coral Produces Predictive Models Biomarker Identified Data and Compute usage grow in population studies

9 © 2014 Cypher Genomics, Inc.Page 9Proprietary and Confidential Web Front-End – User Interface – Apache Web Server – Zend Framework Server Cypher Core Services (CCS) – REST API Implementation to Service User Interface Automated Integration – CDH – Hadoop Ecosystem HDFS – Holds raw reference data HBase – Variant Annotation DB MapReduce – Process variant data and annotation information at scale – MongoDB Consolidated Reference Data Demographics Data – Vertica Analytics DB for real-time, interactive analytic investigation Cypher Analytic Pipeline (CAP) - Novel Variant Annotation Pipeline – Penguin on Demand – HPC with Torque scheduler – Custom parallelization algorithm for annotating, running predictive algorithms – Final annotation of Variant goes into CCS Annotation DB Logical Deployment Components

10 © 2014 Cypher Genomics, Inc.Page 10Proprietary and Confidential CAP Master Application Server Annotation of Novel Variants CAP HPC Job Scheduler Consolidated Reference DB (Mongo) Annotation DB (Hbase) HDFS Hadoop & Map/Reduce Cluster CCS Master Application Server Annotation of known Variants Interactive Analytics Back-End Interpretation & Reporting CCS Demographic Data (Mongo) Analytics DB (Vertica) Cypher Core Services Cypher Annotation Pipeline (Novel Variants) HPC Environment TORQUE Source Reference Data Web App Server Web Server Front-end Applications REST APIs Web Server Web App Server Simplified Logical Architecture HPC Provider Cloud Provider

11 © 2014 Cypher Genomics, Inc.Page 11Proprietary and Confidential Downtime – Unscheduled system reboots and upgrades. Temporary loss of data and configurations – Poor communication Difficult to manage temporary cluster instantiation Firewall architecture imposes inflexible rules on our naming conventions Limits to cluster and storage quota (either programmatically or through the Management Console). – Results in requiring us to call the vendor to raise our limits. Ease of getting to HIPAA Compliance Pain Points

12 © 2014 Cypher Genomics, Inc.Page 12Proprietary and Confidential Comparable (or better) Performance per $ Better Uptime / Less Dramatic Impact during Maintenance Start and stop temporary clusters with ease Flexible firewall rules Unlimited storage scaling on demand Strategic options for utilizing service offerings HIPAA compliance and business associate agreement Evaluating Alternatives

13 © 2014 Cypher Genomics, Inc.Page 13Proprietary and Confidential Most Time Consuming Stages of our Processing – Reformat Input Files – Query Annotations – Process Annotations – Split Results – Organize Results And – Total End-to-End processing time What did we test?

14 © 2014 Cypher Genomics, Inc.Page 14Proprietary and Confidential Current Cloud Vendor Test Configuration – “Original Apple” – 6 Nodes 8 CPU 32 GB RAM 40 TB Block Storage (Magnetic) No Encryption NOTE: The above configuration is smaller than our entire product and is just for performance comparison purposes. You should draw NO conclusions regarding the performance of our product from these comparisons. New Cloud Vendor Configuration 1 – “Pear” – 6 Nodes 8 CPU 32 GB RAM 40 TB Block Storage (SSD) All Disks Encrypted Configuration 2 – “Melon” – 6 Nodes 32 CPUs 60 GB RAM 40 TB Local Disk (Magnetic) All Disks Encrypted Test Configurations – Hadoop (Hbase/MapReduce) Cluster

15 © 2014 Cypher Genomics, Inc.Page 15Proprietary and Confidential Apples to Pears to Melons… Oh my! (Reformat)

16 © 2014 Cypher Genomics, Inc.Page 16Proprietary and Confidential Apples to Pears to Melons… Oh my! (Query)

17 © 2014 Cypher Genomics, Inc.Page 17Proprietary and Confidential Apples to Pears to Melons… Oh my! (Process Annots)

18 © 2014 Cypher Genomics, Inc.Page 18Proprietary and Confidential Apples to Pears to Melons… Oh my! (Split Results)

19 © 2014 Cypher Genomics, Inc.Page 19Proprietary and Confidential Apples to Pears to Melons… Oh my! (Organize)

20 © 2014 Cypher Genomics, Inc.Page 20Proprietary and Confidential Apples to Pears to Melons… Oh my! (Overall)

21 © 2014 Cypher Genomics, Inc.Page 21Proprietary and Confidential Environment set-up and tweaking – 80% of time spent here – Be prepared to iterate on this as you test larger and larger batches. Work with your vendor to provide experts in Hadoop, Performance and Capacity Planning to aid in your evaluation. Work with your vendor to cost out expenditures – Don’t forget security, load balancing, auto-scaling services, etc. – Look closely at the advantages of paying a portion up-front to get good discounts on monthly costs. Lessons Learned

22 © 2014 Cypher Genomics, Inc.Page 22Proprietary and Confidential Find a Cloud Provider that will be a Strategic Partner! – Resources to aid in your evaluation. – Resources to aid in optimizing costs. Instrument your applications to capture detailed performance stats per logical computational step. – Gives delta between current and new Cloud Provider infrastructure. During your testing establish how responsive the Cloud Partner is to issues unrelated to your pilot (i.e. machine upgrades, failures in nodes, networking and/or storage). Ensure that you can stop clusters that you are not testing so that you aren’t billed more than once Make sure they will sign a HIPAA BAA and that you know exactly what is covered Pick a Cloud Partner that offers services for automating deployment, scaling, load balancing, security and log management (even if you don’t plan to use it now). Summary – Key Requirements of a Genomics Cloud Provider

23 © 2014 Cypher Genomics, Inc.Page 23Proprietary and Confidential Javier Velazquez-Muriel – Bioinformatics Engineer Patrick Ravenel – CTO & VP of Engineering Ashley Van Zeeland – CEO & Founder Ali Torkamani – CSO & Founder Nicholas Schork – Founder Eric Topol – Founder Acknowledgements

24 © 2014 Cypher Genomics, Inc.Page 24Proprietary and Confidential Questions?


Download ppt "© 2014 Cypher Genomics, Inc.Page 1Proprietary and Confidential The information disclosed in this document, including all designs and related materials,"

Similar presentations


Ads by Google