Integrative Informatics Life Sciences Conference + Expo April 3 rd, 2006 – Boston, MA John Reynders Information Officer - LRL Discovery and Development.

Slides:



Advertisements
Similar presentations
ChemAxon's Java Components in a Heterogeneous, Server-Centric Application Environment ChemAxon 2005 User Group Meeting May 19th and 20th, Budapest, Hungary.
Advertisements

A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University
DIGIDOC A web based tool to Manage Documents. System Overview DigiDoc is a web-based customizable, integrated solution for Business Process Management.
Welcome to Middleware Joseph Amrithraj
MIT Lincoln Laboratory A Service-Oriented Approach to Application Development Robert Darneille & Gary Schorer WPI MQP Presentations ICS Group 10 October.
INTRODUCTION TO SIMULATION WITH OMNET++ José Daniel García Sánchez ARCOS Group – University Carlos III of Madrid.
1. 2 Captaris Workflow Microsoft SharePoint User Group 16 May 2006.
COM vs. CORBA.
1 XML Web Services Practical Implementations Bob Steemson Product Architect iSOFT plc.
Distributed components
G O B E Y O N D C O N V E N T I O N WORF: Developing DB2 UDB based Web Services on a Websphere Application Server Kris Van Thillo, ABIS Training & Consulting.
SOA with Progress Philipp Walther Consultant. © 2007 Progress Software Corporation2 Agenda  SOA  Enterprise Service Bus (ESB)  The Progress SOA Portfolio.
Presentation 7 part 2: SOAP & WSDL. Ingeniørhøjskolen i Århus Slide 2 Outline Building blocks in Web Services SOA SOAP WSDL (UDDI)
Technical Architectures
Research and objectives Modern software is incredibly complex: for example, a modern OS has more than 10 millions lines of code, organized in 10s of layers!
Workshop on Cyber Infrastructure in Combustion Science April 19-20, 2006 Subrata Bhattacharjee and Christopher Paolini Mechanical.
1 BrainWave Biosolutions Limited Accelerating Life Science Research through Technology.
Business Intelligence components Introduction. Microsoft® SQL Server™ 2005 is a complete business intelligence (BI) platform that provides the features,
Understanding and Managing WebSphere V5
Enterprise Resource Planning
Role of Deputy Director for Code Architecture and Strategy for Integration of Advanced Computing R&D Andrew Siegel FSP Deputy Director for Code Architecture.
Divide and Conquer: Challenges in Scaling Federated Search Presented by Abe Lederman, President and CTO Deep Web Technologies, LLC SearchEngine Meeting.
Module 8 Configuring and Securing SharePoint Services and Service Applications.
Yike Guo/Jiancheng Lin InforSense Ltd. 15 September 2015 Bioinformatics workflow integration.
1 Copyright © 2004, Oracle. All rights reserved. Introduction to Oracle Forms Developer and Oracle Forms Services.
COM vs. CORBA Computer Science at Azusa Pacific University September 19, 2015 Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department.
EGAN: Exploratory Gene Association Networks by Jesse Paquette Biostatistics and Computational Biology Core Helen Diller Family Comprehensive Cancer Center.
Service-enabling Legacy Applications for the GENIE Project Sofia Panagiotidi, Jeremy Cohen, John Darlington, Marko Krznarić and Eleftheria Katsiri.
Fundamentals of Database Chapter 7 Database Technologies.
DEVS Namespace for Interoperable DEVS/SOA
 Chapter 6 Architecture 1. What is Architecture?  Overall Structure of system  First Stage in Design process 2.
Web Services based e-Commerce System Sandy Liu Jodrey School of Computer Science Acadia University July, 2002.
Teranode Tools and Platform for Pathway Analysis Michael Kellen, Solution Manager June 16, 2006.
Informatics Software and Services Jim Shaw BergenShaw International Integrate. Automate. Manage. Your company Logo In collaboration.
Phase II Additions to LSG Search capability to Gene Browser –Though GUI in Gene Browser BLAST plugin that invokes remote EBI BLAST service Working set.
Issues in (Financial) High Performance Computing John Darlington Director Imperial College Internet Centre Fast Financial Algorithms and Computing 4th.
1 Geospatial and Business Intelligence Jean-Sébastien Turcotte Executive VP San Francisco - April 2007 Streamlining web mapping applications.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Biological Signal Detection for Protein Function Prediction Investigators: Yang Dai Prime Grant Support: NSF Problem Statement and Motivation Technical.
Presented by Scientific Annotation Middleware Software infrastructure to support rich scientific records and the processes that produce them Jens Schwidder.
A radiologist analyzes an X-ray image, and writes his observations on papers  Image Tagging improves the quality, consistency.  Usefulness of the data.
Presented by Jens Schwidder Tara D. Gibson James D. Myers Computing & Computational Sciences Directorate Oak Ridge National Laboratory Scientific Annotation.
BBN Technologies Copyright 2009 Slide 1 The S*QL Plugin for Cytoscape Visual Analytics on the Web of Linked Data Rusty (Robert J.) Bobrow Jeff Berliner,
1 Service Creation, Advertisement and Discovery Including caCORE SDK and ISO21090 William Stephens Operations Manager caGrid Knowledge Center February.
JISC/NSF PI Meeting, June Archon - A Digital Library that Federates Physics Collections with Varying Degrees of Metadata Richness Department of Computer.
Adapting the Electronic Laboratory Notebook for the Semantic Era Tara Talbott, Michael Peterson, Jens Schwidder, James D. Myers 2005 International Symposium.
August 2003 At A Glance The IRC is a platform independent, extensible, and adaptive framework that provides robust, interactive, and distributed control.
NA-MIC National Alliance for Medical Image Computing Core 1b – Engineering Computational Platform Jim Miller GE Research.
INFSO-RI Enabling Grids for E-sciencE Web Services Mike Mineter National e-Science Centre, Edinburgh.
Use of Performance Prediction Techniques for Grid Management Junwei Cao University of Warwick April 2002.
1 Copyright © 2008, Oracle. All rights reserved. Repository Basics.
J2EE Platform Overview (Application Architecture)
Introduction to Oracle Forms Developer and Oracle Forms Services
Making the Case for Business Intelligence
Panel: Beyond Exascale Computing
VisIt Project Overview
Netscape Application Server
Introduction to Oracle Forms Developer and Oracle Forms Services
Introduction to Oracle Forms Developer and Oracle Forms Services
Parallel Objects: Virtualization & In-Process Components
Building Innovative Apps using the Microsoft Developer Platform
CMPE419 Mobile Application Development
Real-time BioPharmaceutical R&D
Why many Automation Frameworks? when ONE can do ALL
Introduction of Week 11 Return assignment 9-1 Collect assignment 10-1
XML Based Learning Environment
CMPE419 Mobile Application Development
Mark Quirk Head of Technology Developer & Platform Group
L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher
Presentation transcript:

Integrative Informatics Life Sciences Conference + Expo April 3 rd, 2006 – Boston, MA John Reynders Information Officer - LRL Discovery and Development Informatics

Life Sciences C+E 2006, Boston MA Copyright © 2006 Eli Lilly and Company Outline Cant we all just get along? Navigating silos of silos Integrative Informatics

Life Sciences C+E 2006, Boston MA Copyright © 2006 Eli Lilly and Company Rapid Application Development with the Parallel Object-Oriented Methods and Applications (POOMA) Framework Post-Doc Challenge: Write a 3D Pseudo-Spectral code to simulate two colliding vortices using the Navier-Stokes Equations Advanced Computing Lab Los Alamos National Lab

Life Sciences C+E 2006, Boston MA Copyright © 2006 Eli Lilly and Company Rapid Application Development with the Parallel Object-Oriented Methods and Applications (POOMA) Framework Post-Doc Challenge: Write a 3D Pseudo-Spectral code to simulate two colliding vortices using the Navier-Stokes Equations Advanced Computing Lab Los Alamos National Lab On This:

Life Sciences C+E 2006, Boston MA Copyright © 2006 Eli Lilly and Company Rapid Application Development with the Parallel Object-Oriented Methods and Applications (POOMA) Framework Post-Doc Challenge: Write a 3D Pseudo-Spectral code to simulate two colliding vortices using the Navier-Stokes Equations Result: One Post-Doc with no parallel experience wrote this application in 5 weeks with POOMA Navier-Stokes simulation iso-surface of vorticity Advanced Computing Lab Los Alamos National Lab

Life Sciences C+E 2006, Boston MA Copyright © 2006 Eli Lilly and Company Encapsulation in the POOMA FrameWork STL Expression Templates User threads RefCount & Data Pooling MPI/PVM Domain Decomp RTS Sheduling Load Balancing FieldsMatrices Particles Meshes FFT Elliptic Solvers Stencil Operations DP MonteCarlo ER Plasmas DP Hydro ER Ocean Global Algorithm Computer Science Stencil Operators Interpolators PhysicsApplicationLocal Parallel Advanced Computing Lab Los Alamos National Lab

Life Sciences C+E 2006, Boston MA Copyright © 2006 Eli Lilly and Company Outline Cant we all just get along? Navigating silos of silos Integrative Informatics

Life Sciences C+E 2006, Boston MA Copyright © 2006 Eli Lilly and Company The Problem: Silos of Silos Tools, application, and data are standalone with limited interaction Scientists have great difficulty finding their data and associated tools Asking cross-domain questions ( e.g. bio+chem ) very difficult Support becoming very impractical – estimated 400+ individual tools across silos LLYDB BioSel Jockyss ELIAS Beacon ICARIS Results Star Jubilant BioGeMs Sig3 PathArt TV-GAME PubDBs Proteome Xrep Nautilus Conformia Intellichem MCPACT Watson PRDB LIMS IDW Chem Bio PR&D/ADMET

Life Sciences C+E 2006, Boston MA Copyright © 2006 Eli Lilly and Company PRESENTATION LAYER (.NET) WORK FLOW / BUSINESS LAYER (Some. Net) DATA LAYER BioGems Going from the vertical to the horizontal DATA LAYER Biosel/TINS DATA LAYER Process Tracking DATA LAYER Data Warehouse DATA LAYER.Net?...

Life Sciences C+E 2006, Boston MA Copyright © 2006 Eli Lilly and Company Lilly Science Grid (LSG) Architecture - Systems TDC-TAT Plug-In Manager BiologyChemistryToxicology BioGEMSTV-GAMEBioSelSystem X WS Provider WS Consumer WS Provider WS Cons WS Provider SAP Portfolio Portfolio Sys WS Provider Plug-In APlug-In BPlug-In N … WS Cons Event Communication TAO

Life Sciences C+E 2006, Boston MA Copyright © 2006 Eli Lilly and Company LSG Architecture - Tools View TDC-TAT.NET API SRS/OracleOracle Perl CGI Java SOAP::Lite Perl SOAP::LiteAxis.NET Proxy SOAP::Lite Flat Files/ Oracle Java Axis.NET User Ctrl.NET User Ctrl.NET User Ctrl ….NET Proxy.NET API Oracle.NET C# IIS Web Server Visual Studio Apache Web Server Linux Tomcat Web Server Linux WSDL Common XML Schemas (XSD)

Life Sciences C+E 2006, Boston MA Copyright © 2006 Eli Lilly and Company Thick Client POC Interface Webparts Thin Client POC Interface Portfolio WS Spreadsheet of Portfolio Active Cpds WS BIOSEL/TINS Prot. Express. WS GeneAltas/BioGems Prot. Express. WS Proteome/BioGems Prot. Express. WS Proteome/BioGems Prot. Express. WS Proteome/BioGems Data WebLinks. WS BioGems/TV-GAMES … Gene ID Mapping WS Custom Spreadsheet Data Integration/Mapping Architecture enables encapsulation and division of labor

Life Sciences C+E 2006, Boston MA Copyright © 2006 Eli Lilly and Company Grid Architecture Points: The User MyScience: Enable Scientist to dynamically compose their environment from a set of components Orchestration: Components communicate to enable an action/question in one component to yield results/answers from multiple components Organic: New capabilities can be added by simply adding a new component Scalable: The combinatorics of using 4 out of 12 components yields 495 configurations It is much easier to maintain a framework and 12 associated components than 495 separate tools!

Life Sciences C+E 2006, Boston MA Copyright © 2006 Eli Lilly and Company Grid Architecture Points: The Developer Get SEs and scientists out of silos and into layers so they may do what they do best Data, applications, algorithms, presentation Plug-in architecture to factor business/science and framework development Crisp abstraction barriers between and within layers to enable modular development Rationalize tool set within layers to improve developer productivity

Life Sciences C+E 2006, Boston MA Copyright © 2006 Eli Lilly and Company Some Observations/Thoughts Where is the Value: Data fusion in the military – information supremacy –F15 heads-up display –Aegis cruiser –Bob and Tom from the NSA What makes the drug hunter effective in an Informatics cockpit –Its partly the quality, speed, accuracy of any given tool ( e.g. the altimeter ) –Its mostly how the instruments work as an integrated whole I can ask questions I could not ask before! –Integrate to this point of innovation – before spending significant time on optimizations Some Lessons from Los Alamos: How can one go wrong having an application framework built by a team of A+ students? –By building a framework that can only be used by A+ students Surely everyone knowing as much as they can about all aspects of the framework will produce the best framework! –Nope. By knowing the implementation behind the abstraction, a team fails to program through interface contracts –Also, the team has challenges scaling in development efforts – because it is not functioning as a team ( everyone run to the soccer ball? )

Life Sciences C+E 2006, Boston MA Copyright © 2006 Eli Lilly and Company The old vs. the new architecture Benefits: rapid development, customizable environment, integration of tools for cross-domain inquiry, reduced support load… plugin Discovery Informatics Integration Kernel plugin Jubilant BioGeMs Sig3 PathArt TV-GAME PubDBs Proteome Xrep

Life Sciences C+E 2006, Boston MA Copyright © 2006 Eli Lilly and Company Web-Service Layer - LSG Can accelerate staging the old into the Lilly Science Grid Discovery Integration Kernel plugin Integrated Data Layer - LSG BioGeMs PathArt

Life Sciences C+E 2006, Boston MA Copyright © 2006 Eli Lilly and Company Web-Service Layer Scaling efforts: Divide & Conquer The Kernel Composability, Integration, Interaction, Scalability Clear contract with plug-ins The Plug-in Clear contract with kernel and web-service layer Domain-specific tool Limited knowledge of Kernel required to build plug-in Web-Services Clear contract with plug-ins and data-layer Insert web-service layers into tools – preserving legacy interface and creating service to build a plug-in Integrated Data Layer Clear contract with Web-Services Design for integration first, optimization next Automate ETL Discovery Integration Kernel plugin Integrated Data Layer

Life Sciences C+E 2006, Boston MA Copyright © 2006 Eli Lilly and Company

Life Sciences C+E 2006, Boston MA Copyright © 2006 Eli Lilly and Company Indications view

Life Sciences C+E 2006, Boston MA Copyright © 2006 Eli Lilly and Company Drug Hunting Team view

Life Sciences C+E 2006, Boston MA Copyright © 2006 Eli Lilly and Company Outline Cant we all just get along? Navigating silos of silos Integrative Informatics

Life Sciences C+E 2006, Boston MA Copyright © 2006 Eli Lilly and Company Similarity? ~ Graph - yes. Text - yes Assay - yes

Life Sciences C+E 2006, Boston MA Copyright © 2006 Eli Lilly and Company Similarity – and adding a magical Methyl ~ Graph - yes. Text - yes Assay - yes ~ Graph - yes Text - maybe Assay - no

Life Sciences C+E 2006, Boston MA Copyright © 2006 Eli Lilly and Company Gene Objects Representation Name String Filters PathwaySet GeneFamily GO Measures Alignment (Algorithm) Text (DocumentSet) GeneExpression (SampleSet, MoleculeSet ) ATGAGCCTCCCCAATTCCTCCTGCCTCTTAGAAGACAAGATGTGTGAGGGATGCCA

Life Sciences C+E 2006, Boston MA Copyright © 2006 Eli Lilly and Company Protein Objects Representation Gene String State ( e.g., Phosphorelated ) Filters GeneFamily GO Measures Alignment (Algorithm) Pathway (PathwaySet) Text (DocumentSet) ProteinExpression (SampleSet, MoleculeSet ) Assay ( ExperimentSet ) 3D Structure (Algorithm)

Life Sciences C+E 2006, Boston MA Copyright © 2006 Eli Lilly and Company SNP Objects Representation Gene Locus Filters GeneSet SNP characterization –Coding/Non-Coding –Blossum Score –Exon/Intron –Transcriptional Measures Linkage disequilibrium –D-Prime –R-Squared Haplotype Block Association Text (DocumentSet) ATGAGCCTCCCCAA TTCCTCCTACCTCT TCGGAGACAAGATG TGTCAGGGATGCCA ATGAGCCTCCCCAA TTCCTCCTGCCGCT TCGAAGACAAGATG TGTCAGGGATGCCA ATGAGCCTCCCCAA TTCCTCCTACCTCT TAGGAGACAAGATG TGTCAGGGATGCCA

Life Sciences C+E 2006, Boston MA Copyright © 2006 Eli Lilly and Company Image Objects Laws Texture Convolution L5 = [ ] E5 = [ ] S5 = [ ] W5 = [ ] R5 = [ ] Density Functional Signature ( DFS ) Target DFS + L2 Measure Measure Filter Representation - 2D Matrix

Life Sciences C+E 2006, Boston MA Copyright © 2006 Eli Lilly and Company Molecule Objects Representation Name Graph 3D Structure Filters Library Compounds Similarity Search Measures Text (DocumentSet) Fingerprints (Algorithm) 2D/3D Similarity (Algorithm) HTS (GeneSet)

Life Sciences C+E 2006, Boston MA Copyright © 2006 Eli Lilly and Company Compound Profiling Bioprint Target Chemoprint Chemogenomic Selectivity profiles

Life Sciences C+E 2006, Boston MA Copyright © 2006 Eli Lilly and Company Comparison of kinase dendograms Assay Sim. Sequence Sim.

Life Sciences C+E 2006, Boston MA Copyright © 2006 Eli Lilly and Company HTS as a Mapping Object Representation ProteinSet MoleculeSet HTS Array Filters Protein Filters Molecule Filters Measures Cluster Analysis Self-Organizing Maps Support Vector Machines Neural Networks

Life Sciences C+E 2006, Boston MA Copyright © 2006 Eli Lilly and Company Gene Expression as a Mapping Object Representation GeneSet SampleSet 2D Expression Matrix Filters Gene Filters Sample Filters Measures Cluster Analysis Self-Organizing Maps Support Vector Machines Neural Networks

Life Sciences C+E 2006, Boston MA Copyright © 2006 Eli Lilly and Company Text as a Mapping Object Representation ObjectSet A ObjectSet B RDF Triplets Filters DocumentSet A Filters B Filters Measures QR Factorization Text-based Classifiers

Life Sciences C+E 2006, Boston MA Copyright © 2006 Eli Lilly and Company Pathway as a Mapping Object Representation ProteinSet MoleculeSet VertexSet Filters Protein Filters Molecule Filters Measures Graph Algorithms

Life Sciences C+E 2006, Boston MA Copyright © 2006 Eli Lilly and Company Putting it all together… ObjectsMeasure MTS Literature Binding Coding Clinical DB Compounds Images Genes SNPs Expression Linkage D Signature Fingerprint Map 1Map 2

Life Sciences C+E 2006, Boston MA Copyright © 2006 Eli Lilly and Company The goal… find wormholes Objects 20 Text-based relations: Text Pathway HTS Expression Image 16 Objects 120 heterogeneous relations: