BioCoRE and GEMS: Cyber Infrastructure for Cyber Chemistry Jesús A. Izaguirre Computer Science & Engineering University of Notre Dame with Kirby Vandivort.

Slides:



Advertisements
Similar presentations
PRAGMA BioSciences Portal Raj Chhabra Susumu Date Junya Seo Yohei Sawai.
Advertisements

TeraGrid Deployment Test of Grid Software JP Navarro TeraGrid Software Integration University of Chicago OGF 21 October 19, 2007.
Pulan Yu School of Informatics Indiana University Bloomington Web service based Varuna.Net.
DIGIDOC A web based tool to Manage Documents. System Overview DigiDoc is a web-based customizable, integrated solution for Business Process Management.
1 OBJECTIVES To generate a web-based system enables to assemble model configurations. to submit these configurations on different.
Connect. Communicate. Collaborate Click to edit Master title style MODULE 1: perfSONAR TECHNICAL OVERVIEW.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Building Enterprise Applications Using Visual Studio ®.NET Enterprise Architect.
Network Management Overview IACT 918 July 2004 Gene Awyzio SITACS University of Wollongong.
The Cactus Portal A Case Study in Grid Portal Development Michael Paul Russell Dept of Computer Science The University of Chicago
DCS Architecture Bob Krzaczek. Key Design Requirement Distilled from the DCS Mission statement and the results of the Conceptual Design Review (June 1999):
CompuNet Grid Computing Milena Natanov Keren Kotlovsky Project Supervisor: Zvika Berkovich Lab Chief Engineer: Dr. Ilana David Spring, /
© , Michael Aivazis DANSE Software Issues Michael Aivazis California Institute of Technology DANSE Software Workshop September 3-8, 2003.
A Web-based Collaboratory for Supporting Environmental Science Research Xiaorong Xiang Yingping Huang Greg Madey Department of Computer Science and Engineering.
Interpret Application Specifications
WDK Driver Test Manager. Outline HCT and the history of driver testing Problems to solve Goals of the WDK Driver Test Manager (DTM) Automated Deployment.
Slide 1 of 9 Presenting 24x7 Scheduler The art of computer automation Press PageDown key or click to advance.
Project Implementation for COSC 5050 Distributed Database Applications Lab1.
System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.
26-28 th April 2004BioXHIT Kick-off Meeting: WP 5.2Slide 1 WorkPackage 5.2: Implementation of Data management and Project Tracking in Structure Solution.
GMD German National Research Center for Information Technology Innovation through Research Jörg M. Haake Applying Collaborative Open Hypermedia.
Project Proposal: Academic Job Market and Application Tracker Website Project designed by: Cengiz Gunay Client: Cengiz Gunay Audience: PhD candidates and.
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
Developing Reusable Software Infrastructure – Middleware – for Multiscale Modeling Wilfred W. Li, Ph.D. National Biomedical Computation Resource Center.
SSIS Over DTS Sagayaraj Putti (139460). 5 September What is DTS?  Data Transformation Services (DTS)  DTS is a set of objects and utilities that.
A Scalable Application Architecture for composing News Portals on the Internet Serpil TOK, Zeki BAYRAM. Eastern MediterraneanUniversity Famagusta Famagusta.
Technology Overview. Agenda What’s New and Better in Windows Server 2003? Why Upgrade to Windows Server 2003 ?  From Windows NT 4.0  From Windows 2000.
MAHI Research Database Data Validation System Software Prototype Demonstration September 18, 2001
Using the WS-PGRADE Portal in the ProSim Project Protein Molecule Simulation on the Grid Tamas Kiss, Gabor Testyanszky, Noam.
DISTRIBUTED COMPUTING
9 Chapter Nine Compiled Web Server Programs. 9 Chapter Objectives Learn about Common Gateway Interface (CGI) Create CGI programs that generate dynamic.
Informix IDS Administration with the New Server Studio 4.0 By Lester Knutsen My experience with the beta of Server Studio and the new Informix database.
CHAPTER TEN AUTHORING.
Protein Molecule Simulation on the Grid G-USE in ProSim Project Tamas Kiss Joint EGGE and EDGeS Summer School.
Contents 1.Introduction, architecture 2.Live demonstration 3.Extensibility.
Parameter Sweep Workflows for Modelling Carbohydrate Recognition ProSim Project Tamas Kiss, Gabor Terstyanszky, Noam Weingarten.
Banner Document Management Suite David Cheney |
The Future of the iPlant Cyberinfrastructure: Coming Attractions.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.
1 Computing Challenges for the Square Kilometre Array Mathai Joseph & Harrick Vin Tata Research Development & Design Centre Pune, India CHEP Mumbai 16.
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC BioCoRE: User Experience Markus Dittrich
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
| Banner XtenderSolutions David Cheney SunGard Higher Education.
NEES Cyberinfrastructure Center at the San Diego Supercomputer Center, UCSD George E. Brown, Jr. Network for Earthquake Engineering Simulation NEES TeraGrid.
Project Database Handler The Project Database Handler is a brokering application that mediates interactions between the project database and the external.
INFSO-RI Enabling Grids for E-sciencE Graphical User Interface. for Charon Extension Layer System. and Application Dashboards Jan.
INFSO-RI Enabling Grids for E-sciencE CHARON System Jan Kmuníček, Petr Kulhánek, Martin Petřek CESNET, Czech Republic.
Application Software System Software.
August 2003 At A Glance The IRC is a platform independent, extensible, and adaptive framework that provides robust, interactive, and distributed control.
System/SDWG Update Management Council Face-to-Face Flagstaff, AZ August 22-23, 2011 Sean Hardman.
A computer contains two major sets of tools, software and hardware. Software is generally divided into Systems software and Applications software. Systems.
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
Application Web Service Toolkit Allow users to quickly add new applications GGF5 Edinburgh Geoffrey Fox, Marlon Pierce, Ozgur Balsoy Indiana University.
March 2004 At A Glance The AutoFDS provides a web- based interface to acquire, generate, and distribute products, using the GMSEC Reference Architecture.
VIEWS b.ppt-1 Managing Intelligent Decision Support Networks in Biosurveillance PHIN 2008, Session G1, August 27, 2008 Mohammad Hashemian, MS, Zaruhi.
1 Copyright © 2008, Oracle. All rights reserved. Repository Basics.
InSilicoLab – Grid Environment for Supporting Numerical Experiments in Chemistry Joanna Kocot, Daniel Harężlak, Klemens Noga, Mariusz Sterzel, Tomasz Szepieniec.
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
Advanced Higher Computing Science
Building Enterprise Applications Using Visual Studio®
Integrating Scientific Tools and Web Portals
Pipeline Execution Environment
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
System And Application Software
BioCoRE A web-based collaborative environment for biomedical research, research management, and training Group research by projects Co-authoring and sharing.
VIEWS / TSS Overview.
敦群數位科技有限公司(vanGene Digital Inc.) 游家德(Jade Yu.)
Gordon Erlebacher Florida State University
Presentation transcript:

BioCoRE and GEMS: Cyber Infrastructure for Cyber Chemistry Jesús A. Izaguirre Computer Science & Engineering University of Notre Dame with Kirby Vandivort NIH Resource for Macromolecular Modeling and Bioinformatics University of Illinois

BioCoRE and GEMS 3 October 2004 Overview I Chemical applications such as virtual screening, protein kinetics and structure, and analysis and validation of molecular simulations require enormous resources that can be provided by CyberInfrastructure Successful solution of these problems require collaborative approaches, also facilitated by CyberInfrastructure

BioCoRE and GEMS 3 October 2004 Overview II To make CyberInfrastucture effective, the following issues must be addressed: Users of CyberInfrastructure need a data- centric way of managing their computations and data Distributed databases on the grid need to address the problem of reliability and fault- tolerance of data

BioCoRE and GEMS 3 October 2004 Overview III We will study examples of collaborative software that address these issues, primarily: –BioCoRE: A Collaboratory for Structural Biology –GEMS: Grid Enabled Molecular Simulations Toolset and Database

BioCoRE and GEMS 3 October 2004 Sample CyberScience Projects Collaborative BiophysicsBioCoRE K. Schulten, Illinois Virtual ScreeningThe Screensaver Project W.G. Richards, Oxford Protein V. Pande, Stanford Distributed Database of Molecular Simulations BioSimGrid M. Sansom, Oxford

BioCoRE and GEMS 3 October 2004 What is BioCoRE? BioCoRE: a collaborative work environment for biomedical research, research management and training. BioCoRE assists the entire research process, from talking with collaborators to performing simulations and collecting data, to preparing papers and reports.

BioCoRE and GEMS 3 October 2004 Sharing Documents With the BioFS and WebDAV, scientists can exchange and edit files from anywhere with a web connection.

BioCoRE and GEMS 3 October 2004 Setting Up and Running Simulations NAMDCFG: A “Simulation Setup Wizard” Online help and error checking for NAMD input files Job submission to supercomputers simplified Job status monitored for easy retrieval Job data archived for future reference

BioCoRE and GEMS 3 October 2004 Sharing Molecular Views Using VMD and BioCoRE, collaborators may exchange and manipulate 3-D models of molecules Emphasis on collaborative sessions. Streamlined process of sharing views.

BioCoRE and GEMS 3 October 2004 Communicating Control Panel provides instant messaging and notifications BioCoRE also provides message boards, Web site library, lab book

BioCoRE and GEMS 3 October 2004 Programming Interface Provide way for users to programmatically interact with BioCoRE. Communication (Control Panel), shared states (VMD) WebDAV

BioCoRE and GEMS 3 October 2004 Availability Free Can be accessed from Illinois site, or server software can be installed locally Server software can be modified if necessary

BioCoRE and GEMS 3 October 2004 Virtual Screening Combinatorial Complexity Lead Exploration Screen docking affinities based on a scoring function (interaction energies, RMSD, etc…) Modeled as an all pairs problem Logically independent computational requirements are well suited for wide area grid distribution Leads (ligands) L0001 L0002 L0003 L0004 L0005

BioCoRE and GEMS 3 October 2004 CyberInfrastructure Needs for Virtual Screening I Incorporate protein (receptor) flexibility –Use multiple protein structures (hierarchical representations and algorithms) Iterative refinement of results –Add new protein conformations to improve docking –Use higher resolution models for promising hits (integration of data and work flow) –Monitor status of results (not just jobs running)

BioCoRE and GEMS 3 October 2004 CyberInfrastructure Needs for Virtual Screening II Manage computation and storage in the grid –Declarative rather than imperative specification Automate usage of algorithms / tools –Select software and optimal parameters for algorithms (recommender system) –Example: MDSimAid ( selects optimal MD simulation protocol (limited options)

BioCoRE and GEMS 3 October 2004 BioSimGrid Mark S. P. Sansom, Oxford Trajectory data stored in relational database tables per Data Schema Semi-Automated Deposition of trajectory files for certain formats (CHARMM, NAMD, etc…) Trajectory analysis modules Future goal to distribute database Database for biomolecular simulations Specifically: molecular dynamics trajectories Facilitate validation and analysis of simulations Provides “independence” from the specific simulation semantics (configuration parameters, architecture, simulation tools, etc…)

BioCoRE and GEMS 3 October 2004 CyberInfrastructure Needs for Distributed Databases I Metadata for trajectories –Simulation protocol, software, etc. Distribution on the grid –Storage fault tolerance / reliability –Scalable solution: reduce storage requirements and centralization

BioCoRE and GEMS 3 October 2004 CyberInfrastructure Needs for Distributed Databases II Data-driven model for the user –Data organized around key themes (trajectories, molecules) Generic tools for developers –Applicable to different applications

BioCoRE and GEMS 3 October 2004 Solving Integration Problem We need to capture the data flow and the work flow –Ecce project –XML metadata –Component architectures (e.g., JavaBeans, Common Component Architecture)

BioCoRE and GEMS 3 October 2004 Solving Integration Problem BioCoRE (K. Schulten, Illinois) –Use of programming interface –Provides multiple services to applications (web file system, job management, shared visualization)

BioCoRE and GEMS 3 October 2004 Solving Grid Management Current grid tools are task oriented: run this particular simulation code with these input files, etc. –Web portals are an incremental improvement over command line or stand alone applications Problem: Controlling multiple resources –For example, create 10,000 tasks & keep track of the data, as might be needed for virtual screening applications

BioCoRE and GEMS 3 October 2004 Solving Grid Management with GIPSE GIPSE: Grid Interface for Parameter-driven Simulation Environments –Shift focus from management to research –Result-driven interface –Scripting capabilities

BioCoRE and GEMS 3 October 2004 Solving grid management with GIPSE Simplify process –XML Data format –Missing “glue” Powerful searches –Optimizations –Control loops GEMS ToolsetHIV-1 Protease

BioCoRE and GEMS 3 October 2004 Solving grid management with GIPSE Manage data –Storage –Database retrieval Monitor progress –Status –Application –specific GEMS ToolsetHIV-1 Protease

BioCoRE and GEMS 3 October 2004 GEMS Database Toolset Grid Enabled Molecular Simulation –Data Centric –Wide area distributed storage –Researchers have data and resource autonomy –Simulation configuration, input data files, and output data files identified via XML –Centralized SQL locator –Availability via replication

BioCoRE and GEMS 3 October 2004 Reliability and Leveraged Availability via Runtime Imaging Reliability of data storage is increased User can tradeoff availability versus storage volume Workspace data has 2-way redundancy by default Archival data has a 2-way redundancy of fewer snapshots, but saves the computational images For each computational run through the GEMS portal a comprehensive runtime image is created from which the simulation can automatically be regenerated. Runtime images include executable version and location, library requirements, hardware requirements, input files, and configuration parameters

BioCoRE and GEMS 3 October 2004 Integration of Distributed Data Into New Simulations A grid distributed “make” based on a computational requirement over a set parameter sweep –Example: optimize MD simulation protocol Before starting the sweep a query determines data points that are up to date and those that require computation (including regeneration) –Example: keep current list of results of virtual screening as more computations are performed or targets and ligands added

BioCoRE and GEMS 3 October 2004 Example: Validating Simulations Locate specific published simulation configurations for benchmarking Select pertinent input data files (pdb, psf, force fields, etc…) for direct utilization in a new simulation for purpose of comparison/contrast. Researcher B wants to vary certain parameters of Researcher A’s published simulation to test her new MD integrator

BioCoRE and GEMS 3 October 2004 Acknowledgments Collaborators in GIPSE and GEMS: –Aaron Striegel –Doug Thain –Jeff Peng Students –Paul Brenner –Santanu Chatterjee Funding from NSF Career and Biocomplexity Klaus Schulten BioCoRE Team: –Robert Brunner –Michael Bach –David Brandon BioCoRE funding from NIH