GeWorkbench caGrid TeraGrid Integration Scott Oster Ohio State University – Dept. of Biomedical Informatics Christine Hung Columbia University – JCSB/C2B2.

Slides:



Advertisements
Similar presentations
Open Grid Forum 19 January 31, 2007 Chapel Hill, NC Stephen Langella Ohio State University Grid Authentication and Authorization with.
Advertisements

CVRG Presenter Disclosure Information Tahsin Kurc, PhD Center for Comprehensive Informatics Emory University CardioVascular Research Grid Core Infrastructure.
Experiences In Building Globus Genomics Using Galaxy, Globus Online and AWS Ravi K Madduri University of Chicago and ANL.
XSEDE 13 July 24, Galaxy Team: PSC Team:
MTA SZTAKI Hungarian Academy of Sciences Grid Computing Course Porto, January Introduction to Grid portals Gergely Sipos
Ian Foster Computation Institute Argonne National Lab & University of Chicago Education in the Science 2.0 Era.
Dorian Grid Identity Management and Federation Dialogue Workshop II Edinburgh, Scotland February 9-10, 2006 Stephen Langella Department.
1 genSpace: Community- Driven Knowledge Sharing for Biological Scientists Gail Kaiser’s Programming Systems Lab Columbia University Computer Science.
CaGrid Service Metadata Scott Oster - Ohio State
Member of the ExperTeam Group Ralf Ratering Pallas GmbH Hermülheimer Straße Brühl, Germany
EInfrastructures (Internet and Grids) - 15 April 2004 Sharing ICT Resources – Discussion of Best Practices in the U.S. Mary E. Spada Program Manager, Strategic.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Toward an OpenSocial Life Science Gateway Wenjun Wu, Michael E. Papka, Rick Stevens.
Integromics: a grid-enalbled platform for integration of advanced bioinformatics tools and data Luca Corradi Luca Corradi BIO-Lab,
Building Data-intensive Pipelines Ravi K Madduri Argonne National Lab University of Chicago.
Technical Introduction to caGrid Service Development caGrid 1.3 Justin Permar caGrid Knowledge Center
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space Cracow Grid Workshop’10 Kraków, October 11-13,
Intelligent Workflow Management System(iWMS). Agenda Background Motivation Usage Potential application domains iWMS.
Tony Pan, Ashish Sharma, Metin Gurcan Kun Huang, Gustavo Leone, Joel Saltz The Ohio State University Medical Center, Columbus OH gridIMAGE Microscopy:
State of Service Oriented Science Tools Open Source Grid Cluster Conference Oakland.
Cancer Bioinformatics Grid (caBIG) CANS 2006 Chicago, Illinois Shannon Hastings Department of Biomedical Informatics Ohio State University.
Department of Biomedical Informatics Service Oriented Bioscience Cluster at OSC Umit V. Catalyurek Associate Professor Dept. of Biomedical Informatics.
TeraGrid Science Gateways: Scaling TeraGrid Access Aaron Shelmire¹, Jim Basney², Jim Marsteller¹, Von Welch²,
Long Term Ecological Research Network Information System LTER Grid Pilot Study LTER Information Manager’s Meeting Montreal, Canada 4-7 August 2005 Mark.
Flexibility and user-friendliness of grid portals: the PROGRESS approach Michal Kosiedowski
SAN DIEGO SUPERCOMPUTER CENTER NUCRI Advisory Board Meeting November 9, 2006 Science Gateways on the TeraGrid Nancy Wilkins-Diehr TeraGrid Area Director.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
GenePattern Overview for MAGE-TAB Workshop Ted Liefeld January 24, 2007.
Taverna Workflow. A suite of tools for bioinformatics Fully featured, extensible and scalable scientific workflow management system – Workbench, server,
Building and Running caGrid Workflows in Taverna 1 Computation Institute, University of Chicago and Argonne National Laboratory, Chicago, IL, USA 2 Mathematics.
CaBIG Workflow University of Chicago, USA University of Manchester, UK.
Middleware Support for Virtual Organizations Internet 2 Fall 2006 Member Meeting Chicago, Illinois Stephen Langella Department of.
Introduction to caArray caBIG ® Molecular Analysis Tools Knowledge Center April 3, 2011.
GeWorkbench Highlights caBIG ® Molecular Analysis Tools Knowledge Center AACR Annual Meeting, April 3, 2011.
Shannon Hastings Multiscale Computing Laboratory Department of Biomedical Informatics.
Resource Brokering in the PROGRESS Project Juliusz Pukacki Grid Resource Management Workshop, October 2003.
Russ Hobby Program Manager Internet2 Cyberinfrastructure Architect UC Davis.
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
Large Scale Nuclear Physics Calculations in a Workflow Environment and Data Provenance Capturing Fang Liu and Masha Sosonkina Scalable Computing Lab, USDOE.
Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,
Nature Reviews/2012. Next-Generation Sequencing (NGS): Data Generation NGS will generate more broadly applicable data for various novel functional assays.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
NEES Cyberinfrastructure Center at the San Diego Supercomputer Center, UCSD George E. Brown, Jr. Network for Earthquake Engineering Simulation NEES TeraGrid.
GeWorkbench John Watkinson Columbia University. geWorkbench The bioinformatics platform of the National Center for the Multi-scale Analysis of Genomic.
1 Service Creation, Advertisement and Discovery Including caCORE SDK and ISO21090 William Stephens Operations Manager caGrid Knowledge Center February.
Biomedical and Bioscience Gateway to National Cyberinfrastructure John McGee Renaissance Computing Institute
GeWorkbench Overview Support Team Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute of MIT and Harvard.
GSI: Security On Teragrid A Introduction To Security In Cyberinfrastructure By Dru Sepulveda.
In Vivo Imaging Middleware and Applications RSNA 2007 Berkant Barla Cambazoglu The Ohio State University Department of Biomedical Informatics.
Call in: Participant Passcode: Centra: Meeting ID: ICR_WShttp://ncicb.centra.com August 11, 2010 ICR-WS Meeting.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
Grid Rapid Application Virtualization Interface (gRAVI) - Service Oriented Science Ravi K Madduri, Argonne National Laboratory/ University of Chicago Joshua.
Globus.org/genomics Globus Galaxies Science Gateways as a Service Ravi K Madduri, University of Chicago and Argonne National Laboratory
Ocean Observatories Initiative OOI Cyberinfrastructure Life Cycle Objectives Review January 8-9, 2013 Scientific Workflows for OOI Ilkay Altintas Charles.
CaGrid Workflow Examples Wei Tan, Ravi Madduri University of Chicago {wtan,
Ian Foster Computation Institute Argonne National Lab & University of Chicago Application Hosting Services — Enabling Science 2.0 —
Columbia University and The Broad Institute of MIT and Harvard caBIG® Molecular Analysis Tools Knowledge Center.
Tony Pan, Stephen Langella, Shannon Hastings, Scott Oster, Ashish Sharma, Metin Gurcan, Tahsin Kurc, Joel Saltz Department of Biomedical Informatics The.
0 caBIG and caGrid: Interoperable Computing Infrastructure for the Nation’s [and World’s] Cancer Research Enterprise Peter A. Covitz, Ph.D. Chief Operating.
Portlet Development Konrad Rokicki (SAIC) Manav Kher (SemanticBits) Joshua Phillips (SemanticBits) Arch/VCDE F2F November 28, 2008.
Shaowen Wang 1, 2, Yan Liu 1, 2, Nancy Wilkins-Diehr 3, Stuart Martin 4,5 1. CyberInfrastructure and Geospatial Information Laboratory (CIGI) Department.
Security in Research Computing John Sandefur UAB Comprehensive Cancer Center John-Paul Robinson UAB Research Computing.
Cancer Bioinformatics Grid (caBIG) CANS 2006 Chicago, Illinois
Ravi K Madduri, Argonne National Laboratory/ University of Chicago
Tools and Services Workshop
University of Chicago and ANL
Joslynn Lee – Data Science Educator
Shaowen Wang1, 2, Yan Liu1, 2, Nancy Wilkins-Diehr3, Stuart Martin4,5
USF Health Informatics Institute (HII)
Presentation transcript:

geWorkbench caGrid TeraGrid Integration Scott Oster Ohio State University – Dept. of Biomedical Informatics Christine Hung Columbia University – JCSB/C2B2 caBIG Architecture Face-to-Face Salt Lake City, UT January 2008

Agenda Overview (5 min) Introduction on TeraGrid Workgroup Background on geWorkbench and geWorkbench/caGrid/TeraGrid Project Technology (10 min) Steps to establishing geWorkbench/caGrid/TeraGrid Interface Use of caGrid Security (GTS, Grid Grouper, Dorian, CDS) Workflow and communications between services Demo (5 min) Discussion (5 min)

Team Members geWorkbench (Columbia University) Christine Hung Kiran Keshav caGrid (Ohio State University) Scott Oster Stephen Langella caGrid/TeraGrid (Argonne National Laboratory) Ravi Madduri TeraGrid (Argonne National Laboratory) Stuart Martin Management Aris Floratos (Columbia University) Krishnakant Shanbhag (Argonne National Laboratory) Michael Keller (Booz Allen Hamilton) Patrick McConnell (Duke University) Nancy Wilkins-Diehr (San Diego Supercomputer Center)

Overview Primary problem to address Lack of infrastructure and operating procedures to support high performance computing needs of caBIG Overarching goals Regular caGrid services will run as caGrid/TeraGrid gateways services Virtualize TeraGrid resources (both compute and storage) Approach: labor divided between domain and technical tasks Use cases will be drafted to identify the needs of the community Existing TeraGrid Gateway projects will be surveyed to identify lessons learned and potential technology for reuse Demonstrate approach through working prototype Document best practices and develop “cookbook”

TeraGrid Overview Characteristics: > 250 teraflops of computing capability >30 petabytes of online and archival data storage high-performance networks Mechanics: Prospective users request allocation of HPC resources to a review committee Allocations are granted, and credentials are issued Jobs are run with credentials and resource usage is billed to the allocation “TeraGrid is an open scientific discovery infrastructure combining leadership class resources at nine partner sites to create an integrated, persistent computational resource.”

caGrid Gateway Service Overview caGrid service running in the caBIG™ environment which acts as a bridge or proxy to TeraGrid resources for a subset of caBIG™ users should meet Gold compatibility requirements Created for a specific scientific scenario: abstracts away the details of leveraging TeraGrid for performance intensive operations uses domain-specific operations and data types has access to TeraGrid allocation Alleviates the need for caBIG™ users to: understand the complexities of TeraGrid (or HPC systems) obtain TeraGrid accounts/allocations

geWorkbench – a Platform for Integrated Genomics Integrated genomics analysis application Support for gene expression data, sequences, pathways, and structure 50+ visualization and analysis modules Access to local and remote data sources and analytical services Integration with biological annotation sources Development Platform Open source Java based Component architecture Facilitates customization

geWorkbench – a Platform for Integrated Genomics Large collection of components Data parsers: Affy MAS/GCOS (txt and CEL), Genepix, RMA, FASTA, caArray, PDB. Data Management: Project folders, marker/sequence/array groups. Visualization: Dendrograms, color mosaics, scatter plots, SOM clusters, BLAST results, dot matrices. Analyses: Hierarchical clustering, t-test, SVM, ARACNE, MEDUSA. MatrixREDUCE. 3rd Party components: Cytoscape, GoMiner, GeneWays, GenePattern, MEV. Complete list at

geWorkbench – a Platform for Integrated Genomics

geWorkbench – Graphical User Interface Projects Area Selection Area Visualization Area Command Area

Clustering

caGrid Service

TeraGrid Aware caGrid Service

Creating the Gateway Service Manually stage the binary (jar file) on TeraGrid Takes in.ser files as input Produces results also in a.ser file Used the RAVi plugin for Introduce to create the gateway service Gateway gridFTPs input data and parameters from geWorkbench to TeraGrid geWorkbench passes input to the gateway in geWorkbench’s native format (caDSR compliant) Gateway serializes the input before gridFTPing to TeraGrid Gateway invokes the staged binary Gateway gridFTPs results back to geWorkbench Gateway deserializes the result file Gateway returns results to geWorkbench in its native format Gateway service is a secured caGrid service which in turn invokes TeraGrid with a caBIG community account

Steps to establishing geWorkbench/caGrid/TeraGrid Interface

caGrid Security (GTS, Grid Grouper, Dorian, CDS) e=GAARDS:Main

Workflow and Communications Between Services

Special Thanks caGrid (Security Services) Scott Oster Stephen Langella caGrid(RAVi Plugin, Gateway Service) Ravi Madduri

Demo and Discussions

Steps to establishing geWorkbench/caGrid/TeraGrid Interface

caGrid Security (GTS, Grid Grouper, Dorian, CDS) e=GAARDS:Main

caGrid Security (GTS, Grid Grouper, Dorian, CDS) e=GAARDS:Main

caGrid Security (GTS, Grid Grouper, Dorian, CDS)