1 IBM – IU Recent Research Activities and Collaboration Opportunities Craig Stewart Associate Vice President for Research & Academic Computing (Acting)

Slides:



Advertisements
Similar presentations
Data Publishing Service Indiana University Stacy Kowalczyk April 9, 2010.
Advertisements

Joint CASC/CCI Workshop Report Strategic and Tactical Recommendations EDUCAUSE Campus Cyberinfrastructure Working Group Coalition for Academic Scientific.
What is Cyberinfrastructure?
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21) NSF-wide Cyberinfrastructure Vision People, Sustainability, Innovation,
Data Gateways for Scientific Communities Birds of a Feather (BoF) Tuesday, June 10, 2008 Craig Stewart (Indiana University) Chris Jordan.
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CF21) IRNC Kick-Off Workshop July 13,
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Prof. Jesús A. Izaguirre Department of Computer Science and Engineering Computational Biology and Bioinformatics at Notre Dame.
Bioinformatics: a Multidisciplinary Challenge Ron Y. Pinter Dept. of Computer Science Technion March 12, 2003.
August 29, 2002InforMax Confidential1 Vector PathBlazer Product Overview.
1 Ideas About the Future of HPC in Europe “The views expressed in this presentation are those of the author and do not necessarily reflect the views of.
1 Supplemental line if need be (example: Supported by the National Science Foundation) Delete if not needed. Supporting Polar Research with National Cyberinfrastructure.
Pti.iu.edu /jetstream Award # A national science & engineering cloud funded by the National Science Foundation Award #ACI Prepared for the.
1 Building National Cyberinfrastructure Alan Blatecky Office of Cyberinfrastructure EPSCoR Meeting May 21,
Research & Academic IU Bradley C. Wheeler Associate Vice President & Dean Office of the VP for Information Technology & CIO
INDIANAUNIVERSITYINDIANAUNIVERSITY April 2002 Implementing advanced IT facilities for the Indiana Genomics Initiative Craig A. Stewart
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
Scientific Data Infrastructure in CAS Dr. Jianhui Scientific Data Center Computer Network Information Center Chinese Academy of Sciences.
FutureGrid: an experimental, high-performance grid testbed Craig Stewart Executive Director, Pervasive Technology Institute Indiana University
Statewide IT Conference, Bloomington IN (October 7 th, 2014) The National Center for Genome Analysis Support, IU and You! Carrie Ganote (Bioinformatics.
Research & Academic Computing Bradley C. Wheeler Associate Vice President & Dean.
Information technology, collaboration, and achieving IU ’ s research goals Craig A. Stewart 13 November 2003 Director, Research and Academic.
18:15:32Service Oriented Cyberinfrastructure Lab, Grid Deployments Saul Rioja Link to presentation on wiki.
Craig Stewart 23 July 2009 Cyberinfrastructure in research, education, and workforce development.
INDIANAUNIVERSITYINDIANAUNIVERSITY January 2002 INGEN's advanced IT facilities Craig A. Stewart
Goodbye from Indianapolis, IUPUI, and Craig A. Stewart Executive Director, Pervasive Technology Institute Associate Dean, Research Technologies Indiana.
High Performance Computing for University Medical Research: A Successful Implementation Dr. Craig A. Stewart, Ph.D. Director, Research and.
Big Red II & Supporting Infrastructure Craig A. Stewart, Matthew R. Link, David Y Hancock Presented at IUPUI Faculty Council Information Technology Subcommittee.
I-Light: A Network for Collaboration between Indiana University and Purdue University Craig Stewart Associate Vice President Gary Bertoline Associate Vice.
Genomics, Transcriptomics, and Proteomics: Engaging Biologists Richard LeDuc Manager, NCGAS eScience, Chicago 10/8/2012.
The National Center for Genome Analysis Support as a Model Virtual Resource for Biologists Internet2 Network Infrastructure for the Life Sciences Focused.
Corral: A Texas-scale repository for digital research data Chris Jordan Data Management and Collections Group Texas Advanced Computing Center.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
Leveraging the National Cyberinfrastructure for Top Down Mass Spectrometry Richard LeDuc.
© Trustees of Indiana University Released under Creative Commons 3.0 unported license; license terms on last slide. The IQ-Table & Collection Viewer A.
RNA-Seq 2013, Boston MA, 6/20/2013 Optimizing the National Cyberinfrastructure for Lower Bioinformatic Costs: Making the Most of Resources for Publicly.
1 BioGrids in the US: Current status and future opportunities Craig A. Stewart 15 April 2004 Director, Research and Academic Computing Director,
Pti.iu.edu /jetstream Award # funded by the National Science Foundation Award #ACI Jetstream - A self-provisioned, scalable science and.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
The Future of the iPlant Cyberinfrastructure: Coming Attractions.
A Global Grid for Analysis of Arthropod Evolution Craig A. Stewart, Rainer Keller, Richard Repasky, Matthias Hess, David Hart, Matthias Müller, Ray Sheppard,
Pascucci-1 Valerio Pascucci Director, CEDMAV Professor, SCI Institute & School of Computing Laboratory Fellow, PNNL Massive Data Management, Analysis,
Pti.iu.edu /jetstream Award # funded by the National Science Foundation Award #ACI Jetstream Overview – XSEDE ’15 Panel - New and emerging.
INDIANAUNIVERSITYINDIANAUNIVERSITY 1 Parallel implementation and performance of fastDNAml - a program for maximum likelihood phylogenetic inference Craig.
Research Computing Archived Presentation Title:Indiana Economic Development From Indiana Economic Development Corporation to Indiana and Purdue.
INDIANAUNIVERSITYINDIANAUNIVERSITY Spring 2000 Indiana University Information Technology University Information Technology Services Please cite as: Stewart,
Biological Signal Detection for Protein Function Prediction Investigators: Yang Dai Prime Grant Support: NSF Problem Statement and Motivation Technical.
February 27, 2007 University Information Technology Services Research Computing Craig A. Stewart Associate Vice President, Research Computing Chief Operating.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
ESFRI & e-Infrastructure Collaborations, EGEE’09 Krzysztof Wrona September 21 st, 2009 European XFEL.
A national science & engineering cloud funded by the National Science Foundation Award #ACI Craig Stewart ORCID ID Jetstream.
Recent key achievements in research computing at IU Craig Stewart Associate Vice President, Research & Academic Computing Chief Operating Officer, Pervasive.
© Trustees of Indiana University Released under Creative Commons 3.0 unported license; license terms on last slide. Update on EAGER: Best Practices and.
Award # funded by the National Science Foundation Award #ACI Jetstream: A Distributed Cloud Infrastructure for.
A national science & engineering cloud funded by the National Science Foundation Award #ACI Craig Stewart ORCID ID Jetstream.
Cyberinfrastructure Overview Russ Hobby, Internet2 ECSU CI Days 4 January 2008.
2005 GRIDS Community Workshop1 Learning From Cyberinfrastructure Initiatives Grid Research Integration Development & Support
NICS Update Bruce Loftis 16 December National Institute for Computational Sciences University of Tennessee and ORNL partnership  NICS is the 2.
High Risk 1. Ensure productive use of GRID computing through participation of biologists to shape the development of the GRID. 2. Develop user-friendly.
© Trustees of Indiana University Released under Creative Commons 3.0 unported license; license terms on last slide. Informatics Tools at the Indiana CTSI.
High throughput biology data management and data intensive computing drivers George Michaels.
Indiana University - IBM Visit IT at IU. n Please cite as: Stewart, C.A IU. Presentation. Presented at IBM T.J. Watson Research Center, Feb.
Northwest Indiana Computational Grid Preston Smith Rosen Center for Advanced Computing Purdue University - West Lafayette West Lafayette Calumet.
Page : 1 SC2004 Pittsburgh, November 12, 2004 DEISA : integrating HPC infrastructures in Europe DEISA : integrating HPC infrastructures in Europe Victor.
INTRODUCTION TO XSEDE. INTRODUCTION  Extreme Science and Engineering Discovery Environment (XSEDE)  “most advanced, powerful, and robust collection.
Research & Academic Computing Indiana University Statewide IT Conference 11 September 2003 Indianapolis IN.
Clouds , Grids and Clusters
Matt Link Associate Vice President (Acting) Director, Systems
Research and Academic Computing Division
Presentation transcript:

1 IBM – IU Recent Research Activities and Collaboration Opportunities Craig Stewart Associate Vice President for Research & Academic Computing (Acting) Chief Operational Officer, Pervasive Technology Labs 18 May 2005 ©Trustees of Indiana University 2005

License Terms Please cite as Stewart, C.A. IBM – IU Recent Research Activities and Collaboration Opportunities Presentation. (Bloomington, IN, 18 May 2005). Available from: Except where otherwise noted by inclusion of a source url or a copyright notice, the contents of this presentation are © by the Trustees of Indiana University. This content is released under the Creative Commons Attribution 3.0 Unported license ( This license includes the following terms: You are free to share – to copy, distribute and transmit the work and to remix – to adapt the work under the following conditions: attribution – you must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work). For any reuse or distribution, you must make clear to others the license terms of this work.

3 IU in a nutshell $2B Annual Budget One university with 8 campuses 90,000 students 3,900 faculty 878 degree programs Nation’s 2nd largest school of medicine >$100M annual IT budget

4 OVPIT Michael A. McRobbie Vice President for Research and Information Technology, Chief Information Officer

5 Research & Academic Computing Division of UITS Craig A. Stewart, Associate Vice President (Acting), Chief Operational Officer, Pervasive Technology Labs Thomas Hacker, Associate Director Eric Wernert, Associate Director Scott McCaulay, TeraGrid Site Lead Leigh Grundhoeffer, OSG Site Lead Research & Technical Services Distributed Storage Systems Group Stat/Math Center Digital Library POC Advanced Visualization Lab High Performance Computing Support Team Unix System Support Group Center for Computational Cytomics

6

7 IBM Research SP (Aries/Orion Complex) 632 cpus, TeraFLOPS. First University-owned supercomputer in US to exceed 1 TFLOPS aggregate peak theoretical processing capacity. Geographically distributed at IUB and IUPUI Initially 50 th on Top 500 Photo: Tyagan Miller. May be reused by IU for noncommercial purposes. To license for commercial use, contact the photographer

8 AVIDD Analysis and Visualization of Instrument-Driven Data Distributed Linux cluster. Three locations: IUN, IUPUI, IUB TFLOPS (peak theoretical), 0.5 TB RAM, 10 TB Disk First distributed Linux cluster to achieve > 1 TFLOPS on Linpack benchmark Small cluster at IU Northwest, as well as other components of system, funded through SUR grant

9 Massive Data Storage System Reliable and robust HPSS (High Performance Software System) Automatic replication of data between Indianapolis and Bloomington, via I- light. 180 TB capacity with existing tapes; total capacity of 2.4 PB. >100 TB currently in use; >5 TB for biomedical data Photo: Tyagan Miller. May be reused by IU for noncommercial purposes. To license for commercial use, contact the photographer

10 John-E-Box Design licensed to central Indiana manufacturer

11

12 IU-IBM history with collaborative research 1998 – SUR grant: Linear System Analyzer 1999 – SUR grant: Infrastructure for deep computing: High Performance Computing component technologies and transparent, unlimited I/O in support of data-intensive research 2000 – SUR grant: Tightly integrated distributed supercomputing - the Indiana TeraCloud Announcement of Indiana Genomics Initiative by IU - $105M grant from Lilly Endowment, Inc SUG grant: Information Technology Applications for the Life Sciences $5M investment in IU’s IBM SP, expansion to more than 1 TFLOPS 2002/3 – SUR grant: Grid and Data Intensive Computing in the Life Sciences 2003 – AVIDD cluster first distributed cluster to achieve > 1 TFLOPS on HPLinpack 2003 – IU named IBM Institute of Innovation 2003 – HPC Challenge Award at SC – Announcement of Indiana METACyt Initiative by IU - $53M grant from Lilly Endowment, Inc – Spirit of Philanthropy Award given by OVPIT to IBM, Inc. Proposals and reports available online at

13 A sampling of the results of SUR grants Open source software distributed by IU: –Linear Systems Analyzer –PINY_MD –Parallel fastDNAml –GeneIndex –AIX Batch Scripts –Reference site for IBM Information Integrator, wrappers for II contributed as open source –(see for more information)

14 Intellectual Results Hundreds of publications based on SUR grants and use of IBM equipment

15 TeraGrid

16 IU in major Grid & Networking Projects iVDGL, ATLAS, OSG, CCA, web services, NMI. Participant in three HPC Challenge projects (I-Way in ’95, International grid for mold-filling simulation in ’98, Global Grid for analysis of arthropod evolution in 2003) GlobalNOC - operations for I2/Abilene, NLR, TransPAC, etc. Statewide networking and grid (I-light & IP-grid)

17 TeraGrid Network Connection

18 Infrastructure Contributions Common Infrastructure Contributions –AVIDD – Linux Cluster 2 TFLOPS overall 32 P Itanium2, 384P Prestonia 412 GB RAM; ~10 TB disk –IBM SP 1 TFLOPS (Power4 Regatta & Power3+) –Significant expansion planned –Storage 32 TB spinning disk 150 TB (initially) of IU’s HPSS system, with capacity to 2.4 PB Unique contributions –MD-GRAPE –SMBL/Condor pool –Barco MoVE Lite –CAVE –John-E-Boxes

19 Data Resources Life Sciences –CLSD (Centralized Life Science Data Service) – integrated queries to an assortment of life science data sets, including many of the most widely used public biomedical datasets and a few less common –Flybase and EuGenes – IU- curated data sources –Portal development and building “critical mass” of data resources accessible via the TeraGrid a key contribution Crystallography GIS data

20 TeraGrid DEEP LEAD (Linked Environments for Atmospheric Discovery – Dennis Gannon, IU Lead) –Goal: create an integrated, scalable framework that enables use of analysis tools, forecast models, and data repositories to be used as dynamically adaptive, on-demand systems to predict Tornados. –Real time control of sensors informed by current weather –Requires TeraGrid –Demonstrated at SC2004 Factory myLEAD “agent” instance myLEAD “agent” instance WRF model Data mining task Data mining task workflow myLEAD service myLEAD service Storage Repository Service Storage Repository Service myLEAD portlet as component of LEAD portal IU- TG NCSA- TG /var/tmp/wrf_tmp Service Crystallography at Advanced Photon Sources (SCRaPS) Provides remote instrument access & real-time collaboration Uses XPORT platform Step toward NSF-funded CIMA

21 TeraGrid Wide Helping to Create a 21st Century Workforce –Inspiring and educating children before high school! –Embracing and encouraging diversity through active outreach (e.g. Grace Hopper Celebration sponsor) –Education and training – internship program Broadening TeraGrid User Base –Enabling Life Science applications (Nov 2004 CACM) –Identifying and grid-enabling existing applications –Creating new applications –TeraGrid–oriented tutorials, education, & online support

22 Data Capacitor

23 Data behaves as an incompressible fluid

24 Roles of Data Capacitor Catching the data deluge Parallel high speed I/O of files transported in serial Temporary storage between different parts of a work flow Highly reliable metadata services for online data grids

25 Collaborative Life Science Projects

26 Indiana Genomics Initiative $105M Grant to Indiana University from Lilly Endowment, Inc. Announced December 2000 Created new Programs and Cores within IU School of Medicine Followed by $50M additional for building Runs till 2008; has transformed IU School of Medicine InGen’s IT core is a critical part of the infrastructure for the initiative as a whole –Supercomputing –Massive Data Storage –Visualization

27

28 Example INGEN IT development projects Parallel techniques for rendering PET scan images Protein Family Annotator (Possible IBM partnership project) Computational phylogenetics Biologist’s Portal DiscoveryLink as a means to interconnect IU School of Medicine databases (Possible IBM partnership project)

29 IU-IBM Life Sciences MOU Significant portions of INGEN funding expended on IBM technology for supercomputing and massive data storage Implementation of IBM Information Integrator at IU – early demonstration of success, cornerstone of our data strategy going forward Many collaborative information dissemination activities

30 Hereditary Diseases and Family Studies Division, Dept. of Medical and Molecular Genetics, IU School of Medicine. Supported in part by NIH R01 NS37167.

31 Hereditary Diseases and Family Studies Division, Dept. of Medical and Molecular Genetics, IU School of Medicine. Supported in part by NIH R01 NS37167.

32 HPC Challenge 2003 Are Hexapods (animals with six legs) a single evolutionary group? Are ecdysozoans (animals that shed their skins) a single evolutionary group?

33

34

35 GleiderfüsslerGrid

36 The Metacomputers OneSGI Origin CEBPA (Spain) Linux cluster 64AIST (Japan) Linux cluster 12ANU (Australia) TwoT3E 128HLRS (Germany) IBM SP 64IUB (US) Dec Alpha 4USP (Brazil) Sunfire NUS (Singapore) ThreeHitachi SR Germany Cray T3E 128 MCC (UK) Cray T3E 32PSC (US) IBM SP (Blue Horizon) 32SDSC (US) FourDec Alpha (Lemieux) 64PSC (US) FiveLinux system 1ISET’com (Tunisia) 8 types of systems (several on Top500 list & TeraGrid); 6+ vendors; 641 processors; 9 countries; 6 continents

37 Results of one run

38 IBM Life Sciences Institute of Innovation Biocomplexity Institute Center for Cell and Virus Theory

39 Biocomplexity Institute Department of Physics James Glazier, Director Compucell: using methods of statistical physics to study Complex Tissues –Cell Diffusion, Sorting and Adhesion 18 other affiliated faculty, 3 post-docs [incl Debasis Dan, IBM Inst of Innovation Research Associate], 3 students and 4 staff [incl Maciej Swat, Compucell Developer] Fang Liu, IBM Institute of Innovation Graduate Assistant

40 Cellular Potts Model

41 Center for Cell and Virus Theory Department of Chemistry Peter Ortoleva, Director Karyote: a genomic, proteomic, metabolic cell simulator based on the numerical solution of a set of reaction, transport and mechanical equations underlying cell dynamics. Information theory used to integrate models with data for automated model development, calibration 2 Research Associates [incl Kagan Tuncay, IBM Institute of Innovation Associate Scientist], 9 students and 6 staff [incl Maciej Swat, Compucell Developer] Yu Ma, IBM Institute of Innovation Graduate Assistant

42 Center for Cell and Virus Theory KAGAN (KAryote Genome ANalyzer) is a software package that receives raw time series microarray data, the list of factors that regulate each gene, and yields the timecourse of thermodynamic activities within the nucleus or prokaryotic cell. Comparison of the predicted transcription factor timecourses with the actual ones. The solid line is the actual timecourse. Square, diamond and hollow circle markers represent the solution obtained using the actual control network, 36 additional nonzero elements, and full interaction matrix, respectively.

43 IBM Life Sciences Institute of Innovation SBML Interoperability Yu Ma and Fang Liu (III-funded GAs) SBML Generator: generates SBML (Systems Biology Markup Language) description of metabolic pathways for a given organism, harvested from public resources such as KEGG (Kyoto Encyclopedia of Genes and Genomes), LIGAND and PATHWAY databases, Includes LIGAND database parser that maps compound, enzyme and reaction IDs to the set of information that Karyote concerns about, and outputs the mapping results in XML format. Query KEGG through its WSDL API and look up the parsed LIGAND database to generate the complete metabolic pathway information in a SBML file that conforms with the local extension of SBML standard. BSMLParser: input KEGG gene database file for a given organism, output Karyote related gene information in BSML (Bioinformatic Sequence Markup Language) format. Transfac Loader: populate Karyote database with gene regulation information from TRANSFAC database. SBMLConverter component within CompuCell3D to parse the configure file, output CenterOfMass, Surface, Volume, CellType data during runtime.

44 Indiana METACyt Initiative

45 METACyt Indiana Metabolomics and Cytomics Initiative $53M, 5-year project ~50% of resources into IU Bloomington research infrastructure (computing, greenhouses, transgenic mouse facility, 800 MHz NMR) ~50% into Research Nodes and Integrating Science and Technology Centers METACyt Innovation Fund Goal: self-sustainability based on grants after 5 years

46

47 Center for Computational Cytomics Virtual Center – collaboration of UITS/RAC and Center for Genomics and Bioinformatics Will span range from gene expression wet lab to cell models Computational aspects will focus on: –Phylogenetics –Cell modeling –Archiving, management, and curation of cell model data and results –Creating an information architecture for METACyt

48 Pervasive Technology Labs

49

50 The overall climate Economic/funding outlook poor in comparison to recent years SOMEONE is winning grants IU has solid track record Long-time goal set for Indiana University: to be a leader, in absolute terms, in the development, deployment and use of information technology President Herbert’s inaugural address (“Extending the Reach of Knowledge”): “It is essential that we continue to strengthen our research productivity. In addition to increased scholarly publications, works of art, concerts and other forms of creative scholarly activity, our goal must be to double Indiana University's externally funded research grants and contracts by the end of the decade."

51 Questions and discussion? For further information, see: metacyt.indiana.edu rac.uits.indiana.edu pervasivetechnologylabs.iu.edu i-light.org Ip-grid.org research-indiana.org