Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 IBM – IU Recent Research Activities and Collaboration Opportunities Craig Stewart Associate Vice President for Research & Academic Computing (Acting)

Similar presentations


Presentation on theme: "1 IBM – IU Recent Research Activities and Collaboration Opportunities Craig Stewart Associate Vice President for Research & Academic Computing (Acting)"— Presentation transcript:

1 1 IBM – IU Recent Research Activities and Collaboration Opportunities Craig Stewart Associate Vice President for Research & Academic Computing (Acting) Chief Operational Officer, Pervasive Technology Labs 18 May 2005 ©Trustees of Indiana University 2005

2 License Terms Please cite as Stewart, C.A. IBM – IU Recent Research Activities and Collaboration Opportunities. 2005. Presentation. (Bloomington, IN, 18 May 2005). Available from: http://hdl.handle.net/2022/14774 http://hdl.handle.net/2022/14774 Except where otherwise noted by inclusion of a source url or a copyright notice, the contents of this presentation are © by the Trustees of Indiana University. This content is released under the Creative Commons Attribution 3.0 Unported license (http://creativecommons.org/licenses/by/3.0/). This license includes the following terms: You are free to share – to copy, distribute and transmit the work and to remix – to adapt the work under the following conditions: attribution – you must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work). For any reuse or distribution, you must make clear to others the license terms of this work.

3 3 IU in a nutshell $2B Annual Budget One university with 8 campuses 90,000 students 3,900 faculty 878 degree programs Nation’s 2nd largest school of medicine >$100M annual IT budget

4 4 OVPIT Michael A. McRobbie Vice President for Research and Information Technology, Chief Information Officer

5 5 Research & Academic Computing Division of UITS Craig A. Stewart, Associate Vice President (Acting), Chief Operational Officer, Pervasive Technology Labs Thomas Hacker, Associate Director Eric Wernert, Associate Director Scott McCaulay, TeraGrid Site Lead Leigh Grundhoeffer, OSG Site Lead Research & Technical Services Distributed Storage Systems Group Stat/Math Center Digital Library POC Advanced Visualization Lab High Performance Computing Support Team Unix System Support Group Center for Computational Cytomics

6 6

7 7 IBM Research SP (Aries/Orion Complex) 632 cpus, 1.005 TeraFLOPS. First University-owned supercomputer in US to exceed 1 TFLOPS aggregate peak theoretical processing capacity. Geographically distributed at IUB and IUPUI Initially 50 th on Top 500 Photo: Tyagan Miller. May be reused by IU for noncommercial purposes. To license for commercial use, contact the photographer

8 8 AVIDD Analysis and Visualization of Instrument-Driven Data Distributed Linux cluster. Three locations: IUN, IUPUI, IUB 2.164 TFLOPS (peak theoretical), 0.5 TB RAM, 10 TB Disk First distributed Linux cluster to achieve > 1 TFLOPS on Linpack benchmark Small cluster at IU Northwest, as well as other components of system, funded through SUR grant

9 9 Massive Data Storage System Reliable and robust HPSS (High Performance Software System) Automatic replication of data between Indianapolis and Bloomington, via I- light. 180 TB capacity with existing tapes; total capacity of 2.4 PB. >100 TB currently in use; >5 TB for biomedical data Photo: Tyagan Miller. May be reused by IU for noncommercial purposes. To license for commercial use, contact the photographer

10 10 John-E-Box Design licensed to central Indiana manufacturer

11 11

12 12 IU-IBM history with collaborative research 1998 – SUR grant: Linear System Analyzer 1999 – SUR grant: Infrastructure for deep computing: High Performance Computing component technologies and transparent, unlimited I/O in support of data-intensive research 2000 – SUR grant: Tightly integrated distributed supercomputing - the Indiana TeraCloud 2000 - Announcement of Indiana Genomics Initiative by IU - $105M grant from Lilly Endowment, Inc. 2001- SUG grant: Information Technology Applications for the Life Sciences 2001 - $5M investment in IU’s IBM SP, expansion to more than 1 TFLOPS 2002/3 – SUR grant: Grid and Data Intensive Computing in the Life Sciences 2003 – AVIDD cluster first distributed cluster to achieve > 1 TFLOPS on HPLinpack 2003 – IU named IBM Institute of Innovation 2003 – HPC Challenge Award at SC2003 2004 – Announcement of Indiana METACyt Initiative by IU - $53M grant from Lilly Endowment, Inc. 2005 – Spirit of Philanthropy Award given by OVPIT to IBM, Inc. Proposals and reports available online at http://www.indiana.edu/~rac/ibm/

13 13 A sampling of the results of SUR grants Open source software distributed by IU: –Linear Systems Analyzer –PINY_MD –Parallel fastDNAml –GeneIndex –AIX Batch Scripts –Reference site for IBM Information Integrator, wrappers for II contributed as open source –(see http://rac.uits.iu.edu/ for more information)

14 14 Intellectual Results http://racinfo.uits.iu.edu/publications/ Hundreds of publications based on SUR grants and use of IBM equipment

15 15 TeraGrid

16 16 IU in major Grid & Networking Projects iVDGL, ATLAS, OSG, CCA, web services, NMI. Participant in three HPC Challenge projects (I-Way in ’95, International grid for mold-filling simulation in ’98, Global Grid for analysis of arthropod evolution in 2003) GlobalNOC - operations for I2/Abilene, NLR, TransPAC, etc. Statewide networking and grid (I-light & IP-grid)

17 17 TeraGrid Network Connection

18 18 Infrastructure Contributions Common Infrastructure Contributions –AVIDD – Linux Cluster 2 TFLOPS overall 32 P Itanium2, 384P Prestonia 412 GB RAM; ~10 TB disk –IBM SP 1 TFLOPS (Power4 Regatta & Power3+) –Significant expansion planned –Storage 32 TB spinning disk 150 TB (initially) of IU’s HPSS system, with capacity to 2.4 PB Unique contributions –MD-GRAPE –SMBL/Condor pool –Barco MoVE Lite –CAVE –John-E-Boxes

19 19 Data Resources Life Sciences –CLSD (Centralized Life Science Data Service) – integrated queries to an assortment of life science data sets, including many of the most widely used public biomedical datasets and a few less common –Flybase and EuGenes – IU- curated data sources –Portal development and building “critical mass” of data resources accessible via the TeraGrid a key contribution Crystallography GIS data

20 20 TeraGrid DEEP LEAD (Linked Environments for Atmospheric Discovery – Dennis Gannon, IU Lead) –Goal: create an integrated, scalable framework that enables use of analysis tools, forecast models, and data repositories to be used as dynamically adaptive, on-demand systems to predict Tornados. –Real time control of sensors informed by current weather –Requires TeraGrid –Demonstrated at SC2004 Factory myLEAD “agent” instance myLEAD “agent” instance WRF model Data mining task Data mining task workflow myLEAD service myLEAD service Storage Repository Service Storage Repository Service myLEAD portlet as component of LEAD portal IU- TG NCSA- TG /var/tmp/wrf_tmp Service Crystallography at Advanced Photon Sources (SCRaPS) Provides remote instrument access & real-time collaboration Uses XPORT platform Step toward NSF-funded CIMA

21 21 TeraGrid Wide Helping to Create a 21st Century Workforce –Inspiring and educating children before high school! –Embracing and encouraging diversity through active outreach (e.g. Grace Hopper Celebration sponsor) –Education and training – internship program Broadening TeraGrid User Base –Enabling Life Science applications (Nov 2004 CACM) –Identifying and grid-enabling existing applications –Creating new applications –TeraGrid–oriented tutorials, education, & online support

22 22 Data Capacitor

23 23 Data behaves as an incompressible fluid

24 24 Roles of Data Capacitor Catching the data deluge Parallel high speed I/O of files transported in serial Temporary storage between different parts of a work flow Highly reliable metadata services for online data grids

25 25 Collaborative Life Science Projects

26 26 Indiana Genomics Initiative $105M Grant to Indiana University from Lilly Endowment, Inc. Announced December 2000 Created new Programs and Cores within IU School of Medicine Followed by $50M additional for building Runs till 2008; has transformed IU School of Medicine InGen’s IT core is a critical part of the infrastructure for the initiative as a whole –Supercomputing –Massive Data Storage –Visualization

27 27

28 28 Example INGEN IT development projects Parallel techniques for rendering PET scan images Protein Family Annotator (Possible IBM partnership project) Computational phylogenetics Biologist’s Portal DiscoveryLink as a means to interconnect IU School of Medicine databases (Possible IBM partnership project)

29 29 IU-IBM Life Sciences MOU Significant portions of INGEN funding expended on IBM technology for supercomputing and massive data storage Implementation of IBM Information Integrator at IU – early demonstration of success, cornerstone of our data strategy going forward Many collaborative information dissemination activities

30 30 Hereditary Diseases and Family Studies Division, Dept. of Medical and Molecular Genetics, IU School of Medicine. Supported in part by NIH R01 NS37167.

31 31 Hereditary Diseases and Family Studies Division, Dept. of Medical and Molecular Genetics, IU School of Medicine. Supported in part by NIH R01 NS37167.

32 32 HPC Challenge 2003 Are Hexapods (animals with six legs) a single evolutionary group? Are ecdysozoans (animals that shed their skins) a single evolutionary group?

33 33

34 34

35 35 GleiderfüsslerGrid

36 36 The Metacomputers OneSGI Origin 200032CEBPA (Spain) Linux cluster 64AIST (Japan) Linux cluster 12ANU (Australia) TwoT3E 128HLRS (Germany) IBM SP 64IUB (US) Dec Alpha 4USP (Brazil) Sunfire 6800 16NUS (Singapore) ThreeHitachi SR8000 32Germany Cray T3E 128 MCC (UK) Cray T3E 32PSC (US) IBM SP (Blue Horizon) 32SDSC (US) FourDec Alpha (Lemieux) 64PSC (US) FiveLinux system 1ISET’com (Tunisia) 8 types of systems (several on Top500 list & TeraGrid); 6+ vendors; 641 processors; 9 countries; 6 continents

37 37 Results of one run

38 38 IBM Life Sciences Institute of Innovation Biocomplexity Institute Center for Cell and Virus Theory

39 39 Biocomplexity Institute Department of Physics James Glazier, Director Compucell: using methods of statistical physics to study Complex Tissues –Cell Diffusion, Sorting and Adhesion 18 other affiliated faculty, 3 post-docs [incl Debasis Dan, IBM Inst of Innovation Research Associate], 3 students and 4 staff [incl Maciej Swat, Compucell Developer] Fang Liu, IBM Institute of Innovation Graduate Assistant

40 40 Cellular Potts Model http://www.nd.edu/~lcls/compucell/examples/example1.htm#animpics

41 41 Center for Cell and Virus Theory Department of Chemistry Peter Ortoleva, Director Karyote: a genomic, proteomic, metabolic cell simulator based on the numerical solution of a set of reaction, transport and mechanical equations underlying cell dynamics. Information theory used to integrate models with data for automated model development, calibration 2 Research Associates [incl Kagan Tuncay, IBM Institute of Innovation Associate Scientist], 9 students and 6 staff [incl Maciej Swat, Compucell Developer] Yu Ma, IBM Institute of Innovation Graduate Assistant

42 42 Center for Cell and Virus Theory KAGAN (KAryote Genome ANalyzer) is a software package that receives raw time series microarray data, the list of factors that regulate each gene, and yields the timecourse of thermodynamic activities within the nucleus or prokaryotic cell. Comparison of the predicted transcription factor timecourses with the actual ones. The solid line is the actual timecourse. Square, diamond and hollow circle markers represent the solution obtained using the actual control network, 36 additional nonzero elements, and full interaction matrix, respectively.

43 43 IBM Life Sciences Institute of Innovation SBML Interoperability Yu Ma and Fang Liu (III-funded GAs) SBML Generator: generates SBML (Systems Biology Markup Language) description of metabolic pathways for a given organism, harvested from public resources such as KEGG (Kyoto Encyclopedia of Genes and Genomes), LIGAND and PATHWAY databases, Includes LIGAND database parser that maps compound, enzyme and reaction IDs to the set of information that Karyote concerns about, and outputs the mapping results in XML format. Query KEGG through its WSDL API and look up the parsed LIGAND database to generate the complete metabolic pathway information in a SBML file that conforms with the local extension of SBML standard. BSMLParser: input KEGG gene database file for a given organism, output Karyote related gene information in BSML (Bioinformatic Sequence Markup Language) format. Transfac Loader: populate Karyote database with gene regulation information from TRANSFAC database. SBMLConverter component within CompuCell3D to parse the configure file, output CenterOfMass, Surface, Volume, CellType data during runtime.

44 44 Indiana METACyt Initiative

45 45 METACyt Indiana Metabolomics and Cytomics Initiative $53M, 5-year project ~50% of resources into IU Bloomington research infrastructure (computing, greenhouses, transgenic mouse facility, 800 MHz NMR) ~50% into Research Nodes and Integrating Science and Technology Centers METACyt Innovation Fund Goal: self-sustainability based on grants after 5 years

46 46

47 47 Center for Computational Cytomics Virtual Center – collaboration of UITS/RAC and Center for Genomics and Bioinformatics Will span range from gene expression wet lab to cell models Computational aspects will focus on: –Phylogenetics –Cell modeling –Archiving, management, and curation of cell model data and results –Creating an information architecture for METACyt

48 48 Pervasive Technology Labs

49 49

50 50 The overall climate Economic/funding outlook poor in comparison to recent years SOMEONE is winning grants IU has solid track record Long-time goal set for Indiana University: to be a leader, in absolute terms, in the development, deployment and use of information technology President Herbert’s inaugural address (“Extending the Reach of Knowledge”): “It is essential that we continue to strengthen our research productivity. In addition to increased scholarly publications, works of art, concerts and other forms of creative scholarly activity, our goal must be to double Indiana University's externally funded research grants and contracts by the end of the decade."

51 51 Questions and discussion? For further information, see: www.indiana.edu/~rac/ibm/ metacyt.indiana.edu rac.uits.indiana.edu pervasivetechnologylabs.iu.edu i-light.org Ip-grid.org research-indiana.org


Download ppt "1 IBM – IU Recent Research Activities and Collaboration Opportunities Craig Stewart Associate Vice President for Research & Academic Computing (Acting)"

Similar presentations


Ads by Google