Presentation is loading. Please wait.

Presentation is loading. Please wait.

INDIANAUNIVERSITYINDIANAUNIVERSITY April 2002 Implementing advanced IT facilities for the Indiana Genomics Initiative Craig A. Stewart

Similar presentations


Presentation on theme: "INDIANAUNIVERSITYINDIANAUNIVERSITY April 2002 Implementing advanced IT facilities for the Indiana Genomics Initiative Craig A. Stewart"— Presentation transcript:

1 INDIANAUNIVERSITYINDIANAUNIVERSITY HPC@IDC April 2002 Implementing advanced IT facilities for the Indiana Genomics Initiative Craig A. Stewart stewart@iu.edu HPC@IDC meeting April 23-24, 2002, HPC User Forum meeting, Santa Fe, New Mexico

2 INDIANAUNIVERSITYINDIANAUNIVERSITY License terms Please cite as: Stewart, C.A. Implementing advanced IT facilities for the Indiana Genomics Initiative. 2002. Presentation. Presented at: HPC User Forum (Santa Fe, New Mexico, 23 Apr 2002). Available from: http://hdl.handle.net/2022/15220http://hdl.handle.net/2022/15220 Except where otherwise noted, by inclusion of a source url or some other note, the contents of this presentation are © by the Trustees of Indiana University. This content is released under the Creative Commons Attribution 3.0 Unported license (http://creativecommons.org/licenses/by/3.0/). This license includes the following terms: You are free to share – to copy, distribute and transmit the work and to remix – to adapt the work under the following conditions: attribution – you must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work). For any reuse or distribution, you must make clear to others the license terms of this work. 2

3 INDIANAUNIVERSITYINDIANAUNIVERSITY HPC@IDC April 2002 Indiana University ’ s Goals IT Goal: “ To be a leader in absolute terms in information technology. ” IU president Myles Brand, 1996 Goals of the Indiana Genomics Initiative: To advance understanding of life ’ s processes, develop new therapies for human diseases, improve the quality of human health in Indiana, and enhance the strength of the central Indiana high-tech economy

4 INDIANAUNIVERSITYINDIANAUNIVERSITY HPC@IDC April 2002 IU in a nutshell Founded in 1820 $2B Annual Budget 8 campuses Campuses well connected; esp. IUB, IUPUI, and Purdue ’ s campus at W. Lafayette connected by I-light IU Operates TransPAC, GlobalNOC

5 INDIANAUNIVERSITYINDIANAUNIVERSITY HPC@IDC April 2002 IT@IU in a nutshell Academic programs in IT through computer science, library and information sciences, engineering and technology, and most notably through new School of Informatics CIO: Vice President Michael A. McRobbie ~$100M annual budget Technology services offered university-wide pervasivetechnologylabs

6 INDIANAUNIVERSITYINDIANAUNIVERSITY HPC@IDC April 2002 School of Medicine in a nutshell 2 nd largest School of Medicine in the US IU Cancer Center nationally recognized leader Regenstrief Institute longstanding leader in medical informatics National leader in optical and tomographic imaging Longstanding leader in genetically influenced diseases including Huntington ’ s (Conneally), alcoholism (Li); currently lead institution in national study of bipolar disorder

7 INDIANAUNIVERSITYINDIANAUNIVERSITY HPC@IDC April 2002 INGEN Created by $105 M grant from the Lilly Endowment to Indiana University Involves IU School of Medicine (IUPUI), Departments of Biology and Chemistry (IUB), Center for Genomics and Proteomics (IUB), and University Information Technology Services Comprised of “ Programs ” (central research areas) and “ Cores ” (supporting units that are also generally research areas)

8 INDIANAUNIVERSITYINDIANAUNIVERSITY HPC@IDC April 2002

9 INDIANAUNIVERSITYINDIANAUNIVERSITY IT and INGEN INGEN ’ s IT core is a critical part of the infrastructure for the initiative as a whole –Networking (using I-light facility) –Supercomputing –Massive Data Storage –Visualization –Support IT is one of the paths by which INGEN should enhance the Indiana Economy

10 INDIANAUNIVERSITYINDIANAUNIVERSITY HPC@IDC April 2002 Supercomputing - Oct 17 IU/IBM announcement IU tripled the capacity of its IBM SP, to > 1 TFLOPS (a trillion mathematical operations per second). IU ’ s SP is very large when considered within the set of supercomputers owned by individual universities Large part of this acquisition made possible via funding from INGEN IU and IBM also announced a partnership in developing new supercomputer applications for the life sciences

11 INDIANAUNIVERSITYINDIANAUNIVERSITY HPC@IDC April 2002 Photo: Tyagan Miller. May be reused by IU for noncommercial purposes. To license for commercial use, contact the photographer

12 INDIANAUNIVERSITYINDIANAUNIVERSITY HPC@IDC April 2002 Sun E10000 IU is a Sun “ Center of Excellence ” and is pursuing collaborative research with Sun in the area of Chemical Informatics Photo: Tyagan Miller. May be reused by IU for noncommercial purposes. To license for commercial use, contact the photographer

13 INDIANAUNIVERSITYINDIANAUNIVERSITY HPC@IDC April 2002 AVIDD Analysis & Visualization of Instrument-Driven Data Large, distributed Intel-compatible Linux cluster Distributed data storage/data staging Distributed visualization Education a key component of this initiative – distributed education (IUB, IUPUI, IUN) taught via Access Grids at advanced undergrad/beginning grad level

14 INDIANAUNIVERSITYINDIANAUNIVERSITY HPC@IDC April 2002 Massive Data Storage IU has a large massive data storage system based on IBM and STK tape robotic systems. IU ’ s massive data storage system is based on HPSS (High Performance Storage System) which provides for excellent security. >300 TB current capacity Mirrored storage in Indianapolis and Bloomington should provide safety in data storage IU was first installation to implement remote HPSS movers over long haul networks

15 INDIANAUNIVERSITYINDIANAUNIVERSITY HPC@IDC April 2002 Photo: Tyagan Miller. May be reused by IU for noncommercial purposes. To license for commercial use, contact the photographer

16 INDIANAUNIVERSITYINDIANAUNIVERSITY HPC@IDC April 2002 Advanced Visualization UITS, IU School of Medicine, and IUPUI Computer & Information Science have already collaborated to create 3- DIVE (3-D Interactive Volume Explorer) CAVE Immersadesk IU-designed passive 3D environments (4 ’ sq screen, 5 ’ sq footprint)

17 INDIANAUNIVERSITYINDIANAUNIVERSITY HPC@IDC April 2002 Accomplishments & Challenges Past accomplishments –fastDNAml –3DIVE Challenges –Broader engagement with life scientists –Data heterogeneity –New application areas

18 INDIANAUNIVERSITYINDIANAUNIVERSITY HPC@IDC April 2002 fastDNAml

19 INDIANAUNIVERSITYINDIANAUNIVERSITY HPC@IDC April 2002 Building Phylogenetic Trees Goal: an objective means by which phylogenetic trees can be estimated in tolerable amounts of wall-clock time, producing phylogenetic trees with measures of their uncertainty

20 INDIANAUNIVERSITYINDIANAUNIVERSITY HPC@IDC April 2002 Why is tree-building a HPC problem? The number of bifurcating unrooted trees for n taxa is (2n-5)!/ {2 n-3 (n-3)!} For 100 taxa the number of possible trees is ~10 182

21 INDIANAUNIVERSITYINDIANAUNIVERSITY HPC@IDC April 2002 fastDNAml Developed by Gary Olsen Derived from Felsensteins ’ s PHYLIP programs One of the more commonly used ML methods The first phylogenetic software implemented in a parallel program (at Argonne National Laboratory, using P4 libraries) Olsen, G.J.,et al.1994. fastDNAml: a tool for construction of phylogenetic trees of DNA sequences using maximum likelihood. Computer Applications in Biosciences 10: 41-48 MPI version available from IU now (development supported by IBM SUR grant)

22 INDIANAUNIVERSITYINDIANAUNIVERSITY HPC@IDC April 2002 Performance of fastDNAml

23 INDIANAUNIVERSITYINDIANAUNIVERSITY HPC@IDC April 2002 Current projects Data integration Gamma knife Pedigree analysis PET scan analysis Protein families AMASS – shotgun sequence assembly Data, data, data

24 INDIANAUNIVERSITYINDIANAUNIVERSITY HPC@IDC April 2002 HPC and life sciences HPC hardware and software market set to dramatically expand thanks to life sciences HPC and life sciences communities don ’ t share common language Biomedical researchers are no more conservative than anyone else Biomedical researchers not alone in creating bad code Both communities have lots to offer each other, but it seems at present up to the HPC community to reach out (when was the last time an astronomer saved your life?) HPC community has been slow to take advantage of opportunities offered via collaboration with life scientists This will be like the dot-com bust – sort of. The key question is: how great will be the similarities?

25 INDIANAUNIVERSITYINDIANAUNIVERSITY HPC@IDC April 2002 Challenges: creating collaborations with life scientists Need to challenge “ I can do it on my desktop ” mentality when appropriate Go for the low hanging fruit Remember that physics, astronomy, and other traditional HPC codes have a head start of many years Need to recognize the complexity of the life sciences

26 INDIANAUNIVERSITYINDIANAUNIVERSITY HPC@IDC April 2002 Current approaches @ IU Really clever batch scripts…. then portals Appropriate documentation Door to door consulting Proof of concept projects Contributions to open source/community code efforts

27 INDIANAUNIVERSITYINDIANAUNIVERSITY HPC@IDC April 2002 Keys to success in partnerships @ IU Long history of openness, diversity in HPC uses Accountability and service philosophy Supercomputing time and programming support baseline services Central computing center staff hired from several disciplines (including biology) Computer scientists who actually care about applications History and a certain amount of luck

28 INDIANAUNIVERSITYINDIANAUNIVERSITY HPC@IDC April 2002 Summary IU has thus far been very successful in implementing advanced IT infrastructure for life scientists Reaching out has been essential to formation of partnerships Industry partnerships have been essential to success So far, so good……

29 INDIANAUNIVERSITYINDIANAUNIVERSITY HPC@IDC April 2002 Acknowledgements IBM research relationships & SUR grants Sun and Center of Excellence relationships Compaq relationship Computer scientists at IU (esp. Randall Bramley, Dennis Gannon, Shaoifen Fang) State of Indiana Lilly Endowment

30 INDIANAUNIVERSITYINDIANAUNIVERSITY HPC@IDC April 2002 Important URLs University Information Technology Services: www.indiana.edu/~uits/ UITS Research & Academic Computing Division www.indiana.edu/~uits/rac InGen IT Core: www.indiana.edu/~rac/bioinformatics/ingen.html IU Teraflop SP announcement: www.indiana.edu/~rac/outreach.html IT@IU: it.iu.edu


Download ppt "INDIANAUNIVERSITYINDIANAUNIVERSITY April 2002 Implementing advanced IT facilities for the Indiana Genomics Initiative Craig A. Stewart"

Similar presentations


Ads by Google