Download presentation
Presentation is loading. Please wait.
Published bySibyl Margery Horn Modified over 9 years ago
1
SDSC RP Update TeraGrid Roundtable 01-14-10
2
Reviewing Dash Unique characteristics: –A pre-production/evaluation “data-intensive” supercomputer based on SSD flash memory and virtual shared memory –Nehalem processors Integrating into TeraGrid: –Add to TeraGrid Resource Catalog –Target friendly users interested in exploring unique capabilities –Available initially for start-up allocations (March 2010) –As it stabilizes and depending on user interest, evaluate more routine allocations at TRAC level –Appropriate CTSS kits will be installed –Planned to support TeraGrid wide-area filesystem efforts (GPFS-WAN, Lustre-WAN)
3
Introducing Gordon (SDSC’s Track 2d System) Unique characteristics: –A “data-intensive” supercomputer based on SSD flash memory and virtual shared memory Emphasizes MEM and IO over FLOPS –A system designed to accelerate access to massive data bases being generated in all fields of science, engineering, medicine, and social science –Sandy Bridge processors Integrating into TeraGrid: –Will be added to TeraGrid Resource Catalog –Appropriate CTSS kits will be installed –Planned to support TeraGrid wide-area filesystem efforts –Coming summer 2011
4
The Memory Hierarchy Flash SSD, O(TB) 1000 cycles Potential 10x speedup for random I/O to large files and databases
5
Gordon Architecture: “Supernode” 32 Appro Extreme-X compute nodes –Dual processor Intel Sandy Bridge 240 GFLOPS 64 GB 2 Appro Extreme-X IO nodes –Intel SSD drives 4 TB ea. 560,000 IOPS ScaleMP vSMP virtual shared memory –2 TB RAM aggregate –8 TB SSD aggregate 240 GF Comp. Node 64 GB RAM 240 GF Comp. Node 64 GB RAM 4 TB SSD I/O Node vSMP memory virtualization
6
Gordon Architecture: Full Machine 32 supernodes = 1024 compute nodes Dual rail QDR Infiniband network –3D torus (4x4x4) 4 PB rotating disk parallel file system –>100 GB/s SN DDDDDD
7
Comparing Dash and Gordon systems Doubling capacity halves accessibility to any random data on a given media System ComponentDash Gordon Node Characteristics (# sockets, cores, DRAM) 2 sockets, 8 cores, 48 GB 2 sockets, TBD cores, 64 GB Compute Nodes (#)64 1024 Processor TypeNehalem Sandy Bridge Clock Speed (GHz)2.4 TBD Peak Speed (Tflops)4.9 245 DRAM (TB)3 64 I/O Nodes (#)2 64 I/O Controllers per Node2 with 8 ports 1 with 16 ports Flash (TB)2 256 Total Memory: DRAM + flash (TB)5 320 vSMPYes 32-node Supernodes2 32 InterconnectInfiniBand Disk.5 PB 4.5 PB
8
Data mining applications will benefit from Gordon De novo genome assembly from sequencer reads & analysis of galaxies from cosmological simulations and observations Will benefit from large shared memory Federations of databases and Interaction network analysis for drug discovery, social science, biology, epidemiology, etc. Will benefit from low latency I/O from flash
9
Data-intensive predictive science will benefit from Gordon Solution of inverse problems in oceanography, atmospheric science, & seismology –Will benefit from a balanced system, especially large RAM per core & fast I/O Modestly scalable codes in quantum chemistry & structural engineering –Will benefit from largeshared memory
10
We won SC09 Data Challenge with Dash! With these numbers: IOR 4KB –RAMFS 4Million+ IOPS on up to.750 TB of DRAM (1 supernode’s worth) –88K+ IOPS on up to 1 TB of flash (1 supernode’s worth) –Speed up Palomar Transients database searches 10x to 100x –Best IOPS per dollar Since that time we boosted flash IOPS to 540K hitting our 2011 performance targets
11
Deployment Schedule Summer 2009-Present –Internal evaluation and testing w/ internal apps – SSD and vSMP Starting ~Mar 2010 –Dash would be allocated via startup requests by friendly TeraGrid users. Summer 2010 –Expect to change status to allocable system starting ~October 2010 via TRAC requests –Preference given to applications that target the unique technologies of Dash. Oct 2010 - June 2011 –Operate Dash as an allocable TeraGrid resource, available thru the normal POPS/TRAC cycles, with appropriate caveats about preferred applications and friendly-user status. –Help fill the SMP gap created by Altix’s being retired in 2010 March 2011 – July 2011 –Gordon build and acceptance July 2011 – June 2014 –Operate Gordon as an allocable TeraGrid resource, available thru the normal POPS/TRAC cycles Summer 2009-Present –Internal evaluation and testing w/ internal apps – SSD and vSMP Starting ~Mar 2010 –Dash would be allocated via startup requests by friendly TeraGrid users. Summer 2010 –Expect to change status to allocable system starting ~October 2010 via TRAC requests –Preference given to applications that target the unique technologies of Dash. Oct 2010 - June 2011 –Operate Dash as an allocable TeraGrid resource, available thru the normal POPS/TRAC cycles, with appropriate caveats about preferred applications and friendly-user status. –Help fill the SMP gap created by Altix’s being retired in 2010 March 2011 – July 2011 –Gordon build and acceptance July 2011 – June 2014 –Operate Gordon as an allocable TeraGrid resource, available thru the normal POPS/TRAC cycles
12
HPSS (R/W) HPSS (R/W) HPSS (R) HPSS (R) SAMQFS (R/W) SAMQFS (R/W) SAMQFS Legacy: (R) Allocated: (R/W) SAMQFS Legacy: (R) Allocated: (R/W) SAMQFS Legacy: (R) Allocated: (R/W) SAMQFS Legacy: (R) Allocated: (R/W) SAMQFS (R) SAMQFS (R) Hardware 6 Silos 12 PB 64 Tape Drives Hardware 6 Silos 12 PB 64 Tape Drives No Change No Change Hardware 2 Silos 6 PB 32 Tape Drives Hardware 2 Silos 6 PB 32 Tape Drives No Change No Change Jul 2009Mid 2010Mar 2011Jun 2013 TBD … Consolidating Archive Systems SDSC has historically operated two archive systems: HPSS and SAM-QFS Due to budget constraints, we’re consolidating to one: SAM-QFS We’re currently migrating HPSS user data to SAM-QFS
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.