Petascale System Requirements for the Geosciences Richard Loft SCD Deputy Director for R&D
Petascale Collaboratory Reports l Charges: –Identify science drivers for petascale geoscience computing. (Bryan) –Determine technical feasibility and cost of creating such a system by 2010. (Loft) l Outcomes : –Collaboratory is in NSF Facility Plan as a facility “under study”. –Initiative to bring vendors and scientists together to discuss petascale geoscience computing
Petascale Geoscience l World class computational capability should be devoted to understanding the earth. –Earthquakes & Tsunamis –Severe weather - Katrina –Global warming –Land use policies l It doesn’t stop at the 1 PFLOPS. l Easy Part: Achieving ~ 1 PFLOPS peak –Nothing exceptional has to happen technologically to do it –Possible by 2008, first by 2010, multiple by 2012 –Highly parallel system - speed of single thread slowing. l Hard Part: Infrastructure. –Facilities to house them. –Efficient applications. –Systems and tools to help us understand the data they produce.
Geoscience really does need Petascale Computing (and Beyond)
Geocience Space Weather Turbulence Atmospheric Chemistry Climate Weather The Sun from the sun’s surface to the earth’s core…
Data Driver l A Petascale system will produce 100 PB of geoscience data per year. l That’s 3.2 GB/s 7x24x365 l Hard: Mining that data torrent for science nuggets. l How do we drive petascale requirements for those DAV systems?
Taming the Torrent: Visualization Example l Key concepts 1.Integrate visualization into analysis process 2.interactively steer the analysis to subsets 3.Employ multiresolution data representation as a data reduction technique Size: Full 1/8 1/64 1/512
Facility Driver Uncertainty in fuel efficiency and power density leads to uncertainty petascale facility planning. You can’t turn a facility on a dime. Efficiency is good. How do we craft power-efficiency benchmarks?
Petascale TCO: Fuel Efficiency Top 20 systems Based on processor power rating only Blue Gene/L: How do we use this stuff?
Petascale Data Center Design Petascale estimate: 2-5 MW by 2010 Reliable Tier 2+ Redundant power/cooling systems Modular design Expandable. Mitigates risk. Floor Space (per module): 20,000 sq. ft. machine room 20,000 sq. ft. mechanical space Power (per module) 4 MW systems plus 4 MW mechanical Land: 13 Acres. A second order term in cost.
Application Driver: Collecting Geoscience Application Requirements How do we extract system requirements for systems that haven’t been built from applications that haven’t been written? No single set of requirements in fact exists. There are very few good models of application performance. Application experts intuition is often wrong. How should applications benchmarks be selected?
1/10 Degree POP Ocean Model Credit: Frank Bryan, NCAR Agulhas Rings
POP X1 Performance 6x: System Architecture Does Matter! After Worley, et al.
Discussion - Open Forum How do we drive petascale requirements for DAV systems? How do we extract system requirements for systems that haven’t been built from applications that haven’t been written? How should applications benchmarks be selected? Are the HPC Challenge Benchmarks good metrics? How do we craft power-efficiency benchmarks?
Ready or not, the petascale is coming Proof by extrapolation…
Growth Rate Results - Top 10 Geoscience Centers on the Top500 List l Individual sites come and go. l Individual sites rise and fall. l Collectively, things are smoother. l Annual rate 1st ranked: l Annual rate 1st ranked: 1.81. l Annual mean growth rate: 1.84. l Annual rate 10th ranked: 1.90. l Currently –5 use IBM POWER processors –2 use vector systems –3 use other l Pack is slowly converging. l Use this rate to chart computing future… Log linear fit (mean)