Presentation is loading. Please wait.

Presentation is loading. Please wait.

Exascale Evolution www.openfabrics.org 1 Brad Benton, IBM March 15, 2010.

Similar presentations


Presentation on theme: "Exascale Evolution www.openfabrics.org 1 Brad Benton, IBM March 15, 2010."— Presentation transcript:

1 Exascale Evolution www.openfabrics.org 1 Brad Benton, IBM March 15, 2010

2 Agenda Exascale Challenges On the Path to Exascale: A Look at Blue Waters 2 www.openfabrics.org

3 Exascale Challenges 3 www.openfabrics.org

4 Exascale Challenges Challenges at every level of system design –Managing 500M to 1B (most likely heterogeneous) cores –Programming models to exploit multi-core + accelerators –Interconnect How will IB/RC scale to exascale? How do we “get off the bus”? How can we put more capability in the interconnect –Power Management Power vs. Performance tradeoffs 4 www.openfabrics.org

5 Exascale Challenges Challenges at every level of system design –Resilience/Fault-Tolerance At this scale, something always be broken or in the process of breaking –Development Environment/Performance Tuning –Workflow Management/Process Steering –Data Management/Storage/Visualization 5 www.openfabrics.org

6 Exascale Challenges Resiliency/Fault-Tolerance –F/T Model Fault Detection Fault Isolation Fault Containment Fault Recovery Re-integration –Software Resiliency More than just checkpoint/restart Containers/virtualization suspend/migrate/resume 6 www.openfabrics.org

7 Programming Models MPI –Will it survive in an exascale world? (its demise was predicted at petascale, but seems to be doing okay) Evolve hybrid language models: MPI + “What?” –OpenMP –GPU Accelerators (CUDA, OpenCL) –PGAS languages Greater Exploitation of Autotuning i.e., programs that write progams –ATLAS –FFTW –IBM HPC Toolkit has some of this 7 www.openfabrics.org

8 Title goes here on one line. On the Path to Exascale: A look at Blue Waters 8 www.openfabrics.org

9 NCSA Blue Waters Joint effort between NCSA and University of Illinois http://www.ncsa.illinois.edu/BlueWaters/ http://www.ncsa.illinois.edu/BlueWaters/ First Deliverable of a system based on PERCS technology (2011) Will be the world’s first sustained petascale system for open scientific research http://www.ncsa.illinois.edu/BlueWaters/pdfs/snir-power7.pdf for more detailed informationhttp://www.ncsa.illinois.edu/BlueWaters/pdfs/snir-power7.pdf 9 www.openfabrics.org

10 Blue Waters Overview Approximately 10 PF/s peak More than 300,000 cores (homogeneous) More than 1 PetaByte memory More than 10 Petabyte disk storage More than 0.5 Exabyte archival storage More than 1 PF/s sustained on scientific applications 10 www.openfabrics.org

11 Building Blue Waters Multi-chip Module 4 Power7 chips 128 GB memory 512 GB/s memory bandwidth 1 TF (peak) Router 1,128 GB/s bandwidth IH Server Node 8 MCM’s (256 cores) 1 TB memory 8 TF (peak) Fully water cooled Blue Waters Building Block 32 IH server nodes 32 TB memory 256 TF (peak) 4 Storage systems 10 Tape drive connections Blue Waters ~1 PF sustained >300,000 cores >1 PB of memory >10 PB of disk storage ~500 PB of archival storage >100 Gbps connectivity Blue Waters is built from components that can also be used to build systems with a wide range of capabilities—from deskside to beyond Blue Waters. Blue Waters will be the most powerful computer in the world for scientific research when it comes on line in Summer of 2011. CI Days 22 February 2010 University of Kentucky Power7 Chip 8 cores, 32 threads L1, L2, L3 cache (32 MB) Up to 256 GF (peak) 45 nm technology

12 Power7 Chip: Computational Heart of Blue Waters Base Technology –45 nm, 576 mm2 –1.2 B transistors Chip –8 cores –12 execution units/core –1, 2, 4 way SMT/core –Up to 4 FMAs/cycle –Caches 32 KB I, D-cache, 256 KB L2/core 32 MB L3 (private/shared) –Dual DDR3 memory controllers 128 GB/s peak memory bandwidth (1/2 byte/flop) –Clock range of 3.5 – 4 GHz Quad-chip MCM Power7 Chip 12 www.openfabrics.org

13 High-End Server Resilience 13

14 Feeds and Speeds per MCM 32 cores 8 Flop/cycle per core 4 threads per core max 3.5 – 4 GHz 1 TF/s 32 MB L3 512 GB/s memory BW (0.5 Byte/flop) 800 W (0.8 W/flop) 14

15 First Level Interconnect  L-Local  HUB to HUB Copper Wiring  256 Cores ONE DRAWER 8 MCMs, 32 chips, 256 cores www.openfabrics.org 15

16 Interconnect: 1.1 TB/s HUB 192 GB/s Host Connection 336 GB/s to 7 other local nodes in the same drawer 240 GB/s to local-remote nodes in the same supernode (4 drawers) 320 GB/s to remote nodes 40 GB/s to general purpose I/O www.openfabrics.org 16

17 www.openfabrics.org 17

18 Second Level Interconnect  Optical ‘L-Remote’ Links from HUB  Construct Super Node (4 CECs)  1,024 Cores  Super Node ONE SUPERNODE 4 drawers, 32 MCMs, 128 chips, 1024 cores www.openfabrics.org 18

19 BPA  200 to 480Vac  370 to 575Vdc  Redundant Power  Direct Site Power Feed  PDU Elimination WCU  Facility Water Input  100% Heat to Water  Redundant Cooling  CRAH Eliminated Storage Unit  4U  0-6 / Rack  Up To 384 SFF DASD / Unit  File System CECs  2U  1-12 CECs/Rack  256 Cores  128 SN DIMM Slots / CEC  8,16, (32) GB DIMMs  17 PCI-e Slots  Imbedded Switch  Redundant DCA  NW Fabric  Up to:3072 cores, 24.6TB (49.2TB) Rack  990.6w x 1828.8d x 2108.2  39”w x 72”d x 83”h  ~2948kg (~6500lbs) Rack Components Compute Storage Switch 100% Cooling PDU Eliminated Input: 8 Water Lines, 4 Power Cords Out: ~100TFLOPs / 24.6TB / 153.5TB 192 PCI-e 16x / 12 PCI-e 8x 19 www.openfabrics.org

20 How does this affect OFA? Blue Waters can connect externally via PCIe devices (e.g., InfiniBand) as needed Blue Waters interconnect –Is RDMA based –Is not InfiniBand (or iWARP or RoCEE) –Hardware support for Global Shared Memory Pendulum is swinging back to proprietary interconnects (at least at IBM) Is there a path to OFA compatibility? –how can/should OFA accept/support new/different RDMA interconnects? –how can/should IBM work w/OFA for embracing new interconnect technologies? www.openfabrics.org 20

21 Exascale Evolution Technical Evolution is not always in a straight line Different technologies evolve at different times and rates e.g., Blue Waters is not a direct descendent of RoadRunner/Cell, but rather of POWER/Federation/SP To reach exascale levels will require the consolidation and continued evolution of multiple technologies www.openfabrics.org 21


Download ppt "Exascale Evolution www.openfabrics.org 1 Brad Benton, IBM March 15, 2010."

Similar presentations


Ads by Google