Presentation is loading. Please wait.

Presentation is loading. Please wait.

03.05.2015 SHORT OVERVIEW OF CURRENT STATUS A. A. Moskovsky Program Systems Institute, Russian Academy of Sciences IKI - MSR Research Workshop Moscow,

Similar presentations


Presentation on theme: "03.05.2015 SHORT OVERVIEW OF CURRENT STATUS A. A. Moskovsky Program Systems Institute, Russian Academy of Sciences IKI - MSR Research Workshop Moscow,"— Presentation transcript:

1 SHORT OVERVIEW OF CURRENT STATUS A. A. Moskovsky Program Systems Institute, Russian Academy of Sciences IKI - MSR Research Workshop Moscow, June, 2009 “SKIF-GRID” SUPERCOMPUTING PROJECT OF THE UNION STATE OF RUSSIA AND BELARUS

2 Slide 2 2 Pereslavl-Zalessky  Russian Golden Ring City: 857 years old  Hometown of Great Dukes of Russia  The first building site Peter The Great navy  Ancient capital of Russian Orthodox church Moscow Pereslavl Zalessky 120 km

3 Slide 3 “SKIF-GRID” PROJECT TIMELINE SKIF project, SKIF K-1000 is #98 in Top500 2.June 2004 – first proposal filed for “SKIF-GRID” project 3.March 2007 – approved by Government 4.March SKIF-MSU supercomputer deployed (#36 in June 08 Top 500) 5.May “SKIF-Testbed” federation created. 6.March 2009 – alliance agreement signed for SKIF series 4 development

4 Slide 4 PROJECT ORGANIZATION: Project directions 1. Grid technology 2. Supercomputers SW HW 3. Security 4. Pilot projects – applications of HPC and grid technology

5 Slide 5 «SKIF MSU»

6 Slide 6 SKIF MSU  Theoretical peak performance 60 TFlops  47 TFlops Linpack  Advanced clustering solutions:  diskless computational nodes  Original blade design ParameterValue CPU architecture:x86-64 CPU model:Intel XEON E5472 3,0 GHz (4-cores) Nodes (dual CPU)625 CPU cores total5 000 InterconnectInfiniband DDR, Fat Tree

7 Slide 7 «SKIF-Testbed» a/k/a “SKIF-Polygon”  Federation of HPC centers, ~100 Tflops  4 computers in the current Top 500  MSU (#35 in Top500)  South Urals State University  Tomsk State University  UFA state technical university

8 Slide 8 Middleware platform – UNICORE 6.1  X.509 for security  Certificate Authority at Pereslavl-Zalessky (PyCA)  Site platform  UNICORE 6.1  Java 1.5  Linux  Torque  Experimental sites: UNICORE is complemented with additional services/modules

9 Slide 9 Applications ( )  HPC applications:  Drug design (MSU Belozersky Institute, SRCC, Chelyabinsk SU)  Inverse problems in soil remote sensing (SRCC)  Computational chemistry (MSU Chemistry department)  Geophysical data services  Mammography database prototype (N.N. Semenov Chemical Physics Institute, RAS)  Text mining (PSI RAS)  Engineering (South Ural University …)  Space Research Institute...  …

10 SKIF-Aurora : second phase of SKIF-GRID project

11 Slide 11 SKIF Series 4: original R&D goals  Highest density of performance (biggest possible number CPU per 1U)  Smaller latency  Less cables and connectors — better reliability  Enlarged emission of heat per 1U We need new technology of cooling… How to?  Improved Interconnect: we need better scalability, bandwidth and latency that it’s provided by best available solutions (eg. Infiniband QDR)  New approach to monitoring and management of the supercomputer  Combining standard CPUs and accelerators in computational nodes of the supercomputer

12 Slide 12 Spring’2008: SKIF Series 4 — How To?

13 Slide 13 Summer’2008: SKIF Series 4 — Know How!  Italian-Russian Cooperation  «SKIF Series 4» == «SKIF-AURORA Project»  Designed by an alliance of Eurotech, PSI RAS and RSC SKIF with support by Intel  To be present at ISC 09 Program Systems Institute of RAS

14 Slide 14 SKIF-Aurora distinctive features  No moving parts  Liquid cooling – power efficiency  X86_64 processors (IntelNehalem)  3-D torus interconnect  Redundant management/monitoring subsystem  FPGA on board (optional)  SSD disks (optional)  QDR Infiniband

15 Slide 15 SKIF-Aurora  32 nodes per chassis  64 CPUs in 6U  Up to 8 chassis per rack  Up to 512 CPU per rack  Up to 2048 cores  To build 500 TFlops  21 racks in 2009  scalable due to 3-D torus  10 kW per chassis

16 Slide 16 SKIF-AURORA: Designed by the alliance of Eurotech, PSI RAS and RSC SKIF PCBs, mechanics, power supply, cooling, 1 and 2 levels of management system 3 level of management system, Interconnect (3D-torus: firmware, routing, drivers, MPI-2…), FPGA as accelerator

17 Slide 17 SKIF-AURORA Management Subsystem

18 Slide 18 3-D torus interconnect implementation System Interconnect, 3D-torus Subsidiary Interconnect, Infiniband FPGA... CPU standard part non- standard part  Only QCD specific is implemented by Italian team  Russian teams to upgrade network to general-purpose interconnect (MPI 2.0), due to appear fall 2009

19 Slide 19 R&D Directions Using FPGA  Collective MPI operations using FPGA  FPGA to facilitate support of PGAS- languages (UPC, Titanium, etc)  FPGA+CPU hybrid computing

20 Slide 20 Conclusions  Is based on collaboration between international teams  Harnesses shared expertise and results  Aimed to develop a family of petascale-level supercomputers with innovative techniques:  Higher density of CPUs (flops per volume)  Efficient water cooling system  Scalable powerful 3D-Torus Interconnect  Etc.

21 Slide 21 Datacenter visualization

22 Slide 22 Datacenter visualization

23 Slide 23 THANKS SKIF-GRID web site


Download ppt "03.05.2015 SHORT OVERVIEW OF CURRENT STATUS A. A. Moskovsky Program Systems Institute, Russian Academy of Sciences IKI - MSR Research Workshop Moscow,"

Similar presentations


Ads by Google