Download presentation
Presentation is loading. Please wait.
Published byMeryl Stevens Modified over 9 years ago
1
Program Systems Institute of the Russian Academy of Sciences Supercomputer Projects SKIF and SKIF-GRID of Russia and Belorussia Sergei Abramov Workshop introducing the AURORA project 4 June 2009 Conference Room FBK, via Sommarive, 18 - Povo. Trento, Italy SKIF-AURORA Project
2
Program Systems Institute of RAS — Supercomputer Project «SKIF-GRID» Outline Pereslavl-Zalessky and Program Systems Institute of the RAS: Short introduction Supercomputer Projects SKIF and SKIF-GRID of Russia and Belorussia 2008–2010: Series 4 of SKIF supercomputers Series 4 of SKIF supercomputer == SKIF-AURORA SKIF-AURORA Selected Topics Management Subsystem 3D-torus Interconnect Combining standard CPUs and FPGA-accelerators Conclusion 1 October 2015© PSI RAS, SKIF-GRID, 2009 All rights reserved2
3
Program Systems Institute of the Russian Academy of Sciences Supercomputer Projects SKIF and SKIF-GRID of Russia and Belorussia Pereslavl-Zalessky and Program Systems Institute of the RAS: Short introduction
4
Program Systems Institute of RAS — Supercomputer Project «SKIF-GRID» Pereslavl-Zalessky Beautiful ancient Russian town, 860 years old The center of the Russian Golden Ring City Hometown of Great Dukes of Russia The first building site Peter The Great navy Ancient capital of Russian Orthodox church 1 October 2015© PSI RAS, SKIF-GRID, 2009 All rights reserved4 Moscow Pereslavl- Zalessky 120 km
5
Program Systems Institute of RAS — Supercomputer Project «SKIF-GRID» PSI RAS, Pereslavl-Zalesski 1 October 2015© PSI RAS, SKIF-GRID, 2009 All rights reserved5
6
Program Systems Institute of RAS — Supercomputer Project «SKIF-GRID» Foundation of the Institute 1 October 2015© PSI RAS, SKIF-GRID, 2009 All rights reserved6 The Program Systems Institute was founded in 1984 by a decree of the USSR government. The foundation was aimed at the development of computer science in the country. The first (1984–2003) director of the Institute was Prof. A. Ailamazyan
7
Program Systems Institute of RAS — Supercomputer Project «SKIF-GRID» 2009: Organization of the Institute Artificial Intelligence Research Center Medical Informatics Research Center Research Center for Multiprocessor Systems System Analysis Research Center Control Processes Research Center Scientific and Educational Center — International Children’s Computer Center 1 October 2015© PSI RAS, SKIF-GRID, 2009 All rights reserved7
8
Program Systems Institute of RAS — Supercomputer Project «SKIF-GRID» Ailamazyan University of Pereslavl 1 October 2015© PSI RAS, SKIF-GRID, 2009 All rights reserved8
9
Program Systems Institute of the Russian Academy of Sciences Supercomputer Projects SKIF and SKIF-GRID of Russia and Belorussia
10
Program Systems Institute of RAS — Supercomputer Project «SKIF-GRID» SKIF and SKIF-GRID Supercomputing Projects Joint Supercomputing Projects of Russian Federation and Republic of Belarus R&D in all directions and levels of supercomputer and grid-technologies: hardware, operating system, parallel programming systems, applications etc. SKIF: 2000–2004, 10 + 10 = 20 organizations SKIF-GRID: 2007–2010, 12 + 23 = 35 organizations PSI RAS is lead organization from Russian Federation 1 October 2015© PSI RAS, SKIF-GRID, 2009 All rights reserved10
11
Program Systems Institute of RAS — Supercomputer Project «SKIF-GRID» SKIF-GRID Project organization 1 October 2015© PSI RAS, SKIF-GRID, 2009 All rights reserved11 Project directions 1. Grid technology 2. Supercomputers SW HW 3. Security 4. Pilot projects — applications of HPC and grid technology
12
Program Systems Institute of RAS — Supercomputer Project «SKIF-GRID» Series 1, 2, and 3 of the SKIF supercomputers 1 October 2015© PSI RAS, SKIF-GRID, 2009 All rights reserved12 Series 1 (2000–2003) 2000: SKIF Firstborn 0.02/0.011 2001: SKIF ВМ-5100 0.048/0.026 2003: SKIF ES1710.03 0.04/.023 Series 2 (2003–2007) 2003: SKIF -Forge-32 0.1/0.074 2003: SKIF K-500 0.717/0.417 2004: SKIF К-1000 2.53/2.03 Series 3 (2007–2008) 2007: SKIF Cyberia 12/9.01 2008: SKIF Ural 15.94/12.2 2008: SKIF MSU 60/47.17
13
Program Systems Institute of RAS — Supercomputer Project «SKIF-GRID» Flagship of SKIF supercomputers: SKIF MSU (March 2008) June 2008: #36 in Top500 Peak performance 60 Tflops, Linpack: 47 Tflops Original blade design, CPU model: 4-cores Intel XEON E5472 3,0 GHz Nodes (dual CPU): 625 CPU cores total: 5,000 Interconnect: Infiniband DDR, Fat Tree 1 October 2015© PSI RAS, SKIF-GRID, 2009 All rights reserved13
14
1 October 2015© PSI RAS, SKIF-GRID, 2009 All rights reserved14 2002 June MVS 1000M 0.734/1.024 TFlops 2003 November SKIF K-500 0.423/0.717 TFlops 2004 November SKIF K-1000 2.032/2.534 TFlops 2007 February SKIF Cyberia 9.013/12.002 TFlops 2008 May SKIF Ural 12.2/15.9 TFlops 2008 май SKIF MSU 47.1/60 TFlops Only six developed in Russia supercomputers were ranked in the Top500… Five of them are SKIFs
15
1 October 2015© PSI RAS, SKIF-GRID, 2009 All rights reserved15 Series1 Series 2 Series 3 Series 4 2032 Gflops SKIF K-1000 472 Gflops SKIF K-500 57 Gflops Firstborn-M 26 Gflops VM5100 11 Gflops Firstborn 47.17 Tflops SKIF MSU 12.2 Tflops SKIF Ural 9 Тflops SKIF Syberia 1Q 2012 SKIF P~5.0 3Q 2010 SKIF P-1.0 3Q 2009 SKIF P-0.5 Completed: Series 1–3 Nearest plan: Series 4 Linpack Series 1, 2, 3 and 4 of SKIF supercomputers
16
Program Systems Institute of the Russian Academy of Sciences Supercomputer Projects SKIF and SKIF-GRID of Russia and Belorussia 2008–2010: Series 4 of SKIF supercomputers
17
Program Systems Institute of RAS — Supercomputer Project «SKIF-GRID» SKIF Series 4: Aims of R&D Highest density of performance (biggest possible number CPU per 1U) Smaller latency Less cables and connectors — better reliability Enlarged emission of heat per 1U We need new technology of cooling… How to? Improved Interconnect: we need better scalability, bandwidth and latency that it’s provided by best available solutions (eg. Infiniband QDR) New approach to monitoring and management of the supercomputer Combining standard CPUs and accelerators in computational nodes of the supercomputer 1 October 2015© PSI RAS, SKIF-GRID, 2009 All rights reserved17
18
Program Systems Institute of RAS — Supercomputer Project «SKIF-GRID» Spring’2008: SKIF Series 4 — How To? 1 October 2015© PSI RAS, SKIF-GRID, 2009 All rights reserved18 How to enlarge number of CPU per 1U? How to cool supercomputer nodes? How to develop improved interconnect? How to combine standard CPUs and accelerators? How develop management subsystem? SKIF series 4 is extremely complex project. We need strong partners! SKIF series 4 is extremely complex project. We need strong partners!
19
Program Systems Institute of RAS — Supercomputer Project «SKIF-GRID» Summer’2008: SKIF Series 4 — Know How! Italian-Russian Cooperation «SKIF Series 4» == «SKIF-AURORA Project» Designed by an alliance of Eurotech, PSI RAS and RSC SKIF with support by Intel 1 October 2015© PSI RAS, SKIF-GRID, 2009 All rights reserved19 Program Systems Institute of RAS
20
— Supercomputer Project «SKIF-GRID» SKIF-AURORA: Designed by the alliance of Eurotech, PSI RAS and RSC SKIF 1 October 2015© PSI RAS, SKIF-GRID, 2009 All rights reserved20 Program Systems Institute of RAS PCBs, schematics, mechanics, power supply, cooling, 1 and 2 levels of management system 3 level of management system, Interconnect (3D-torus: firmware, routing, drivers, MPI-2…), FPGA as accelerator
21
Program Systems Institute of the Russian Academy of Sciences Supercomputer Projects SKIF and SKIF-GRID of Russia and Belorussia SKIF-AURORA: State of the Project 1 October 2015© PSI RAS, SKIF-GRID, 2009 All rights reserved21
22
Program Systems Institute of RAS — Supercomputer Project «SKIF-GRID» Node Card 1 октября 2015 г. 1 октября 2015 г. СКИФ-ГРИД © 2009 Все права защищеныСлайд 22
23
Program Systems Institute of RAS — Supercomputer Project «SKIF-GRID» PSU Card 1 октября 2015 г.СКИФ-ГРИД © 2009 Все права защищеныСлайд 23
24
Program Systems Institute of RAS — Supercomputer Project «SKIF-GRID» Root Card 1 октября 2015 г.СКИФ-ГРИД © 2009 Все права защищеныСлайд 24
25
Program Systems Institute of RAS — Supercomputer Project «SKIF-GRID» Chassis 1 октября 2015 г.СКИФ-ГРИД © 2009 Все права защищеныСлайд 25
26
Program Systems Institute of RAS — Supercomputer Project «SKIF-GRID» Chassis 1 октября 2015 г.СКИФ-ГРИД © 2009 Все права защищеныСлайд 26
27
Program Systems Institute of RAS — Supercomputer Project «SKIF-GRID» Chassis 1 октября 2015 г.СКИФ-ГРИД © 2009 Все права защищеныСлайд 27
28
Program Systems Institute of RAS — Supercomputer Project «SKIF-GRID» ISC’09, Hamburg, June 23–25, 2009 1 октября 2015 г.СКИФ-ГРИД © 2009 Все права защищеныСлайд 28
29
Program Systems Institute of RAS — Supercomputer Project «SKIF-GRID» Rack 1 октября 2015 г.СКИФ-ГРИД © 2009 Все права защищеныСлайд 29
30
Program Systems Institute of RAS — Supercomputer Project «SKIF-GRID» System Poject SKIF-Aurora 500 Tflops 1 октября 2015 г.СКИФ-ГРИД © 2009 Все права защищеныСлайд 30
31
Program Systems Institute of the Russian Academy of Sciences Supercomputer Projects SKIF and SKIF-GRID of Russia and Belorussia SKIF-AURORA: Management Subsystem 1 October 2015© PSI RAS, SKIF-GRID, 2009 All rights reserved31
32
Program Systems Institute of RAS — Supercomputer Project «SKIF-GRID» Subjects of Management Subsystem 1 Pflops = 42 racks = = 10,752 nodes + 672 DC/DC trays + 672 root nodes For scalability we need robust and redundant management subsystem Comprehensive monitoring and control in all situations 1 October 2015© PSI RAS, SKIF-GRID, 2009 All rights reserved32
33
Program Systems Institute of RAS — Supercomputer Project «SKIF-GRID» 1 st Level of Management Subsystem Standard solution: IPMI over TCP/IP (Infiniband) Available when nodes, root card, and IB-network are powered on and work properly Root cards and DC/DC trays are not covered by monitoring and control 1 October 2015© PSI RAS, SKIF-GRID, 2009 All rights reserved33
34
Program Systems Institute of RAS — Supercomputer Project «SKIF-GRID» 2 nd Level of Management Subsystem Catalyst module on the root card implements node power control and serial console for the nodes Available when root card and IB-network are powered on and work properly Root cards and DC/DC trays are not covered by monitoring and control 1 October 2015© PSI RAS, SKIF-GRID, 2009 All rights reserved34
35
Program Systems Institute of RAS — Supercomputer Project «SKIF-GRID» 3 rd Level of Management Subsystem SKIF Servnet: independent sensor network Available always, uses dedicated power network, power consumption: 3W per chassis Accessible over dedicated network: Ethernet + CANbus + I 2 C Monitors temperature, humidity, supply voltages on node cards, root card, DC/DC tray. Transfer this information to 2 nd level (to Catalist) Can turn off DC/DC PSU in case of emergency Turn-off decision is made locally by ARM microcontroller located on the root card 1 October 2015© PSI RAS, SKIF-GRID, 2009 All rights reserved35 Program Systems Institute of RAS
36
— Supercomputer Project «SKIF-GRID» SKIF-AURORA Management Subsystem: Total monitoring and control 3-way redundant Designed for “dark datacenter” Robust management subsystem 1 October 2015© PSI RAS, SKIF-GRID, 2009 All rights reserved36 Program Systems Institute of RAS
37
— Supercomputer Project «SKIF-GRID» SKIF-AURORA Management Subsystem: Total monitoring and control 1 October 2015© PSI RAS, SKIF-GRID, 2009 All rights reserved
38
Program Systems Institute of the Russian Academy of Sciences Supercomputer Projects SKIF and SKIF-GRID of Russia and Belorussia 3D-torus Interconnect 1 October 2015© PSI RAS, SKIF-GRID, 2009 All rights reserved38
39
Program Systems Institute of RAS — Supercomputer Project «SKIF-GRID» 3D-torus Interconnect Only QCD specific is implemented by Italian team Russian teams to upgrade network to general- purpose interconnect (MPI 2.0) Due to appear fall 2009 Support and improvements in 2010–2012 1 October 2015© PSI RAS, SKIF-GRID, 2009 All rights reserved39
40
Program Systems Institute of RAS — Supercomputer Project «SKIF-GRID» 3D-torus Interconnect. Current status Simple rounting implemented on a prototype (SKIFino) Routing on single-FPGA prototype is working MPI is based on MPICH2 codebase — prototyped MPICH2 self-test implemented 1 October 2015© PSI RAS, SKIF-GRID, 2009 All rights reserved40
41
Program Systems Institute of RAS — Supercomputer Project «SKIF-GRID» R&D Directions Using FPGA Resources Collective MPI operations using FPGA FPGA to facilitate support of PGAS-languages (UPC, Titanium, etc) FPGA+CPU hybrid computing 1 October 2015© PSI RAS, SKIF-GRID, 2009 All rights reserved41 System Interconnect, 3D-torus Subsidiary Interconnect, Infiniband FPGA... CPU standard part non- standard part
42
Program Systems Institute of the Russian Academy of Sciences Supercomputer Projects SKIF and SKIF-GRID of Russia and Belorussia Conclusions 1 October 2015© PSI RAS, SKIF-GRID, 2009 All rights reserved42
43
Program Systems Institute of RAS — Supercomputer Project «SKIF-GRID» Conclusions SKIF-AURORA project Is based on collaboration between international teams Harnesses shared expertise and results Aimed to develop a family of top-level supercomputers with innovative techniques: Higher density of CPUs (flops per volume) Efficient water cooling system Efficient power supply system Scalable powerful 3D-Torus Interconnect Most modern standard CPUs for computation and FPGA for its acceleration Redundant robust management subsystem Etc. 1 October 2015© PSI RAS, SKIF-GRID, 2009 All rights reserved43
44
Program Systems Institute of RAS — Supercomputer Project «SKIF-GRID» Conclusions The collaboration between Italian and Russian teams Allows to obtain world class supercomputer technologies Provides leading positions in supercomputer industry (at least in the nearest future) for all participants of the collaboration Makes all results available in reasonable time and by reasonable efforts and resources 1 October 2015© PSI RAS, SKIF-GRID, 2009 All rights reserved44
45
Program Systems Institute of the Russian Academy of Sciences Supercomputer Projects SKIF and SKIF-GRID of Russia and Belorussia Grazie per l’attenzione!
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.