Presentation is loading. Please wait.

Presentation is loading. Please wait.

Two Types of Supercomputer developments Yutaka Ishikawa RIKEN AICS University of Tokyo 1 2014/09/03 Session.

Similar presentations

Presentation on theme: "Two Types of Supercomputer developments Yutaka Ishikawa RIKEN AICS University of Tokyo 1 2014/09/03 Session."— Presentation transcript:

1 Two Types of Supercomputer developments Yutaka Ishikawa RIKEN AICS University of Tokyo 1 2014/09/03 Session 2: Deployed Ecosystems and Roadmaps for the Future Smoky Mountains Computational Sciences and Engineering Conference

2 Supercomputers in Japan 2014/09/03 FLAGSHIP Machine K Computer PF Riken 9 Universities and National Laboratories HPCI (High Performance Computing Infrastructure) is formed from those machines, called leading machines Features:Single sign-on Shared storage (Distributed file system) As of Jun 2012  Each supercomputer center has one, two or more supercomputers.  Each supercomputer center replace their machines every 4.5 to 6 years. 2

3 Procurement Policies in Supercomputer Centers 2014/09/03 Flagship-Aligned Commercial Machine (FAC) –Acquiring a machine whose architecture is the same of the flagship machine. Complimentary Function Leading Machine (CFL-M, CFL-D) –Acquiring a machine whose architecture is different than the flagship machine, e.g. vector machine. –CFL-M: a commercial machine provided by a vendor –CFL-D: a new machine developed by both a vendor and supercomputer center. Upscale Commodity Cluster Machine (UCC) – Acquiring a large-scale commodity cluster Technology Path-Forward Machine (TPF) –Design and development of future advanced machine 3

4 Supercomputer Centers located at Japanese Universities Fiscal Year 201220132014201520162017201820192020202120222023 Hokkaido Tohoku Tsukuba Tokyo Tokyo Tech. Nagoya Kyoto Osaka Kyushu T2K Todai (140 TF) 50+ PF (FAC) 3MW 100+ PF (UCC + TPC) 4MW 100+ PF (UCC + TPC) 4MW Post T2K -- 30 PF (UCC + TPF) 4MW Post T2K -- 30 PF (UCC + TPF) 4MW (Manycore system) (700+ TF) HA-PACS (800 TF) NEC SX-9 + Exp5800 (31TF) 50-100 Pflops (FAC + UCC) 50-100 Pflops (FAC + UCC) Fujitsu FX10 (90.8TF, 31.8 TB/s), CX400(470.6TF, 55 TB/s) Fujitsu FX10 (1PFlops, 150TiB, 408 TB/s), Hitachi SR16000/M1 (54.9 TF, 10.9 TiB, 5.376 TB/s) Fujitsu M9000(3.8TF, 1TB/s) HX600(25.6TF, 6.6TB/s) FX1(30.7TF, 30 TB/s) Upgrade (3.6PF) 3MW -50 PF (TPF) 2MW 100 ~ 200 PF (FAC/TPF + UCC) 100 ~ 200 PF (FAC/TPF + UCC) 4MW Hitachi SR16000/M1 (172 TF, 22TB) Cloud System Hitachi BS2000 (44TF, 14TB) Hitachi SR16000/M1 (172 TF, 22TB) Cloud System Hitachi BS2000 (44TF, 14TB) 10+ PF (CFL-M/TPF + UCC) 1.5 MW 10+ PF (CFL-M/TPF + UCC) 1.5 MW 100 PF 2 MW (CFL-M/TPF+UCC) ~1PF,~1PB/s(CFL-M) ~2MW 30+PF, 30+PB/s (CFL-D) ~5.5MW(max) Tsubame 3.0 (20~30 PF, 2~6PB/s) 1.8MW (Max 3MW) Tsubame 3.0 (20~30 PF, 2~6PB/s) 1.8MW (Max 3MW) Tsubame 4.0 (100~200 PF, 20~40PB/s), 2.3~1.8MW (Max 3MW) Tsubame 2.5 (5.7 PF, 110+ TB, 1160 TB/s), 1.8MW Tsubame 2.0 (2.4PF, 97TB, 744 TB/s)1.8MW Cray XC30 (400TF) 600TF 6-10 PF (FAC/TPF + UCC) 1.8 MW 6-10 PF (FAC/TPF + UCC) 1.8 MW 100+ PF (FAC/TPF + UCC) 1.8-2.4 MW 100+ PF (FAC/TPF + UCC) 1.8-2.4 MW Cray XE6 (300TF, 92.6TB/s), GreenBlade 8000 (243TF, 61.5 TB/s) SX-8 + SX-9 (21.7 TF, 3.3 TB, 50.4 TB/s) 500+ TB/s (CFL-M) 1.2 MW 5+ PB/s (TPF) 1.8 MW Hitachi SR1600(25TF) Fujitsu FX10 ( 270TF)+FX10 相当 (180TF), CX400/GPGPU (766TF, 183 TB) Fujitsu FX10 ( 270TF)+FX10 相当 (180TF), CX400/GPGPU (766TF, 183 TB) 5-10 PF (FAC) Hitachi HA8000tc/ Xeon Phi (712TF, 242 TB), SR16000(8.2TF, 6 TB) 100-150 PF (FAC/TPF + UCC) 100-150 PF (FAC/TPF + UCC) 10-20 PF (UCC + TPF) 10-20 PF (UCC + TPF) 3MW 2.6MW 2.0MW 42014/09/03

5 Towards the Next Flagship Machine 2014/09/035 Post K Computer U. of Tsukuba U. of Tokyo PostT2K T2K PF 2008201020122014201620182020 U. of Tsukuba U. of Tokyo Kyoto U. RIKEN 9 Universities and National Laboratories PostT2K Arch: UPCC (Upscale Commodity Cluster Machine) Soft: TPF (Technology Path-Forward Machine) Manycore architecture O(10K) nodes PostT2K is a production system operated by both Tsukuba and Tokyo PostK Flagship Machine Manycore architecture O(100K-1M) nodes System software and parallel programming language in PostT2K will be employed in a part of Post K’s software environment Machine resources will be used to develop system software stack in PostK

6 PostT2K Hardware –Latest CPU technology is assumed –Specifying Node Performance, Memory Capacity/Bandwidth, Interconnect performance, File I/O performance, Storage Capacity Software –Specifying Operating System (Linux and McKernel) Programming Languages (Fortran, C/C++, Xcalable MP) Communication Library (MPI-3) Math Libraries File System Batch Job System 2014/09/036 Development Procurement McKernel –Light Weight Microkernel Xcalable MP –Parallel Programming Language MPICH with Low-level Communication Facility

7 Linux + McKernel Concerns –Reducing memory contention –Reducing data movement among cores –Providing new memory management –Providing fast communication –Parallelizing OS functions achieving less data movement New OS mechanisms and APIs are revolutionarily/evolutionally created and examined, and selected Linux with Light Weight Micro Kernel –IHK (Interface for Heterogeneous Kernel) Loading a kernel into cores Communication between Linux and the kernel –McKernel Customizable OS environment –E.g. environment without CPU scheduler (without timer interrupt) 2014/09/037 Core McKernel Linux Kernel Daemon Core User process Daemon Core Interface for Hetero. Kernels System call to LMK System call to Linux Running on both Xeon and Xeon-phi environments IHK and McKernel have been developed at the University of Tokyo and Riken with Hitachi, NEC, and Fujitsu

8 PostT2K OS Environment being developped Linux Kerne l+ McKernel –Several variations of McKernel are provided for applications –Linux Kernel resides, but an McKernel is selectively loaded for each application 2014/09/038 Linux kernel is resident App A on McKernel without CPU scheduler Is invoked Finish App C on McKernel with Segmentation is invoked Finish App B on McKernel with CPU scheduler Is invoked Finish App D on Linux Is invoked Finish

9 XcalableMP(XMP) What’s XcalableMP (XMP for short)? A PGAS programming model and language for distributed memory, proposed by XMP Spec WG XMP Spec WG is a special interest group to design and draft the specification of XcalableMP language. It is now organized under PC Cluster Consortium, Japan. Mainly active in Japan, but open for everybody. Project status (as of Nov. 2013) XMP Spec Version 1.2 is available at XMP site. new features: mixed OpenMP and OpenACC, libraries for collective communications. Reference implementation by U. Tsukuba and Riken AICS: Version 0.7 (C and Fortran90) is available for PC clusters, Cray XT and K computer. Source-to- Source compiler to code with the runtime on top of MPI and GasNet. 9 Language Features Directive-based language extensions for Fortran and C for PGAS model Global view programming with global-view distributed data structures for data parallelism SPMD execution model as MPI pragmas for data distribution of global array. Work mapping constructs to map works and iteration with affinity to data explicitly. Rich communication and sync directives such as “gmove” and “shadow”. Many concepts are inherited from HPF Co-array feature of CAF is adopted as a part of the language spec for local view programming (also defined in C). XMP provides a global view for data parallel program in PGAS model Code example

10 Roles of PC Cluster Consortium Development, Maintenance and Promotion 102014/09/03 Members: Univ. of Tsukuba, Univ. of Tokyo, Titech, AMD, Intel, Fujitsu, Hitachi, NEC, Cray, … IHK, McKernel, LLC, XMP Integration of other open sources, e.g., MPICH Distribution as open source Promotion PostT2K PostK Vendor Contribution Vendor Contribution Support PC cluster consortium was established in 2001. The original mission was to contribute to the PC cluster market through the development, maintenance, and promotion of cluster system software based on the SCore cluster system software and Omni OpenMP compiler, developed by the Real World Computing Partnership funded by the Japanese government from 1992 for 10 years.

11 International Collaboration between DOE and MEXT 2014/09/0311 PROJECT ARRANGEMENT UNDER THE IMPLEMENTING ARRANGEMENT BETWEEN THE MINISTRY OF EDUCATION, CULTURE, SPORTS, SCIENCE AND TECHNOLOGY OF JAPAN AND THE DEPARTMENT OF ENERGY OF THE UNITED STATES OF AMERICA CONCERNING COOPERATION IN RESEARCH AND DEVELOPMENT IN ENERGY AND RELATED FIELDS CONCERNING COMPUTER SCIENCE AND SOFTWARE RELATED TO CURRENT AND FUTURE HIGH PERFORMANCE COMPUTING FOR OPEN SCIENTIFIC RESEARCH Yoshio Kawaguchi (MEXT, Japan) and William Harrod(DOE, USA) Purpose: Work together where it is mutually beneficial to expand the HPC ecosystem and improve system capability –Each country will develop their own path for next generation platforms –Countries will collaborate where it is mutually beneficial Joint Activities –Pre-standardization interface coordination –Collection and publication of open data –Collaborative development of open source software –Evaluation and analysis of benchmarks and architectures –Standardization of mature technologies Kernel System Programming Interface Low-level Communication Layer Task and Thread Management to Support Massive Concurrency Power Management and Optimization Data Staging and Input/Output (I/O) Bottlenecks File System and I/O Management Improving System and Application Resilience to Chip Failures and other Faults Mini-Applications for Exascale Component-Based Performance Modelling Technical Areas of Cooperation

12 Concluding Remarks Ecosystem –Co-development of system software stack for a leading machine (PostT2K) and the flagship machine (PostK) –Beneficial to users Continuity of System Software and Programming Language from leading machines to the flagship machine –Contribution to open source community Shared and Enhanced by the community Schedule 2014/09/0312 PostT2K PostK Procurement Software Development Operation Basic DesignDesign and Implementation Manufacturing, Installation, and Tuning Operation

13 2014/09/0313

14 1.The overall theme of SMC2014 is "Integration of Computing and Data into Instruments of Science and Engineering". 2.Our session is focused on "Deployed Ecosystems and Roadmaps for the Future ". We will be focusing on current experiences and challenges in deploying large scale computing capabilities and our plans and expectations on how future systems will be made available to our scientists and engineers. 3.Consistent with this topic, we are inviting you share your vision for how the computational ecosystem may continue to develop to serve the scientific and engineering challenges of the future. 4.The three other panels in our conference will focus on "Strategic Science: Drivers of Future Innovation", "Future Architectures to Co- Design for Science", and "Math and Computer Science Challenges for Big Data, Analytics, and Scalable Applications". 2014/09/0314

Download ppt "Two Types of Supercomputer developments Yutaka Ishikawa RIKEN AICS University of Tokyo 1 2014/09/03 Session."

Similar presentations

Ads by Google