COLUMBIA SUPERCOMPUTER SGI Altix 3700 Architecture.

COLUMBIA SUPERCOMPUTER SGI Altix 3700 Architecture

COLUMBIA SUPERCOMPUTER NASA's most powerful supercomputer, named Columbia to honor the Space Shuttle astronauts died in flight accident was especially designed to process high level simulations to aviod more accidents. NASA's most powerful supercomputer, named Columbia to honor the Space Shuttle astronauts died in flight accident was especially designed to process high level simulations to aviod more accidents.

Presentation Guide Line Architectural Overview of SGI Altix 3000. Main approach of the Altix super computer design Main approach of the Altix super computer design Components of hardware Components of hardware Details of process units. Details of process units. How it works How it works More About Colombia Current Configuration of Columbia Current Configuration of Columbia Logical Domains, Logical Domains, Projects Running on Columbia Projects Running on Columbia Software Applications Software Applications

Columbia System Architectural Details Based on SGI® NUMAflex™ architecture 20 SGI® Altix™ 3700 superclusters, each with 512 processors Global shared memory across 512 processors 10,240 Intel Itanium® 2 processors Current processor speed: 1.5 gigahertz Current cache: 6 megabytes 1 terabyte of memory per 512 processors, with 20 terabytes total memory Interconnect SGI® NUMAlink™ InfiniBand network Storage Online: 440 terabytes of Fibre Channel RAID storage Archive storage capacity: 10 petabytes Columbia is installed at NASA Ames Research Center near San Jose, Calif., and became fully operational on October 26, 2004. Columbia features a sustained Linpack benchmark performance of 51.9 teraflop/s.

SGI 3000 Architecture Especially Designed to solve memory bottlenecks Processor Level Parallelism High Bandwith Interconnection between Processors. Global Shared Memory Up to 512 Processors and 6 TB Shared Memory for every Node

Understanding Differences... Common Cluster Structure Interconnection Over Bus Memory Controlling is Handled by Software Running on Each Processor Packet Based I/O Communication Node Structure Global Shared Memory Seperate Memory Controller Device for every Node Within Memory Domain Communication

Example Data Sharing Between Processor Local Memories X factor can be used as a ratio for bandwitdh or time to spend for data exchanging

Shared Memory Architecture, Fixed Address Bus and Data Bus Bandwitdhs can be choosen. No data synchronization.

Modular Design of SGI Altix The NUMAflex design enables the CPU, memory, I/O, interconnect, graphics, and storage to be packaged into modular components, or "bricks". I/O Bricks: IX and PX Storage: D-bricks Interconnect: Router bricks/R-bricks Processor: C-Brick

Basic Node: C-Brick A Memory Brick is Essentially a C-brick wihout processors. Being used as a buffer memory for near C-bricks. A Memory Brick is Essentially a C-brick wihout processors. Being used as a buffer memory for near C-bricks. Intraconnection layer handles global shared memory access to processor over a directory based communication.Also manages memory spoofing of Processors. Cache Coherency Protocols are runnig here. Intraconnection layer handles global shared memory access to processor over a directory based communication.Also manages memory spoofing of Processors. Cache Coherency Protocols are runnig here. Intraconnection Interfaces are for communication of Super Clusters, which may include various numbers and types of Bricks.Intraconnection Interfaces are for communication of Super Clusters, which may include various numbers and types of Bricks.

Router Brick Enables Global Shared Memory Directory Access To C-Bricks over IntraConnection. Scalable Hubs manages Intra/Inter connections which makes data exchange possible also between super clusters. Router Bricks Extends address information of C- Bricks up to 6 TB, and provide virtual channels to Processors that acts like high-bandwidth data bus between Memory modules.

ALTIX SUPER CLUSTER

On each 512-processor node, the primary features and benefits are: Low latency to memory (less than 1 microsecond) - reduces communication overhead High memory bisection bandwidth - first system (in November 2003) to exceed 1 terabyte/second on the STREAM benchmark Global shared memory and cache-coherency - enables simpler and more efficient programming Large shared memory (up to 6 TB) - allows large problems to remain resident

ALTIX ARCHITECTURE SUPER COMPUTER

Details of.. S-Hub Memory Access In SGI Altix Architecture, the memory is physically distributed around the machine. In the case of this 4-cpu example system, there are 2 "nodes", each of which has a separate memory controller and each manages 1/2 of the total physical memory (in the usual homogeneously populated configuration). When a program runs and allocates memory, the default mode of operation on these systems is called the "first touch" algorithm. When this mode is active, memory is allocated in the physical memory closest to the processor running the thread that first references a particular memory location. Then the data is all initialized by the master thread and all of the data is placed on the node where the master thread is located. When a program runs and allocates memory, the default mode of operation on these systems is called the "first touch" algorithm. When this mode is active, memory is allocated in the physical memory closest to the processor running the thread that first references a particular memory location. Then the data is all initialized by the master thread and all of the data is placed on the node where the master thread is located. NUMAflex intraconnect between that master node and the rest of the nodes in Super Cluster. Scalable Hubs( running NUMAflex protocols ) manages the reference address passing of target data on the Remote Global Shared Memory, and copying local data to remote memory in multiple work case. S-Hubs also inform other nodes (C-Brick, IX-Brick.. Etc) about the global data referance of a common job. Furthermore for Super-Clusters interconnection, S-Hubs pass messages to R-Bricks telling the global data position and promoting Router-Brick nodes to a Interconnection master of a certain data.

Details of.. Processor Level Parallesim

Details of.. Instruction level Optimization

Details of..

Other Advantages of Itanium Processors Special Registers for Branch Predictions Both 32 bit, 64 bit Pointers. Can manage different size of seperated memories simultenusly RSE, Register Stack Engine saves restores registar without software intervention.

Architectural Summary Using the advantage of shared memory. Processor level parallesim. Tightly Coupled different units makes a “Brick” Modular Flexible Design with bricks Flexible design by basicly using different number and types of bricks. Processing power can ben increased by increasing the number of clusters.

But.. High Hardware Costs Tightly Coupled Units in Bricks Component can hardly be updated. Extra Sub Layers for intra communication

MORE ABOUT COLUMBIA

What is more? System Facts Logical Domains of Columbia Computational Projects Running on Columbia Software Applications Conclusion

Columbia System Facts 20 SGI® Altix™ 3700 superclusters 10,240 Intel Itanium® 2 processors 20 terabytes total memory 440 terabytes of Fibre Channel RAID storage Archive storage capacity: 10 petabytes Linux® based operating system

Logical Domains of Columbia SpaceOps-Exploration-Aero-Safety (called "SEAS") Science The 2048-cpu national leadership computing system (called "2048") Columbia is partitioned into three domains to better meet the computational needs of various NASA missions. The machine "cfe1" serves the SEAS domain, "cfe2" serves Science, and "cfe3" is for the 2048.

ALTIX ARCHITECTURE SUPER COMPUTER

Many exciting projects are being run on Columbia from aeroelastic analysis to weather-climate modeling systems. Aeronautics Research Mission Directorate Projects Exploration Systems Mission Directorate Projects Science Mission Directorate Projects Space Operations Mission Directorate Projects NASA Safety & Engineering Center Projects Computational Projects Running on Columbia

Aeronautics Research Projects The goal of the Aeronautics Research Mission Directorate is to pursue research and technology development that: protects air travelers and the public; protects the environment; increases mobility; enhances national security; and pioneers revolutionary aeronautical concepts for science and exploration. The objective of this mission directorate is to pioneer and validate high-value technologies that enable new exploration and discovery, and improve the quality of life through practical applications.

Exploration Systems Projects The Exploration Systems Mission Directorate is responsible for creating a constellation of new capabilities and supporting technologies that enable sustained and affordable human and robot exploration. This mission directorate is also responsible for effective utilization of International Space Station facilities and other platforms for research that supports long-duration human exploration.

Science Mission Projects The Science Mission Directorate carries out research, flight, and robotics missions, and development of advanced technologies to expand our understanding of the Earth and the universe. Activities include: scientific exploration of the Earth, moon, Mars and beyond; exploration of the origins and evolution of the solar system and life within it; transferring the knowledge gained from Earth studies to the exploration of the solar system, and vice versa; and showcasing the amazing and unexpected discoveries we make every day to inspire the next generation of explorers.

Space Operations Projects The Space Operations Mission Directorate supports NASA's science, research, and exploration achievements by providing many critical enabling capabilities such as direct space flight operations, launches, and communications, as well as the operation of integrated systems in low-Earth orbit and beyond. These goals are accomplished through the following programs: the International Space Station program, the Space Shuttle program, and the Flight Support program.

NASA Safety and Engineering Center Projects The NASA Safety and Engineering Center (NESC) is a NASA initiative that will help ensure the safety and engineering excellence of NASA's programs and institutions. The objective of the NESC is to improve safety by performing various independent technical assessments, which includes the testing, analysis, and evaluation to determine appropriate preventive and corrective actions for recognized problems, trends or issues within NASA programs.

Cart3D Cart3D is a high-fidelity inviscid analysis package for conceptual and preliminary aerodynamic design. It allows users to perform automated computational fluid dynamics analysis on complex geometry. Cart3D is currently playing an integral role in NASA's Return to Flight (RTF) effort. Simulations of tumbling debris from foam and other sources, generated using Cart3D, are being used to assess the threat that shedding such debris poses to various elements of the Space Shuttle Launch Vehicle. Cart3D Cart3D is a high-fidelity inviscid analysis package for conceptual and preliminary aerodynamic design. It allows users to perform automated computational fluid dynamics analysis on complex geometry. Cart3D is currently playing an integral role in NASA's Return to Flight (RTF) effort. Simulations of tumbling debris from foam and other sources, generated using Cart3D, are being used to assess the threat that shedding such debris poses to various elements of the Space Shuttle Launch Vehicle. Image above: This image shows an unsteady Cart3D simulation used to predict the trajectory of a piece of tumbling foam debris released during ascent. The colors represent surface pressure. NAS Software Applications

debris The debris software package includes programs for computing debris trajectories relative to a vehicle in flight, for detecting possible debris impacts on any part of the flight vehicle, and for filtering, sorting, and managing very large databases of debris impacts. The debris code is being used to compute debris trajectories, which characterize the debris environment experienced by the Space Shuttle Launch Vehicle during ascent. Understanding this debris environment is critical to NASA's Return to- Flight effort. Image above: This image shows a number of debris trajectories computed by the debris code, which is being used to characterize the debris environment experienced by the Space Shuttle Launch Vehicle during ascent.

Estimating the Circulation and Climate of the Ocean (ECCO) This application is used to conduct large-scale, high-resolution ocean modeling and analysis. Researchers from the NASA Advanced Supercomputing Division, JPL, and MIT have partnered to dramatically accelerate development of a global eddy-resolving ocean and sea-ice reanalysis. Estimates of time-evolving ocean and sea-ice circulations are obtained by constraining the MIT general circulation model with both satellite and in- situ observations such as sea level, sea-ice extent, and hydrographic profiles. Scientists use these realistic, full-ocean-depth circulation estimates to understand how ocean currents and sea-ice affect climate, to study air-sea exchanges, to improve seasonal and long-term climate predictions, and for many other applications. Image above: Simulated near-surface current speed and sea-ice cover illustrate the tremendous complexity of the global-ocean and sea-ice circulations. Color scale, black to red to white, indicates current speed and ranges from 0 to 50 cm/s. Land masses are overlaid with NASA satellite imagery. White areas at the Poles depict land-ice and sea-ice.

The NASA Finite Volume General Circulation Model (fvGCM) fvGCM is a global climate and weather prediction model traditionally used for long- term climate simulations at a coarse (approximately100 km) horizontal resolution. The fvGCM code has been running on Columbia, producing real-time, high-resolution (approximately 25 km) weather forecasts focused on improving hurricane track and intensity forecasts. The code has been remarkably successful during the active 2004 Atlantic hurricane season, providing landfall forecasts with an accuracy of approximately 100 km up to five days in advance. This record marks an improvement in advanced warning beyond the typical two- to three-day lead-time. Image above: A snapshot of clouds from the fvGCM as hurricane Frances makes landfall on the Gulf coast of Florida and hurricane Ivan intensifies in the tropical Atlantic.

INS3D This code solves the incompressible Navier-Stokes equations in three-dimensional generalized coordinates for both steady-state and time varying flow. During long-duration space missions, astronauts must adapt to altered circumstance of microgravity. Blood circulation undergoes significant adaptation during and after space flight. The bloodflow through an anatomical Circle of Willis configuration is simulated using the INS3D code to provide means for studying gravitational effects on the brain's circulation. Image above: The brain uses the connective arterial tree, called the Circle of Willis, to distribute oxygenated blood throughout the brain mass. To assess the impact of changing gravitational forces on human space flight, it is essential to quantify the blood flow characteristics in the brain under varying gravity conditions.

Overflow A computational fluid dynamics program for solving complex flow problems. Overflow is widely used by NASA and industry for designing launch and re-entry vehicles, rotorcraft, ships, and commercial aircraft, among others. The Overflow code is being used to compute the flowfield around the Space Shuttle Launch Vehicle to study the air loads acting on the vehicle due to several design changes, and to study the potential impacts from any debris that might be shed during the ascent. Image above: This image depicts the flowfield around the Space Shuttle Launch Vehicle traveling at Mach 2.46 and at an altitude of 66,000 feet. The surface of the vehicle is colored by the pressure coefficient, and the gray contours represent the density of the surrounding air.

The Parallel Ocean Program (POP) POP is the oceanic component of the Community Climate System Model (CCSM), a fully coupled global climate model that enables accurate simulations of the Earth's past, present, and future climate states. The POP code was ported, and optimized to scale almost linearly to 512 processors of Columbia. This test case will feature a North Atlantic Ocean model at 1/10th degree resolution being simulated at about six years per day. Image above: Surface velocity of the North Atlantic based on a simulation using the Parallel Ocean Program (POP) Version 1.4.3 with a 0.1 degree resolution.

PHANTOM PHANTOM is a three-dimensional, unsteady, all-speed flow code developed for turbomachinery applications. The code, written using the Generalized Equation Set, can be applied to both gases and liquids. Recently, the PHANTOM code has been used to do some analysis in the Flow Liner Crack Investigation. For example, to analyze the surface pressure on the Space Shuttle Main Engine's low-pressure fuel pump inducer, operating in liquid hydrogen. Image above: Plot of the surface pressure on the Space Shuttle Main Engine's low-pressure fuel pump inducer, operating in liquid hydrogen.

The Columbia supercomputer is making it possible for NASA to achieve breakthroughs in science and engineering for the agency's missions and Vision for Space Exploration. Columbia's highly advanced architecture will also be made available to a broader national science and engineering community. Conclusion

COLUMBIA SUPERCOMPUTER SGI Altix 3700 Architecture.

Similar presentations

Presentation on theme: "COLUMBIA SUPERCOMPUTER SGI Altix 3700 Architecture."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

COLUMBIA SUPERCOMPUTER SGI Altix 3700 Architecture.

Similar presentations

Presentation on theme: "COLUMBIA SUPERCOMPUTER SGI Altix 3700 Architecture."— Presentation transcript:

Similar presentations

About project

Feedback