Download presentation
Presentation is loading. Please wait.
Published byAliya Goreham Modified over 9 years ago
1
Algorithm and Scaling (Issues) for Aerospace (CFD) Codes Sukumar Chakravarthy src@metacomptech.com www.metacomptech.com 1
2
Scope of Presentation Range of aerospace CFD and related applications Hierarchy of simulation approaches Hierarchy of algorithmic approaches Algorithm and scalability issues and considerations 2
3
Presentation Approach & Goals A picture is worth a thousand words We will use ten thousand words and 1 picture == eleven thousand word- equivalents Catalog, serve as collective conscience Discuss relationship between application needs, algorithms, modeling approaches and HPC issues and possibilities 3
4
CFD++ Aerospace Applications External aerodynamics Propulsion integration Component integration Systems Cabin airflow FADEC Icing Fuel tank purge Thrust reverser Propulsion Nozzle design Jet noise
5
CFD++ Aerospace Applications Plumes Trajectory Aerodynamic coefficients Drag polar Dynamic derivatives Store separation Canopy separation Sabot separation Stage separation Pilot seat ejection Projectiles Spinning projectiles
6
CFD++ Aerospace Applications Synthetic jets Turbomachinery Blade design Blade cooling Pulsed detonation Flapping wings Flexible wings Entomopters Helicopters Propellers, rotors Parachutes Parachutists, sky-diving
7
CFD++ Aerospace Applications Spacecraft launch Reentry vehicles Rocket assisted landings (Earth, Mars, Venus) X-Prize vehicles Land speed record vehicles Bullets, artillery rounds Liquid fuel breakup Liquid fuel sloshing, feed Acceleration, deceleration effects Aeroacoustics Flow Structure Interaction (FSI)
8
What’s special about Aerospace CFD? Extremes of scales, operating conditions, physics and chemistry, speeds, application- specific needs (extraction of useful information) Nonlinearity is most often inherent It is not just the simulation itself that counts If there is no information output required, no need to do the simulation
9
Hierarchy of problem classes Steady state/unsteady problems Small, medium and large scale problems Entire configurations as well as analysis of components Engineering analysis, scientific analysis, trouble shooting All speeds, atmospheric conditions, diverse fluids and their properties
10
10 Physics (nature) Math Model of Physics Numerical Model of Math Model Computational Model Human(s) in the loop Simulation Results Common Elements of Simulations
11
Common Underlying Physical Processes 11 Convection: Production: Dissipation: Redistribution: Diffusion: Evolution:
12
Summary of some HPC issues Loading the problem, saving final results Checkpointing Computational vs. communications performance (scalability) Data extraction issues Robustness (10000-way parallel should be as robust as serial algorithm) Data-center issues (throughput, storage) Visualization, interaction with running case 12
13
Modeling Hierarchy Potential flow assumption Small-disturbance approaches Inviscid-flows taken separately, and hybridized with boundary layer theory Reynolds/Favre-averaged N-S equations with phenomenological turbulence models LES and hybrid RANS-LES approaches Special equations and models 13
14
Mesh possibilities Surface mesh only (panel methods) Cartesian mesh, almost Cartesian mesh Structured mesh – hex (3D) & quad (2D) Unstructured – all cell types Hybrid structured and unstructured meshes, hex-core meshes Patched and overset meshes Moving (dynamic) meshes Flexible boundaries and meshes 14
15
“Extreme Grids” Aspect ratios of 10000 to 1 or more (boundary layer resolution with Y+ < 1) Mesh sizes of hundreds of million and more Extreme grid spacings present in mesh 15
16
Numerical approaches Explicit and implicit Fractional steps and factored schemes Finite volume, finite difference schemes Finite element schemes Spectral and spectral element schemes “Local” schemes and “global” schemes 16
17
Some HPC algorithmic challenges Challenges of making implicit schemes be really implicit on multi-CPU computations Ensure insensitivity of results to variations in number of parallel processes used How to make the 10000-way parallel computation as robust as the serial algorithm How to make the 10000-way parallel computation converge as well but in much less time 17
18
Adaptive meshes Adaptive elements (cells) Adaptive grids H-adaptation, P-adaptation, H-P-adaptation 18
19
Classification of Algorithms Low information density schemes – expand stencil to improve accuracy High information density schemes – expand information content per cell (e.g. use values and derivatives, or values at multiple collocation points) Homogeneity (or lack of) of discretization and solution methodology Homogeneity (or lack of) underlying physics models 19
20
The usual scalability considerations Computation and communication Computation versus communication Overlap of computation and communication Bulk of communication for local schemes can follow pattern of one to a few connectivity Global operations – global reductions often determine scalability 20
21
21 Recent Scalability Improvements CFD++ now scales well to very large number of cores The scalability improvements are universal – they apply to all modern HPC platforms from all vendors Tests have shown effective performance all the way up to 4096 cores Even relatively small grids (e.g. 16 million cells) scale well to 2048 or even 4096 cores, depending on computer and type of case run Goal – to demonstrate similar performance on 10000 to 40000 cores Ex 1: 33M cells, Computer 1, Case 1Ex 2: 16M cells, Computer 2, Case 2
22
Some Influences on Scalability Effect of physics – increased sophistication means more computation, often more scalability Effect of numerics – increased accuracy means more computation, and more communication, often more scalability Effect of grid – more grid means more computation and less communication for “local” algorithms 22
23
Additional thoughts on Parallel Processing Two ways of using multiple compute engines Parallel computations Pipelined computations Pipelined algorithms have not been exploited too much at the HPC level Process level and thread level parallelism beginning to be combined (e.g. to exploit GPGPUs) 23
24
Load balancing issues Structured vs. unstructured grids (usually solved by weighted domain decomposition) Adaptive algorithms and adaptive meshes Different physics in different regions Moving meshes and overset meshes 24
25
Optimization considerations Parallel algorithms for optimization How to use large numbers of processors E.g. Do many cases in parallel Pre-compute cases matrix, sensitivity, etc. and then train neural networks or tabulate sensitivity before applying optimization procedure 25
26
Multi-physics considerations Communications between non- homogeneous simulation tools Communications between diverse hardware platforms Tight coupling vs. loose coupling considerations 26
27
Need for Parallel I/O and File systems Very large scale problems Very large number of processors Initial load and final save + intermediate data output Asymmetric data extraction needs 27
28
Typical “post-processing” needs Global information (forces and moments, lift, drag, torque) Semi-global information (forces and moments along wing span, along fuselage) Reduced subsets – iso-surfaces, surface data, cut-planes Time-averages versus instantaneous values In-situ “post”-processing can be very useful 28
29
Single and Distributed File Parallel I/O Parallel I/O (PIO) can be accomplished in two ways In Single-File mode, PIO reads and writes from the current full-mesh/full-solution files. In Distributed-File mode, PIO reads and writes from a set of files (e.g. placed in subdirectories) associated with each parallel process 29
30
Interactive massively parallel computing Steady state versus Transient (unsteady) computations Links with front-end and graphical processing Even post processing of large scale problems may require substantial parallel computing resources One should not just focus on the “batch” computing model 30
31
Some elements of the balancing act Computation Communication Memory requirements I/O requirements Accuracy requirements Robustness requirements In-situ solution processing requirements 31
32
Bandwidths to consider Number of cores vs. number of I/O channels Memory bandwidth from core to memory Memory access conflicts 32
33
Some old ideas revisited Paying more attention to connectivity architecture Minimization of hops Domain decomposition that minimizes traffic between switches How many switches or hops (groups of nodes), how many nodes, how many processors in a node, how many cores per processor 33
34
Final thoughts The challenge of producing codes that work in the user’s hands and computing facilities Ease of use Scalability and effectiveness vs. just scalability Resource maximization versus minimization What can be done with less What can be done with more What more can be done with less Thank you 34
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.