Algorithm and Scaling (Issues) for Aerospace (CFD) Codes Sukumar Chakravarthy 1.

Algorithm and Scaling (Issues) for Aerospace (CFD) Codes Sukumar Chakravarthy src@metacomptech.com www.metacomptech.com 1

Scope of Presentation  Range of aerospace CFD and related applications  Hierarchy of simulation approaches  Hierarchy of algorithmic approaches  Algorithm and scalability issues and considerations 2

Presentation Approach & Goals  A picture is worth a thousand words  We will use ten thousand words and 1 picture == eleven thousand word- equivalents  Catalog, serve as collective conscience  Discuss relationship between application needs, algorithms, modeling approaches and HPC issues and possibilities 3

CFD++ Aerospace Applications  External aerodynamics  Propulsion integration  Component integration  Systems  Cabin airflow  FADEC  Icing  Fuel tank purge  Thrust reverser  Propulsion  Nozzle design  Jet noise

CFD++ Aerospace Applications  Plumes  Trajectory  Aerodynamic coefficients  Drag polar  Dynamic derivatives  Store separation  Canopy separation  Sabot separation  Stage separation  Pilot seat ejection  Projectiles  Spinning projectiles

CFD++ Aerospace Applications  Synthetic jets  Turbomachinery  Blade design  Blade cooling  Pulsed detonation  Flapping wings  Flexible wings  Entomopters  Helicopters  Propellers, rotors  Parachutes  Parachutists, sky-diving

CFD++ Aerospace Applications  Spacecraft launch  Reentry vehicles  Rocket assisted landings (Earth, Mars, Venus)  X-Prize vehicles  Land speed record vehicles  Bullets, artillery rounds  Liquid fuel breakup  Liquid fuel sloshing, feed  Acceleration, deceleration effects  Aeroacoustics  Flow Structure Interaction (FSI)

What’s special about Aerospace CFD?  Extremes of scales, operating conditions, physics and chemistry, speeds, application- specific needs (extraction of useful information)  Nonlinearity is most often inherent  It is not just the simulation itself that counts  If there is no information output required, no need to do the simulation

Hierarchy of problem classes  Steady state/unsteady problems  Small, medium and large scale problems  Entire configurations as well as analysis of components  Engineering analysis, scientific analysis, trouble shooting  All speeds, atmospheric conditions, diverse fluids and their properties

10 Physics (nature) Math Model of Physics Numerical Model of Math Model Computational Model Human(s) in the loop Simulation Results Common Elements of Simulations

Common Underlying Physical Processes 11 Convection: Production: Dissipation: Redistribution: Diffusion: Evolution:

Summary of some HPC issues  Loading the problem, saving final results  Checkpointing  Computational vs. communications performance (scalability)  Data extraction issues  Robustness (10000-way parallel should be as robust as serial algorithm)  Data-center issues (throughput, storage)  Visualization, interaction with running case 12

Modeling Hierarchy  Potential flow assumption  Small-disturbance approaches  Inviscid-flows taken separately, and hybridized with boundary layer theory  Reynolds/Favre-averaged N-S equations with phenomenological turbulence models  LES and hybrid RANS-LES approaches  Special equations and models 13

Mesh possibilities  Surface mesh only (panel methods)  Cartesian mesh, almost Cartesian mesh  Structured mesh – hex (3D) & quad (2D)  Unstructured – all cell types  Hybrid structured and unstructured meshes, hex-core meshes  Patched and overset meshes  Moving (dynamic) meshes  Flexible boundaries and meshes 14

“Extreme Grids”  Aspect ratios of 10000 to 1 or more (boundary layer resolution with Y+ < 1)  Mesh sizes of hundreds of million and more  Extreme grid spacings present in mesh 15

Numerical approaches  Explicit and implicit  Fractional steps and factored schemes  Finite volume, finite difference schemes  Finite element schemes  Spectral and spectral element schemes  “Local” schemes and “global” schemes 16

Some HPC algorithmic challenges  Challenges of making implicit schemes be really implicit on multi-CPU computations  Ensure insensitivity of results to variations in number of parallel processes used  How to make the 10000-way parallel computation as robust as the serial algorithm  How to make the 10000-way parallel computation converge as well but in much less time 17

Adaptive meshes  Adaptive elements (cells)  Adaptive grids  H-adaptation, P-adaptation, H-P-adaptation 18

Classification of Algorithms  Low information density schemes – expand stencil to improve accuracy  High information density schemes – expand information content per cell (e.g. use values and derivatives, or values at multiple collocation points)  Homogeneity (or lack of) of discretization and solution methodology  Homogeneity (or lack of) underlying physics models 19

The usual scalability considerations  Computation and communication  Computation versus communication  Overlap of computation and communication  Bulk of communication for local schemes can follow pattern of one to a few connectivity  Global operations – global reductions often determine scalability 20

21 Recent Scalability Improvements  CFD++ now scales well to very large number of cores  The scalability improvements are universal – they apply to all modern HPC platforms from all vendors  Tests have shown effective performance all the way up to 4096 cores  Even relatively small grids (e.g. 16 million cells) scale well to 2048 or even 4096 cores, depending on computer and type of case run  Goal – to demonstrate similar performance on 10000 to 40000 cores Ex 1: 33M cells, Computer 1, Case 1Ex 2: 16M cells, Computer 2, Case 2

Some Influences on Scalability  Effect of physics – increased sophistication means more computation, often more scalability  Effect of numerics – increased accuracy means more computation, and more communication, often more scalability  Effect of grid – more grid means more computation and less communication for “local” algorithms 22

Additional thoughts on Parallel Processing  Two ways of using multiple compute engines  Parallel computations  Pipelined computations  Pipelined algorithms have not been exploited too much at the HPC level  Process level and thread level parallelism beginning to be combined (e.g. to exploit GPGPUs) 23

Load balancing issues  Structured vs. unstructured grids (usually solved by weighted domain decomposition)  Adaptive algorithms and adaptive meshes  Different physics in different regions  Moving meshes and overset meshes 24

Optimization considerations  Parallel algorithms for optimization  How to use large numbers of processors  E.g. Do many cases in parallel  Pre-compute cases matrix, sensitivity, etc. and then train neural networks or tabulate sensitivity before applying optimization procedure 25

Multi-physics considerations  Communications between non- homogeneous simulation tools  Communications between diverse hardware platforms  Tight coupling vs. loose coupling considerations 26

Need for Parallel I/O and File systems  Very large scale problems  Very large number of processors  Initial load and final save + intermediate data output  Asymmetric data extraction needs 27

Typical “post-processing” needs  Global information (forces and moments, lift, drag, torque)  Semi-global information (forces and moments along wing span, along fuselage)  Reduced subsets – iso-surfaces, surface data, cut-planes  Time-averages versus instantaneous values  In-situ “post”-processing can be very useful 28

Single and Distributed File Parallel I/O  Parallel I/O (PIO) can be accomplished in two ways  In Single-File mode, PIO reads and writes from the current full-mesh/full-solution files.  In Distributed-File mode, PIO reads and writes from a set of files (e.g. placed in subdirectories) associated with each parallel process 29

Interactive massively parallel computing  Steady state versus Transient (unsteady) computations  Links with front-end and graphical processing  Even post processing of large scale problems may require substantial parallel computing resources  One should not just focus on the “batch” computing model 30

Some elements of the balancing act  Computation  Communication  Memory requirements  I/O requirements  Accuracy requirements  Robustness requirements  In-situ solution processing requirements 31

Bandwidths to consider  Number of cores vs. number of I/O channels  Memory bandwidth from core to memory  Memory access conflicts 32

Some old ideas revisited  Paying more attention to connectivity architecture  Minimization of hops  Domain decomposition that minimizes traffic between switches  How many switches or hops (groups of nodes), how many nodes, how many processors in a node, how many cores per processor 33

Final thoughts  The challenge of producing codes that work in the user’s hands and computing facilities  Ease of use  Scalability and effectiveness vs. just scalability  Resource maximization versus minimization  What can be done with less  What can be done with more  What more can be done with less Thank you 34

Algorithm and Scaling (Issues) for Aerospace (CFD) Codes Sukumar Chakravarthy 1.

Similar presentations

Presentation on theme: "Algorithm and Scaling (Issues) for Aerospace (CFD) Codes Sukumar Chakravarthy 1."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Algorithm and Scaling (Issues) for Aerospace (CFD) Codes Sukumar Chakravarthy 1.

Similar presentations

Presentation on theme: "Algorithm and Scaling (Issues) for Aerospace (CFD) Codes Sukumar Chakravarthy 1."— Presentation transcript:

Similar presentations

About project

Feedback