Presentation is loading. Please wait.

Presentation is loading. Please wait.

Scalable System for Large Unstructured Mesh Simulation Miguel A. Pasenau, Pooyan Dadvand, Jordi Cotela, Abel Coll and Eugenio Oñate.

Similar presentations


Presentation on theme: "Scalable System for Large Unstructured Mesh Simulation Miguel A. Pasenau, Pooyan Dadvand, Jordi Cotela, Abel Coll and Eugenio Oñate."— Presentation transcript:

1 Scalable System for Large Unstructured Mesh Simulation Miguel A. Pasenau, Pooyan Dadvand, Jordi Cotela, Abel Coll and Eugenio Oñate

2 29th Nov 2012 / 2 Overview Introduction Preparation and Simulation – More Efficient Partitioning – Parallel Element Splitting Post Processing – Results Cache – Merging Many Partitions – Memory usage – Off-screen mode Conclusions, Future lines Acknowledgements

3 29th Nov 2012 / 3 Overview Introduction Preparation and Simulation – More Efficient Partitioning – Parallel Element Splitting Post Processing – Results Cache – Merging Many Partitions – Memory usage – Off-screen mode Conclusions, Future lines Acknowledgements

4 29th Nov 2012 / 4 Introduction Education: Masters in Numerical Methods, trainings, seminars, etc. Publishers: magazines, books, etc. Research: PhD’s, congresses, projects, etc. One of the International Centers of Excellence on Simulation-Based Engineering and Sciences [Glotzer et al., WTEC Panel Report on International Assessment of Research and Development in Simulation Based Engineering and Science. World Technology Evaluation Center (wtec.org), 2009].

5 29th Nov 2012 / 5 Introduction Simulation: structures

6 29th Nov 2012 / 6 Introduction CFD: Computer Fluid Dynamics

7 29th Nov 2012 / 7 Introduction Geomechanics Industrial forming processes Electromagnetism Acoustics Bio-medical engineering Coupled problems Earth sciences

8 29th Nov 2012 / 8 Introduction Simulation Preparation of analysis data Visualization of results GiD Geometry description Provided by CAD or using GiD Computer Analysis

9 29th Nov 2012 / 9 Introduction Analysis Data generation Read in and correct CAD data Assignment of boundary conditions Definitions of analysis parameters Generation of analysis data Assignment of material properties, etc.

10 29th Nov 2012 / 10 Introduction Visualization of Numerical Results – Deformed shapes, temperature distributions, pressures, etc. – Vector, contour plots, graphs, – Line diagrams, results surfaces – Animated sequences – Particle line flow diagrams

11 29th Nov 2012 / 11

12 29th Nov 2012 / 12 Introduction Goal: do a CFD simulation with 100 Million elements using in-house tools Hardware: cluster with – Master node: 2 x Intel Quad Core E5410, 32 GB RAM – 3 TB disc with dedicated Gigabit to Master node – 10 nodes: 2 x Intel Quad Core E5410 and 16 GB RAM – 2 nodes: 2 x AMD Opteron Quad Core 2356 and 32 GB – Total of 96 cores, 224 GB RAM available – Infiniband 4x DDR, 20 Gbps

13 29th Nov 2012 / 13 Introduction Airflow around a F1 car model

14 29th Nov 2012 / 14 Introduction Kratos: – Multi-physics, open source framework – Parallelized for shared and distributed memory machines GiD: – Geometry handling and data management – First coarse mesh – Merging and post-processing results

15 29th Nov 2012 / 15 Introduction Geometry Conditions Materials Coarse mesh generation Partition Distribution Communication plan part 1 part 2 Refinement Calculation part n res. 1 res. 2 res. n · · · Merge Visualize · · ·

16 29th Nov 2012 / 16 Overview Introduction Preparation and Simulation – More Efficient Partitioning – Parallel Element Splitting Post Processing – Results Cache – Merging Many Partitions – Memory usage – Off-screen mode Conclusions, Future lines and Acknowledgements

17 29th Nov 2012 / 17 Preparation and simulation Geometry Conditions Materials Coarse mesh generation Partition Distribution Communication plan part 1 part 2 Refinement Calculation part n res. 1 res. 2 res. n · · · Merge Visualize · · ·

18 29th Nov 2012 / 18 Meshing Single workstation: limited memory and time Three steps: – Single node: GiD generates a coarse mesh with 13 Million tetrahedrons – Single node: Kratos + Metis divide and distribute – In parallel: Kratos refines the mesh locally

19 29th Nov 2012 / 19 Preparation and simulation Geometry Conditions Materials Coarse mesh generation Partition Distribution Communication plan part 1 part 2 Refinement Calculation part n res. 1 res. 2 res. n · · · Merge Visualize · · ·

20 29th Nov 2012 / 20  Rank0 read the model, partitions it and send the partitions to the other ranks Rank 0Rank 1 Rank 2Rank 3 Efficient partitioning: before

21 29th Nov 2012 / 21  Rank0 read the model, partitions it and send the partitions to the other ranks Rank 0Rank 1 Rank 2Rank 3 Efficient partitioning: before

22 29th Nov 2012 / 22 Requires large memory in node 0 Using the cluster time for partitioning which can be done outside Each rerun need repartitioning Same working procedure for OpenMP and MPI run Efficient partitioning: before

23 29th Nov 2012 / 23  Dividing and writing the partitions in another machine  Reading data of each rank separately Efficient partitioning: now

24 29th Nov 2012 / 24 Preparation and simulation Geometry Conditions Materials Coarse mesh generation Partition Distribution Communication plan part 1 part 2 Refinement Calculation part n res. 1 res. 2 res. n · · · Merge Visualize · · ·

25 29th Nov 2012 / 25 Local refinement: triangle k i j l m n i l j m k n 1 3 4 2 k k i j l i l j 1 2 i j m k 1 2 k i j l m i l j m k 1 3 2 i l j m k 1 3 2

26 29th Nov 2012 / 26 Local refinement: triangle  Selecting the case respecting nodes Id  The decision is not for best quality!  It is very good for parallelization  OpenMP  MPI k i j l m i l j m k 1 3 2 i l j m k 1 3 2

27 29th Nov 2012 / 27 Local refinement: tetrahedron Father Element Child Elements

28 29th Nov 2012 / 28 Local refinement: examples

29 29th Nov 2012 / 29 Local refinement: examples

30 29th Nov 2012 / 30 Local refinement: examples

31 29th Nov 2012 / 31 Local refinement: uniform  A Uniform refinement can be used to obtain a mesh with 8 times more elements  Does not improve the geometry representation

32 29th Nov 2012 / 32 Introduction Geometry Conditions Materials Coarse mesh generation Partition Distribution Communication plan part 1 part 2 Refinement Calculation part n res. 1 res. 2 res. n · · · Merge Visualize · · ·

33 29th Nov 2012 / 33 Parallel calculation Calculated using 12 x 8 MPI processes Less than 1 day for 400 time steps About 180 GB memory usage Single volume mesh of 103 Million tetrahedrons split into 96 files ( mesh portion and its results)

34 29th Nov 2012 / 34 Overview Introduction Preparation and Simulation – More Efficient Partitioning – Parallel Element Splitting Post Processing – Results Cache – Merging Many Partitions – Memory usage – Off-screen mode Conclusions, Future lines and Acknowledgements

35 29th Nov 2012 / 35 Post processing Geometry Conditions Materials Coarse mesh generation Partition Distribution Communication plan part 1 part 2 Refinement Calculation part n res. 1 res. 2 res. n · · · Merge Visualize · · ·

36 29th Nov 2012 / 36 Post-process Challenges to face: – Single node – Big files: tens or hundreds of GB – Merging: Lots of files – Batch post-processing – Maintain generality

37 29th Nov 2012 / 37 Big Files: results cache Uses a defined memory pool to store results. Used to cache results stored in files. Mesh information Created Results: cuts, extrusions, tcl Temporal results User definable Memory pool Results from files: single, multiple, merge

38 29th Nov 2012 / 38 Big Files: results cache Results cache table RC entry timestamp RC entry timestamp · · · · · · · RC entry timestamp Result RC info RC Info file 1offsettype file 2offsettype · · · file noffsettype memory footprint Open files table filehandletype filehandletype · · · filehandletype Result RC info Result RC info Granularity of result

39 29th Nov 2012 / 39 Big Files: results cache Verifies result’s file(s) and gets result’s position in file and memory footprint. Results of latest analysis step in memory. Loaded on demand. Oldest results unloaded if needed. Touch on use.

40 29th Nov 2012 / 40 Big Files: results cache Chinese harbour: 104 GB results file 7,6 Million tetrahedrons 2.292 time steps 3,16 GB memory usage ( 2 GB results’ cache)

41 29th Nov 2012 / 41 Big Files: results cache Chinese harbour: 104 GB results file 7,6 Million tetrahedrons 2.292 time steps 3,16 GB memory usage ( 2 GB results’ cache)

42 29th Nov 2012 / 42 Merging many partitions Before: 2, 4,... 10 partitions Now: 32, 64, 128,... of a single volume mesh Postpone any calculation: – Skin extraction – Finding boundary edges – Smoothed normals – Neighbour information – Graphical objects creation

43 29th Nov 2012 / 43 Merging many partitions Telescope example 23,870,544 tetrahedrons Before32 partitions24’ 10” After32 partitions4’ 34” 128 partitions10’ 43” Single file2’ 16”

44 29th Nov 2012 / 44 Merging many partitions

45 29th Nov 2012 / 45 Merging many partitions Racing car example 103,671,344 tetrahedrons Before96 partitions> 5 hours After96 partitions51’ 21” Single file13’ 25”

46 29th Nov 2012 / 46 Memory usage Around 12 GB of memory used with a spike of 15 GB ( MS Windows) 17,5 GB ( Linux), including: – Volume mesh ( 103 Mtetras) – Skin mesh ( 6 Mtriangs) – Several surface and cut meshes – Stream line search tree – 2 GB of results cache – Animations

47 29th Nov 2012 / 47 Pictures

48 29th Nov 2012 / 48 Pictures

49 29th Nov 2012 / 49 Pictures

50 29th Nov 2012 / 50 Batch post-processing: off-screen GiD with no interaction and no window Command line: gid -offscreen [ WxH] -b+g batch_file_to_run Useful to: – launch costly animations in bg or in queue – use gid as template generator – use gid behind a web server: Flash Video animation Animation window: added button to generate batch file for offscreen-gid to be sent to a batch queue.

51 29th Nov 2012 / 51 Animation

52 29th Nov 2012 / 52 Overview Introduction Preparation and Simulation – More Efficient Partitioning – Parallel Element Splitting Post Processing – Results Cache – Merging Many Partitions – Memory usage – Off-screen mode Conclusions, Future lines and Acknowledgements

53 29th Nov 2012 / 53 Conclusions The implemented improvements helped us to achieve the milestone: Prepare, mesh, calculate and visualize a CFD simulation with 103 Million tetrahedrons GiD: also modest machines take profit of these improvements

54 29th Nov 2012 / 54 Future lines Faster tree creation for stream lines. – Now: ~ 90 s. creation time, 2-3 s. per stream line Mesh simplification, LOD – geometry and results criteria – Surface meshes, iso-surfaces, cuts: faster drawing – Volume meshes: faster cuts, stream lines – Near real-time Parallelize other algorithms in GiD: – Skin and boundary edges extraction – Parallel cuts and stream lines creation

55 29th Nov 2012 / 55 Challenges 10 9 – 10 10 tetrahedrons, 6·10 8 – 6·10 9 triangles Large workstation with Infiniband to cluster and 80 GB or 800 GB RAM? Hard disk? Post process as backend of a web server in cluster? Security issues? Post process embedded in solver? Output of both: the original mesh and a simplified one?

56 29th Nov 2012 / 56 Acknowledgements Ministerio de Ciencia e Innovación, E-DAMS project European Commission, Real-time project

57 29th Nov 2012 / 57 Comments, questions...... ?

58 Thanks for your attention Scalable System for Large Unstructured Mesh Simulation


Download ppt "Scalable System for Large Unstructured Mesh Simulation Miguel A. Pasenau, Pooyan Dadvand, Jordi Cotela, Abel Coll and Eugenio Oñate."

Similar presentations


Ads by Google