Download presentation

Presentation is loading. Please wait.

Published byJavier Layfield Modified about 1 year ago

1
Staggered mesh methods for MHD- and charged particle simulations of astrophysical turbulence Åke Nordlund Niels Bohr Institute for Astronomy, Physics, and Geophysics University of Copenhagen

2
Star Formation The IMF is a result of statistics of MHD-turbulence Planet Formation Gravitational fragmentation (or not!) Stars Turbulent convection determines structure BCs Stellar coronae & chromospheres Heated by magnetic dissipation Context examples

3
Charged particle contexts Solar Flares To what extent is MHD OK? Particle acceleration mechanisms? Reconnection & dissipation? Gamma-Ray Bursts Relativistic collisionless shocks? Weibel-instability creates B? Synchrotron radiation or gitter radiation?

4
Overview MHD methods Godunov-like vs. direct Staggered mesh vs. centered method Radiative transfer Fast & cheap methods Charged particle dynamics Methods & examples

5
Solving the (M)HD Partial Differential Equations (PDEs) Godunov-type methods Solve the local Riemann problem (approx.) OK in ideal gas hydro MHD: 7 waves, 648 combos (cf. Schnack’s talk) Constrained Transport (CT) Gets increasingly messy when adding gravity... non-ideal equation of state (ionization)... radiation...

6
Direct methods Evaluate right hand sides (RHS) High order spatial derivatives & interpolations Spectral Compact Local stencils e.g. 6th order derivatives, 5th order interpolations Step solution forward in time Runge-Kutta type methods (e.g. 3rd order): Adams-Bashforth Hyman’s method RK3-2N Saves memory – uses only F and dF/dt (hence 2N)

7
Which variables? Conservative! Mass Momentum Internal energy not total energy consider cases where magnetic or kinetic energy dominates total energy is well conserved e.g. Mach 5 supersonic 3D-turbulence test (Wengen) less than 0.5% change in total energy

8
Dissipation Working with internal energy also means that all dissipation (kinetic to thermal, magnetic to thermal) must be explicit Shock- and current sheet-capturing schemes Negative part of divergence captures shocks Ditto for cross-field velocity captures current sheets

9
Advantages Much simpler HD ~ 700 flops / point (6th/5th order in space) ENZO ~ 10,000 flops / point FLASH ~ 20,000 flops / point MHD ~ 1100 flops / point Trivial to extend Non-ideal equation-of-state Radiative energy transfer Relativistic

10
Direct method: Disadvantages? Smaller Courant numbers allowed 3 sub-step limit ~ 0.6 (runs at 0.5) 2 sub-step limit ~ 0.4 (runs at 0.333) PPM typically runs at 0.8 factor 1.6 further per full step (unless directionally split) Comparison of hydro flops ~2,000 (direct, 3 sub-steps) ~10,000 (ENZO/PPM, FLASH/PPM) Need to also compare flops per second Cache use?

11
Perhaps much more diffusive? 2D implosion test indicates not so square area with central, rotated low pressure square generates thin ’jet’ with vortex pairs moves very slowly, in ~ pressure equilibrium essentially a wrinkled 2D contact discontinuity see Jim Stone’s test pages, with references

12
2D Implosion Test

13
Imagine: non-ideal EOS + shocks + radiation + conduction along B Ionization: large to small across a shock Radiation: thick to thin across a shock Heat conduction only along B... Rieman solver? Any volunteers? Operator and/or direction split? With anisotropic resistivity & heat conduction?!

14
Non-ideal EOS + radiation + MHD: Validation? Godunov-type methods No exact solutions to check against Difficult to validate Direct methods Need only check conservation laws mass & momentum, no direct change energy conservation; easy to verify Valid equations + stable methods valid results

15
Staggered Mesh Code (Nordlund et al) Cell centered mass and thermal energy densities Face-centered momenta and magnetic fields Edge-centered electric fields and electric currents Advantages: simplicity; OpenMP (MPI btw boxes) consistency (e.g., divB=0) conservative, handles extreme Mach Advantages: simplicity; OpenMP (MPI btw boxes) consistency (e.g., divB=0) conservative, handles extreme Mach

16
Code Philosophy Simplicity F90/95 for ease of development Simplicity minimizes operator count Conservative (per volume variables) Can nevertheless handle SNe in the ISM Accuracy 6th/5th order in space, 3rd order in time Speed About 650,000 zone-updates/sec on laptop

17
Code Development Stages 1. Simplest possible code Dynamic allocation No need to recompile for different resolutions F95 array valued function calls P4 speed is the SAME as with subroutine calls 2. SMP/OMP version Open MP directives added Uses auto-parallelization and/or OMP on SUN, SGI & IBM 3. MPI version for clusters Implemented with CACTUS (see Scales to arbitrary number of CPUs

18
CACTUS Provides “flesh” (application interface) Handles cluster-communication E.g. MPI (but not limited to MPI) Handles GRID computing Presently experimental Handles grid refinement and adaptive meshes AMR not yet available “thorns” (applications and services) Parallel I/O Parameter control (live!) Diagnostic output X-Y plots JPEG slices Isosurfaces

19
mhd.f90 MHD

20
Example Code Induction Equation stagger-code/src-simple Makefile (with includes for OS- and host-dep) Subdirectories with optional code: INITIAL (initial values) BOUNDARIES EOS (equation of state) FORCING EXPLOSIONS COOLING EXPERIMENTS stagger-code/src (SMP production) Ditto Makefile and subdirs CACTUS_Stagger_Code Code becomes a ”thorn” in the CACTUS ”flesh” ! ! Magnetic field's time derivative, dBdt = - curl(E) ! dBxdt = dBxdt + ddzup(Ey) - ddyup(Ez) dBydt = dBydt + ddxup(Ez) - ddzup(Ex) dBzdt = dBzdt + ddyup(Ex) - ddxup(Ey) ! ! Magnetic field's time derivative, dBdt = - curl(E) ! dBxdt = dBxdt + ddzup(Ey) - ddyup(Ez) dBydt = dBydt + ddxup(Ez) - ddzup(Ex) dBzdt = dBzdt + ddyup(Ex) - ddxup(Ey) ! ! Magnetic field's time derivative, dBdt = - curl(E) ! call ddzup_set(Ey, scr1) ; call ddyup_set(Ez, scr2) !$omp parallel do private(iz) do iz=1,mz dBxdt(:,:,iz) = dBxdt(:,:,iz) + scr1(:,:,iz) - scr2(:,:,iz) end do call ddxup_set(Ez, scr1) ; call ddzup_set(Ex, scr2) !$omp parallel do private(iz) do iz=1,mz dBydt(:,:,iz) = dBydt(:,:,iz) + scr1(:,:,iz) - scr2(:,:,iz) end do call ddyup_set(Ex, scr1) ; call ddxup_set(Ey, scr2) !$omp parallel do private(iz) do iz=1,mz dBzdt(:,:,iz) = dBzdt(:,:,iz) + scr1(:,:,iz) - scr2(:,:,iz) end do ! ! Magnetic field's time derivative, dBdt = - curl(E) ! call ddzup_set(Ey, scr1) ; call ddyup_set(Ez, scr2) !$omp parallel do private(iz) do iz=1,mz dBxdt(:,:,iz) = dBxdt(:,:,iz) + scr1(:,:,iz) - scr2(:,:,iz) end do call ddxup_set(Ez, scr1) ; call ddzup_set(Ex, scr2) !$omp parallel do private(iz) do iz=1,mz dBydt(:,:,iz) = dBydt(:,:,iz) + scr1(:,:,iz) - scr2(:,:,iz) end do call ddyup_set(Ex, scr1) ; call ddxup_set(Ey, scr2) !$omp parallel do private(iz) do iz=1,mz dBzdt(:,:,iz) = dBzdt(:,:,iz) + scr1(:,:,iz) - scr2(:,:,iz) end do SUBROUTINE mhd(eta,Ux,Uy,Uz,Bx,By,Bz,dpxdt,dpydt,dpzdt,dedt,dBxdt,dBydt,dBzdt) USE params USE stagger real, dimension(mx,my,mz) :: & eta,Ux,Uy,Uz,Bx,By,Bz,dpxdt,dpydt,dpzdt,dedt,dBxdt,dBydt,dBzdt !hpf$ distribute (*,*,block) :: & !hpf$ eta,Ux,Uy,Uz,Bx,By,Bz,dpxdt,dpydt,dpzdt,dedt,dBxdt,dBydt,dBzdt real, allocatable, dimension(:,:,:) :: & Jx,Jy,Jz,Ex,Ey,Ez, & Bx_y,Bx_z,By_x,By_z,Bz_x,Bz_y,scr1,scr2 !hpf$ distribute (*,*,block) :: & !hpf$ Jx,Jy,Jz,Ex,Ey,Ez, & !hpf$ Bx_y,Bx_z,By_x,By_z,Bz_x,Bz_y,scr1,scr2 SUBROUTINE mhd(eta,Ux,Uy,Uz,Bx,By,Bz,dpxdt,dpydt,dpzdt,dedt,dBxdt,dBydt,dBzdt) USE params USE stagger real, dimension(mx,my,mz) :: & eta,Ux,Uy,Uz,Bx,By,Bz,dpxdt,dpydt,dpzdt,dedt,dBxdt,dBydt,dBzdt !hpf$ distribute (*,*,block) :: & !hpf$ eta,Ux,Uy,Uz,Bx,By,Bz,dpxdt,dpydt,dpzdt,dedt,dBxdt,dBydt,dBzdt real, allocatable, dimension(:,:,:) :: & Jx,Jy,Jz,Ex,Ey,Ez, & Bx_y,Bx_z,By_x,By_z,Bz_x,Bz_y,scr1,scr2 !hpf$ distribute (*,*,block) :: & !hpf$ Jx,Jy,Jz,Ex,Ey,Ez, & !hpf$ Bx_y,Bx_z,By_x,By_z,Bz_x,Bz_y,scr1,scr2 SUBROUTINE mhd(CCTK_ARGUMENTS) USE hd_params USE stagger_params USE stagger IMPLICIT NONE DECLARE_CCTK_ARGUMENTS DECLARE_CCTK_PARAMETERS DECLARE_CCTK_FUNCTIONS CCTK_REAL, allocatable, dimension(:,:,:) :: & Jx, Jy, Jz, Ex, Ey, Ez, & Bx_y, Bx_z, By_x, By_z, Bz_x, Bz_y SUBROUTINE mhd(CCTK_ARGUMENTS) USE hd_params USE stagger_params USE stagger IMPLICIT NONE DECLARE_CCTK_ARGUMENTS DECLARE_CCTK_PARAMETERS DECLARE_CCTK_FUNCTIONS CCTK_REAL, allocatable, dimension(:,:,:) :: & Jx, Jy, Jz, Ex, Ey, Ez, & Bx_y, Bx_z, By_x, By_z, Bz_x, Bz_y

21
Physics (staggered mesh code) Equation of state Qualitative: H+He+Me Accurate: Lookup table Opacity Qualitative: H-minus Accurate: Lookup table Radiative energy transfer Qualitative: Vertical + a few (4) Accurate: Comprehensive set of rays

22
Staggered Mesh Code Details Dynamic memory allocation Any grid size; no recompilation Parallelized Shared memory: OpenMP (and auto-) parallelization MPI: Direct (Galsgaard) or via CACTUS Organization – Makefile includes Experiments EXPERIMENTS/$(EXPERIMENT).mkf Selectable features Eq. of state Cooling & conduction Boundaries OS and compiler dependencies hidden OS/$(MACHTYPE).f90 OS/$(HOST).mkf OS/$(COMPILER).mkf

23
Radiative Transfer Requirements Comprehensive Need at least (double) rays 4-5 frequency bins (recent paper) At least 5 directions Speed issue Would like 25 rays to add negligible time

24
BenchmarkTiming Results

25
Altix Itanium-2 Scaling

26
Star Formation Planet Formation Stars Stellar coronae & chromospheres Applications

27
Star Formation Nordlund & Padoan 2002

28
Key feature: intermittency! What does it mean in this context? Low density, high velocity gas fills most of the volume! High density, low velocity features occupy very little space, but carry much of the mass! How does it influence star formation? It greatly simplifies understanding it! Inertial dynamics in most of the volume! Collapsing features are relatively well defined!

29
Turbulence Diagnostics of Molecular Clouds Padoan, Boldyrev, Langer & Nordlund, ApJ 2002 (astro-ph/ )

30
Numerical (250 3 sim) & Analytical IMF Padoan & Nordlund (astro-ph/ )

31
Low Mass IMF Padoan & Nordlund, ApJ 2004 (astro-ph/ )

32
Planet formation; gas collapse

33
Coronal Heating Initial Magnetic Field Potential extrapolation of AR 9114

34
Coronal Heating: TRACE 195 Loops

35
Current sheet hierarchy

36
Current sheet hierarchy: close-up

37
Scan through hierarchy: dissipation Hm, the dissipation looks pretty intermittent– large nice empty areas to ignore with an AMR code, right? Note that all features rotate as we scan through – this means that these currents sheets are all curved in the 3 rd dimension.

38
Electric current J This is still the dissipation. Lets replace it by the electric current, as a check! Hm, not quite as empty, but the electric current is at least mostly weak, right?

39
J log(J) So, let’s replace the current with the log of current, to see the levels of the hierarchy better!

40
Log of the electric current Not really much to win with AMR here, if we want to cover the hierarchy!

41
Solar & stellar surface MHD Faculae Sunspots Chromospheres Coronae

42
Faculae: Center-to- Limb Variation

43
Radiative transfer ’Exact’ radiative energy transfer is not expensive allows up to ~100 rays per point for 2 x CPU-time parallelizes well (with MPI or OpenMP) Reasons for not using Flux Limited Diffusion Not the right answer (e.g. missing shadows) Is not cheaper

44
Radiative Transfer: Significance Cosmology End of Dark Ages Star Formation Feedback: evaporation of molecular clouds Dense phases of the collapse Planet Formation External illumination of discs Structure and cooling of discs Stellar surfaces Surface cooling: the driver of convection

45
Radiative transfer methods Fast local solvers Feautrier schemes; the fastest (often) Optimized integral solutions; the simplest A new approach to parallellizing RT Solve within each domain, with no bdry radiation Propagate and accumulate solutions globally

46
Moments of the radiation field

47
Give up, adopting some approximation? Flux Limited Diffusion Did someone say ”shadows”?? Or, solve as it stands? Fast solvers Parallelize Did someone say ”difficult”? Phew, 7 variables!?!

48
Rays Through Each Grid Point Interpolate source function to rays in each plane

49
How many rays are needed? Depends entirely on the geometry For stellar surfaces, surprisingly few! 1 vertical + 4 slanted, rotating 1% accuracy in the mean Q a few % in fluctuating Q 8 rays / 48 rays see plots

50
8 rays / 48 rays

51
Radiative transfer steps Interpolate source function(s) and opacity Simple translation of planes – fast Solve along rays May be done in parallel (distribute rays) Interpolate back to rectangular mesh Inverse of 1st interpolation (negative shift) Add up Integrate over angles (and possibly frequencies or bins)

52
Along straight rays, solve

53
Or actually, solve directly for the cooling (I-S)! Source Function (input) New Source Function (input)

54
Formal (and useful) solutions For simplicity, let’s consider the standard formulation Has the formal solution:

55
Doubly useful As a direct method Very accurate, if S() is piecewise parabolic The slowness of exp() can be largely avoided As a basis for domain decomposition Add ’remote’ contributions separately!

56
Direct solution, integral form

57
How to parallelize (Heinemann, Dobler, Nordlund & Brandenburg – in prep.) Solve for the intensity generated internally in each domain, separately and in parallel Then propagate and accumulated the boundary intensities, modified only by trivial optical depth factors

58
Putting it together

59
The Transfer Equation & Parallelization Analytic Solution: Processors

60
The Transfer Equation & Parallelization Analytic Solution: Ray direction Intrinsic Calculation Processors

61
The Transfer Equation & Parallelization Analytic Solution: Ray direction Communication Processors

62
The Transfer Equation & Parallelization Analytic Solution: Ray direction Communication Processors

63
The Transfer Equation & Parallelization Analytic Solution: Ray direction Communication Processors

64
The Transfer Equation & Parallelization Analytic Solution: Ray direction Communication Processors

65
The Transfer Equation & Parallelization Analytic Solution: Ray direction Communication Processors

66
The Transfer Equation & Parallelization Analytic Solution: Ray direction Communication Processors

67
The Transfer Equation & Parallelization Analytic Solution: Ray direction Communication Processors

68
The Transfer Equation & Parallelization Analytic Solution: Ray direction Communication Processors

69
The Transfer Equation & Parallelization Analytic Solution: Ray direction Processors Intrinsic Calculation

70
Pencil Code (Brandenburg et al) CPU-time per ray-point Ignore! (bad node distribution) about 160 nsec / pt / ray Can be improved w factor 4-5!

71
CPU-time per point (Pencil Code)

72
Timing Results, Stagger Code

73
Radiative Transfer Conclusions The methods are conceptually simple fast robust scale well in parallel environments

74
Collisionless shocks Not an artists rendering! Shows electrical current filaments in a collisionless shock simulation with ~ 10 9 particles and ~ mesh zones

75
Particle-in-Cell (PIC) code Steps Relativistic particle move, using B & E Uses - relativistic momenta About particle updates / sec on P4 laptop Parallelizes nearly linearly (OpenMP on Altix) Gather fields; n i, n e, j i, j e 2 nd order; Triangular Shaped Clouds (TSC) Push B & E – staggered in space and time Electrostatic solver Based on original 2-D, non-relativistic code by Michael Hesse, GSF 3-D, relativistic version developed by Frederiksen, Haugbølle, Hededal & Nordlund, Copenhagen

76
Use of Maxwell’s Equations in the code Fields on mesh Sampled particles Basic tests: wave propagation, etc

77
Example: Single electron Electron & proton circling in separate orbits Relativistic; =10 NOTE: resolution implications of high ! Far field: Synchrotron radiation

78
The Weibel Instability Well known and understood First principles; anisotropic PDFs Weibel 1959, Fried 1959, Yoon & Davidson 1987 Numerical studies, electron-positron, 2-D Wallace & Epperlein 1991, Yang et al 1994 Kazimura et al 1998 (ApJ) Numerical studies, relativistic, ion-electron Califano et al 1997, ‘98, ‘99, ‘00, ‘01, ’02,.. Application to GRBs Medvedev & Loeb 1999, Medvedev 2000, ’01, …

79
The Weibel Instability (two-stream) (Weibel 1959, Medvedev & Loeb 1999)

80
Experiments 3-D Of the order 200x200x800 mesh, 10 9 part. Cold beam from the left Carries negligible magnetic field Hits denser plasma, initially field free Weibel instability B, E

81
So, what is this? A Weibel-like instability at high Initial scales ~ skin depth Conventional expectation: restricted to skin depth Generated fields propagate at v~c Fluctuations ‘ride’ on the beam Losses supported by beam population Scales grow down the line!!

82
Along Across Electron and ion current channels Coherent Structures in Collisionless Shocks

83
Ion and electron structures

84
A non-Fermi acceleration scenario Hededal, Haugbølle, Frederiksen and Nordlund (2004) astro-ph/ Electrons are accelerated instantaneously inside the Debye cylinder surrounding the ion current channels.

85
Electron path near ion channel CH note: 10%-40% optical dark (HETE, BeppoSax). 50% detected in radio. CH note: 10%-40% optical dark (HETE, BeppoSax). 50% detected in radio. Hededal, Haugbølle, Frederiksen and Nordlund (2004) astro-ph/

86
Perspectives for the future Star Formation Is turbulent fragmentation the main mechnism? How important are magnetic fields are important for the IMF? Include radiative transfer during collapse! Magnetic fields are also important during collapse! Planet Formation RT important for initial conditions as well as for disc structure and cooling Stellar surfaces Include approx. RT in simulations of chromosphere

87
Solar Plans Convection: from granulation to supergranulation scales SunspotsFaculae Chromosphere Corona 20 Mm 30 Mm 50 Mm

88

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google