N-Body Problems An N-body problem is a problem involving N “bodies” – that is, particles (stars, atoms) – each of which applies some force to all of the others. For example, if you have N stars, then each of the N stars exerts a force (gravity) on all of the other N–1 stars. Likewise, if you have N atoms, then every atom exerts a force (nuclear) on all of the other N–1 atoms. 5
1-Body Problem When N is 1, you have a simple 1-Body Problem: a single particle, with no forces acting on it. Given the particle’s position P and velocity V at some time t0, you can trivially calculate the particle’s position at time t 0 +Δt: P(t 0 +Δt) = P(t 0 ) + VΔt V(t 0 +Δt) = V(t 0 ) 6
2-Body Problem When N is 2, you have – surprise! – a 2-Body Problem: exactly 2 particles, each exerting a force that acts on the other. The relationship between the 2 particles can be expressed as a differential equation that can be solved analytically, producing a closed-form solution. So, given the particles’ initial positions and velocities, you can trivially calculate their positions and velocities at any later time. 7
3-Body Problem When N is 3, you have – surprise! – a 3-Body Problem: exactly 3 particles, each exerting a force that acts on the other. The relationship between the 3 particles can be expressed as a differential equation that can be solved using an infinite series, producing a closed-form solution, due to Karl Fritiof Sundman in 1912. However, in practice, the number of terms of the infinite series that you need to calculate to get a reasonable solution is so large that the infinite series is impractical, so you’re stuck with the generalized formulation. 8
N-Body Problems (N > 3) For N greater than 3 (and for N of 3 in practice), no one knows how to solve the equations to get a closed form solution. So, numerical simulation is pretty much the only way to study groups of 3 or more bodies. Popular applications of N-body codes include: astronomy (that is, galaxy formation, cosmology); chemistry (that is, protein folding, molecular dynamics). Note that, for N bodies, there are on the order of N 2 forces, denoted O(N 2 ). 9
N-Body Problems Given N bodies, each body exerts a force on all of the other N – 1 bodies. Therefore, there are N (N – 1) forces in total. You can also think of this as (N (N – 1)) / 2 forces, in the sense that the force from particle A to particle B is the same (except in the opposite direction) as the force from particle B to particle A. 18
Aside: Big-O Notation Let’s say that you have some task to perform on a certain number of things, and that the task takes a certain amount of time to complete. Let’s say that the amount of time can be expressed as a polynomial on the number of things to perform the task on. For example, the amount of time it takes to read a book might be proportional to the number of words, plus the amount of time it takes to settle into your favorite easy chair. C 1 N + C 2 19
Big-O: Dropping the Low Term C 1 N + C 2 When N is very large, the time spent settling into your easy chair becomes such a small proportion of the total time that it’s virtually zero. So from a practical perspective, for large N, the polynomial reduces to: C 1 N In fact, for any polynomial, if N is large, then all of the terms except the highest-order term are irrelevant. 20
Big-O: Dropping the Constant C 1 N Computers get faster and faster all the time. And there are many different flavors of computers, having many different speeds. So, computer scientists don’t care about the constant, only about the order of the highest-order term of the polynomial Separates the cost of the fundamental algorithm from computer-specifics They indicate this with Big-O notation: O(N) This is often said as: “of order N.” 21
N-Body Problems Given N bodies, each body exerts a force on all of the other N – 1 bodies. Therefore, there are N (N – 1) forces total. In Big-O notation, that’s O(N 2 ) forces. So, calculating the forces takes O(N 2 ) time to execute. But, there are only N particles, each taking up the same amount of memory, so we say that N-body codes are of: O(N) spatial complexity (memory) O(N 2 ) time complexity 22
O(N 2 ) Forces 23 Note that this picture shows only the forces between A and everyone else. A
How to Calculate? Whatever your physics is, you have some function, F(A,B), that expresses the force between two bodies A and B. For example, for stars and galaxies: F(A,B) = G·m A ·m B / (dist(A,B) 2 ) where G is the gravitational constant and m is the mass of the body in question. If you have all of the forces for every pair of particles, then you can calculate their sum, obtaining the force on every particle. From that, you can calculate every particle’s new velocity and position. 24
Algorithm Set up initial positions and velocities of all particles FOR time steps from 1 to T FOR each particle p from 1 to N Initialize force on p to 0. FOR each other particle q from 1 to N calculate force on p from q add to p’s forces Calculate the velocity of p based on forces Calculate the position of p based on velocity 25
HOMEWORK! HW #10 is posted Last one of the semester! 1 Monte Carlo problem Class in here on Thursday and Monday
Algorithm – Parallel Version Set up initial positions and velocities of all particles FOR time steps from 1 to T FOR some sub-set of particles p from 1 to N Initialize force on p to 0. FOR each particle q from 1 to N (excluding p) calculate force on p from q add to p’s forces Calculate the velocity of p based on forces Calculate the position of p based on velocity Send position information of my subset to other CPUs 28
Parallelization of the Direct Force Algorithm The steps needed for each calculation: Single set of Instructions over Multiple Data Each process calculates some of the accelerations (calculate) Each process calculates some of the new positions (calculate) Each process shares its position information (communicate!) The point of diminishing returns: The more you split up the problem, the less work each processor does, thus the ratio of concurrent work to communication reduces.
Amdahl's Law and optimal efficiency General Law Best case simulation time = Speedup approaches a limit N-body is worse: time = (aN 2 / P) + (cN) + (dP) Speedup falls off as 1/P for large P (time increases linearly) Large N, less communication can increase the value of P before speedup falls off.
Particle-Mesh and Particle-Particle Particle-Mesh
Particle Mesh Algorithm Replace solution of force through particle-particle interaction with solution of force as the gradient of a density function satisfying Poisson’s equation.
Particle Mesh Algorithm Step 1: Generate Density Distribution Function from point sources Step 2: Take FFT of density distribution function Step 3: Solve Poisson’s equation for gravitational potential in Fourier space Step 4: Transform back to Euclidean space Step 5: Compute force from potential
Step 1: Generate Density Distribution Function For each body, determine which grid sites are near the body, and determine how to apply a density distribution to those nearby grid points Simplest approach, assume point mass fills some radius R and volume V, and any grid point within that radius has its density increased by M/V
Step 2/3/4: Solving Poisson’s equation using Fourier Transform
Step 5: Solve for the force using the potential
Win with PM Now we only have to communicate based on the number of grid nodes, instead of the number of bodies. There’s very little point-to-point force calculations. Very fast! What’s the bad news?
PM Issues Because we’re “softening” the particles to a grid, we have no calculations reflecting “close” (local/short range) forces. Those are the biggest forces! Recall that gravitational force diminishes with r 2
Other Concerns when using PM Limitations on size of grid Memory requirements, particularly in 3-D NG*NG*NG Time requirements to map points to grid NP*NG*NG*NG Time requirements to solve FFT NG*log(NG)
Improvements for nearest neighbors Use PM method for long range forces only, and calculate short range forces using direct calculation of nearest neighbors. Particle-Particle Particle-Mesh or P3M (Or use tree-based hybrid methods) i.e. Barnes-Hut