Monte Carlo Linear Algebra Techniques and Their Parallelization Ashok Srinivasan Computer Science Florida State University www.cs.fsu.edu/~asriniva.

Monte Carlo Linear Algebra Techniques and Their Parallelization Ashok Srinivasan Computer Science Florida State University www.cs.fsu.edu/~asriniva

Outline Background –Monte Carlo Matrix-vector multiplication –MC linear solvers Non-diagonal splitting –Dense implementation –Sparse implementation Parallelization Conclusions and future work

Background MC linear solvers are old! –von Neumann and Ulam (1950) –Were not competitive with deterministic techniques Advantages of MC –Can give approximate solutions fast Feasible in applications such as preconditioning, graph partitioning, information retrieval, pattern recognition, etc –Can yield selected components fast –Are very latency tolerant

Matrix-vector multiplication Compute C j h, C  R nxn –Choose probability and weight matrices such that C ij = P ij W ij and h i = p i w i –Take a random walk, based on these probabilities –Define random variables X i X 0 = w 0, and X i = X i-1 W k i k i-1 E(X j  ik j ) = (C j h) i –Each random walk can be used to estimate the k j th component of C j h Convergence rate independent of n k0k0 Pk1k0Pk1k0 Update (C j h) k j kjkj k2k2 k1k1 Pk2k1Pk2k1 pk0pk0

Matrix-vector multiplication... continued  m j=0 C j h too can be similarly estimated  m j=0 (BC) j Bh will be needed by us –It can be estimated using probabilities on both matrices, B and C –Length of random walk is twice that for the previous case k0k0 Pk1k0Pk1k0 Update (C j h) k 2j+1 k 2m+1 k2k2 k1k1 Pk2k1Pk2k1 pk0pk0 Pk3k2Pk3k2 k3k3

MC linear solvers Solve Ax = b –Split A as: A = N – M –Write the fixed point iteration x m+1 = N -1 Mx m + N -1 b = Cx m + h –If we choose x 0 = h, then we get x m =  m j=0 C j h –Estimate the above using the Markov chain technique mentioned earlier

Current techniques –Choose N = a diagonal matrix Ensures efficient computation of C C is sparse when A is sparse Example: N = Diagonal of A yields the Jacobi iteration, and the corresponding MC estimate

Properties of MC linear solvers MC techniques estimate the result of a stationary iteration –Errors from the iterative process –Errors from MC Reduce the error by –Variance reduction techniques –Residual correction –Choose a better iterative scheme!

Non-diagonal splitting Observations –It is possible to construct an efficient MC technique for specific splittings, even if explicit construction of C were computationally expensive –It may be possible to implicitly represent C sparsely, even if C is not actually sparse

Our example Choose N to be the diagonal and sub- diagonal of A d 1 s 2 d 2 s n d n N = * * * *...... * * * * * * * N -1 = Computing N -1 C is too expensive –Compute x m =  m j=0 (N -1 M) j N -1 b instead

Computing N -1 Using O(n) storage and precomputation time, any element of N -1 can be computed in constant time –Define T(1) = 1, T(i+1) = T(i) s i+1 /d i+1 –N -1 ij = 0, if i < j 1/d i, if i =j (-1) i-j /d j T(i)/T(j), otherwise The entire N -1, if needed, can be computed in O(n 2 ) time

Dense implementation Compute N -1 and store in O(n 2 ) space Choose probabilities proportional to the weight of the elements Use the alias method to sample –Precomputation time proportional to the number of elements –Constant time to generate each sample Estimate  m j=0 (N -1 M) j N -1 b

Experimental results Walk length = 2

Sparse implementation We cannot use O(n 2 ) space or time! Sparse implementation for M is simple Sparse representation of N -1 –Choose P ij = 0, if i < j 1/(n-j+1) otherwise Sampled from the uniform distribution –Choose W ij = N -1 ij P ij Constant time to determine any W ij Minor modifications needed when s i = 0

Parallelization Proc 1 RNG 1 Proc 2 RNG 2 Proc 3 RNG 3 23.223.6 23.7 23.5 MC is “embarrassingly” parallel Identical algorithms are run independently on each processor, with the random number sequences alone being different

MPI vs OpenMP on Origin 2000 Cache misses cause poor performance of the OpenMP parallelization

Conclusions and future work Demonstrated that is possible to have effective MC implementations with non- diagonal splittings too Need to extend this to better iterative schemes Non-replicated parallelization needs to be considered

Monte Carlo Linear Algebra Techniques and Their Parallelization Ashok Srinivasan Computer Science Florida State University www.cs.fsu.edu/~asriniva.

Similar presentations

Presentation on theme: "Monte Carlo Linear Algebra Techniques and Their Parallelization Ashok Srinivasan Computer Science Florida State University www.cs.fsu.edu/~asriniva."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Monte Carlo Linear Algebra Techniques and Their Parallelization Ashok Srinivasan Computer Science Florida State University www.cs.fsu.edu/~asriniva.

Similar presentations

Presentation on theme: "Monte Carlo Linear Algebra Techniques and Their Parallelization Ashok Srinivasan Computer Science Florida State University www.cs.fsu.edu/~asriniva."— Presentation transcript:

Similar presentations

About project

Feedback