Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Resolution of large symmetric eigenproblems on a world-wide grid Laurent Choy, Serge Petiton, Mitsuhisa Sato CNRS/LIFL HPCS Lab. University of Tsukuba.

Similar presentations

Presentation on theme: "1 Resolution of large symmetric eigenproblems on a world-wide grid Laurent Choy, Serge Petiton, Mitsuhisa Sato CNRS/LIFL HPCS Lab. University of Tsukuba."— Presentation transcript:

1 1 Resolution of large symmetric eigenproblems on a world-wide grid Laurent Choy, Serge Petiton, Mitsuhisa Sato CNRS/LIFL HPCS Lab. University of Tsukuba 2 nd NEGST workshop at Tokyo May 28-29 th, 2007

2 2 Outlines  Introduction  Distribution of the numerical method  Experiments Experiments on world-wide grids: platforms, numerical settings Experiments on Grid'5000: motivations, platforms, numerical settings Results  YML Progress of YML YvetteML workflow of the real symmetric eigenproblem First experiments  Conclusion

3 3 Outlines ➔ Introduction  Distribution of the numerical method  Experiments Experiments on world-wide grids: platforms, numerical settings Experiments on Grid'5000: motivations, platforms, numerical settings Results  YML Progress of YML YvetteML workflow of the real symmetric eigenproblem First experiments  Conclusion

4 4 Introduction  Huge number of nodes connected to Internet Clusters and NOWs of institutions,PCs of individual users Volunteer  Constant availability of nodes, on-demand access  HPC and large Grid Computing are complementary We do not target the highest performances We target a different community of users  Why the real symmetric eigenproblem? Requires a lot of resources on the nodes Communications, synchronization points Useful problem Few similar studies for very large Grid Computing

5 5 Outlines  Introduction ➔ Distribution of the numerical method  Experiments Experiments on world-wide grids: platforms, numerical settings Experiments on Grid'5000: motivations, platforms, numerical settings Results  YML Progress of YML YvetteML workflow of the real symmetric eigenproblem First experiments  Conclusion

6 6 Distribution of the numerical method (1/2)  Real symmetric eigenproblem Au=lu, A real symmetric  Main steps: Lanczos tridiagonalization  T=Q t AQ, T real symmetric tridiagonal  Data accessed by means of MVP Bisection and Inverse Iteration  Tv=lv, same eigenvalues as A (Ritz eigenvalues)  Communication-free parallelism: task-farming Ritz eigenvectors computations (u) Accuracy tests |Au-lu| 2 <eps

7 7 Distribution of the numerical method (2/2)  Reducing the memory usage Out-of-core Restarted scheme  Reorthogonalization  Bisection, Inverse Iteration  Reduces the disk usage too  Volume of communications Data-persistence (A and Q)  Number of communications  Task-farming  Other issue to be improved Distribution of A

8 8 Outlines  Introduction  Distribution of the numerical method ➔ Experiments ➔ Experiments on world-wide grids: platforms, numerical settings ➔ Experiments on Grid'5000: motivations, platforms, numerical settings Results  YML Progress of YML YvetteML workflow of the real symmetric eigenproblem First experiments  Conclusion

9 9 World-wide grid experiments Experimental platforms, numerical settings (1/2)  Computing and network resources University of Tsukuba  Homogeneous dedicated clusters  Dual Xeon ~3GHz, 1 to 4 GB University of Lille 1  Heterogeneous NOWs  Celeron 1.4 GHz to P4 3.2 Ghz  128MB to 1GB  Shared with students Internet

10 10 World-wide grid experiments Experimental platforms, numerical settings (2/2)  4 Platforms OmniRPC 2 local platforms: 29 / 58 nodes, Lille 2 world-wide platforms  58 (29 Lille+ 29 Tsukuba dual-proc.)  116 (58 Lille, 58 Tsukuba dual-proc.)  Matrix N=47792 2.5 million elements, avg 48 nnz/row  Parameters M=10, 15, 20, 25 K=1, 2, 3, 4

11 11 Grid'5000 experiments Presentation, motivations  Up to 9 sites distributed in France Dedicated PC with reservation policy Fast and dedicated Network  RENATER (1GBit/s to 10GBit/s) PC are homogeneous (few exceptions) Homogeneous environment  (deployment strategy)  For those experiments Orsay: up to 300 single-CPU nodes Lille: up to 60 single-CPU nodes Nice: up to 60 dual-CPU nodes Rennes: up to 70 dual-CPU nodes

12 12 Grid'5000 experiments Platforms and numerical settings (1/2)  Step 1: Goal: improving previous analysis. Platforms  29 Orsay, single-proc  58 Orsay, single-proc  58 Lille, Sophia dual-proc  116 Orsay, Sophia dual-proc (1 core/proc)  + 116 Orsay, Lille, Sophia dual-proc (1 core/proc)  1 process/dual-processor Numerical settings  Matrix: N=47792, 2.5 million elements, avg 48 nnz/row  Parameters  m=10, 15, 20, 25  k=1, 2, 3, 4

13 13 Grid'5000 experiments Platforms and numerical settings (2/2)  Step 2: Goal: increasing the size of the problem. In progress N=430128, 193 million elements 7 OmniRPC relay nodes, 206 CPU  3 sites 11 OmniRPC relay nodes, 412 CPU  4 sites k=1, m=15

14 14 Outlines  Introduction  Distribution of the numerical method ➔ Experiments Experiments on world-wide grids: platforms, numerical settings Experiments on Grid'5000: motivations, platforms, numerical settings ➔ Results  YML Progress of YML YvetteML workflow of the real symmetric eigenproblem First experiments  Conclusion

15 15 World-wide grid experiments Results Sing. Proc. Orsay Dual. Proc. Tsukuba (all proc. Used) 116 Sing. Proc. Lille Dual. Proc. Tsukuba (all proc. Used) 58 Sing. Proc Lille58 Sing. Proc. Lille29

16 16 Grid'5000 experiments – step 1 Results Sing. Proc. Orsay Sing. Proc. Lille Dual. Proc. Sophia (1 proc. Used) 116 Sing. Proc. Orsay Dual. Proc. Sophia (all proc. Used) 116 Sing. Proc. Lille Dual. Proc. Sophia (all proc. Used) 58 Sing. Proc Orsay58 Sing. Proc. Orsay29

17 17 Grid'5000 experiments – step 2 Results 119Ritz eigenvector 9<1Bisection + Inverse Iteration 1315010962Wall-clock time Send new column of Q: 20 MVP: 12311 Reorthog: 159 Send new column of Q: 22 MVP: 10106 Reorthog: 129 Lanczos tridiagonalization Details for N=430128, m=15, k=1 Wall-clock times in seconds 691 206 810 412 |Au-lu| < eps Number cpu  Evaluation of the wall-clock-time for 1 MVP with the matrix A In the tridiagonalization:  15(m)*5(nb restarts)=75 MVPs  134 sec (206 cpu) and 164 sec (412 cpu) per MVP In the tests of convergence:  5(nb restarts) MVPs  138 sec (206 cpu) and 162 sec (412 cpu) per MVP

18 18 Outlines  Introduction  Distribution of the numerical method  Experiments Experiments on world-wide grids: platforms, numerical settings Experiments on Grid'5000: motivations, platforms, numerical settings Results ➔ YML ➔ Progress of YML ➔ YvetteML workflow of the real symmetric eigenproblem ➔ First experiments  Conclusion

19 19 Progress of YML  YML 1.0.5  Stability, error reporting  Collections of data  out-of-core  Variable lists of parameters  Parameters in/out of the Workflow  Mainly developed at the PRiSM laboratory, University of Versailles   Olivier Delannoy, Nahid Emad

20 20 Resolution of the eigenproblem with YML  No data persistence Future work: binary cache Re-usability / aggregation of components

21 21 Experiments with YML & OmniRPC back-end YML + OmniRPC back-end (wall-clock times in min) OmniRPC (wall-clock times in min) Overhead (in %) Sources of overhead  No computation in the YvetteML workflow  Sheduler, (un)packing the parameters  Transfers of binaries

22 22 Outlines  Introduction  Distribution of the numerical method  Experiments Experiments on world-wide grids: platforms, numerical settings Experiments on Grid'5000: motivations, platforms, numerical settings Results  YML Progress of YML YvetteML workflow of the real symmetric eigenproblem First experiments ➔ Conclusion

23 23 Conclusion (1/3)  Reminder of the scope of this work Large grid computing and HPC: complementary tools  Used by people that have no access to HPC  Significant computations (size of the problem)  We do not (cannot) target the high performances  The resources are not dedicated  Slow networks, heterogeneous machines, external perturbations, etc  Linear algebra problems are useful for many general applications  Differences with HPC and cluster computing We must not have a “speed-up” approach of the computations Recommendations to save resources on nodes

24 24 Conclusion (2/3)  We propose Scalable real symmetric eigensolver for large grids  Next expected bounding limit: disk space for much larger or very dense matrix Before the implementation of the method, key choices must be done  Numerical methods and programming paradigms  Bisection (Task-farming)  Restarted scheme (memory and disk)  Out-of-core (memory)  Data persistence (communication) New version of YML Workflow of the eigensolver and re-usable components  In progress

25 25 Conclusion (3/3)  Topics of study for the eigensolver Improving the distribution of A Testing more matrices  Different kind of matrices (e.g. sparse, dense)  Larger matrices Scheduling level  adapting the workload balancing to the heterogeneity of the platforms  Current and future work on YML Finishing the multi back-end support Binary cache

Download ppt "1 Resolution of large symmetric eigenproblems on a world-wide grid Laurent Choy, Serge Petiton, Mitsuhisa Sato CNRS/LIFL HPCS Lab. University of Tsukuba."

Similar presentations

Ads by Google