Presentation is loading. Please wait.

Presentation is loading. Please wait.

Flexible Control of Data Transfer between Parallel Programs Joe Shang-chieh Wu Alan Sussman Department of Computer Science University of Maryland, USA.

Similar presentations


Presentation on theme: "Flexible Control of Data Transfer between Parallel Programs Joe Shang-chieh Wu Alan Sussman Department of Computer Science University of Maryland, USA."— Presentation transcript:

1 Flexible Control of Data Transfer between Parallel Programs Joe Shang-chieh Wu Alan Sussman Department of Computer Science University of Maryland, USA

2 Grid 20042 Corona and solar wind Global magnetospheric MHD Thermosphere- ionosphere model Rice convection model Particle and Hybrid model

3 Grid 20043 What is the problem? Coupling existing (parallel) programs –for physical simulations more accurate answers can be obtained –for visualization, flexible transmission of data between simulation and visualization codes Exchange data across shared or overlapped regions in multiple parallel programs Couple multi-scale (space & time) programs Focus on multiple time scale problems (when to exchange data)

4 Grid 20044 Roadmap Motivation Approximate Matching Matching properties Performance results Conclusions and future work

5 Grid 20045 Is it important? Petroleum reservoir simulations – multi- scale, multi-resolution code Special issue in May/Jun 2004 of IEEE Computing in Science & Engineering “It’s then possible to couple several existing calculations together through an interface and obtain accurate answers.” Earth System Modeling Framework several US federal agencies and universities. (http://www.esmf.ucar.edu)

6 Grid 20046 Solving multiple space scales 1.Appropriate tools 2.Coordinate transformation 3.Domain knowledge

7 Grid 20047 Matching is OUTSIDE components Separate matching (coupling) information from the participating components –Maintainability – Components can be developed/upgraded individually –Flexibility – Change participants/components easily –Functionality – Support variable-sized time interval numerical algorithms or visualizations Matching information is specified separately by application integrator Runtime match via simulation time stamps

8 Grid 20048 Separate codes from matching define region Sr12 define region Sr4 define region Sr5... Do t = 1, N, Step0... // computation jobs export(Sr12,t) export(Sr4,t) export(Sr5,t) EndDo define region Sr0... Do t = 1, M, Step1 import(Sr0,t)... // computation jobs EndDo Importer Ap1 Exporter Ap0 Ap1.Sr0 Ap2.Sr0 Ap4.Sr0 Ap0.Sr12 Ap0.Sr4 Ap0.Sr5 Configuration file # Ap0 cluster0 /bin/Ap0 2... Ap1 cluster1 /bin/Ap1 4... Ap2 cluster2 /bin/Ap2 16... Ap4 cluster4 /bin/Ap4 4 # Ap0.Sr12 Ap1.Sr0 REGL 0.05 Ap0.Sr12 Ap2.Sr0 REGU 0.1 Ap0.Sr4 Ap4.Sr0 REG 1.0 #

9 Grid 20049 Matching implementation Library is implemented with POSIX threads Each process in each program uses library threads to exchange control information in the background, while applications are computing in the foreground One process in each parallel program runs an extra representative thread to exchange control information between parallel programs –Minimize communication between parallel programs –Keep collective correctness in each parallel program –Improve overall performance

10 Grid 200410 Approximate Matching Exporter Ap0 produces a sequence of data object A at simulation times 1.1, 1.2, 1.5, and 1.9 –A@1.1, A@1.2, A@1.5, A@1.9 Importer Ap1 requests the same data object A at time 1.3 –A@1.3 Is there a match for A@1.3? If Yes, which one and why?

11 Grid 200411 Supported matching policies = LUBminimum f(x) with f(x) ≥ x GLBmaximum f(x) with f(x) ≤ x REGf(x) minimizes |f(x)-x| with |f(x)-x| ≤ p REGUf(x) minimizes f(x)-x with 0 ≤ f(x)-x ≤ p REGLf(x) minimizes x-f(x) with 0 ≤ x-f(x) ≤ p FASTRany f(x) with |f(x)-x| ≤ p FASTUany f(x) with 0 ≤ f(x)-x ≤ p FASTLany f(x) with 0 ≤ x-f(x) ≤ p

12 Grid 200412 Acceptable ≠ Matchable t e’ t e’’

13 Grid 200413 Region-type matches te’te’

14 Grid 200414 Experimental setup Question : How much overhead introduced by runtime matching? 6 PIII-600 processors, connected by channel-bonded Fast Ethernet u tt = u xx + u yy + f(t,x,y), solve 2-d diffusion equation by the finite element method. u(t,x,y) : 512x512 array, on 4 processors (Ap1) f(t,x,y) : 32x512 array, on 2 processors (Ap2) All data in Ap2 is sent (exported) to Ap1 using matching criterion Ap1 receives (imports) data with 3 different scenarios. 1001 matches made for each scenario (results averaged over multiple runs)

15 Grid 200415 Experiment result 1 P10P11P12P13 Case A341ms336ms610ms614ms Case B620ms618ms Case C624ms612ms340ms339ms Ap1 execution time (average)

16 Grid 200416 Experiment result 2 Do t = 1, N import (data, t) compute u EndDo Do t = 1, N Request a match for data@t Receive data compute u EndDo Matching timeData Transfer timeComputation TimeMatching Overhead Case A944us6.1ms605ms13% Case B708us2.9ms613ms20% Case C535us6.8ms614ms7% Ap1 pseudo code Ap1 overhead in the slowest process

17 Grid 200417 Experiment result 3 Slowest ProcessFastest Process Case A944us (P13)4394us (P11) Case B708us (P10)3468us (Others) Case C535us (P10)3703us (P13) Comparison of matching time Fastest process (P11) - high cost, remote match Slowest process (P13) - low cost, local match High cost match can be hidden

18 Grid 200418 Conclusions & Future work Conclusions –Low overhead approach for flexible data exchange between different time scale e- Science components Ongoing & future work –Performance experiments in Grid environment –Caching strategies to efficiently deal with slow importers –Real applications – space weather is the first one

19 End of Talk

20 Grid 200420 Main components

21 Grid 200421 Local and Remote requests

22 Grid 200422 Space Science Application


Download ppt "Flexible Control of Data Transfer between Parallel Programs Joe Shang-chieh Wu Alan Sussman Department of Computer Science University of Maryland, USA."

Similar presentations


Ads by Google