Presentation is loading. Please wait.

Presentation is loading. Please wait.

Parallel Simulation. Past, Present and Future C.D. Pham Laboratoire RESAM Universit ₫ Claude Bernard Lyon 1

Similar presentations

Presentation on theme: "Parallel Simulation. Past, Present and Future C.D. Pham Laboratoire RESAM Universit ₫ Claude Bernard Lyon 1"— Presentation transcript:

1 Parallel Simulation. Past, Present and Future C.D. Pham Laboratoire RESAM Universit ₫ Claude Bernard Lyon 1

2 Past •Introduction –Discrete Event Simulation (DES) –Parallel DES and the synchronization problems •Chandy-Misra-Bryant rules –Architecture of a conservative LP –The « Safe is better » approach –The lookahead ability  Jefferson ’ s point of view –Architecture of an optimistic LP –Time Warp •Mixed/adaptive approaches,

3 Present •Ongoing projects –SSF, –TeD/GTW, –GloMoSim, –CSAM.

4 Future •Challenges & Perspectives –Ultra-large scale simulations, –Wide-area federation-based simulations, –WEB-based simulations.

5 PAST: the algorithms, only the algorithms!

6 Simulation •To simulate is to reproduce the behavior of a physical system with a model •Practically, computers are used to numerically simulate a logical model •Simulations are used for performance evaluation and prediction of complex systems –fluids dynamic, chemistry reactions (continous)  communication network models: routing, congestion avoidance, mobile … (discrete) •Simulation is more flexible than analytical methods

7 Discrete Event Simulation (DES) •assumption that a system changes its state at discrete points in simulation time a1a2a3a4d1d2d3 S1S3 S2 0 tt 2t2t3t3t4t4t5t5t6t6t time step

8 DES concepts •fundamental concepts: –system state (variables) –state transitions (events) –simulation time: totally ordered set of values representing time in the system being modeled •the system state can only be modified upon reception of an event •modeling can be –event-oriented –process-oriented

9 Life cycle of a DES •a DES system can be viewed as a collec-tion of simulated objects and a sequence of event computations •each event computation contains a time stamp indicating when that event occurs in the physical system •each event computation may: –modify state variables –schedule new events into the simulated future •events are stored in a local event list –events are processed in time stamped order –usually, no more event = termination

10 A simple DES model local event list A B 5 link model delay = 5 send processing time = 5 receive processing time = 1 packet arrival P1 at 5, P2 at 12, P3 at 22 B receive P1 from A e4 B sends ACK(P1) to A e5 e8 B receive P2 from A A sends P1 to B e2 A receive packet P1 e1 A sends P2 to B e6 A receive packet P2 e3 A receive packet P3 e9 e7 A receive ACK(P1)

11 Why it works? •events are processed in time stamp order  an event at time t can only generate future events with timestamp greater or equal to t (no event in the past) •generated events are put and sorted in the event list, according to their timestamp –the event with the smallest timestamp is always processed first, –causality constraints are implicitly maintained.

12 Why change? It ’ s so simple! •models becomes larger and larger •the simulation time is overwhelming or the simulation is just untractable •example: –parallel programs with millions of lines of codes, –mobile networks with millions of mobile hosts, –ATM networks with hundreds of complex switches, –multicast model with thousands of sources, –ever-growing Internet, –and much more...

13 Some figures to convince... •ATM network models –Simulation at the cell-level, –200 switches –1000 traffic sources, 50Mbits/s –155Mbits/s links, –1 simulation event per cell arrival. –simulation time increases as link speed increases, –usually more than 1 event per cell arrival, –how scalable is traditional simulation? More than 26 billions events to simulate 1 second! 30 hours if 1 event is processed in 1us

14 Parallel simulation - principles •execution of a discrete event simulation on a parallel or distributed system with several physical processors. •the simulation model is decomposed into several sub-models that can be executed in parallel –spacial partitioning, –temporel partitioning, •radically different from simple simulation replications.

15 Parallel simulation - pros & cons •pros –reduction of the simulation time, –increase of the model size, •cons –causality constraints are difficult to maintain, –need of special mechanisms to synchronize the different processors, –increase both the model and the simulation kernel complexity. •challenges –ease of use, transparency.

16 Parallel simulation - example logical process (LP) packetheventt parallel

17 A simple PDES model local event list A B 5 link model delay = 5 send processing time = 5 receive processing time = 1 packet arrival P1 at 5, P2 at 12, P3 at 22 B sends ACK(P1) e5 A sends P1 to B e2 e6 A sends P2 to B A rec. packet P1 e1 A rec. packet P2 e3 B rec. P1 from A e4 B rec. P2 from A e8 e7 A rec. ACK(P1) t e9 A rec. packet P3 causality error, violation

18 Synchronization problems •fundamental concepts –each Logical Process (LP) can be at a different simulation time –local causality constraints: events in each LP must be executed in time stamp order •synchronization algorithms  Conservative: avoids local causality violations by waiting until it ’ s safe –Optimistic: allows local causality violations but provisions are done to recover from them at runtime

19 Chandy-Misra-Bryant rules Architecture of a conservative LP The « Safe is better » approach The lookahead ability

20 Architecture of a conservative LP –LPs communicate by sending non-decreasing timestamped messages –each LP keeps a static FIFO channel for each LP with incoming communication –each FIFO channel (input channel, IC) has a clock c i that ticks according to the timestamp of the topmost message, if any, otherwise it keeps the timestamp of the last message LP B LP A LP C LP D c 1 =t B 1 tB1tB1 tB2tB2 tC3tC3 tC4tC4 tC5tC5 tD4tD4 c 2 =t C 3 c 3 =t D 3

21 A simple conservative algorithm •each LP has to process event in time-stamp order to avoids local causality violations The Chandy-Misra-Bryant algorithm while (simulation is not over) { determine the IC i with the smallest C i if (IC i empty) wait for a message else { remove topmost event from IC i process event } }

22 Safe but has to block LP B LP A LP C LP D 36 147 10 5 IC 1 IC 2 IC 3 min IC event 1 2 3 1 4 2 5 3 BLOCK 3 6 1 7 2 9

23 Blocks and even deadlocks! S A B M merge point BLOCKED cycle S sends all messages to B 4 4 4 4 4 6

24 How to solve deadlock: null-messages S A B M null-messages for artificial propagation of simulation time 10 4 4 4 4 5 6 7 1 2 UNBLOCKED What frequency?

25 How to solve deadlock: null-messages a null-message indicates a Lower Bound Time Stamp minimum delay between links is 4 LP C initially at simulation time 0 119 10 7 ABC 4 LP C sends a null-message with time stamp 4 LP A sends a null-message with time stamp 8 8 LP B sends a null-message with time stamp 12 12 LP C can process event with time stamp 7 12

26 The lookahead ability •null-messages are sent by an LP to indicate a lower bound time stamp on the future messages that will be sent •null-messages rely on the « lookahead » ability –communication link delays –server processing time (FIFO) •lookahead is very application model dependant and need to be explicitly identified

27 Lookahead for concurrent processing LP B LP A LP C LP D s TATA T A +L A ss ss ss s safe event unsafe event

28 What if lookahead is small? a null-message indicates a Lower Bound Time Stamp minimum delay between links is 4 LP C initially at simulation time 0 119 10 7 AB C 1 LP C sends a null-message with time stamp 1 LP A sends a null-message with time stamp 2 2 LP B sends a null-message with time stamp 3 3 LP C can process event with time stamp 7 7 1 then 5 5 then 6 6 then 7 7

29 Conservative: pros & cons •pros –simple, easy to implement –good performance when lookahead is large (communication networks, FIFO queue) •cons –pessimistic in many cases –large lookahead is essential for performance –no transparent exploitation of parallelism  performances may drop even with small changes in the model (adding preemption, adding one small lookahead link … )

30 Jefferson ’ s point of view Architecture of an optimistic LP Time Warp

31 Architecture of an optimistic LP –LPs send timestamped messages, not necessarily in non-decreasing time stamp order –no static communication channels between LPs, dynamic creation of LPs is easy –each LP processes events as they are received, no need to wait for safe events –local causality violations are detected and corrected at runtime LP B LP A LP C LP D tB1tB1 tB2tB2 tC3tC3 tC4tC4 tC5tC5 tD4tD4

32 Processing events as they arrive 11 LP B 13 LP D 18 LP B 22 LP C 25 LP D 28 LP C 36 LP B 32 LP D LP B LP A LP C LP D LP A processed! what to do with late messages?

33 Time Warp. Rollback? How? •Late messages are handled with a rollback mechanism –undo false/uncorrect local computations, •state saving: save the state variables of an LP •reverse computation –undo false/uncorrect remote computations, •anti-messages: anti-messages and (real) messages annihilate each other –process late messages –re-process previous messages: processed events are NOT discarded!

34 A pictured-view of a rollback 11131822252836 32 4345 2513 state points anti-msg 13152024273038 1118222832 343830 36 –The real rollback distance depends on the state saving period: short period reduces rollback overhead but increases state saving overhead 11131822252832364345 283236 unprocessed processed

35 Reception of an anti-message –may initiate a rollback if the corresponding positive message has already been processed, –may annihilate the corresponding positive message if it is still unprocessed, –may wait in the input queue if the corresponding positive message has not been received yet. 222528364345 43 22252836434548 222528364345 25 rollback 48

36 Need for a Global Virtual Time •Motivations –an indicator that the simulation time advances –reclaim memory (fossil collection) •Basically, GVT is the minimum of  all LPs ’ logical simulation time –timestamp of all messages in transit •GVT garantees that –events below GVT are definitive events (I/O) –no rollback can occur before the GVT –state points before GVT can be reclaimed –anti-messages before GVT can be reclaimed

37 A pictured-view of the GVT LP B LP A LP C LP D c c old GVT ccc cccc cccc new GVT cccc c D conditional event definitive event cc c cc c c c c c cc DD D D D D D D D WANTED

38 Optimistic overheads •Periodic state savings –states may be large, very large! –copies are very costly •Periodic GVT computations –difficult in a distributed architecture, –may block computations, •Rollback thrashing –cascaded rollback, no simulation progress! •Memory! –memory is THE limitation

39 Optimistic: pros & cons •pros –exploits all the parallelism in the model, lookahead is less important, –transparent to the end-user, –interactive simulations possible, –can be general-purpose. •cons –very complex, needs lots of memory,  large overheads (state saving, GVT, rollbacks … )

40 Mixed/adaptive approaches •General framework that (automatically) switches to conservative or optimistic •Adaptive approaches may determine at runtime the amount of conservatism or optimism conservativeoptimistic mixed messages performance optimistic conservative

41 PRESENT: how to survive? … and how to get money?

42 Parallel simulation today •Lots of algorithms have been proposed –variations on conservative and optimistic –adaptives approaches •Paradoxically few end-users –impossible to compete with sequential simulators in terms of user interface, generability, ease of use... •Ongoing research mainly focus on –ultra-large scale simulations of networks, –tools and execution environments –composability and interoperability issues

43 Ongoing projects •DOMAINS/GloMoSim •SSF •TeD/GTW •CSAM

44 DOMAINS/GloMoSim project  Design of Mobile Adaptive Networks, DARPA/DAAB07-97-C-D321 •Provides a library for simulating millions of mobile nodes •Proves the efficiency of parallel simulation for scalability issues •Based on the PARSEC simulation language •Conservative or optimistic execution

45 Glomo objectives

46 Glomo librairies

47 SSF-Scalable Simulation Framework •DARPA/ITO (Next Generation Internet Program) and NSF/ANIR (Special Projects in Networking) •SSF proposes discrete event simulations of large complex systems, with serial and scalable parallel implementations •SSFNet is a collection of SSF-based models for simulating Internet protocols and networks •Based on YAWNS, a conservative kernel

48 SSFNet, modeling the Internet

49 TeD/GTW  TeD (Telecommunications Description Language) is a language for modeling telecommunicating network elements and protocols (PNNI). •GTW is a general purpose parallel discrete event simulation executive using optimistic synchronization techniques. •The TeD compiler translates TeD models into C++ code which uses GTW for parallel simulation

50 Modeling with TeD

51 CSAM •CSAM: Conservative Simulator for ATM network Model •Simulation at the cell-level  C++ programming-style, predefined generic model of sources, switches, links … Test-bed for parallel simulations on high-performance clusters.

52 Test case: 78-switch ATM network Distance-Vector Routing with dynamic link cost functions Connection setup, admission control protocols

53 CSAM - Some results... Routing protocol ’ s reconfiguration time

54 CSAM - visualization tool

55 FUTURE: the great challenges! Ultra-Large scale simulations, Wide-area federation-based simulations, WEB-based simulations.

56 Ultra-large scale simulations •Millions of mobile nodes, •Thousands of multicast connections, •Full Internet simulation, •Ultra-large scale simulations require –lots of memory! lots of CPU! –new modeling techniques: reuse of model description, decoupling state from model description; –advanced memory management schemes: shared events, application memory regulation. •« Out of core » simulations.

57 Federation-based simulations •Cost of model developping is increasing at a high rate. •Reuse and interoperability is a key issue in the development phase of new models. •Need for a unified framework so that independent simulators can run together and achieve a given goal. The DoD has proposed the High-Level Architecture framework for federation-based simulations

58 The HLA framework Runtime Infrastructure (RTI) Federation Management Declaration Management Object Management Ownership Management Time Management Data Distribution Management • The High Level Architecture calls for a federation of simulations to achieve interoperability and reuse of software. Federation • Without HLA, simulations are mostly independents and interoperability is not easy. Logical simulations Hardware, human-in-the loop real-time simulators Display, statistics... 10 Rules for the federation and the federates behavior An Object Model Template to describe the simulation objects An Interface Specification simulator real-time players tools

59 Wide-area interactive simulations INTERNET human in the loop flight simulator battle field simulation display computer-based sub-marine simulator

60 WEB-Based simulation •Users build models and submit them on the web (meta-computing) –Hides the complexity of parallel simulation techniques –Provides computing resources for the users ASCII RED 1st rank top 500 list Cplant ?

61 JTeD project, the Java-based TeD

62 Summary •Parallel simulation is a mature field •Applications, especially communication network models, are the centre of interest •The challenges are for very-large scale simulations and re-usability of models •Real-time interactive simulations are desirable on a wide-area interconnection •As-fast-as possible simulations will likely remain « indoor » (cluster, SMP)

63 Requirements put on networking •In wide-area simulation, data distribution relies mainly on multicast and broadcast operations •Near real-time behaviors are desirable for interactive simulation

64 References •Parallel simulation – K. M. Chandy and J. Misra, Distributed Simulation: A Case Study in Design and Verification of Distributed Programs, IEEE Trans. on Soft. Eng., 1979, pp440-452 – R. Fujimoto, Parallel Discrete Event Simulation, Comm. of the ACM, Vol. 33(10), Oct. 90, pp31-53 •HLA – • Projects – GlomoSim - – SSF - – TeD/GTW - – CSAM -

Download ppt "Parallel Simulation. Past, Present and Future C.D. Pham Laboratoire RESAM Universit ₫ Claude Bernard Lyon 1"

Similar presentations

Ads by Google