Presentation is loading. Please wait.

Presentation is loading. Please wait.

P-GAS: Parallelizing a Many-Core Processor Simulator Using PDES Huiwei Lv, Yuan Cheng, Lu Bai, Mingyu Chen, Dongrui Fan, Ninghui Sun Institute of Computing.

Similar presentations


Presentation on theme: "P-GAS: Parallelizing a Many-Core Processor Simulator Using PDES Huiwei Lv, Yuan Cheng, Lu Bai, Mingyu Chen, Dongrui Fan, Ninghui Sun Institute of Computing."— Presentation transcript:

1 P-GAS: Parallelizing a Many-Core Processor Simulator Using PDES Huiwei Lv, Yuan Cheng, Lu Bai, Mingyu Chen, Dongrui Fan, Ninghui Sun Institute of Computing Technology lvhuiwei@ncic.ac.cn PADS 2010, May 18, 2010

2 Motivation Multi-core platforms are common now Courtesy: Sun® UltraSPARC T2 Courtesy: AMD® Opeteron 6000 Courtesy: Intel® Nehalem System Simulators still sequential

3 Motivation Multi-core platforms are common now courtesy: Sun® UltraSPARC T2 courtesy: AMD® Phenom courtesy: Intel® Nehalem System Simulators still sequential  Multi-core is wasted  Simulation speed is limited by single core performance

4 Poor Scalability of Single-threaded Simulator Slowdown grow exponentially Not able to simulate future many-core systems 1000+ cores Too slow to simulate future many-cores

5 Goal: fast and accurate computer system simulation Functional Cycle Accuracy Speed (slowdown) Speed (slowdown) Speedup 10x without accuracy lost

6 Outline Motivation Implementation Background From DES to PDES Optimization Evaluation Conclusion

7 Godson-T Architecture Simulator Discrete Event Simulation (DES) one global event queue event assigned to sinkers new event insert back into event queue Fine-grained EVENT A EVENT B

8 SimK: PDES Framework Open source Conservative PDES Highly optimized pthreads lock-free user-level thread scheduling Modularized use SimK API to implement a LP schedule, execschedule, exec commu, sync, buffer, deploycommu, sync, buffer, deploy APIAPI LPLP LPLP LPLP LPLP LPLP …… core Host SimK LP

9 From DES to PDES Seperate global queue Group sinkers into logical processes(LP), 1 queue/LP Event across LPs is wrapped with PDES time router core cache PDES time wrapper router core cache LP

10 router 1 E.g. Router Event before PDES time wrapper router 0 core 0 cache 0 router 1 core 1 cache 1 LP 0 LP 1 router 0 core 0 cache 0 core 1 cache 1 after Event Queue Router 0 send a event to router 1

11 Events from DES to PDES Single-thread  multi-threads Conservative PDES Simulation Time Thread 1 Thread 2 Thread 3 Thread 4 1 cycle event dependence

12 Grouping Into Big LPs Problem Avg. speedup is 1.8x with 16 thread (16 1-core LPs proto.) Cause of Problem too many LPs + lookahead is extremely small  high sync cost Solution grouping adjacent LPs into one big LP LP

13 Final Parallelized version Parallel Discrete Event Simulation sinkers grouped into big LPs LPs binded to threads using SimK API time sync between LPs using PDES sched and exec under SimK framework schedule, execschedule, exec commu, sync, buffer, deploycommu, sync, buffer, deploy APIAPI core Host SimK

14 Outline Motivation Implementation Evaluation Accuracy Speedup Conclusion

15 Evaluation Setup GAS v.s. P-GAS 4 Quad-Core AMD Opteron 8347 SMP 16 cores total, 64GB Memory Benchmark: SPLASH-2 kernel count benchmark computing time in wall-clock time

16 Cycle Count Error Avg. cycle count error: 0.04% 16

17 P-GAS Speedup 16 threads, SPLASH-2 Kernel Avg. speedup is 9.8x best speedup 13.6x(LU,16 threads) 5.3x super-linear speedup with 4 threads Avg. 9.8 Max. 13.6 5.3

18 Why super-linear speedup? More cores, more caches to use The insert-to-queue time is shorter 18 5.3x super-linear speedup with 4 threads

19 Conclusion P-GAS use PDES to speedup a cycle-accurate many-core processor simulator speedup 9.8x on a 16-core SMP cycle error < 0.04% Highly optimized conservative PDES could be used in fast and accurate system simulation multi-core/many-core processor simulation SMP cluster, many-core cluster...

20 P-GAS: Parallelizing a Many-Core Processor Simulator Using PDES Please email me the questions: lvhuiwei@ncic.ac.cn Open source release of our PDES framework: http://simk.sf.net


Download ppt "P-GAS: Parallelizing a Many-Core Processor Simulator Using PDES Huiwei Lv, Yuan Cheng, Lu Bai, Mingyu Chen, Dongrui Fan, Ninghui Sun Institute of Computing."

Similar presentations


Ads by Google