Presentation is loading. Please wait.

Presentation is loading. Please wait.

UC Berkeley 1 Time dilation in RAMP Zhangxi Tan and David Patterson Computer Science Division UC Berkeley.

Similar presentations


Presentation on theme: "UC Berkeley 1 Time dilation in RAMP Zhangxi Tan and David Patterson Computer Science Division UC Berkeley."— Presentation transcript:

1 UC Berkeley 1 Time dilation in RAMP Zhangxi Tan and David Patterson Computer Science Division UC Berkeley

2 2 A time machine Using RAMP as datacenter simulator –Vary DC configurations: processors, disks, network and etc. –Evaluate different system implementations: Mapreduce with 10 Gbps, 2ms delay or 100 Gbps, 80ms delay interconnect –Explore and predict what happened if update hardware in your cluster: powerful CPU, fast/large disks – Try things in the future! RAMP inside

3 3 The problems Emulate fast and many computers in FPGA What are the problems? –First comment half year ago in RadLab retreat: 100 MHz is too slow can’t reflect GHz machine –Targets are becoming more and more complex Implement them in FPGA and cycle accurate is desired How many cores can we put in FPGA? (Original vision 16-24 cores per chip. Now, 1 Leon on V2P30, 2-3 on V2P70)

4 4 Methodologies RDL –Target cycle, host cycle, start, stop, channel model… Transfer data between units with extra start/stop control Replace original transferring logic with RDL control target clock: If no data, still send something to keep the target time “running” Bad control logic implementation may cause deadlock RDLizing unit (build channels, units) if you want to talk with each other –Compared to porting APPs for MicroBlaze? –RDLizing is obvious and simple?? Model: event driven? or clock driven? Time dilation –Remove target cycle control Stepping every clock cycle is the way to debug 1000 nodes system? –Use standard data transfer interface –Rescale everything to a “virtual wall clock” and “slow down” events accordingly Events: Timer interrupt, data sent/received and etc

5 5 Basic Idea “Slow down” time passage to make target faster –10 ms wall clock time = 2 ms target time Network: shorter time to send packet -> BW increase, latency decrease Disk: shorter time to read/write CPU: shorter time to do computation –Virtual wall clock is the coordinate in target, only control event interval in implementation Wall clock 10 ms perceived event interval 10 ms Virtual wall clock 2 ms 2 ms perceived event interval 10 ms perceived event interval No time dilation Time dilation

6 6 Real world examples Real Time dilation 1 sec Timer interrupt before time dilation 10 ms Network CPU and OS 100 ms Sending 100 Mb data between two events Perceived BW : 100 Mbps Perceived BW : 1 Gbps Sending data at the same rate with the same logic Timer interrupt after time dilation 50 ms in wall clock time 10 ms perceived in target OS updates its timer every 10 ms (jiffies) in each timer interrupt Reprogram the timer to slow the interrupt down –No OS modifications –No HW changes Speed up the processor by x5

7 7 Experiments HW Emulator (FPGA): 32-bit Leon3 with, 50MHz, 90 MHz DDR memory, 8K L1 Cache (4K Inst and 4K Data) –Target system: Linux 2.6 kernel, Leon @ 50 MHz / 250 MHz / 500 MHz / 1 GHz / 2 GHz –Run Dhrystone benchmark –Tomorrow: HW/SW co-simulation example Concept Time Dilation Factor = wall clock time / emulated clock time

8 8 Dhrystone result (w/o memory TD) How close to a 3 GHz x86 ~8000 Dhrystone MIPS? Memory, Cache, CPI

9 9 Problems Similar to time dilation in VM –To Infinity and Beyond: Time-Warped Network Emulation, NSDI 06 Everything scaled linearly, including memory! –VM is lucky: networking code can fit in cache easily. –RAMP has more knobs to tweak. Solution: slow down the memory and redo the experiment

10 10 Dhrystone w. Memory TD Keep the memory access latency constant - 90 MHz DDR DRAM w. 200 ns latency in all target (50MHz to 2GHz) - Latency is pessimistic, but reflect the trend RAMP blue result + Time dilation vs. real system?

11 11 Limitation of Naïve time dilation Fixed CPI (memory/CPU) model Next step –Variable time dilation factor: distribution and state (statistic model) –Emulate OOO with time dilation Peek each instruction and dilate it –Going to deterministic? No, I’ll do statistic Unit Time dilation counter Proposed model No extra control between units Reprogram Time Dilation Counter (TDC) in each unit to get different target configuration

12 12 Discussions!


Download ppt "UC Berkeley 1 Time dilation in RAMP Zhangxi Tan and David Patterson Computer Science Division UC Berkeley."

Similar presentations


Ads by Google