Presentation is loading. Please wait.

Presentation is loading. Please wait.

University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 1 The StageNet Fabric.

Similar presentations


Presentation on theme: "University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 1 The StageNet Fabric."— Presentation transcript:

1 University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 1 The StageNet Fabric for Constructing Resilient Multicore Systems Shantanu Gupta, Shuguang Feng, Amin Ansari, Jason Blome and Scott Mahlke University of Michigan, Ann Arbor

2 University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 2 Journey of Silicon Technology 486 Pentium Pentium II Pentium III Pentium 4 Core Duo Core 2 Quad Perfect transistors Rising Variability and Defects Unreliable Silicon CPU Performance (log scale) Memory redundancy IBM z servers Cell

3 University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 3 Reliability Threats Transient Faults Hard Faults (Manufacturing defects and device wear-out) Manufacturing Defects That Escape Testing (Inefficient Burn-in Testing) Increased Heating Higher Transistor Leakage Thermal Runaway Higher Power Dissipation Parametric Variability (Uncertainty in device and environment) Intra-die variations in ILD thickness [Todd Austin, GSRC Sep 08]

4 University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 4 Goal of this Research Reliability is developing into a first class design constraint Design a computing substrate ► Provides scalable fault tolerance ► Highly reconfigurable ► Marginal overheads Enable CMP designs capable of facing 100s of faults while maintaining useful throughput

5 University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 5 Lower complexity Reconfiguration Granularity FETCH DEC EXEC WB MEM CORE levelSTAGE levelMODULE level ElastIC, DT’ 06 Reunion, MICRO’06 Configurable Isolation, ISCA’07 Online Diagnosis of Hard Faults, MICRO’ 05 Ultra Low-Cost Defect Protection, ASPLOS’ 06 Better resource utilization For 100% area overhead (redundancy) -- Poor MTTF gains + Easy to implement + Good MTTF gains + Circuit / Architectural boundary + Full coverage + Best MTTF gains -- Complex implementation 100% MTTF ↑ 170% MTTF ↑ 200% MTTF ↑

6 University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 6 Core 2 Core 0 Core 1 Core 3 CMP Fabric Stage1 StageN Stage2 Stage3 Stage1 StageN Stage2 Stage3 Stage1 StageN Stage2 Stage3 Stage1 StageN Stage2 Stage3 Stage1 Latch Stage2 Latch Stage3 StageN

7 University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 7 StageNet (SN) Fabric Stage1StageNStage2Stage3 Stage1StageNStage2Stage3 Stage1StageNStage2Stage3 Stage1StageNStage2Stage3 Configuration Manager StageNet Slice (SNS) Crossbar Switch Wearout Sensors Delay Temperature Current

8 University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 8 SN – Benefits Stage1StageNStage2Stage3 Stage1StageNStage2Stage3 Stage1StageNStage2Stage3 Stage1StageNStage2Stage3 Configuration Manager

9 University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 9 Outline SN Slice (SNS) architecture SNS performance results SN architecture Lifetime Reliability Evaluation

10 University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 10 StageNet Slice (SNS) – Decoupled uArch IssueFetchDecodeEx/Mem WB LATCH Gen PC Branch Predictor Register File register wb branch resolution bypass 5 stage pipeline SNS DecodeEx/Mem Fetch Gen PC Branch Predictor Issue Register File double buffer double buffer double buffer double buffer double buffer double buffer double buffer Scoreboard

11 University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 11 SNS Performance Hit IssueFetchDecodeEx/Mem WB LATCH Gen PC Branch Predictor Register File register wb branch resolution bypass 1 2 3 8 9 10 6 7 4 5 BR register dependency Commit Time 123678910 5 stage pipeline 123678910 SNS pipeline 3. Transmission delays 2. Data forwarding 1. Control stall Issue Scoreboard DecodeEx/MemFetch Gen PC Branch Predictor Register File double buffer double buffer double buffer double buffer double buffer double buffer double buffer > 5X slowdown

12 University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 12 Stream-id : 1-bit to represent the execution path Toggled upon a branch mis-predicted Wrong path instructions are squashed 1. Control Handling using Stream ID DecodeEx/Mem Fetch Gen PC Branch Predictor Issue Register File double buffer double buffer double buffer double buffer double buffer double buffer double buffer Scoreboard SID 0 0 0

13 University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 13 DecodeEx/MemFetch Gen PC Branch Predictor Issue Register File Scoreboard Stream-ID Example SID 0 0 0 BR 0 0 1 1 0 1 1 SID 1 squashed committed Branch mispredict Toggle Stream-ID Squash the wrong ones Continue on the right path Toggle Stream-ID

14 University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 14 SNS with Stream-ID DecodeEx/MemFetch Gen PC Branch Predictor Issue Register File double buffer double buffer double buffer double buffer double buffer double buffer double buffer Scoreboard SID 0 0 0 1 2 3 8 9 10 6 7 4 5 BR register dependency Commit Time 123678910 5 stage pipeline 123 SNS pipeline 3. Transmission delays 678910 2. Data forwarding 1. Branch induced stall

15 University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 15 SNS - Challenges and Solutions [CASES 08] 1.Control Handling Stream-ID takes care of this 2.Data Forwarding Bypass$ emulates data forwarding - Store previous results - Pass them on to new instructions 3.Transmission Delay Macro-ops are used to amortize delay - Bundles of instructions - Increases system utilization Reduce Feedback Links Conserve Bandwidth Decentralized Control

16 University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 16 Simulation Infrastructure Trimaran Compiler Liberty Simulation Environment Benchmarks Trimaran Assembler HPL-PD Assembly HPL-PD Emulator (FUNCTIONAL) SN Architecture (TIMING) Liberty Simulation Framework Rebel Branch predictorGlobal, 16-bit, gshare predictor Level 1 I/D cache4-way, 16KB, 1 cycle latency Level 2 unified cache 8-way, 64KB, 5 cycle latency

17 University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 17 Final SNS Performance 0 1 2 3 4 5 6 3des g721decode g721encode idct rawcaudio rawdaudio rijndael mcf eqn grep wc Mean Normalized Runtime SNS + StreamID SNS + StreamID + Bypass$ SNS + Stream ID + Bypass$ + MOPs

18 University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 18 SNS – Design Summary DecodeEx/Mem Fetch Gen PC Branch Predictor Issue Register File double buffer Scoreboard SID 0 0 0 Bypass $ Packer 1.StreamID – SID registers 2.Bypass$ – Bypass$, Scoreboard 3.Macro-ops – Packer, Buffer sizes double buffer double buffer double buffer double buffer double buffer double buffer ~12% area overhead, ~10% perf. overhead

19 University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 19 SN – Architecture 5 SNSs combined to form SN SN architecture is resilient ► Broken stages can be isolated ► Crossbar switches are redundant ► Interconnection wires are relatively reliable Configuration manager acts upon failures ► Stage borrowing / lending ► Stage sharing

20 University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 20 SN – Stage Borrowing Pipelines borrow / lend stages to form SNSs Exclusive use of stages by SNSs

21 University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 21 SN – Stage Sharing Allow SNSs to share stages Degree of sharing is tunable (2-way, 3-way..)

22 University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 22 Lifetime Reliability Experiments Monte Carlo experiment of ~300 lifetime experiments Where, each experiment involves - ► Assigning a TTF to all the components ► Killing components at their failure times ► Reconfiguring system to isolate broken components ► Computing instantaneous throughput Evaluation for three designs ► Traditional CMP ► SN + borrowing ► SN + borrowing + sharing

23 University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 23 SN – Throughput 4X

24 University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 24 SN – Cumulative Work 50%

25 University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 25 SN Many-core Vision SN, as presented, can not scale to many cores.... How to deploy SN in a 64 core system? ► Create SN blocks – optimal # cores tied together ► Deploy a sparse network b/w blocks Traditional many-core SN block SN SN many-core

26 University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 26 Conclusions Architectural innovations will be crucial in tackling the high failure rates. SN is a potential solution ► 50% more cumulative work ► Low overheads (10% performance, 12% area) SNS, a decoupled pipeline microarchitecture, forms its basis ► Stream-ID ► Bypass$ (not presented) ► Macro-ops (not presented) Ongoing work ► SNS design for aggressive cores ► Optimal SN configuration for many-core systems

27 University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 27 Thank You http://cccp.eecs.umich.edu

28 University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 28 Back up

29 University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 29 SN – Defect Tolerance # Faults Traditional CMP StageNet CMP 0 5 5 35 2 4 1

30 University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 30 Scoreboard REG ID Valid Issue Scoreboard DecodeEx/MemFetch Gen PC Branch Predictor Register File double buffer double buffer double buffer double buffer double buffer double buffer double buffer Scoreboard to handle RAW dependencies Stalls generate backpressure

31 University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 31 Area overhead breakdown Router area for 32 and 64 bit configurations

32 University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 32 Architectural Details

33 University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 33 Stage modifications for SNS

34 University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 34 2. Bypass$ for data forwarding REG IDVALUE Bypass Cache - Fully associative structure - FIFO replacement policy Key benefits - Reduced stalls - Lower bandwidth consumption DecodeEx/Mem Fetch Gen PC Branch Predictor Issue Register File double buffer double buffer double buffer double buffer double buffer double buffer double buffer Scoreboard SID 0 0 0 Bypass $

35 University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 35 SNS with Stream-ID, Bypass$ DecodeEx/MemFetch Gen PC Branch Predictor Issue Register File double buffer double buffer double buffer double buffer double buffer double buffer double buffer Scoreboard SID 0 0 0 1 2 3 8 9 10 6 7 4 5 BR register dependency Commit Time 123678910 5 stage pipeline 123 SNS pipeline 3. Transmission delays 678 910 2. Data forwarding Bypass $

36 University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 36 3. Transmission delay Multiple cycles for instruction transfer  Low utilization DecodeEx/Mem Fetch Gen PC Branch Predictor Issue Register File double buffer double buffer double buffer double buffer double buffer double buffer double buffer Scoreboard SID 0 0 0 Bypass $

37 University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 37 Need to improve utilization ► Balance transfer and compute time Send instruction bundles ► Macro-ops (MOP) ► Greedy selection policy Advantages ► Removes temp. intermediates ► Parallelizes transfer and compute Hide delay with Macro-ops Max length 4 Max live-ins 2 >> ST LD + / >> & << ST + LD

38 University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 38 SNS with Stream-ID, Bypass$, MOP DecodeEx/MemFetch Gen PC Branch Predictor Issue Register File double buffer double buffer double buffer double buffer double buffer double buffer double buffer Scoreboard SID 0 0 0 1 2 3 8 9 10 6 7 4 5 BR register dependency Commit Time 123678910 5 stage pipeline SNS pipeline 123 3. Transmission delays 678910 Bypass $ Packer 123678910

39 University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 39 Traditional solutions ► TMR ► Tandem / HP Non-stop ► IBM zSeries …are impractical ► Cost ► Power ► Low gain Tolerating Permanent Faults Current approach 1.Detection 2.Diagnosis ► Using sensors ► Redundant Computation ► BIST 3.Repair ► Replacement ► Reconfiguration K-pos DP-31/32


Download ppt "University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 1 The StageNet Fabric."

Similar presentations


Ads by Google