Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Billion Transistor Architectures Interconnect design for low power – Naveen & Karthik Computational unit design for low temperature – Karthik Increased.

Similar presentations


Presentation on theme: "1 Billion Transistor Architectures Interconnect design for low power – Naveen & Karthik Computational unit design for low temperature – Karthik Increased."— Presentation transcript:

1 1 Billion Transistor Architectures Interconnect design for low power – Naveen & Karthik Computational unit design for low temperature – Karthik Increased reliability and power-efficiency – Niti Hardware for raytracing and OS co-processing – led by Pete and Erik/Dave Interconnect design for high performance

2 2 Partitioned Architectures Instr Fetch L1 D Cache

3 3 Interconnect Design Delay Optimized Bandwidth Optimized Power Optimized Power and B/W Optimized

4 4 Tuning Wire Properties Wire delay  sqrt(RC) (Ho, Mai, Horowitz, Proc. of IEEE, 2001) R wire =  / (thickness – barrier) (width – 2 barrier) C wire = 2 K  horiz thickness/spacing + 2  vert width/layerspacing + fringe(  horiz,  vert ) Wide wires  reduced resistance, slightly higher capacitance Wide spacing  reduced capacitance Example (Banerjee et al., IEEE Trans on Electronic Devices, Feb 2004): Factor of 8 increase in width and spacing  R L = 0.125 R B, C L = 0.74 C B, Delay L = 0.43 Delay B

5 5 Transmission Lines Test chips have demonstrated the potential of transmission lines: 3/4 th the latency of an equally wide RC wire (at 0.18  High associated costs: transmitter/receiver circuits, high width, thickness, vertical and horizontal spacing, power and ground reference planes and shielding lines

6 6 Latency-Bandwidth Trade-Off Bottomline: low latency wires are possible, but the area and associated costs are high High area cost  few wires can be accommodated  useful only for low-bandwidth communication Problem: microarchitectural applications of 3 sets of wires  B-Wires: high-bandwidth, high-latency, 64-wide  L-Wires: low-bandwidth, low-latency, 8-wide  PW-Wires: high-bandwidth, high-latency, low-power, 128-wide

7 7 Interconnect Design Delay Optimized Bandwidth Optimized Power Optimized Power and B/W Optimized

8 8 Hybrid Interconnects Each link on the network consists of a combination of B, L, and PW-Wires Instr Fetch L1 D Cache

9 9 L1 Cache Pipeline L1 D Cache LSQLSQ Eff. Address Transfer 10c Mem. Dep Resolution 5c Cache Access 5c Data return at 20c

10 10 Exploiting L-Wires L1 D Cache LSQLSQ Eff. Address Transfer 10c Partial Mem. Dep Resolution 3c Cache Access 5c 8-bit Transfer 5c Data return at 14c

11 11 Exploiting Choice Narrow bit-width operands (integers < 256) and narrow control signals (branch mispredicts) can also use L-Wires High-bandwidth power-efficient PW-Wires can transmit non-critical or bursty traffic L-Wires can improve performance by 10%

12 12 Results Summary ConfigurationMetal Area IPCRelative Dyn Energy Relative Leakage Energy Relative Energy- Delay Comments 64 B1.00.91100 Hi-perf 128 PW1.00.83398189 64 PW, 8 L1.50.90364779Low EDP 128 B2.00.9699190102 64 B, 128 PW2.00.938416998 128 PW, 8 L2.00.93368179Low EDP 64 B, 8 L2.01.00899988Hi-Perf 192 B3.00.9998275105 128 B, 8 L3.01.028818793Hi-Perf 64 B, 128 PW, 8L 3.00.986117088Low EDP

13 13 Title Bullet


Download ppt "1 Billion Transistor Architectures Interconnect design for low power – Naveen & Karthik Computational unit design for low temperature – Karthik Increased."

Similar presentations


Ads by Google