Download presentation

Presentation is loading. Please wait.

Published byLeonardo Turkel Modified over 2 years ago

1
Feb. 17, 2011 Midterm overview Real life examples of built chips – Clock Skew Arithmetic Data Centers Power reduction techniques – Dynamic Voltage / Frequency Scaling – Clock Throttling – Power Gating – Others? Project – 4b adder with Razor recovery

2

3
Go Over Problems 1c 2a; 2b 3c

4
Crossbar Design

5

6
6 Mirror Adder Stick Diagram

7
7 The Mirror Adder The NMOS and PMOS chains are completely symmetrical. A maximum of two series transistors can be observed in the carry- generation circuitry. When laying out the cell, the most critical issue is the minimization of the capacitance at node C o. The reduction of the diffusion capacitances is particularly important. The capacitance at node C o is composed of four diffusion capacitances, two internal gate capacitances, and six gate capacitances in the connecting adder cell. The transistors connected to C i are placed closest to the output. Only the transistors in the carry stage have to be optimized for optimal speed. All transistors in the sum stage can be minimal size.

8
8 Transmission Gate Full Adder

9
9 Manchester Carry Chain

10
10 Manchester Carry Chain

11
11 Carry-Bypass Adder Also called Carry-Skip

12
12 Carry-Bypass Adder (cont.)

13
13 Carry Ripple versus Carry Bypass

14
14 Carry-Select Adder

15
15 Carry Select Adder: Critical Path

16
16 Linear Carry Select

17
17 Square Root Carry Select

18
18 Adder Delays - Comparison

19
19 LookAhead - Basic Idea

20
20 Look-Ahead: Topology Expanding Lookahead equations: All the way:

21
21 Carry Lookahead Trees Can continue building the tree hierarchically.

22

23
Power Reduction Techniques Stop the clock – Dynamic power reduction Power gating – Reduce the leakage How fast can you turn something on/off? – Nothing to do sleep How can you save power while in operation? – Near-threshold design

24
Power Gating

25

26

27
Kevin Nowka, IBM

28

29

30

31

32

33
Gate Leakage

34

35
Digital Parallelization Y[n] = X[n] + X[n-1] Input 5GS/s) clk X[n]X[n-1] Y[n] + x Clk = 5GHz Analog Signal Input 5GS/s) Or 100MHz) ANALOGDIGITAL

36
DSP Parallelization Y[n] = X[n] + X[n-1] Input 5GS/s) clk X[n]X[n-2] + x Y[n-1] = X[n-1] + X[n-2] clk clkb CLK = 5GHz clk X[n-1] Y[n] Y[n-1] + x CLK = 2.5GHz

37
DSP Parallelization Clock speed reduced by ½ – Can parallelize further – Increase number of MACs(multiply/accumulates) by 2 Intuition? – Area goes up by 2 – Power decreases (clock rate down by 2, computations up by 2, but easier timing constraints) – What about clock power? Save a little power, but double the area?

38
Razor: A Low-Power Pipeline Based on Circuit-Level Timing Speculation MICRO36-Razor.pdf

39

40

41

42

43

44

45
Project Description Minimal: 4b Adder, Implemented with Razor – Simulations into near-threshold domain Grad. Student: requires more advanced design – Analog: Opamps built using inverters – Digital: Adiabatic Near-Threshold – Power Gating: add power gating to your design Undergrad: extra credit if do any of the above

46
Problem 1: On-Chip Wires Consume Energy On-chip wire power does not scale – Dominated by interconnect capacitance (CV DD 2 ) ON-CHIP (Status Quo): fJ/bit/mm NOTE: Sub/Near-Threshold doesnt help this problem! OUR GOAL: < 5fJ/bit/mm [DOE, Exascale Workshop] V DD EbEb 1V 150fJ/mm

47
Data Center Design

48

49

50

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google