Presentation is loading. Please wait.

Presentation is loading. Please wait.

Low Power Digital Design

Similar presentations


Presentation on theme: "Low Power Digital Design"— Presentation transcript:

1 Low Power Digital Design
Low-power Finite-State Machine (FSM) Design Design model for partitioned FSMs based on mixed synchronous/asynchronous state memory Comparative study of low-voltage performance of standard-cell flip-flops Department of Information Technology Electronics Group

2 Department of Information Technology
Low Power Design Design Model for partitioned Finite-State Machines based on mixed synchronous/asynchronous state memory Bengt Oelmann Department of Information Technology Electronics Group

3 Design Model for Partitioned FSMs
Objective of this work Provide a design model that enables low-power FSM design with low area overhead Outline Background to low-power FSM design Basic principles of the proposed design model Outline the design procedure Indicate improvements compared to other work Future work Department of Information Technology Electronics Group

4 Power consumption in static digital CMOS
Power consumption in static CMOS Leakage Short circuit dynamic Department of Information Technology Electronics Group

5 Dynamic power management
To have in mind for further discussions In static CMOS circuits quiescent portions of the circuit dissipate a minimal amount of power Average power can be reduced even if more hardware is added when we can guarantee that the active hardware, in average, is less. reduce effective capacitance Department of Information Technology Electronics Group

6 Functional overhead in power management
Systems are in most cases designed for worst-case conditions, here meaning full utilization only full utilization during a small fraction of their operational time a shutdown mechanism shuts down parts of the design not needed when full utilization is not required Functional overhead shutdown circuits are not needed for functional correctness Power management system should never deteriorate up to violate performance constraints: area, timing, power(!) Department of Information Technology Electronics Group

7 Where is power management used ?
Algorithm level Architectural level Dynamic power management techniques Register-Transfer level Logic level Department of Information Technology Electronics Group

8 Implementation of Power Management
The design must be partitioned idle states if the different units must be detected so that these units can be shut down Partitioning the design according to the functional units can be made manually and intuitively -- the functional units are well separated and easy to identify small number of places where clock-gating are introduced Partitioning of one single functional unit it is less obvious how to partition the unit automated procedures are needed and supported by CAD tools Department of Information Technology Electronics Group

9 Department of Information Technology
Low Power FSM design Approaches to low-power FSM design State-encoding Shut-down techniques Gated-clock Input-disabling Gated-clock approaches result in FSMs with 35% lower power consumption than state-encoding techniques [Chow96] Department of Information Technology Electronics Group

10 Department of Information Technology
Partitioned FSMs X S0 S1 S2 S3 S4 0.6 0.2 0.005 0.01 0.04 0.1 Y l d S Department of Information Technology Electronics Group

11 Organization of the states
Local states States from the original FSM stored in state memory based on flip-flops triggered by active edge of the clock signal Global states The global state is pointing out which one of the sub-FSM is active It is updated independent of the clock signal (asynchronously) Entering a g-state will change the global state Restrictions on state assignment Coupled states must have identical codes (gi,si) Other states may share the same codes if they reside in different sub-FSMs Department of Information Technology Electronics Group

12 Basic principles of the proposed design model
Partitioning the original FSM Separate sub-FSMs with coupled states indicated Department of Information Technology Electronics Group

13 Department of Information Technology
Crossing transitions Original transition from s5 to s1 in one clock cycle In the transformed graph the s5 to s1 “transition” requires two transitions In a fully synchronous operation this takes 2 cycles or simultaneous clocking of F1 and F2 in the transition cycle or the crossing transition is handled asynchronously g1 s5 s1 crossing transition F1 F2 Department of Information Technology Electronics Group

14 Example, coupled states
Let there be a partition p = {S1,S2,S3,S4} resulting in the following sets of local states: U1={s1, s2, s3, g4} U2={s4, s5, s6, g7} U3={s7, g1} U4={s8, s9, s10, s11, s12, s13, g1 , g5} Clustering of the coupled states Department of Information Technology Electronics Group

15 Example, clustering the remaining states
States not coupled may be freely placed in any cluster U1={s1, s2, s3, g4} U2={s4, s5, s6, g7} U3={s7, g1} U4={s8, s9, s10, s11, s12, s13, g1 , g5} Department of Information Technology Electronics Group

16 Example, state encoding
Procedure ensures that minimum number of bits in the state memory is needed for each sub-FSM Binary state encoding in re-ordered state table Department of Information Technology Electronics Group

17 Department of Information Technology
Saving flip-flops States in different sub-FSMs share the same code Total number of bits in local state memory Sharing: 3 Not sharing: = 8 Department of Information Technology Electronics Group

18 Improvements compared to others ...
Performance of previously presented mixed synch./asynch. approach Estimated performance of this approach Significant reduction in both area and power for the output logic Department of Information Technology Electronics Group

19 Department of Information Technology
Future work Develop suitable implementation architecture Performance evaluation using standard benchmarks Design automation Develop CAD-tool for automatic synthesis Low-level optimization e.g. optimal state encoding Apply this approach to “real-world” problems e.g. high-speed -- low-power protocol processors Partitioning of FSM along with data path ... this work is well suited for a PhD-thesis project Department of Information Technology Electronics Group

20 Comparative study of flip-flops
Comparative study of low-voltage performance of standard-cell flip-flops Xue Shang and Bengt Oelmann Department of Information Technology Electronics Group

21 Comparative study of flip-flops
Objective Characterize and compare the power consumption and speed performance of flip-flops designed as standard cells Propose suitable combination of flip-flop types to be included in a cell library used in power-driven synthesis Outline Motivation Background to standard cell design Different types of flip-flops Characterization of the flip-flops Simulation results Conclusions Department of Information Technology Electronics Group

22 Department of Information Technology
Motivation Pipelining together with voltage scaling is an efficient way to get high-speed and low-power Increased pipelining  flip-flops will occupy larger part of the design case 1: Delay in critical path is T at Vdd = V1 case 2: Pipeline registers are introduced to shorten the delay when Vdd = V1 Reduce Vdd to V2 so that critical path becomes T. Department of Information Technology Electronics Group

23 Standard-cell based design flow
HDL code (RTL) Automatic Synthesis Gate netlist constraints standard cell library power (max power) timing (max cycle time) area (max circuit area) to physical design Department of Information Technology Electronics Group

24 Flip-flops for timing-driven synthesis
In timing-driven synthesis only one type of flip-flop is needed in the library The synthesis tool picks to flip-flop with best driving capability for the actual loading condition FFx3 FFx1 Combinational Logic FFx1 Department of Information Technology Electronics Group

25 Flip-flops for power-driven synthesis
The synthesis tool must compute the switching probability (a) and signal probability (p) at every node in the network The synthesis tool will pick the flip-flop with lowest power consumption for the actual a and p. (a6,p6) FF type 1 (a0,p0) (a3,p3) (a5,p5) (a7,p7) (a1,p1) FF type 2 (a3,p3) (a3,p3) (a8,p8) (a2,p2) FF type 1 Department of Information Technology Electronics Group

26 Design criteria on standard cell flip-flops
Static CMOS Robustness no restrictions on the lowest allowable clock frequency Single clock phase facilitates the automated design process supported by CAD-tools Single-ended data inputs primary inputs of the flip-flop cell must only be connected to gate-terminals of transistors Source- or drain connections are not well suited for simple timing calculations Department of Information Technology Electronics Group

27 Department of Information Technology
Types of flip-flops Two different types of master-slave flip-flops D Q C differential data input two-phase clock Department of Information Technology Electronics Group

28 Characterization of flip-flops
Parameters to control for comparison common technology (0.6 mm) transistor sizing input signal transition time loading conditions data input sequence Simulation testbench Department of Information Technology Electronics Group

29 Department of Information Technology
Timing Cycle time calculation in synchronous designs Performance measure of interest tD-Q D Q C Department of Information Technology Electronics Group

30 Department of Information Technology
How to determine tD-Q? tC-Q = f(tsetup) Stable region (tC-Q = constant) Meta-stable region (tC-Q = f(tC-C)) Failure region stable failure meta-stable minimum D-Q delay Department of Information Technology Electronics Group

31 Department of Information Technology
Power Power dissipation is separated into Data power (power to switch the data input) Clock power (power to switch the clock input) Internal power (power dissipation within the flip-flop) Data input pattern Worst power (a=1): Average power (a=0.5): pseudo-random pattern Minimum power (a=0): or Department of Information Technology Electronics Group

32 Department of Information Technology
Flip-flops C2MOS based on clocked CMOS stages (Traditional) MUX Multiplexer based, static combinational gates (Traditional) SRIS Static Ratio-Insensitive Latch (Yuan and Svensson 1997) SSTC Static Single Transistor Clocked (Yuan and Svensson 1997) strongARM Used in ARM RISC processors (Advanced RISC Machines) TG Transmission gate based (PowerPC) Department of Information Technology Electronics Group

33 Department of Information Technology
Delays Minimum D-Q delay Department of Information Technology Electronics Group

34 Department of Information Technology
Power-Delay-Product Department of Information Technology Electronics Group

35 Department of Information Technology
Power consumption Department of Information Technology Electronics Group

36 Where is the power dissipated ?
Department of Information Technology Electronics Group

37 Department of Information Technology
Conclusions Voltage scaling is efficient down to1.8V Speed strongARM is the fastest flip-flop with low power consumption Power for low switching activity SRIS has half the power consumption compared to strongARM for high switching activity SRIS has up to twice the power consumption compared to strongARM For a standard cell library use both strongARM and SRIS -- let the synthesis tool pick the most suitable one Department of Information Technology Electronics Group

38 Department of Information Technology
Concluding remarks Large number of simulations must be carried out automated characterization procedure a must Simulation environment built on PowerMill A fast analog simulator Matlab Test pattern generation Automatic analysis of simulation results Graphical User interface Graphical presentation of simulation results Perl Netlist conversions Processing of simulation results Department of Information Technology Electronics Group


Download ppt "Low Power Digital Design"

Similar presentations


Ads by Google