Presentation on theme: "VADA Lab.SungKyunKwan Univ. 1 L3: Lower Power Design Overview (2) 성균관대학교 조 준 동 교수"— Presentation transcript:
VADA Lab.SungKyunKwan Univ. 1 L3: Lower Power Design Overview (2) 성균관대학교 조 준 동 교수 http://vlsicad.skku.ac.kr
VADA Lab.SungKyunKwan Univ. 2 Low-Power Design Flow developed at LIS
VADA Lab.SungKyunKwan Univ. 3 Low Power Design Flow I
VADA Lab.SungKyunKwan Univ. 4 Low Power Design Flow II
VADA Lab.SungKyunKwan Univ. 5 Execution unit idle time(PowerPC 603)
VADA Lab.SungKyunKwan Univ. 6 System Integration
VADA Lab.SungKyunKwan Univ. 7 Power Consumption in Multimedia Systems LCD: 54.1%, HDD 16.8%, CPU 10.7%, VGA/VRAM 9.6%, SysLogic 4.5%, DRAM 1.1%, Others: 3.2% 5-55 Mode: – Display mode: CPU is in sleep-mode (55 minutes), LCD (VRAM + LCDC) – CPU mode: Display is idle ( 5 minutes), Looking up - data retrival Handwrite recognition - biggest power (memory, system bus active)
VADA Lab.SungKyunKwan Univ. 8 Reducing Waste Locality of reference Demand-driven / Data-driven computation Application-specific processing Preservation of data correlations Distributed processing
VADA Lab.SungKyunKwan Univ. 9 Energy-Efficient Design 1) Reduce the supply voltage Energy of switching drops quadratically with the supply voltage This drop is accompanied by reduced circuit speed 2) Minimizing switching capacitance Exploiting locality of reference with distributed computational structures, minimizing global interactions Enforcing a demand-driven policy that eliminates switching activities in unused modules Preserving temporal correlation in data streams by minimizing the degree of hardware sharing
VADA Lab.SungKyunKwan Univ. 10 Switching Activity
VADA Lab.SungKyunKwan Univ. 11 Eliminating Redundant Computations
VADA Lab.SungKyunKwan Univ. 12 Power saving concepts ý Work with parallel computation and low frequency. ý Reduce pipe stages to save registers (try to avoid hazards). ý Disable input toggling when the block is at idle state. ý Work with minimum gate size to reduce the toggle current. ý For outputs with large fanout’s speed up the transition to reduce the short circuit current (invest toggle current in order to save short circuit current).
VADA Lab.SungKyunKwan Univ. 13 Power Management DPM (Dynamic Power Management): stops the clock switching of a specific unit generated by clock generators. The clock regenerators produce two clocks, C1 and C2. The logic: 0.3%, 10-20% of power savings. SPM (Static Power Management): saving of the power dissipation in the steady mode. When the system (or subsystem) remains idle for a significant period time, then the entire chip (or subsystem) is shut-down. Identify power hungry modules and look for opportunities to reduce power If f is increased, one has to increase the transistor size or V dd.
VADA Lab.SungKyunKwan Univ. 14 Power Management(firstname.lastname@example.org) use right supply and right frequency to each part of the system If one has to wait on the occurence of some input, only a small circuit could wait and wake-up the main circuit when the input occurs. Another technique is to reduce the basic frequency for tasks that can be executed slowly. PowerPC 603 is a 2-issue (2 instructions read at a time) with 5 parallel execution units. 4 modes: – Full on mode for full speed –Doze mode in which the execution units are not running –Nap mode which also stops the bus clocking and the Sleep mode which stops the clock generator –Sleep mode which stops the clock generator with or without the PLL (20- 100mW). Superpipelined MIPS R4200 : 5-stage pipleline, MIPS R4400: 8 stage, 2 execution units, f/2 in reduce mode.
VADA Lab.SungKyunKwan Univ. 15 TI Two DSPs: TMS320C541, TMS320C542 reduce power and chip count and system cost for wireless communication applications C54X DSPs, 2.7V, 5V, Low-Power Enhanced Architecture DSP (LEAD) family: Three different power down modes, these devices are well-suited for wireless communications products such as digital cellular phones, personal digital assistants, and wireless modem,low power on voice coding and decoding The TMS320LC548 features: –15-ns (66 MIPS) or 20-ns (50 MIPS) instruction cycle times – 3.0- and 3.3-V operation 32K 16-bit words of RAM and 2K 16-bit words of boot ROM on-chip Integrated Viterbi accelerator that reduces Viterbi butter y update in four instruction cycles for GSM channel decoding Powerful single-cycle instructions (dual operand, parallel instructions, conditional instructions) Low-power standby modes
VADA Lab.SungKyunKwan Univ. 16 Low-power embedded system design low-power embedded applications: PDAs, mobile phones, etc. power-efficient processor cores(ARM) cache/memory organization for low power power management on embedded system chips, comparative analysis of power drawn by subsystems (CPU, hard disk, display, and standby) of notebooks
VADA Lab.SungKyunKwan Univ. 17 High level optimization for low power use of parallel and/or pipelined structures, the choice of data representations, the exploitation of signal correlations, the synchronization of signals for glitching minimization, and an accurate analysis of the shared resources. At the algorithmic-level, applying arithmetic and logic transformations to the block diagram
VADA Lab.SungKyunKwan Univ. 18 VLSI Signal Processing Design Methodology pipelining, parallel processing, retiming, folding, unfolding, look-ahead, relaxed look-ahead, and approximate filtering bit-serial, bit-parallel and digit-serial architectures, carry save architecture redundant and residue systems Viterbi decoder, motion compensation, 2D- filtering, and data transmission systems