Presentation is loading. Please wait.

Presentation is loading. Please wait.

Low power Design Strategies Daniele Folegnani. Talk outline Why Low Power is Important Power Consumption in CMOS Circuits New Trends for Future Microprocessors.

Similar presentations


Presentation on theme: "Low power Design Strategies Daniele Folegnani. Talk outline Why Low Power is Important Power Consumption in CMOS Circuits New Trends for Future Microprocessors."— Presentation transcript:

1 Low power Design Strategies Daniele Folegnani

2 Talk outline Why Low Power is Important Power Consumption in CMOS Circuits New Trends for Future Microprocessors Low Power Strategies Power Consumption Evaluation of a Superscalar Processor An Architectural Technique to Reduce the Power Consumption of the Issue Logic Conclusions

3 Why Low Power is Important High performance microprocessors PowerPC704 consumes 85 Watt Alpha 21364 consumes 100 Watt Problems involved: thermal runaway, gate dielectric, junction fatigue, electromigration diffusion, electrical parameters shift, silicon interconnections fatigue, package related failure. THE FUNCTIONALITY AND THE CLOCK SPEED CAN BE LIMITED

4 Thermal and Power dissipation costs

5 Low performance processors High demand of portable devices ( mobile phones, laptops, smart cards, videogames, etc ) >>> 95% of production !!! Extensive use of multimedia features Problems involved: >>> Battery life !!! Energy battery will not grow drastically in the near future due to technology and safety reasons ( today´s batteries has the same energy of a grenade !!! ) One of the market point is: hours of use and hours of standby Need of techniques to improve energy efficiency without penalizing performance

6 Power Consumption in CMOS Circuits Static Theoretically 0, in practice leakage and threshold currents exist in transistors Dynamic Transients ( the linear zone ) Capacitance switching THE MOST IMPORTANT FACTOR

7 New Trends for Future Microprocessors

8 Moore´s Law doubling transistors every 18 months Power is proportional to DIE AREA and FREQUENCY In the same technology a new architecture has 2-3X in Die Area Changing technology implies 2X frequency SCALING TECHNOLOGY... Decreasing voltage ( 0.7 scaling factor ) Decreasing of die area ( 0.5 scaling factor ) Increasing C per unit area 43% !!!

9 This implies that the power density increase of 40% every generation !!! Temperature is a function of power density and determinates the type of cooling system needed. VARIABLES PEAK POWER ( worst case ) Today´s packages can sustain a power dissipation over 100W for up to 100msec >>> cheaper package if peaks are reduced ENERGY SPENT ( for a workload ) More correlated to battery life

10 Low Power Strategies OS level : PARTITIONING, POWER DOWN Software level : REGULARITY, LOCALITY, CONCURRENCY ( Compiler technology for low power, instruction scheduling ) Architecture level : PIPELINING, REDUNDANCY, DATA ENCODING ( ISA, architectural design, memory hierarchy, HW extensions, etc ) Circuit/logic level : LOGIC STYLES, TRANSISTOR SIZING, ENERGY RECOVERY ( Logic families, conditional clocking, adiabatic circuits, asynchronous design ) Technology level : Threshold reduction, multi-threshold devices, etc

11 Power Consumption Estimation

12 Due to the relative high error rate in the architectural estimation ( no vision of the total area, circuit types, technology, block activity, etc ) IMPORTANT DESIGN DECISIONS MUST BE DONE AT ARCHITECTURAL LEVEL Accurate power evaluation is done at late design phases Needs of good feedback between all the design phases - Correlation between power estimation from low level to high level TRY TO IMPROVE ACCURACY AT HIGH LEVEL - Critical path based power consumption analysis ( CIRCUIT TYPES, TECHNOLOGY, ACTIVITY FACTOR ) - Thermal images based correlation analysis ( HOTTEST SPOTS LOCATION, COOLEST SPOTS LOCATION, TEMPERATURE DIFFERENCES, TEMPERATURE DISTRIBUTION )

13 Architectural Power Evaluation [ G.Cai, Intel ] Architectural design partition Power consumption evaluation at block level - Power density of blocks ( SPICE simulation, statistical input set, technology and circuit types definition ) - Activity of blocks and sub-blocks ( running benchmarks ) - Area ( feedback from VLSI design, circuits and technology defined ) TRY DO DEFINE SCALING FACTORS THAT ALLOW TO REMAP THE ARCHITECTURAL POWER SIMULATOR WHEN TECHNOLOGY, AREA AND CIRCUIT TYPES CHANGE TRY TO REDUCE THE ERROR ESTIMATION AT HIGH LEVEL

14 POW OUT ORDER Technology assumed: CMOS 0.18 micron 5 types of circuit logic ( static, dynamic, SRAM, clock distribution, PLA ) 32 architectural blocks and area associated blocks built with custom design two types of power density ( active and inactive power density )

15 Power Consumption Evaluation of a Superscalar Processor Architectural parameters: 4 instr. fetch, issue and commit 128 entries instruction queue size I-Cache 128Kbytes, direct mapped, 32 byte line, 1 cycle hit, 3 cycle miss D-Cache 128Kbytes, 4 way set ass, 32 byte line, 1 cycle hit, 3 cycle miss UL2-Cache,1024Kbytes, 4 way set ass, 64 byte line, 3 cycle hit Combined predictor of 1K entries with Gshare with 1K 2-bit counters, 8 bit global history and bimodal pred. of 2K entries with 2-bit counters 4 intALU, 4fpALU, 1int mul/div, 1 fp mul/div Out of order issue, oldest ready first selection policy

16

17 An Architectural Technique to Reduce the Power Consumption of the Issue Logic IQ + ROB responsible of about 53% of power consumption Cache hierarchy is not the most important power consumption factor in superscalar paradigm Power consumption is almost independent to the instruction mix TRENDS IN SUPERSCALAR Increasing issue width Increasing size of instruction window is more than linear respect IW Area of IQ grows more than linear respect the number of entries IQ power contribution may grow in the future

18 Every cycle the wakeup logic broadcast the result tags through the result buses to all the entries and each entry compares them with their to find a match THE ISSUE ENGINE SPEND EVERY CYCLE A LARGE AMOUNT OF POWER ONYL FOR CHECKING IF SOME INSTRUCTIONS ARE AVAILABLE FOR EXECUTIONConsidering Periods of execution with high parallelism, just a subpart of the IQ may satisfy the IW Periods of execution with poor parallelism, some parts of the IQ may not provide any useful instruction ready to execute The issue engine is very power inefficient

19

20

21

22

23 Dynamically Resizing the Instruction Queue We propose a run-time mechanism that adapt the size of IQ based on its contribution on IPC We avoid the wake-up function in the parts that are temporally disabled Resize decision are commit based IQ implemented as a circular FIFO with head and tail pointers, no collapsing

24 What we do is... Partition the queue in 16 parts of 8 entries Define a new pointer for the queue, called the limit pointer At start time has the same value of the head pointer and is update as the head pointer When a resize decision is done an offset ( one portion ) is added/subtracted from it The zone between the head and the limit pointer is the disabled zone ( no wake-up ) If the tail grows more than the limit, we allow the correct wake-up and we stop the insertion until the limit reach the tail

25 Heuristic to reduce size Collect statistics about the instructions committed in the youngest portion of the queue every quantum time ( 1000 cycles ). We propose to insert a bit in each ROB entry that will be set at dispatch time if the physical position of the instruction in IQ is in the current youngest part The resize decision is threshold-based >>> 0.025 of IPC in the current portion No limit to cut Heuristic to increase size Blind >>> grow of one portion every 5 quantum time at lets the cut approach decide if the decision was correct or not ( time of high parallelism or not )

26 Results

27

28 Conclusions Power consumption is a new constraint in the design of computer systems like cost and performance The problem must be attacked from different levels of abstraction Power decision must be done at early steps of the design There is a need of power estimation models and tools, specially at architectural level

29 Q&A ?

30

31


Download ppt "Low power Design Strategies Daniele Folegnani. Talk outline Why Low Power is Important Power Consumption in CMOS Circuits New Trends for Future Microprocessors."

Similar presentations


Ads by Google