Presentation is loading. Please wait.

Presentation is loading. Please wait.

Competitive Power Power Overview Understanding Evaluation Methodology

Similar presentations


Presentation on theme: "Competitive Power Power Overview Understanding Evaluation Methodology"— Presentation transcript:

1 Competitive Power Power Overview Understanding Evaluation Methodology
Benefits of Low Power Performance Density Package Options Reliability Conclusion This presentation from 1997 provides information on Xilinx power consumption versus Altera, using Altera’s own datasheet specs for analysis.

2 Power Consumption Overview
Higher speed/density requires more power, leading to higher junction temp. Junction temperature is limited to 125oC for plastic & 150oC for ceramic packages Power directly limits: System performance Design density Package options Device reliability 150 oC 125 oC 100 oC Performance And Density Die Temp Because of physical device and package limitations, a designer needs to pay attention to the maximum power that a design is dissipating. As we all know, as the power goes up so does the die junction temperature. Most manufactures specify 125 oC as the absolute maximum for plastic packages and 150 oC for ceramic. This limit creates a cliff beyond which the designer should not travel. The overall power consumption of the device affects several areas of operation; System Performance, Device Density, Package Options and Device Reliability. All of these areas will be reviewed in further detail.

3 Power History As devices get larger & faster, power goes UP
1st generation FPGAs had: Lower performance Lower power requirements No package power concerns High density FPGAs have: Much higher performance Higher power requirements Package power limit concerns exist Package Power Limit PMAX High Density Low Density Designers in the past have not needed to worry about the package power limits and how they may affect their overall design strategy because the density and speed of the devices was such that with a full device running at full speed the design was always well within the the package power limit. This was an area of design consideration which the designer could count on being safe for operation. The new modern devices are now packing so much more density and speed into the same packages that the designer is now faced with the need to evaluate the power issues of the package. The FPGAs which Xilinx offers are designed for lower power requirements as we will see in other foils. Real World Design Power Consumption Performance (MHz)

4 Estimating Power Is Complex
Power consumption is completely design dependent and is affected by: System performance (switching frequency) Design density (number of interconnects) Design activity (% of interconnects switching) Logic block and interconnect structure Supply voltage No single benchmark can emulate all design conditions realistically; they must be used as guideline only Power consumption within a device is dependent on the design implemented as well as device characteristics themselves. It is dependent on the clock speed of the design, the number of routing lines being used and the % of those lines which toggle for each clock cycle. Because we have no way to simulate all design configurations and conditions we are forced to use benchmarks to get a rough estimate of how the device is performing. The benchmark which the industry is using is a 16-Bit counter filled device with no loads. This 16-Bit counter benchmark tends to be on the low end of the design range so we also plot 8-Bit counter benchmarks which tend to represent the high end of the design range.

5 Device Power Comparison
To compare device power between different architectures, the counter methodology is used Fill the device with 16-bit counters with no loads Measure active ICC versus frequency Solve the equation for “K” Both Xilinx devices and Altera’s 10K50s were measured to verify methodology The resulting Xilinx K factors are directly comparable with the Altera “K” factors, ratio of “K” factors = ratio of current! The Altera current methodology is used to derive the Xilinx K factors which can then be directly compared with the K factors from the Altera devices. The Xilinx devices were measured using the Altera methodology as well as an Altera 10K50. The 10K50 was measured to confirm that we were using the Altera methodology the same way that Altera had used it by deriving the 10K50’s K factor from the measurements and having it nearly identical to the K specified in the Altera data sheet.

6 Altera’s Methodology for Estimating Active Power
Active ICC = K x fMAX x N x %TOG ICC = Active ICC in uA. K = Scale Factor in uA/(MHz*LE) fMAX = Maximum operating frequency In MHz N = Number of Logic Cells (LEs or 1/2CLBs) used in design %TOG = Percent of Logic Cells toggling “K” Factor Is Direct Indicator Of Device Architecture Power Efficiency. Equation Gives Rough Estimate Of Active ICC Only. The equation presented here is what Altera presents in their data sheets. It give a rough estimate of the active current expected based on the device “K” factor and the customers ability to estimate the % of lines toggling. Xilinx is not recommending this methodology for actual device power estimation because this method does not accurately reflect the current except for the exact condition in which it was measured (full of 16-Bit counters), however comparing devices can easily be done by looking at the ratio of device K factors remembering that the lower the K factor the lower the current.

7 Altera 10K “K” Factor (Current) Nearly Twice Xilinx 4KEX
“K” Factor Comparison Low “K” is low current “K” factor is directly related to the device current based on architecture, process and user’s circuit design At high densities a low “K” factor is mandatory! Ratio of “K” factors = ratio of current! This summary table compares the Altera “K” factors with several Xilinx devices. You should notice that in every case the Xilinx K factor for comparable devices is less than that of Altera, in fact most of the time by a factor grater than 2:1 Altera 10K “K” Factor (Current) Nearly Twice Xilinx 4KEX *Source: Altera K Factors From Flex8K and Flex10K Data Sheets

8 Xilinx vs. Altera FLEX Interconnect Technology
Xilinx FPGA Architecture Altera FLEX Architecture Logic Block 1 Logic Block 2 Logic Block 3 Logic Block 4 Logic Block 1 Logic Block 2 Logic Block 3 Logic Block 4 A B C A B C “Segmented” Interconnect Lines “Non-Segmented” Interconnect Lines Why do we know that we will require less current than the Altera devices?. There is a fundamental difference between the two architectures, we use a segmented architecture while Altera uses a non-segmented, some times called “Long Line”, architecture. The segmented architecture allows the logic to interconnected with relatively short interconnects on average, while the Long Line architecture always uses a whole long line to interconnect the logic. The shorter the interconnect the lower the capacitance, the lower the capacitance being switched, the lower the current! It’s that simple! Variable length Interconnect lines 1 Segmented line required to connect 4 logic cells Lower capacitance on short lines means lower power Fixed Length Interconnect Lines 3 Single Signal lines required to connect 4 logic cells Higher capacitance on each net means higher power

9 Understanding 8/16-Bit Counter Benchmarks
Benchmark fills device with 8 or 16-bit counters Predictable # lines toggling 12.5% for 16-bit counter 25% for 8-bit counter Many real designs 12%-25% Package power limit determined by max junction temp. & package thermal resistance (0JA) Performance (MHz) PMAX Package Power Limit Real World Design Power Consumption 8-Bit Counter 16-Bit Counter The Altera current methodology uses 16-bit counters and in the Altera literature they indicate that 12.5% (from 16-bit counter) toggling is normal in most designs. Xilinx however finds that this is on the low side of the actual range of currents seen from real application designs, we also show where the upper portion of the range is using the 25% (from 8-bit counter) toggling current limit to give a more realistic current range. The Altera current equation is plotted for ICC with the number of LEs and % toggling held constant while we vary the Fmax.

10 Xilinx 4KXL >3X Faster than Altera 10K
Typical Power Consumption by Frequency 2000 Package Limit 10K100 XC4062XL Current (mA) 1000 Presented here is a comparison of the Xilinx and Altera BIG devices, as you can see the Altera 10K100 is of limited use without a fan where the Xilinx 4062XL has no problem working in a small low cost plastic package. 20 MHz 40 MHz 60 MHz 80 MHz 100 MHz Performance Package = 503/475 PGA

11 Xilinx 4KEX 2X Faster than Altera 10K
Typical Power Consumption by Frequency 1000 Package Limit 10K50 XC4036EX 500 Current (mA) From this foil, you can get a feel for how the maximum performance is limited by the current of the device runs up to the package power limitation. As you would expect from looking at the K multipliers presented in the previous foil, the Altera 10K50 limits out at about half the performance value that the 4KEX limits to. 20 MHz 40 MHz 60 MHz Performance Package = 240 HQFP

12 Xilinx 4KXL 1.5X Faster than Altera 10KV
Typical Power Consumption By Frequency 1000 Package Limit XC4036XL 10K50V Current (mA) 500 From this foil, you can get a feel for how the maximum performance is limited by the current of the device runs up to the package power limitation. As you would expect from looking at the K multipliers presented in the previous foil, the Altera 10K50V limits out at about two thirds the performance value that the 4KXL limits to. 25 MHz 50 MHz 75 MHz 100 MHz Performance Package = 208 HQFP

13 Xilinx 4KE 1.5X Faster than Altera 10K
Typical Power Consumption by Frequency 1000 Package Limit XC4025E 10K40 500 Current (mA) From this foil, you can get a feel for how the maximum performance is limited by the package power limitation of a 4KE vs 10K. As you would expect from looking at the K multipliers presented in the previous foils, the 4KE requires only 69% (72/104) of the current that the Altera 10K requires, this gives 44% higher performance limits for the 4KE. 20 MHz 40 MHz 60 MHz Performance Package = 240 HQFP Full 8 bit and 16 bit counter benchmarks used for range

14 Design Density Comparison
Another way of looking at the solution is to evaluate how much logic can be placed in a device while achieving a specific performance goal The density vs. frequency plots provide a quick way to estimate the capacity of a device/package combination In the next graph we will present the device/package limit in a slightly different light. In this case we will again solve the Altera equation for ICC but this time we will hold Fmax constant and vary the number of LEs used.

15 Xilinx 4KEX 2X Logic Capacity Of Altera 10K
Maximum Density At 50MHz 1000 Package Limit 10K50 XC4036EX 500 Current (mA) This foil gives you a different point of view, I have selected a performance level (50 MHz) and plotted the maximum density the given device can achieve while operating at that performance level. What you can see in this file is that for comparable power quad packages the Altera solution can only achieve one half the number of logic cells as the Xilinx solution! Even when we put the Altera device in a PGA they can only achieve two thirds the number of LCs that the Xilinx solution offers. 1000 2000 3000 Density (LCs) Max LCs For Device; 10K50 = 2880, XC4036EX = 3078 Package = 240 HQFP

16 Xilinx Lower Power Offers More Package Options
FLEX 10K100 is only available in PGA 503! Power dissipation unacceptable for other packages Altera forced to specify package with fan & heat sink attached Who uses PGAs in production? Expensive Large XC4062XL is available in many package configurations! HQ240, BG432, BG560, PG475 Altera’s power problems are bad enough that the 10K100 is only available in a PGA package and to be useful must have a fan on it! How many large customers are willing to go to production with this configuration? Xilinx on the other hand offers many package options for the XC4062XL with it’s much lower power requirements.

17 Xilinx Lower Power = Higher Reliability
FIT Rates FIT: failures in 109 device hrs Lower the FIT rate the better As junction temperature increases, FIT rate increases dramatically Device reliability decreases rapidly as junction temp. increases above 100oC Power dissipation greatly affects all device’s reliability. I have provided on this chart a table that shows how the device FIT rate changes over junction temperature. If we review the table for a minute, you will see that at 70 C the device quoted FIT rate is presented (10 for Xilinx, 50 for Altera) and as the junction temperature increases you will note that the FIT rate goes up a little more than 2/1 for every 10 C. This 2/1 increase is a fair rule of thumb you can use when estimating changes in FIT rate. The thing to keep in mind is that the FIT rate defines how many failures we may have in 109 device hours. The lower the rate the better! FIT rate acceleration based on activation energy of 0.9 eV, Base FIT rates from Xilinx and Altera Quality Reports

18 Xilinx Lower Power Offers 360X Better Reliability
A design which does not stress any of the device limits: Altera 10K50 in 240 RQFP and Xilinx 4036EX in 240 HQFP run under the following conditions: bit counters in device at 30 MHz system speed 5.0 Volts supply, 50oC ambient temperature Altera device runs at 125oC junction, Xilinx at 70oC All values within device maximum limits! Reliability evaluation Xilinx solution gives 360x better reliability (3600/10) Xilinx 4036XL will give 1200x better reliability I have presented an example design here to help illustrate the power advantage that Xilinx offers over Altera. We put the same exact design in a 10K50 and a 4036EX device and calculate the junction temperature for each using the same external conditions. The results are the 10K50 running at 120 C and the Xilinx 4036EX running at 80 C. When we look up the FIT rates at these temperatures you will notice a HUGE difference 2398 to 23! Do you think you might have a customer who should be worried about 100 times higher failure rates with the Altera Devices? This is even with the Altera device using the much more expensive PGA package, the design will not work in a plastic package! Remember that the component engineer at many companies will be very interested in this information!

19 Conclusion Xilinx consistently provides superior performance limits while using low cost plastic packages Xilinx XC4000E & EX families offer best power/speed/density tradeoffs in the industry Xilinx XC4000XL delivers the industry-leading performance at very high densities Using Altera’s own methodology, Xilinx XC4000EX devices dissipate 1/2 power of Altera Flex 10K parts 4KXL devices draw <1/3 power of Flex 10K parts NOTES:

20 Appendix

21 How do you get 12% toggle rate for a 16-Bit Counter?
16 Bit Counter Example Q Q0 Q1 Q2 Q14 Q15 Q13 Q0 toggles each clock period, it contributes 1/16 = Q1 toggles every 2nd clock period, it contributes 0.5/16 = Q2 toggles every 4th clock period, it contributes 0.25/16 = Q3 toggles every 8th clock period, it contributes 0.125/16 = Q4 ... Q15 ... Sum All Contributions: Total Note: For 8 Bit Counter, Each Contribution Is x/8 This foil is included to explain where the 12.5% number and the 25% number comes from when using counters. What you find is that each bit is averaged over the size of the counter to come up with an average percentage which is nothing more than the summation of each bit contribution.

22 Determining Max Package Current
Find From Data Sheet On Package The 0JA Determine The Max Ambient Temperature (TA in oC) Determine The Max Junction Temperature (TJ in oC) Calculate The Max Package Current; IMaxPkg = [(TJ - TA) / 0JA ] / VCC The presentation uses graphs plotted with current on the Y axis, the package power limit must be determined and converted to a maximum current. The equation is fairly simple and requires the user to determine the theta JA of the package they desire to use, then just plug in the maximum junction and ambient temperatures and the device supply voltage and you get current.

23 Understanding the Comparison Graphs
The maximum current the package can handle without exceeding maximum junction temperature Typical Power Consumption by Frequency Package Limit 16-bit limit The range of currents expected in most real designs, upper limit defined by 8-bit counters and lower limit by 16-bit counters Current (mA) 8-bit limit Each of the comparisons are presented in graph form. Each graph displays the data in a similar fashion as highlighted here. Performance Notes about graph conditions Package = 240 HQFP


Download ppt "Competitive Power Power Overview Understanding Evaluation Methodology"

Similar presentations


Ads by Google