Presentation is loading. Please wait.

# Tunable Sensors for Process-Aware Voltage Scaling

## Presentation on theme: "Tunable Sensors for Process-Aware Voltage Scaling"— Presentation transcript:

Tunable Sensors for Process-Aware Voltage Scaling
Tuck-Boon Chan‡ and Andrew B. Kahng†‡ CSE† and ECE‡ Departments, UCSD Good morning everyone, the title of my presentation is Tunable Sensors for Process-Aware Voltage Scaling. This work is done by me and my advisor Prof. Andrew Kahng at UCSD.

Outline Intro: Adaptive Voltage Scaling (AVS)
Overview of Proposed Method Voltage Scaling Properties Designing the Circuit Results In this presentation, I will start with an introduction on adaptive voltage scaling, AVS. Then, I will describe our process-aware voltage scaling method, and some observations on voltage scaling properties. Finally, I will present our circuit and some simulation results.

Adaptive Voltage Scaling
reduce voltage  meet performance with less power Maximum frequency a typical chip margin worst-case scenario (e.g, due to process variation) Voltage Vnominal Circuits are designed to guardband for performance variation There is margin for typical chips Adaptive voltage scaling (AVS) adjusts voltage to reduce power This figure shows the frequency vs. voltage tradeoff curve of a chip. The chip is designed to operate at a target frequency with a nominal operating voltage, under the worst-case performance variation scenario. However, a typical chip will likely to operate at a higher frequency at the same nominal voltage. This means that there is a margin in terms of frequency. In otherwords, the chip is overdesigned. In this case, we can use adaptive voltage scaling to reduce the voltage so that the chip can meet the performance target with less power.

Taxonomy of AVS Techniques
AVS classes approaches Open-Loop AVS Freq. & Vdd LUT Post-silicon characterization AVS Pre-characterize LUT [Martin02] Process-aware AVS Post-silicon characterization [Tschanz03] Generic monitor Design dependent replica In-situ monitor Process and temperature-aware AVS Generic on-chip monitor [Burd00] Design-dependent monitor [Elgebaly07, Drake08, Chan12] In-situ performance monitor Measure actual critical paths [Hartman06, Fick10] Closed- Loop AVS Error Detection System Error detection and correction system Vdd scaling until error occurs [Das06,Tschanz10] Power There are many existing AVS techniques and they can be broadly classified into three catagories. The open-loop AVS, closed-loop AVS, and Error tolerance AVS. Open-loop AVS typically adjust the voltage based on a pre-characterized lookup table. To achieve more power reduction, closed-loop AVS utilize on-chip monitor to measure the performance variation of the chip and adjust the voltage accordingly. For further power reduction, error tolerance AVS reduce the voltage aggressively until an error occurs. In this work, we focus on closed-loop AVS because it offers better power reduction compared to the open-loop AVS, and it is easier to design compared to the Error tolerance AVS. Error Tolerance AVS

Motivation for Closed-Loop AVS
[Hartman06] For example, this figure shows the power consumptions of a chip with open-loop or closed-loop AVS. The measurement result shows that the closed-loop AVS can saves 62% more power compared to a open-loop AVS with two voltage levels. Closed-loop AVS saves up to 62% dynamic power

Classes of Closed-Loop AVS
Generic monitor Design-dependent replica In-situ monitor Does not capture design-specific performance variation Critical path may be difficult to identify (IP from 3rd party) Calibrating monitors at multiple modes/voltages requires long test time A closed loop AVS can be implemented by different monitors. For example, we can use a generic monitor such as an inverter-based ring-oscillator to measure the performance variation of a chip. But this monitor does not capture the design-specific performance variation. To capture the design-specific performance variation, we may use design-dependent replica circuits or in-situ monitors. However, these monitors usually requires information on critical paths, which may be difficult to be identified. Also, calibrating monitors at multiple modes and voltages requires long test time. In this work, we propose a method to design a closed-loop AVS monitor, which can be used as a generic monitor or tuned for a specific design when silicon samples are available. This work: Tunable monitor for closed-loop AVS Can be applied as a generic monitor Or tuned to capture design-specific performance

Outline Intro: Adaptive Voltage Scaling (AVS)
Overview of Proposed Method Voltage Scaling Properties Designing the Circuit Results

Voltage Scaling Key Concepts
Process distance k Max. freq. Scaling rate = SS Voltage Process distance: process-induced frequency shift relative to target frequency Scaling rate: frequency shift (f) per unit voltage difference (V) Vmin= Minimum Vdd to meet target frequency Calculated from process distance and scaling rate Before describing our monitoring method, I would like to define a few voltage scaling properties. First, the process distance of a chip at a process condition k, is defined as the process-induced frequency shift relative to the target frequency of the chip. Second, scaling rate of the chip, is defined as the frequency shift per unit voltage difference. As we can see, by using the process distance and scaling rate, we can calculate Vmin, which is the minimum operating voltage to meet the target frequency.

Monitor Design Concept
Use Vmin of ring-oscillator (RO) as a reference Design ROs with worst-case voltage scaling properties  an arbitrary circuit will meet target frequency at Vmin_ro V RO Critical paths Freq. Process corner A RO V Critical paths Freq. Process corner B In this work, we propose to use the V min of ring-oscillator as a reference for voltage scaling. Our goal is to design the ring oscillators with the worst-case voltage scaling properties, such that any arbitrary circuit can meet the target frequency at V min of the ring oscillator. For example, this figure shows that at process corner A, the V min of a RO is larger than the maximum V min of critical paths. However, at a different process corner, the V min of RO is now smaller than the V min of critical paths. To make sure that the Vmin of RO is always larger, we can have another RO, which has a larger V min at process condition B, and then, we can scale the operating voltage of the chip to the maximum V min of the ROs. > Max. Vmin of ROs Max. Vmin of paths

Proposed Method: Tunable Monitor
Scenario 1: Without circuit information Configure RO for worst-case Vmin Guardband for arbitrary circuits Store config. Scenario 2: With chips at process corners Extract Fmax and Vmin of chips Tune voltage scaling properties of ROs so that Vmin_ro > Vmin_chip Recover margin with one calibration Our tunable RO can be used in different scenarios. In the first scenario, where we do not have any circuit information, we can configure the RO for worst-case Vmin and guardband for arbitrary circuits. Then, the configuration will be stored in the chip. In the second scenario, where we have chips sampled at different process corners, we can extract the F max and V min of the chips. Then, based on the extracted V min, we can tune the voltage scaling properties of the ROs, so that the V min of RO is larger than the V min of the chips. This method can recover the voltage margin in scenario one with one calibration. In this work, we focus on voltage scaling properties. This is a key difference from previous methods because it allows us to analyze the worst-case voltage scaling. Our focus is on voltage scaling property  analyze worst-case voltage scaling

Problems Goal: Vmin_ro > Vmin_path Questions:
Given a process technology, what is the range of the Vmin that is defined by process distance and scaling rate for arbitrary critical paths? What circuit techniques can “tune” Vmin? V Vmin of arbitrary critical paths freq. Path A Path B Path C As a short summary, our goal is to design the ROs such that the V min of the ROs is larger than the V min of critical paths. To achieve the goal, we need to answer two questions. First, Given a process technology, what is the range of the Vmin that is defiend by the process distance and scaling rate? For example, we can see that the path A defines a V min. But the V min can be different depending on the critical paths. Also, the Vmin can be different at a different process corner. Second, what kind of circuit technique can be used to tune the Vmin of an RO. Vmin = ? Also, Vmin changes at different process corners

Outline Intro: Adaptive Voltage Scaling (AVS)
Overview of Proposed Method Voltage Scaling Properties Designing the Circuit Results

Vmin Analytical Derivation
(1) Process distance Scaling rate (2) fpath = inverse of average delays of NMOS & PMOS To understand the voltage scaling properties, we derive the analytical expression for V min of an inverter chain. As mentioned earlier, the Vmin is a function of process distance and scaling rate. To calculate the process distance, we need to model the frequency shift at a process condition k. To model the scaling rate, we need to model the frequency shift per unit voltage change. In this derivation, we model the frequency as the inverse of average delays of NMOS and PMOS. Then, we calculate the delays with the Elmore delay model and the effective currents of transistors with different process conditions and voltages. Calculate delays with Elmore delay model Effective currents of transistors (3)

Vmin Sensitivity Vmin for PMOS only Vmin for NMOS only With the analytical Vmin model, we sweep the fanout of the inverter, the drive strength of NMOS and PMOS, the beta ratio, and interconnect length. Our study shows that V min is not very sensitive to the parameters even if they are perturbed from 0.2X to 4X of their nominal values. Further, the empirical results show that bounds on V min is determined by the artificial scenario where only the delay of PMOS or NMOS is being considered. This suggests that Vmin is related to the device characteristics. Vmin is not very sensitive to fanout, interconnect load, etc. Empirically, bounds on Vmin determined by NMOS and PMOS

Effects of Fanout and Series Resistance
Fanout has little effect on Vmin High series resistance reduces Vmin  But, need long wires To validate the analytical analysis, we simulate ring oscillators with different configurations using SPICE. In the first experiment, the simulation results show that V min is not sensitive to fanout across different process corners. In the second experiment, we can see that the V min decreases when the series resistance increases. However, this is unlikely to happen in a typical circuit because large resistance due to long wire is usually avoided with buffer insertion.

Effects of Cell Type Cell type affects Vmin
Maximum Vmin at different corners are determined by different cell types Stacking causes cell delay biased to PMOS or NMOS  changes device characteristics and Vmin In the third experiment, we simulate ring oscillators with different cell types. From the results, we can see that cell type affects Vmin. Further, we observed that the maximum V min at different corners are determined by different cell types. For example, maximum V min at the fast-fast corner is determined by NOR gates, but the maximum V min at the slow-fast corner is determined by NAND gates. Also, we can see that V min is affected by stacking. This is because stacking causes cell delay biased to PMOS or NMOS and at the same time stacking effect changes the device characteristics.

Effects of Cell Strength
Cell layout changes device characteristics and Vmin Vmin does not increase from X1 to X3 But increases from X0 to X1 X1 to X3  {1,2,3} fingers, same device characteristic X0 to X1  Both 1 finger but different diffusion area In the fourth experiment, we study the effect of cell strength. The simulation results show that V min does not change when cell size is increased from X1 to X3. However, the V min is increased when the cell size is increased from X0 to X1. To understand this, we study the layout of the cells and found that the cells with X1 to X3 sizes are similar but the transistors have different number of fingers. Thus, the device characteristics are similar. For X0 and X1 cells, both of them have single finger, but the diffusion areas are different. This study suggests that the cell layout can change the devices characteristics and affects V min.

Outline Intro: Adaptive Voltage Scaling (AVS)
Overview of Proposed Method Voltage Scaling Properties Designing the Circuit Results In this presentation, I will start with an introduction on adaptive voltage scaling and existing works. Then I will describe our process-aware voltage scaling method and some observations on key voltage scaling properties. Finally, I will present our circuit design and the results.

Design of RO with Tunable Vmin
Identified two circuit knobs to tune Vmin Series resistance Cell types (INV, NAND, NOR) Proposed circuit ROs with different cell types (worst-case Vmin are determined by different cells at different process corners) Tune Vmin  a configurable series resistance at each stage 1 bit Control pins High resistance Low resistance Based on the analysis, we have identified that the two circuit knobs to tune V min are series resistance and cell types. To construct a tunable monitor, we use ROs with different cell types because the maximum V min is determined by different cells at different process corners. To tune the V min, we insert a configurable series resistance passgate at each stage of the Ros For instance, when all series resistance passgate is low, the V min of the RO is large and vise versa.

Tunability Vmin decreases linearly with % high-resistance passgates
ROs with different gate types have similar trend INVX3 To validate the voltage scaling properties of our tunable RO, we simulate the RO with SPICE and tune the configuration of the series passgates. The simulation result shows that Vmin decreases linearly as the percentage of high resistance passgate increases. From our simulation results, we observed similar trends for different gate types.

Outline Intro: Adaptive Voltage Scaling (AVS)
Overview of Proposed Method Voltage Scaling Properties Designing the Circuit Results

Experiment Methodology
Goal: Validate PVS ROs in simulation Check Vmin of ROs vs. Vmin of paths with arbitrary circuits and process variation Experiment setup: 65nm industrial technology Implement 3 testcases (arbitrary circuits) Implement 3 tunable ROs (INV, NAND, NOR) Power (mW) Area (mm2) Freq. target FPU 4.1 0.015 710 TLU 438.0 0.098 507 MUL 19.8 0.050 1042 In this work, we setup an experiment to validate the process-aware voltage scaling ROs in simulation. Specifically, we want to check the Vmin of ROs vs. V min of critical paths with arbitrary circuits and process variation. In this experiment, we use a 65nm industrial technology to implement 3 testcases and 3 tunable ROs with different gate types. The testcases are sub modules of a OpenSparc processor. To test our methodology, we choose to implement the sub modules with different target frequencies.

Process Variation Setup
Simulate critical paths and ROs with SPICE 200 Monte Carlo samples (global variation) 4 variation sources, Gaussian distributions Difference between slow and fast corners define +/- 3 sigma values of variation sources Variation sources mean +/- 3 sigma NMOS Vth 30mV PMOS Vth Channel length 5nm Gate oxide thickness 0.06nm To emulate process variation, we simulate the critical paths extracted from the testcases and ROs with SPICE. In this experiment, we use 200 Monte Carlo samples. We assume there are 4 variation sources, and the variations follow Gaussian distribution. Then, we assume the difference between slow and fast corners define the plus-minus three sigma values of the variation sources.

Vmin Extraction and Comparison
Define ftarget of chip and ROs at “slow-slow” process corner nominal voltage = 1.0V Vmin_chip = max. Vmin of critical paths of a testcase Vmin_est = max. Vmin of 3 ROs For each testcase, calculate Vmin_est - Vmin_chip of every Monte Carlo sample A chip is safe when Vmin_est - Vmin_chip > 0 To extract the Vmin, we define the frequency target of chip and RO at slow-slow process corner with a nominal voltage at one volt. Then, we calculate V min of a chip as the maximum V min of critical paths. Similarly, we calculate V min estimate as the maximum V min of three Ros. For each testcase, we calculate the difference between Vmin estimate and V min chip, for every Monte Carlo sample. A chip is consider as safe, when the V min estimate minus V min chip is larger than zero.

Scenario 1: Guardband for Arbitrary Circuit
Vmin_est - Vmin_chip > 0 under process variation Similar results for different testcases Small difference between normal and tunable ROs  due to series passgates FPU testcase TLU testcase MUL testcase In the first experiment, we can see that for the FPU testcase the value of V min estimate minus V min chip is larger than zero for different process conditions. For TLU and MUL testcases, the results are similar. For comparison, we also run the same experiment for normal ROs without series passgates. As we can see, the Vmin distributions for the tunable RO is slightly different from the normal Ros. This is because the low series resistance affects V min distribution.

Scenario 2: Tune ROs for Margin Reduction
Extract Vmin_chip at different process corners Configure % high-resistance passgates min. : s.t. : Ensures Vmin_est guided by ROs is always safe In the second experiment, we assume chip samples are available so that we can extract the Vmin of chips at different process corners. Then, by using the extracted information, We configure the percentage of high-resistance passgate to reduce voltage margin. The objective is to minimize the sum of Vmin difference between V min estimate and V min of the chip. A constraint is added to make sure that the Vmin of RO is always larger than the Vmin of the chip, for all process conditions. In this optimization framework, we change the value of Vmin estimate by perturbing the configuration of each RO.

Experiment Result on Tunability
Aggressive config.  Vmin_est < Vmin_chip  Some chips will fail Optimized config. Increase % high resistance passgates Vmin_est ≈ Vmin_chip Default config. Low resistance passgates Guardband for worst-case Vmin_est > Vmin_chip 13mV margin This figure show that when we use the default configuration, the ROs only have low resistance passgates. Since it is designed to guardband for the worst-case, there is a 13 mV margin. In this experiment, we show that by perturbing the percentage of high resistance passgates, we can reduce the Vmin estimate so that it is similar to the V min of the chip. In case we push for more aggressive configuration, the value of Vmin estimate will be smaller than the V min of the chip. Therefore, some of the chips will fail.

Experiment Result on Tunability
Aggressive config.  Vmin_est < Vmin_chip  Some chips will fail Optimized config. Increase % high resistance passgates Vmin_est ≈ Vmin_chip Default config. Low resistance passgates Guardband for worst-case Vmin_est > Vmin_chip 13mV margin Benefits of tunability Recover voltage margin Compensate for difference between SPICE model vs. silicon Recover margin when chip performance variation is reduced due to improvements in chip manufacturing This experiment shows that, by controlling the percentage of high resistance passgate, we can tune the voltage scaling properties of the ROs and recover voltage margin. The tunability is also very useful to compensate for difference between SPICE model and silicon. Moreover, we can adjust the configuration to reduce voltage margin especially when the process variation is improved.

Summary Monitor design based on voltage scaling properties
Estimate the worst-case voltage scaling property across different process corners Does not require information about critical paths Can be used as an IP for arbitrary circuits Recover margin if fmax of sample silicon is available Future works Proof of concept silicon Account for performance variation due to layout context In summary, we have proposed a monitor design based on voltage scaling properties. The monitor can estimate the worst-case voltage across different process corners for any arbitrary circuits. Therefore, it does not require the information of critical paths, and it can be used as an IP for arbitrary circuits. We also show that the tunable RO can recover voltage margin if Fmax of sample silicon is available. Our future works include proof of concept silicon and further study to account for performance variation due to layout context.

tbchan@ucsd.edu, abk@cs.ucsd.edu
Thank you!

Backup Slides

Effects of Pass Gates Pass gate is equivalent to large resistance
Vmin decreases Pass gate is equivalent to large resistance Vmin decreases with fewer parallel pass gates

Effects of Cell Type and Strength
Key observations: Vmin is affected by cell types Use NAND, NOR type ROs Cell strength changes Vmin  Use cells with large Vmin In another experiment, we extract the Vmin of ROs made by different standard cells. We observe that Vmin is affected by cell type and the strength. This is because different cell topologies and layouts affect device characteristics of the cells, as well as the voltage scaling properties. Thus, Vmin is also affected.

Download ppt "Tunable Sensors for Process-Aware Voltage Scaling"

Similar presentations

Ads by Google