Presentation is loading. Please wait.

Presentation is loading. Please wait.

VLSI Delay Tuning by Wire-Shield Spacing Tapering with Clock-Tree Application Eyal Sarfati Binyamin Frankel Prof. Yitzhak Birk Prof. Shmuel Wimer.

Similar presentations


Presentation on theme: "VLSI Delay Tuning by Wire-Shield Spacing Tapering with Clock-Tree Application Eyal Sarfati Binyamin Frankel Prof. Yitzhak Birk Prof. Shmuel Wimer."— Presentation transcript:

1 VLSI Delay Tuning by Wire-Shield Spacing Tapering with Clock-Tree Application
Eyal Sarfati Binyamin Frankel Prof. Yitzhak Birk Prof. Shmuel Wimer

2 VLSI Delay Tuning

3 Delay Requirements In synchronous systems, the clock maintains synchronization of all elements across the chip die. 𝑇 𝐿𝑜𝑔𝑖𝑐 Setup Constraint : 𝑇 𝑝𝑒𝑟𝑖𝑜𝑑 −𝑚𝑖𝑛 𝑇 𝐶𝑙𝑜𝑐𝑘 𝑃𝑎𝑡ℎ 𝐶𝑎𝑝𝑡𝑢𝑟𝑒 − 𝑇 𝑠𝑒𝑡𝑢𝑝 ≥ 𝑚𝑎𝑥 𝑇 𝐶𝑙𝑜𝑐𝑘 𝑃𝑎𝑡ℎ 𝐿𝑎𝑢𝑛𝑐ℎ𝑒𝑟 + 𝑇 𝑐𝑝→𝑄 + 𝑇 𝐿𝑜𝑔𝑖𝑐 Hold Constraint : 𝑚𝑎𝑥 𝑇 𝐶𝑙𝑜𝑐𝑘 𝑃𝑎𝑡ℎ 𝐶𝑎𝑝𝑡𝑢𝑟𝑒 + 𝑇 ℎ𝑜𝑙𝑑 ≤ 𝑚𝑖𝑛 𝑇 𝐶𝑙𝑜𝑐𝑘 𝑃𝑎𝑡ℎ 𝐿𝑎𝑢𝑛𝑐ℎ𝑒𝑟 + 𝑇 𝑐𝑝→𝑄 + 𝑇 𝐿𝑜𝑔𝑖𝑐 Launcher FF D Q Capture FF data path Clock Root Point of Divergence 𝑇 𝑐𝑝→𝑄 𝑇 𝑆𝑒𝑡𝑢𝑝 / 𝑇 𝐻𝑜𝑙𝑑 𝑇 𝐶𝑙𝑜𝑐𝑘 𝑃𝑎𝑡ℎ 𝐿𝑎𝑢𝑛𝑐ℎ𝑒𝑟 𝑇 𝐶𝑙𝑜𝑐𝑘 𝑃𝑎𝑡ℎ 𝐶𝑎𝑝𝑡𝑢𝑟𝑒

4 VLSI Wire Shielding

5 Cross-Coupling Interference (Cross-Talk)
Signal wire cross coupling may result in unintentional “noise”. Vout(t) RD CL Interconnection (aggressor) (victim) Vin(t)

6 Interconnect Shield A technique to prevent interference due to cross-coupling capacitance between adjacent signal wires. Vout(t) RD CL Interconnection Vin(t) Interconnection VGND

7 Layout Snapshot Example
VSS Aggressor Signal

8 Tuning Delay With Shielded Wires

9 Delay Tuning by Wire Shielding
Main Idea : Use shielding for tunable wire delay. Small Wire-Shield Spacing →Large delay Large Wire-Shield Spacing → Small delay

10 Wire Delay Modeling

11 Delay Tuning Range, Crude Approximation
𝑇𝑢𝑛𝑖𝑛𝑔 𝑅𝑎𝑛𝑔𝑒= 𝛿 𝑚𝑎𝑥 − 𝛿 𝑚𝑖𝑛 𝛿 𝑛𝑜𝑚𝑖𝑛𝑎𝑙 Consider a wire, already shielded with space 𝑠=2× 𝑠 𝑚𝑖𝑛 𝑆ℎ𝑖𝑒𝑙𝑑 𝑇𝑢𝑛𝑖𝑛𝑔 𝑅𝑎𝑛𝑔𝑒= 𝛿 𝑠ℎ𝑖𝑒𝑙𝑑 1 − 𝛿 𝑠ℎ𝑖𝑒𝑙𝑑 3 𝛿 𝑤𝑖𝑟𝑒 + 𝛿 𝑠ℎ𝑖𝑒𝑙𝑑 2 = 2 3 𝑐 𝑙𝑙 𝐿 𝑅 𝐷 + 𝑟 𝑠 𝐿 2𝑤 𝑅 𝐷 𝑐 𝑠 𝑤𝐿+ 𝐶 𝐿 + 𝑟 𝑠 𝐿 𝑤 𝑐 𝑠 𝑤 2 𝐿+ 𝐶 𝐿 𝑐 𝑙𝑙 𝐿 𝑅 𝐷 + 𝑟 𝑠 𝐿 2𝑤 ≈ 4 3 2𝑤+1 Consider the assumptions the wire is long 𝐶 𝑠 𝑤𝐿≫ 𝐶 𝐿 and capacitance parameters are similar 𝑐 𝑙𝑙 ≈ 𝑐 𝑠.

12 Delay Tuning Range, Simulation Results
For line width 𝑤= 𝑘∙𝑤 𝑚𝑖𝑛 1≤k≤3 : Signal Line Width Tuning Range Crude Approximation w=x1 44% w=x2 27% w=x3 19%

13 Delay Variation Test variability 𝑠=1 𝑠=3 𝛿 3 𝛿 1 (b) (a) 𝛿 ′ 𝛿 ′′

14 Delay Tuning By Shielding Wires
The achievement of both respectable tuning range, and low variability lays the ground for intentional delay insertion and tuning by shielding with space tapering. RD CL Interconnection (a) Vin(t) Delay Buffer Vout(t) RD CL Interconnection VGND (b) Vin(t) Vout(t)

15 Shield Delay Tuning

16 Optimal Tapered Shielding For Delay Tuning
Problem formulation: Minimizing the area between the signal wire and the shield (the wasted routing resources), which is given by : 𝐴= 𝑖=1 𝑛 𝑙 𝑖 𝑠 𝑖 subject to a required delay 𝛿 𝑠ℎ𝑖𝑒𝑙𝑑 . 𝛿 𝑠ℎ𝑖𝑒𝑙𝑑 ≈ 2𝑅 𝐷 𝑐 𝑙𝑙 𝑖=1 𝑛 𝑙 𝑖 𝑠 𝑖 + 𝑟 𝑠 𝑐 𝑙𝑙 𝑤 𝑖=1 𝑛 𝑙 𝑖 𝑗=𝑖 𝑛 𝑙 𝑗 𝑠 𝑗 Theorem [FW]: In order to obtain a desired delay while consuming minimum area resources, the wire shielding must have a monotonically increasing spacing. [FW] B. Frankel and S. Wimer “Optimal VLSI Delay Tuning by Wire Shielding” Journal of Optimization Theory and Applications vol. 170 no 3, pp , 2016.

17 Delay Tuning by Shield Space Tapering
Analytical continuous shield spacing solution : 𝑠 𝑥 = 2𝑅 𝐷 2 𝑐 𝑙𝑙 𝑤 3 𝛿 𝑠ℎ𝑖𝑒𝑙𝑑 𝑟 𝑠 𝑟 𝑠 𝐿 𝑅 𝐷 𝑤 − 𝑟 𝑠 𝑅 𝐷 𝑤 𝑥 Feasible solation is approximated by a stepwise function: 𝑠 𝑥 𝑖 = 𝑠 𝑖−1 + 𝑠 𝑖 2 B. Frankel and S. Wimer “Optimal VLSI Delay Tuning by Wire Shielding” Journal of Optimization Theory and Applications vol. 170 no 3, pp , 2016.

18 Optimal Tapered Shielding For Delay Tuning – Stepwise Approximation
𝑠 𝑑 𝑥 area 𝑠 𝑑 𝑥 delay Delay and area accuracy of stepwise approximation 𝑠 𝑑 𝑥 relative to 𝑠 𝑥 as a function of n (number of permissible shield spacings). Feasible stepwise approximation 𝑠 𝑑 𝑥 of the optimal square root shielding 𝑠 𝑥 .

19 Shield Delay Clock Tree Application

20 Experimental Results On Real 28nm Designs

21 Experimental Results On Real 28nm Designs
Tradeoff between buffer count and wire length Saved the variation of 12 cascaded buffers in CPU case Equal to ~13[ps] variation delay (In typical corner). Buffer Insertion Shield Delay Design ARM® CPU Memory Ctrl. # FFs 70K 40K # FFs with useful skew 3892 1215 CTS Wire Length [mm] 534 305 837(x1.56) 480(x1.57) # delay buffers 8785 4986 # delay buffers in path 12 7

22 Extraction of Shield Delay From Silicon

23 Silicon Measurements Real silicon behavior may be different than the RC layout extraction and SPICE simulations. But, due to lack of internal node observability, direct silicon measurement is impossible. Therefore, indirect measurement approach will be used instead. Still, indirect measurement will measure the sum of shield delay and buffer delay.

24 Ring Oscillator Ring_clk
CMOS Ring oscillator is usually used to evaluate gate delay times. Common designs contain an odd number of inverting gates, forming a closed loop. Operates by DC supply voltage with an enable signal which controls circuit on/off. The oscillation frequency 𝑓 is given by 𝑓= 1 2𝑁 𝜏 𝑑 where N is the number of gates and 𝜏 𝑑 the individual gate delay. Ring_clk

25 Balanced Inverting Mux
The inputs are buffered, to provide identical load. I[0] to Z[0] path : buffer  low NAND  low NOR  high NOR  high NAND  high NAND I[3] to Z[0] path : buffer  high NAND  high NOR  low NOR  low NAND  high NAND

26 Variation In Different Corners
The input-to-output delay similarity allows to subtract cell based delays. Corner rise→fall fall→rise P device N Vdd [Volt] temp [℃] wire RC average [psec] max var ±[%] typical 0.8 +85 63.7 0.9 70.6 2.2 1.0 55.5 1.6 61.1 2.3 slow 0.72 −40 worst 79.2 1.4 90.2 +125 79.7 1.2 88.6 2.0 fast 1.05 best 45.7 1.8 50.9 53.6 57.4

27 Shielded Interconnect Ring Oscillator
Shield-Wire spacings: 1,2,3,5× 𝑠 𝑚𝑖𝑛 Wire widths : 1× 𝑤 𝑚𝑖𝑛 Mux-Mux distance: 200 𝜇𝑚 Mux configurable selectors : 𝑆 0 −𝑆 7 A total of 256 configurations.

28 Testing circuit Method: count ring oscillator cycles within a time window. 25Mhz reference clock Time window of 320[ns] – 200[ns] (Panic mode)

29 Delay Extraction 𝒙 𝛿 𝑘 𝑗 : 𝑥−𝑡𝑜−𝑥 rise + fall delays from mux 𝑗 through input 𝑘 of the next mux in chain. ∆ 𝑛 = 𝛿 𝑘 0 + 𝛿 𝑘 1 + 𝛿 𝑘 2 + 𝛿 𝑘 3 , 0≤𝑛≤255 From the test circuit : ∆ 𝑛 = 320 𝑛𝑠 𝑐𝑜𝑢𝑛𝑡 𝐇 𝜹 = ∆ , 𝐇 256x16 𝜹 = 𝛿 0 0 , 𝛿 1 0 , …, 𝛿 3 3 𝑇 16x1 ∆ = ∆ 0 , ∆ 1 ,…, ∆ x1

30 Delay Extraction 𝒙 Due to 𝑥−𝑡𝑜−𝑥 configuration Variance, accurate solution is impossible to obtain. Instead, Least square estimation is used: 𝜹 = 𝐇 𝑇 𝐇 −𝟏 𝐇 𝑇 ∆

31 Post Silicon Delay Estimation Flow
Test 256 silicon ring configurations. Count oscillations for each configuration. Derive 256 ring delays (estimators) Estimate 16 x-to-x segment delays by linear regression Calculate ring delays or the rest 20% configurations 80% delays Training set Compare 20% delays Test set 20% delays Nearly Equal ? Invalid Estimation Valid Estimation

32 Post Silicon Delay Estimation, Results
Post-silicon error validation on 20% configurations Typ 0.8v Typ 0.9v 25°C 50°C 85°C 105°C Err Max[%] 0.08% 0.07% 0.09% 0.14% 0.06% 0.11% Err Min[%] 0.00% Err Avg[%] 0.02% 0.04% Err StdEv[%] 0.01% 0.03% Typ 1.0v 25°C 50°C 85°C 105°C Err Max[%] 0.17% 0.15% 0.13% Err Min[%] 0.00% Err Avg[%] 0.04% 0.02% 0.03% Err StdEv[%]

33 Post Silicon Shield Delay Tuning Range
Silicon measured delay tuning range for 200 𝜇𝑚 signal wire length Delay Tuning Range = 𝛿 𝑎𝑣𝑔 𝑠1 − 𝛿 𝑎𝑣𝑔 𝑠3 𝛿 𝑎𝑣𝑔 𝑠2 0.8v 0.9v 1.0v 25°C 23% 25% 26% 50°C 24% 85°C 105°C 27% Delay Tuning Range ~ 𝟐𝟓%

34 Post Silicon On Chip Shield Delay Variability
2% shield delay variation for the s=x1 and s=x2 cases. 5% shield delay variation for the s=x3 case.

35 Discussion and Conclusions

36 Are Shield Delays Always Better ?
Goal : Minimize the maximum of delay difference variation. The problem : data and clock paths are in a constant race. If variability of paths is correlated (common effect) Use the same technology For delay tuning! If variability of paths is uncorrelated (local effect) Use the lower variability Delay technology (shield delays) Launcher FF D Q Capture FF data path 𝑇 𝑆𝑒𝑡𝑢𝑝 / 𝑇 𝐻𝑜𝑙𝑑

37 Path Delay Variability: Our Messages
Fab Share the variability correlation values. This will help designers/tools choose the right delay technology. CAD Take the variability correlation into account. In case variation is uncorrelated, use the lower variability technology. Use shield delay for clock tree ECOs, high resolution delay tuning together with low variability and small layout changes.

38 Conclusions Presented a novel approach for using wire-shield spacing tapering as a replacement of buffers for delay tuning. Demonstrated this usage in a clock-tree synthesis application. Measured the tuning delay range of shielding on silicon. Measured the absolute delay variability of shielding on silicon. A verified post silicon with training and test sets methodology that enables an accurate delays extraction.

39 Thank you!


Download ppt "VLSI Delay Tuning by Wire-Shield Spacing Tapering with Clock-Tree Application Eyal Sarfati Binyamin Frankel Prof. Yitzhak Birk Prof. Shmuel Wimer."

Similar presentations


Ads by Google