Presentation is loading. Please wait.

Presentation is loading. Please wait.

Design Flows and Tools Peter A. Beerel University of Southern California USC Asynchronous CAD/VLSI Group (async.usc.edu)

Similar presentations


Presentation on theme: "Design Flows and Tools Peter A. Beerel University of Southern California USC Asynchronous CAD/VLSI Group (async.usc.edu)"— Presentation transcript:

1 Design Flows and Tools Peter A. Beerel University of Southern California USC Asynchronous CAD/VLSI Group (async.usc.edu)

2 Part II - Agenda Design Flows Design via decomposition Modeling design using System Verilog Design Automation – The Proteus-A flow Legacy RTL Added System Verilog CSP front-end Asynchronous optimizations Final Flow Considerations Analog Verification Design for Test and Debug

3 Design via Process Decomposition Collection of Processes linked by Channels Channels pass messages with guaranteed delivery Processes synchronize Processes can be decomposed into smaller processes

4 Modeling Asynchronous Design via SystemVerilogCSP (SVC) SystemVerilog interface abstracts channel wires as well as communication protocol Send/Receive Blocking tasks (Flow control) module Sender (interface R); parameter WIDTH = 8; logic [WIDTH-1:0] data; always begin //produce data R.Send(data); end endmodule module Receiver (interface L); parameter WIDTH = 8; logic [WIDTH-1:0] data; always begin L.Receive(data); //consume data end endmodule Abstract communication SenderReceiver SVC Interface

5 SVC - Waveform view Receiver pending on Receive Sender performs Send, Communication happens No one is Sending or Receiving Sender pending on Send Receiver performs Receive, Communication happens //Sender (DataGen) always begin #Delay; R.Send(data); End //Receiver always begin L.Receive(data); #FL; R.Send(data); #BL; end

6 Part II - Agenda Design Flows Design via decomposition Modeling design using System Verilog Design Automation – The Proteus-A flow Legacy RTL Added System Verilog CSP front-end Asynchronous optimizations Final Flow Considerations Analog Verification Design for Test and Debug

7 Constraints Sync Library Clock Gating Clock Tree Synthesis Netlist Clock Gating The Proteus-A Flow – Legacy RTL Synthesis Physical Design Synth RTL Netlist Constraints Final Layout Proteus/ Sync Library ClockFree Image Netlist Design Goals Async Netlist Key Features Re-uses synchronous EDA tools Seamless integration into existing flows Back-end design style agnostic Up to 2X higher performance Tool Status Commercialized version in production for 2+ years Uses proprietary QDI library Academic version (Proteus-A) enhanced significantly at USC Recent Advances Power optimization algorithms

8 Synth. RTL Flow Demo – Legacy RTL Legacy RTL Specification SynthesisClockfree Physical Design Final Layout Asynchronous Gate-level Netlist Synthesized Image Netlist

9 Download from ARM-compatible 32-bit RISC processor 3 stages : FETCH, DECODE and EXECUTE Amber23 – Proteus-A Case Study Cache Bus interface Decode State machine Register bank Barrel shifter ALU Multiplexer instruction control Cache Bus interface Read data Address, write data Zhang, USC Summer Research, 2012

10 Download from ARM-compatible 32-bit RISC processor 3 stages : FETCH, DECODE and EXECUTE Amber23 – Performance Comparison Cache Bus interface Decode State machine Register bank Barrel shifter ALU Multiplexer instruction control Cache Bus interface Read data Address, write data Zhang, USC Summer Research, 2012

11 Constraints Sync Library Clock Gating Clock Tree Synthesis Netlist Clock Gating The Proteus-A Flow – SVCRTL Synthesis Physical Design Verilog Netlist Constraints Final Layout Proteus/ Sync Library ClockFree System- Verilog Image Netlist SVC2RTL Design Goals Synth. RTL Constraints Async Netlist Key New Features Supports System Verilog CSP front-end Enables user-defined conditional communication Saves power at architectural level Tool Status Proprietary version starting from CAST developed at Fulcrum System Verilog version subsequently developed at USC Used in current research at USC and Technion and 40+ person async class

12 Key to Low-Power - Conditional Communication Conditional communication reduces token flow, saving power Traditionally - manually introduced via user-created decomposition Recent research - automatically introduced via Operand Isolation DEMUX A,B op Add/Sub Mult MUX + + D S R Saifhashemi, PATMOS 2012

13 SVC2RTL – Enables User-Defined Conditional Communication Not received Dummy value 0 1 Not sent

14 Part II - Agenda Design Flows Design via decomposition Modeling design using System Verilog Design Automation – The Proteus-A flow Legacy RTL Added System Verilog CSP front-end Asynchronous optimizations Final Flow Considerations Analog Verification Design for Test and Debug

15 Power Optimization Overview Conditioning Automatically add conditional communication Reconditioning Optimize the existing conditionality

16 Power Saving - The Opportunity + Unnecessary calculation

17 Our Solution - Adding Isolation Cells All inputs/outputs are unconditional Operand Isolation And-based isolation cells Generated by synchronous RTL synthesizer Does not prevent switching in asynchronous circuits Isolation cells are not effective in asynchronous circuits

18 Our Solution - Conditioning & No Activity

19 Power Optimization Results Case study: 32-bit ALU placed and routed Back annotated switching activity using a VCD file Results: Isolating ADD and SUB are detrimental for r ADD and r SUB > % power reduction when only isolating MUL (r f =0.25) Area cost of isolating MUL is about 4% and no performance penalty Saifhashemi, Patmos 2012

20 Power Savings – The Opportunity Conditional communication is explicit and only at primary IO Unnecessary activity

21 The Reconditioning Problem Definition (The Reconditioning Problem): Rearrange location of RECEIVE and SEND cells to minimize Power consumption while preserving functional behavior.

22 Power Results RECON1: Dual-mode arithmetic unit RECON2: Conditional multiplier ALU-OI ALU after operand isolation Saifhashemi, PhD Thesis, 2012

23 Mode Based Conditional Slack Matching DEMUX A,B op MUX S R S R Add/Sub Mult Najibii,2012 Conditional Slack Matching Advantage – Conditional behavior yields less stalls and thus not as many pipeline buffers needed Previously ignored – conservatively modeled as unconditional

24 Conditional Slack Matching - Results Najibii, % less buffers on average

25 Design Flow Demo Synthesis Physical Design Constraints Final Layout Proteus/ Sync Library ClockFree System- Verilog Image Netlist SVC2RTL Design Goals Synth. RTL Constraints Async Netlist

26 Agenda Design Flows Design via decomposition Modeling design using System Verilog Design Automation – The Proteus-A flow Legacy RTL Added System Verilog CSP front-end Asynchronous optimizations Final Flow Considerations Analog Verification Design for Test and Debug

27 Final Flow Considerations Static Timing Analysis Verify timing constraints and performance is a must Trick traditional tools into working with asynchronous circuits Analog Verification Domino logic used in QDI flows sensitive to charge sharing Asynchronous channels cannot tolerate cross-talk glitches Special spiced-based tools developed Asynchronous Scan Asynchronous scan is a must but doable Design for Silicon Debug Chip deadlock is still difficult to debug

28 Conclusions The Asynchronous Design Flow/CAD Landscape Synchronous design rigidity continues to hamper quality design Asynchronous design offers solutions but has many design flow challenges Design Flow Requirements Design flows must easily integrate into synchronous designs Circuit quality must compete very well to warrant switching design styles Our approach Proteus provides a good design framework for automation of both legacy RTL and SystemVerilog CSP Final considerations of analog and timing verification, scan, and debug should not be over looked

29 Acknowledgements

30


Download ppt "Design Flows and Tools Peter A. Beerel University of Southern California USC Asynchronous CAD/VLSI Group (async.usc.edu)"

Similar presentations


Ads by Google