Presentation is loading. Please wait.

Presentation is loading. Please wait.

Design Flows and Tools Peter A. Beerel

Similar presentations


Presentation on theme: "Design Flows and Tools Peter A. Beerel"— Presentation transcript:

1 Design Flows and Tools Peter A. Beerel
University of Southern California USC Asynchronous CAD/VLSI Group (async.usc.edu)

2 Part II - Agenda Design Flows Design via decomposition
Modeling design using System Verilog Design Automation – The Proteus-A flow Legacy RTL Added System Verilog CSP front-end Asynchronous optimizations Final Flow Considerations Analog Verification Design for Test and Debug

3 Design via Process Decomposition
Collection of Processes linked by Channels Channels pass messages with guaranteed delivery Processes synchronize Processes can be decomposed into smaller processes

4 Modeling Asynchronous Design via SystemVerilogCSP (SVC)
SystemVerilog interface abstracts channel wires as well as communication protocol Send/Receive Blocking tasks (Flow control) Sender Receiver SVC Interface Abstract communication module Sender (interface R); parameter WIDTH = 8; logic [WIDTH-1:0] data; always begin //produce data R.Send(data); end endmodule module Receiver (interface L); L.Receive(data); //consume data

5 SVC - Waveform view //Sender (DataGen) always begin #Delay;
R.Send(data); End //Receiver L.Receive(data); #FL; #BL; end Receiver performs Receive, Communication happens Receiver pending on Receive Sender performs Send, Communication happens No one is Sending or Receiving Sender pending on Send

6 Part II - Agenda Design Flows Design via decomposition
Modeling design using System Verilog Design Automation – The Proteus-A flow Legacy RTL Added System Verilog CSP front-end Asynchronous optimizations Final Flow Considerations Analog Verification Design for Test and Debug

7 The Proteus-A Flow – Legacy RTL
Key Features Re-uses synchronous EDA tools Seamless integration into existing flows Back-end design style agnostic Up to 2X higher performance Tool Status Commercialized version in production for 2+ years Uses proprietary QDI library Academic version (Proteus-A) enhanced significantly at USC Recent Advances Power optimization algorithms Synth RTL Design Goals Synthesis Image Netlist Netlist Constraints Proteus/ Sync Library ClockFree Constraints Sync Library Clock Gating Clock Tree Synthesis Netlist Async Netlist Netlist Constraints Physical Design Final Layout 7 7

8 Legacy RTL Specification
Flow Demo – Legacy RTL Physical Design Synthesis Clockfree Synth. RTL Synthesized Image Netlist Asynchronous Gate-level Netlist Final Layout Legacy RTL Specification

9 Amber23 – Proteus-A Case Study
Download from ARM-compatible 32-bit RISC processor 3 stages : FETCH, DECODE and EXECUTE Decode State machine Register bank Barrel shifter ALU Multiplexer instruction control Cache Bus interface Read data Address, write data Cache Bus interface Zhang, USC Summer Research, 2012

10 Amber23 – Performance Comparison
Download from ARM-compatible 32-bit RISC processor 3 stages : FETCH, DECODE and EXECUTE Decode State machine Register bank Barrel shifter ALU Multiplexer instruction control Cache Bus interface Read data Address, write data Cache Bus interface Zhang, USC Summer Research, 2012

11 The Proteus-A Flow – SVCRTL
System- Verilog Image Netlist Key New Features Supports System Verilog CSP front-end Enables user-defined conditional communication Saves power at architectural level Tool Status Proprietary version starting from CAST developed at Fulcrum System Verilog version subsequently developed at USC Used in current research at USC and Technion and 40+ person async class Design Goals Verilog Netlist SVC2RTL Synth. RTL Constraints Synthesis Constraints Proteus/ Sync Library ClockFree Constraints Sync Library Clock Gating Clock Tree Synthesis Netlist Async Netlist Netlist Constraints Physical Design Final Layout 11 11

12 Key to Low-Power - Conditional Communication
Conditional communication reduces token flow, saving power Traditionally - manually introduced via user-created decomposition Recent research - automatically introduced via Operand Isolation DEMUX A,B op Add/Sub Mult MUX D S R + + Saifhashemi, PATMOS 2012

13 SVC2RTL – Enables User-Defined Conditional Communication
Dummy value Not sent 1 Not received 1

14 Part II - Agenda Design Flows Design via decomposition
Modeling design using System Verilog Design Automation – The Proteus-A flow Legacy RTL Added System Verilog CSP front-end Asynchronous optimizations Final Flow Considerations Analog Verification Design for Test and Debug

15 Power Optimization Overview
Conditioning Automatically add conditional communication Reconditioning Optimize the existing conditionality Formal problem definition

16 Power Saving - The Opportunity
Unnecessary calculation +

17 Our Solution - Adding Isolation Cells
All inputs/outputs are unconditional Operand Isolation And-based isolation cells Generated by synchronous RTL synthesizer Does not prevent switching in asynchronous circuits Isolation cells are not effective in asynchronous circuits

18 Our Solution - Conditioning
& + + No Activity

19 Power Optimization Results
Case study: 32-bit ALU placed and routed Back annotated switching activity using a VCD file Results: Isolating ADD and SUB are detrimental for rADD and rSUB > 0.2 53% power reduction when only isolating MUL (rf=0.25) Area cost of isolating MUL is about 4% and no performance penalty Highlight rows Saifhashemi, Patmos 2012

20 Power Savings – The Opportunity
Unnecessary activity 1 1 Unnecessary activity Conditional communication is explicit and only at primary IO

21 The Reconditioning Problem
Definition (The Reconditioning Problem): Rearrange location of RECEIVE and SEND cells to minimize Power consumption while preserving functional behavior.

22 Power Results Saifhashemi, PhD Thesis, 2012 RECON1:
Dual-mode arithmetic unit RECON2: Conditional multiplier ALU-OI ALU after operand isolation Saifhashemi, PhD Thesis, 2012

23 Mode Based Conditional Slack Matching
DEMUX A,B op MUX Add/Sub S R S R Mult Conditional Slack Matching Advantage – Conditional behavior yields less stalls and thus not as many pipeline buffers needed Previously ignored – conservatively modeled as unconditional Najibii,2012

24 Conditional Slack Matching - Results
33% less buffers on average Najibii,2012

25 Design Flow Demo Synthesis ClockFree Physical Design System- Verilog
Design Goals SVC2RTL Synth. RTL Constraints Synthesis Image Netlist Constraints Proteus/ Sync Library ClockFree Abstract Sends and Receive Async Netlist Constraints Physical Design Final Layout

26 Agenda Design Flows Design via decomposition
Modeling design using System Verilog Design Automation – The Proteus-A flow Legacy RTL Added System Verilog CSP front-end Asynchronous optimizations Final Flow Considerations Analog Verification Design for Test and Debug

27 Final Flow Considerations
Static Timing Analysis Verify timing constraints and performance is a must Trick traditional tools into working with asynchronous circuits Analog Verification Domino logic used in QDI flows sensitive to charge sharing Asynchronous channels cannot tolerate cross-talk glitches Special spiced-based tools developed Asynchronous Scan Asynchronous scan is a must but doable Design for Silicon Debug Chip deadlock is still difficult to debug

28 Conclusions The Asynchronous Design Flow/CAD Landscape
Synchronous design rigidity continues to hamper quality design Asynchronous design offers solutions but has many design flow challenges Design Flow Requirements Design flows must easily integrate into synchronous designs Circuit quality must compete very well to warrant switching design styles Our approach Proteus provides a good design framework for automation of both legacy RTL and SystemVerilog CSP Final considerations of analog and timing verification, scan, and debug should not be over looked

29 Acknowledgements Prts used for high performance

30


Download ppt "Design Flows and Tools Peter A. Beerel"

Similar presentations


Ads by Google