Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hardware Architecture Design

Similar presentations


Presentation on theme: "Hardware Architecture Design"— Presentation transcript:

1 Hardware Architecture Design
Media IC and System Lab VLSI Crash Course 2019

2 Outline System Verilog introduction Architecture Design Protocols

3 SystemVerilog

4 History Improved version of Verilog What’s new? EDA tool support?
Verilog 1995, 2001(most popular), 2005 SystemVerilog 2005, 2009, 2012 What’s new? Some handy features for simplifying RTL coding Many features for verification EDA tool support? Good supports from commercial tools

5 Features logic data type clog2 Multi-dimension signals
Simpler for loop Improved always block

6 The New Logic Data Type (☆☆☆☆☆)
Reason: The datatype isn’t equal to the real circuit. reg doesn’t mean a register. It depends on how you use these variables. So, in SystemVerilog, a new type “logic” is introduced to replace both of them. wire for continuous assignment (assign) reg for procedural assignment (within always block) Flip-Flop must be the data type of “reg”

7 The New Logic Data Type (☆☆☆☆☆)
Since logic is useful, you can do this: module MyModule( input a, output reg b, input c, output wire d, module MyModule( input logic a, output logic b, input logic c, output logic d,

8 The New Built-In Functions (☆☆☆☆☆)
$clog2() – ceil(log2(x)). Someday you write: parameter MAX_NUM = 6; parameter BIT_NEED = 3; // 6 requires 3 bits logic [BIT_NEED-1:0] counter; The second day: parameter MAX_NUM = 100; // Advisor says that... parameter BIT_NEED = 3; // You forget it The SystemVerilog version: parameter BIT_NEED = $clog2(MAX_NUM);

9 Multi-Dimension Improvements (☆☆☆☆☆)
module AddFourNumber( input [31:0] a [0:3], input [31:0] b [0:3], output [31:0] c [0:3] ); assign c[0] = a[0]+b[0]; assign c[1] = a[1]+b[1]; assign c[2] = a[2]+b[2]; assign c[3] = a[3]+b[3]; endmodule

10 Multi-Dimension Improvements 2 (☆☆☆☆☆)
Verilog ports is 1D module AddFourNumber( input [127:0] a, input [127:0] b, output [127:0] c ); assign c[127-:32] = a[127-:32]+b[127-:32]; assign c[ 95-:32] = a[ 95-:32]+b[ 95-:32]; assign c[ 63-:32] = a[ 63-:32]+b[ 63-:32]; assign c[ 31-:32] = a[ 31-:32]+b[ 31-:32]; endmodule

11 Improved Always Blocks (☆☆☆)
Replace all by always_comb Replace sequential always block by always_ff

12 Simpler For-Loop (☆☆☆☆)
Don’t need to declare a global indices integer i; for (i=0; i<10; i=i+1) The SystemVerilog Version is: for (int i=0; i<10; i++)

13 Assignment vs always block
LHS should be wire LHS should be reg RHS can be wire or reg Everything is logic! Begin & end are not allowed Begin & end are used for multiple statements Always running Triggered by sensitivity lists But you don’t need to write them Combinational only Could be sequential or combinational always_comb for combinational always_ff for sequential (flip-flop) EDA tool can do some checks for you Only 1-line conditional statement is allowed 1-line, if-else and case conditional statements are allowed.

14 Architecture Design

15 Pipeline and Parallel Pipeline: different function units working in parallel Parallel: duplicated function units working in parallel

16 Pipeline Advantages Drawbacks Reduce the critical path
Increase the working frequency and sample rate Increase the throughput Drawbacks Increasing latency (in cycle) Increase the number of registers

17 How to Do Pipelining Put pipelining registers across any feed-forward cutset of the graph Cutset A cutset is a set of edges of a graph such that if these edges are removed from the graph, the graph becomes disjoint Feed-forward cutset The data move in the forward direction on all the edges of the cutset

18 Example

19 Notes for Pipeline Pipelining is a very simple design technique which can maintain the input output data configuration and sampling frequency Tclk=Tsample Supported in many EDA tools Effective pipelining Put pipelining registers on the critical path Balance pipelining 10 →(2+8): critical path=8 10 →(5+5): critical path=5

20 Parallel Single-input single-output (SISO) system
𝑦 𝑛 =𝑎𝑥 𝑛 +𝑏𝑥 𝑛−1 +𝑐𝑥(𝑛−2) Multiple-input multiple-output (MIMO) system 𝑦 3𝑘 =𝑎𝑥 3𝑘 +𝑏𝑥 3𝑘−1 +𝑐𝑥(3𝑘−2) 𝑦 3𝑘+1 =𝑎𝑥 3𝑘+1 +𝑏𝑥 3𝑘 +𝑐𝑥(3𝑘−1) 𝑦 3𝑘+2 =𝑎𝑥 3𝑘+2 +𝑏𝑥 3𝑘+1 +𝑐𝑥(3𝑘)

21 Parallel system1 Whole system

22 Parallel system2

23 Notes for Parallel The input/output data access scheme should be carefully designed, it will cost a lot sometimes Tclk>Tsample, fclk<fsample Large hardware cost Combined with pipeline processing

24 Retiming A transformation technique used to change the locations of delay elements in circuit without affecting the input/output characteristics Reducing the clock period Reducing the number of registers Reducing the power consumption

25 Reducing the Clock Period

26 Reducing the Number of Registers

27 Reducing the Power Consumption
Placing registers at the inputs of nodes with large capacitances can reduce the switching activities at these nodes

28 Unfolding Unfolding is a transformation technique that can be applied to a DSP program to create a new program describing more than one iterations of the original program To reveal hidden concurrent so that the program can be scheduled to a smaller iteration period To design parallel architecture

29 Example DSP algorithm Replace n with 2k and 2k+1 𝑦 𝑛 =𝑎𝑥 𝑛−9 +𝑥 𝑛
𝑦 2𝑘 =𝑎𝑥 2𝑘−9 +𝑥 2𝑘 =𝑎𝑥 2 𝑘−5 +1 +𝑥 2𝑘 𝑦 2𝑘+1 =𝑎𝑥 2𝑘−8 +𝑥 2𝑘+1 =𝑎𝑥 2(𝑘−4) +𝑥 2𝑘+1

30 Example1

31 Example2

32 Example3

33 Folding Folding transform is used to systematically determine the control circuits in DSP architectures where multiple algorithm operations are time-multiplexed to a single functional unit

34 Protocols

35 What is Hardware Design?
Hardware design: design dataflow of hardware first! The same AXI example is simplified to the image below Concrete dataflow first; exact, low level signal and protocol later.

36 Importance of Protocol in Hardware Design
Design as dataflow, implement as protocol. Benefits: Reuse verification. Play-and-Plug. Uniform code. Widely used and easy to understand. Protocol must be simple: Handshake (2-wire) Streaming (1-wire)

37 The Simplest Streaming Protocol
A valid bit indicate whether data bus hold a valid data

38 Code for Streaming Protocol
Simple to understand, easy to use input logic i_valid, i_data; output logic o_valid, o_data; clk or negedge rst) begin if (!rst) o_valid <= 0; else o_valid <= i_valid; end clk or negedge rst) begin if (!rst) o_data <= 0; else if (i_valid) o_data <= i_data; end Clock gating coding style

39 Easy to Cascade Modules
You can easily add new stage to add new functionalities. Input Module A Module B Output Input Module A Module C Module B Output

40 Easy to Cascade Modules
You can also easily broadcasting signals. Input Module A Module B Output Module D Output

41 But How About Merging? Input Module A Module B Output Input Module E
Data might come at different cycle in streaming interface. Input Module A Module B Output Input Module E

42 The Improved Handshake Protocol
A valid bit indicate whether data bus hold a valid data. A ready bit indicate whether the receiver can got it. ack is 0, wait 1 more cycle Done in 1 cycle

43 Code for Handshake Protocol
input logic i_valid, o_ready, i_data; output logic o_valid, i_ready, o_data; assign i_ready = o_ready || !o_valid; clk or negedge rst) begin if (!rst) o_valid <= 0; else o_valid <= i_valid || (o_valid && !o_ready); end clk or negedge rst) begin if (!rst) o_data <= 0; else if (i_valid && i_ready) o_data <= i_data; end 2 core logic Clock gating coding style

44 Code for Handshake Protocol
assign i_ready = o_ready || !o_valid; o_valid <= i_valid || (o_valid && !o_ready); If the next stage is ready → ready to get Or, you are empty → ready to get Have input data → has data at the next cycle Or, have data but can't pass to the next stage

45 Handshake can Handle Datapath Merging
Wait until both ready, then you are ready. Input Module A Module B Output Input Module E

46 Brief Summarize Streaming (1-wire) protocol.
Very simple to use. But large, be sure you can always receive the data. Handshake (2-wire) protocol. Can stop the data input. Very commonly used!! Both make easy-to-understand hardware pipeline. Both are widely used in industries.


Download ppt "Hardware Architecture Design"

Similar presentations


Ads by Google