Case Study VLSI 系統設計與高階合成. + + +           + : delay : multiplier: adder … + + + + + … + … … FIR Filter tap=4 IIR Case - Filter (1/8)

Slides:



Advertisements
Similar presentations
VERILOG: Synthesis - Combinational Logic Combination logic function can be expressed as: logic_output(t) = f(logic_inputs(t)) Rules Avoid technology dependent.
Advertisements

//HDL Example 8-2 // //RTL description of design example (Fig.8-9) module Example_RTL (S,CLK,Clr,E,F,A);
Counters Discussion D8.3.
Verilog in transistor level using Microwind
CPSC 321 Computer Architecture Andreas Klappenecker
CDA 3100 Recitation Week 11.
The Verilog Hardware Description Language
Supplement on Verilog adder examples
Synchronous Sequential Logic
Combinational Logic.
Multiplication and Division
Verilog Modules for Common Digital Functions
Table 7.1 Verilog Operators.
ECE Synthesis & Verification - Lecture 2 1 ECE 667 Spring 2011 ECE 667 Spring 2011 Synthesis and Verification of Digital Circuits High-Level (Architectural)
COE 405 Design and Synthesis of DataPath Controllers Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals.
CSE 201 Computer Logic Design * * * * * * * Verilog Modeling
//HDL Example 5-1 // //Description of D latch (See Fig.5-6) module D_latch (Q,D,control); output Q; input.
Verilog. 2 Behavioral Description initial:  is executed once at the beginning. always:  is repeated until the end of simulation.
1 Brief Introduction to Verilog Weiping Shi. 2 What is Verilog? It is a hardware description language Originally designed to model and verify a design.
//HDL Example 6-1 // //Behavioral description of //Universal shift register // Fig. 6-7 and Table 6-3 module shftreg.
Latches and Flip-Flops Discussion D8.1 Section 13-9.
FSM examples.
ELEN 468 Lecture 151 ELEN 468 Advanced Logic Design Lecture 15 Synthesis of Language Construct I.
Pulse-Width Modulated DAC
OUTLINE Introduction Basics of the Verilog Language Gate-level modeling Data-flow modeling Behavioral modeling Task and function.
Ring Counter Discussion 11.3 Example 32.
1 Lecture 5: A DataPath and Control Unit. 2 Finite State Machine and Datapath Design a simple datapath and a control unit data path contains: compare.
Arbitrary Waveform Discussion 12.2 Example 34. Recall Divide-by-8 Counter Use q2, q1, q0 as inputs to a combinational circuit to produce an arbitrary.
Counters Discussion 12.1 Example 33. Counters 3-Bit, Divide-by-8 Counter 3-Bit Behavioral Counter in Verilog Modulo-5 Counter An N-Bit Counter.
2-to-1 Multiplexer: if Statement Discussion D7.1 Example 4.
CS 61C Discussion 10 (1) Jaein Jeong Fall input MUX °Out = in0 * select’ + in1 * select in0in1selectout
A/D Converter Datapaths Discussion D8.4. Analog-to-Digital Converters Converts analog signals to digital signals –8-bit: 0 – 255 –10-bit: 0 – 1023 –12-bit:
Registers and Shift Registers Discussion D8.2. D Flip-Flop X 0 Q 0 ~Q 0 D CLK Q ~Q D gets latched to Q on the rising edge of the clock. Positive.
Generic Multiplexers: Parameters Discussion D7.5 Example 8.
Lessons from last lab: 1.Many had the “# skipping” problem 2.Most assumed there was something wrong with their code 3.How does one check their code? 4.SIMULATE!!
D Flip-Flops in Verilog Discussion 10.3 Example 27.
Quad 2-to-1 Multiplexer Discussion D7.4 Example 7.
Engineering 100 Section 250 Combinational Logic -- Examples 9/13/2010.
RTL Coding tips Lecture 7,8 Prepared by: Engr. Qazi Zia, Assistant Professor EED, COMSATS Attock.
Lecture 9: Structure for Discrete-Time System XILIANG LUO 2014/11 1.
Week Four Design & Simulation Example slides. Agenda Review the tiny example (Minako “logic”)from last week – look at the detailed static timing report.
數位系統實驗 Experiment on Digital System Lab06: Verilog HDL and FPGA (2) 負責助教:葉俊顯 stanley.
1 Workshop Topics - Outline Workshop 1 - Introduction Workshop 2 - module instantiation Workshop 3 - Lexical conventions Workshop 4 - Value Logic System.
1 Workshop Topics - Outline Workshop 1 - Introduction Workshop 2 - module instantiation Workshop 3 - Lexical conventions Workshop 4 - Value Logic System.
ECE/CS 352 Digital System Fundamentals© 2001 C. Kime 1 ECE/CS 352 Digital Systems Fundamentals Spring 2001 Chapters 3 and 4: Verilog – Part 2 Charles R.
Area: VLSI Signal Processing.
Traffic Lights Discussion D8.3a. Recall Divide-by-8 Counter Use Q2, Q1, Q0 as inputs to a combinational circuit to produce an arbitrary waveform. s0 0.
Final Project. System Overview Description of Inputs reset: When LOW, a power on reset is performed. mode: When LOW, NORMal mode selected When HIGH,
1 Arithmetic, ALUs Lecture 9 Digital Design and Computer Architecture Harris & Harris Morgan Kaufmann / Elsevier, 2007.
Digital Electronics.
SYEN 3330 Digital SystemsJung H. Kim 1 SYEN 3330 Digital Systems Chapter 7 – Part 2.
OUTLINE Introduction Basics of the Verilog Language Gate-level modeling Data-flow modeling Behavioral modeling Task and function.
Single Cycle Processor Design using Verilog
SYEN 3330 Digital SystemsJung H. Kim Chapter SYEN 3330 Digital Systems Chapters 4 – Part4: Verilog – Part 2.
1 Verilog: Function, Task Verilog: Functions A function call is an operand in an expression. It is called from within the expression and returns a value.
1 Modeling Synchronous Logic Circuits Debdeep Mukhopadhyay Associate Professor Dept of Computer Science and Engineering NYU Shanghai and IIT Kharagpur.
1 Lecture 3: Modeling Sequential Logic in Verilog HDL.
Figure Implementation of an FSM in a CPLD..
Introduction to FPGAs Getting Started with Xilinx.
Lecture 10 Flip-Flops/Latches
Supplement on Verilog FF circuit examples
Supplement on Verilog for Algorithm State Machine Chart
Reg and Wire:.
Supplement on Verilog Sequential circuit examples: FSM
Pulse-Width Modulation (PWM)
FSM MODELING MOORE FSM MELAY FSM. Introduction to DIGITAL CIRCUITS MODELING & VERIFICATION using VERILOG [Part-2]
Verilog.
VLSI Programming 2IMN35 Lab 1 Questionnaire
Supplement on Verilog Sequential circuit examples: FSM
The Verilog Hardware Description Language
Presentation transcript:

Case Study VLSI 系統設計與高階合成

+ + +           + : delay : multiplier: adder … … + … … FIR Filter tap=4 IIR Case - Filter (1/8)

module fir1(a, b, c, y); input [7:0] a, b, c; output [11:0] y; assign y = a*2+b*4+c*6; endmodule module fir2(a, b, c, y); input [7:0] a, b, c; output [11:0] y; reg [11:0] y; or b or c) y = a*2+b*4+c*6; endmodule tap=3 h 0 =2; h 1 =4, h 2 =6; Case - Filter (2/8) X X X + + 2 a 4 b 6 c y Output: 20, 32, 44, 56, 68, 80, 92, 104, …. delay delay

module fir3(a, b, c, y, ck); input [7:0] a, b, c; input ck; output [11:0] y; reg [11:0] y; ck) y = a*2+b*4+c*6; endmodule X X X + + 2 a 4 b 6 c y reg ck Case - Filter (3/8) Three inputs (a, b, c) must be entered concurrently (more pins, higher cost). delay delay Output: 20, 32, 44, 56, 68, 80, 92, 104, …. Stable output The stable output is generated at the positive edge of clock.

fir2 Case - Filter (4/8) fir2 fir3 Unstable output Stable output

module register(in, out, ck); input [11:0] in; input ck; output [11:0] out; reg [11:0] out; ck) out=in; endmodule `include "register.v" module ffir1(in, out, ck); input [11:0] in; input ck; output [11:0] out; wire [11:0] x4, x3, x2, x1; register r1(in, x3, ck); register r2(x3, x2, ck); register r3(x2, x1, ck); register r4(x4, out, ck) assign x4=x1*6+x2*4+x3*2; endmodule T(n+1)  T(n)  ffir1 Case - Filter (5/8) One input (in) is entered every clock cycle (more suitable for memory access and pins’ cost) ** * in + + out x(3) x(2) x(1) 24 6 module ffir1_a(in, out, ck); input ck; input [11:0] in; output [11:0] out; reg [11:0] out; reg [11:0] x4, x3, x2, x1; ck) beginx3<=in;x2<=x3;x1<=x2;out<=(x3*2+x2*4)+x1*6;endendmodule

Case - Filter (6/8) ffir1 ** * in + + x(3) x(2) x(1) 246 out Output: 20, 32, 44, 56, 68, 80, 92, 104, …. Correct results start here (4th clock), why ? T(4)  T(3)  T(2)  2 1 X T(1)  1 X X T(0)  X X X To work well, every input must be ready before the positive edge of every clock

`include "register.v" module ffir2(in, out, ck); input [11:0] in; input ck; output [11:0] out; wire [11:0] x4, x3, x2, x1; wire [11:0] t3, t2, t1; wire [11:0] y3, y2, y1; register r1(in, x3, ck); register r2(x3, x2, ck); register r3(x2, x1, ck); assign t3=x3*2; assign t2=x2*4; assign t1=x1*6; register r4(t3, y3, ck); register r5(t2, y2, ck); register r6(t1, y1, ck); assign x4=y1+y2+y3; register r7(x4, out, ck); endmodule ffir2 Case - Filter (7/8) ** * in + + out x(3)x(2) x(1) 24 6 module ffir2_a(in, out, ck); input ck; input [11:0] in; output [11:0] out; reg [11:0] out; reg [11:0] x3, x2, x1; reg [11:0] y3, y2, y1; ck) beginx3<=in;x2<=x3;x1<=x2;y3<=x3*2;y2<=x2*4;y1<=x1*6;out<=(y3+y2)+y1;endendmodule Datapath Pipelining y1 y2 y3

ffir2 Case - Filter (8/8) ** * in + + out x(3)x(2) x(1) 24 6 T(4)  T(3)  T(2)  2 1 X T(1)  1 X X Output: 20, 32, 44, 56, 68, 80, 92, 104, …. Correct results start here (5th clock) ?? Delay for is about 7.3 ns Delay for register assign is about 6.1 ns Delay for + is about 2.6 ns * Total delay=2.6*2+6.1=11.3 ns Total delay= =13.4 ns (+ little wire delay) Critical path=13.4 ns => clock rate less than 1/(13.4*10 -9 ) ~= 74.6 MHz 1/(13.4*10 -9 ) ~= 74.6 MHz

Systolic Array (FIR) + : adder  : multiplier R : register h i : coefficient R R hihi xixi x i+1 y i+1 yiyi  + Processing Element PE 0 PE 1 PE 2 PE 3 Case - Systolic Array (1/5)

Case - Systolic Array (2/5)

Case - Systolic Array (3/5)

R R  + in_x out_x in_y out_y coeff mult_out module pe(clk, reset, coeff, in_x, in_y, out_x, out_y); parameter size = 8; input clk, reset; input [size-1:0] in_x, coeff; input [size+size-1:0] in_y; output [size-1:0] out_x; output [size+size-1:0] out_y; wire [size+size-1:0] mult_out, add_out; reg_8 r1(clk, reset, in_x, out_x); reg_16 r2(clk, reset, add_out, out_y); assign mult_out = in_x * coeff; assign add_out = mult_out + in_y; endmodule Design_1 module pe(clk, reset, coeff, in_x, in_y, out_x, out_y); parameter size = 8; input clk, reset; input [size-1:0] in_x, coeff; input [size+size-1:0] in_y; output [size-1:0] out_x; output [size+size-1:0] out_y; reg [size+size-1:0] out_y; reg [size-1:0] out_x; clk) begin if(reset) begin out_x = 0; out_y = 0; end else begin out_y = in_y + (in_x * coeff); out_x = in_x; end endmodule Case - Systolic Array (4/5)

PE 0 PE 1 PE 2 PE 3 //***** main **************************** module systolic(clk, reset, input_x, output_y); parameter size = 8; input clk, reset; input [size-1:0] input_x; output [size+size-1:0] output_y; wire [size-1:0] pe0_x, pe1_x, pe2_x, pe3_x; wire [size+size-1:0] pe1_y, pe2_y, pe3_y; wire [size-1:0] h0 = 8'h01; wire [size-1:0] h1 = 8'h01; wire [size-1:0] h2 = 8'h01; wire [size-1:0] h3 = 8'h01; wire [size+size-1:0] pe4_y = 16'h0000; pe pe_0(clk, reset, h0, input_x, pe1_y, pe0_x, output_y); pe pe_1(clk, reset, h1, pe0_x, pe2_y, pe1_x, pe1_y); pe pe_2(clk, reset, h2, pe1_x, pe3_y, pe2_x, pe2_y); pe pe_3(clk, reset, h3, pe2_x, pe4_y, pe3_x, pe3_y); endmodule //***** register_8bits ******* module reg_8(clk,reset,in,out); parameter size_in = 8; inputclk,reset; input[size_in-1:0] in; output[size_in-1:0] out; reg[size_in-1:0] out; clk ) begin if(reset) out=0; else out=in; end endmodule //***** register_16bits ******* module reg_16(clk,reset,in,out); parameter size_in = 16; inputclk,reset; input[size_in-1:0] in; output[size_in-1:0] out; reg[size_in-1:0] out; clk ) begin if(reset) out=0; else out=in; end endmodule Case - Systolic Array (5/5)

R : Register  : ADD  : MPY Matrix Multiplication c=c+(a*b) Case-Matrix Multiplication (1/2) `include "reg_8.v" `include "reg_16.v" module PE(clk,reset,in_a,in_b,out_a,out_b,out_c); parameter data_size=8; inputreset,clk; input[data_size-1:0] in_a,in_b; output[data_size-1:0] out_a,out_b; output[2*data_size-1:0] out_c; wire[2*data_size-1:0] out_c,ADD_out,out_MPY; wire [data_size-1:0] out_a,out_b; assign out_MPY=in_a*in_b; assign ADD_out=out_MPY+out_c; reg_16 reg_16_0(clk,reset,ADD_out,out_c); reg_8 reg_delay_8_0(clk,reset,in_a,out_a); reg_8 reg_delay_8_1(clk,reset,in_b,out_b); endmodule

module PE_H(reset,clk,in_a,in_b,out_a,out_b,out_c); parameter data_size=8; inputreset,clk; input[data_size-1:0] in_a,in_b; output[2*data_size:0] out_c; output[data_size-1:0] out_a,out_b; reg[2*data_size:0] out_c; reg [data_size-1:0] out_a,out_b; clk) begin if(reset) begin out_a=0; out_b=0; out_c=0; end else begin out_c=out_c+in_a*in_b; out_a=in_a; out_b=in_b; end endmodule Case-Matrix Multiplication (2/2) The rest of circuit can be designed easily….