Introduction to Bluespec: A new methodology for designing Hardware

Slides:



Advertisements
Similar presentations
Elastic Pipelines and Basics of Multi-rule Systems Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology February.
Advertisements

An EHR based methodology for Concurrency management Arvind (with Asif Khan) Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.
Constructive Computer Architecture: Multirule systems and Concurrent Execution of Rules Arvind Computer Science & Artificial Intelligence Lab. Massachusetts.
March 2007http://csg.csail.mit.edu/arvindSemantics-1 Scheduling Primitives for Bluespec Arvind Computer Science & Artificial Intelligence Lab Massachusetts.
Stmt FSM Richard S. Uhler Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology (based on a lecture prepared by Arvind)
Asynchronous Pipelines: Concurrency Issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology October 13, 2009http://csg.csail.mit.edu/koreaL12-1.
March, 2007http://csg.csail.mit.edu/arvindIPlookup-1 IP Lookup Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.
September 24, L08-1 IP Lookup: Some subtle concurrency issues Arvind Computer Science & Artificial Intelligence Lab.
IP Lookup: Some subtle concurrency issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology March 4, 2013
December 10, 2009 L29-1 The Semantics of Bluespec Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute.
Computer Architecture: A Constructive Approach Sequential Circuits Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
February 22, 2005http://csg.csail.mit.edu/6.884/L07-1 Bluespec-1: Design Affects Everything Arvind Computer Science & Artificial Intelligence Lab Massachusetts.
December 12, 2006http://csg.csail.mit.edu/6.827/L24-1 Scheduling Primitives for Bluespec Arvind Computer Science & Artificial Intelligence Lab Massachusetts.
Pipelining combinational circuits Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology February 20, 2013http://csg.csail.mit.edu/6.375L05-1.
February 14, 2007L04-1http://csg.csail.mit.edu/6.375/ Bluespec-1: Design methods to facilitate rapid growth of SoCs Arvind Computer Science & Artificial.
September 3, 2009L02-1http://csg.csail.mit.edu/korea Introduction to Bluespec: A new methodology for designing Hardware Arvind Computer Science & Artificial.
Introduction to Bluespec: A new methodology for designing Hardware Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
Multiple Clock Domains (MCD) Continued … Arvind with Nirav Dave Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology November.
March, 2007Intro-1http://csg.csail.mit.edu/arvind Design methods to facilitate rapid growth of SoCs Arvind Computer Science & Artificial Intelligence Lab.
March 6, 2006http://csg.csail.mit.edu/6.375/L10-1 Bluespec-4: Rule Scheduling and Synthesis Arvind Computer Science & Artificial Intelligence Lab Massachusetts.
Constructive Computer Architecture: Guards Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology September 24, 2014.
September 22, 2009http://csg.csail.mit.edu/koreaL07-1 Asynchronous Pipelines: Concurrency Issues Arvind Computer Science & Artificial Intelligence Lab.
Constructive Computer Architecture Sequential Circuits Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology
Elastic Pipelines: Concurrency Issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology February 28, 2011L08-1http://csg.csail.mit.edu/6.375.
Constructive Computer Architecture Sequential Circuits Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology September.
Introduction to Bluespec: A new methodology for designing Hardware Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
Constructive Computer Architecture Sequential Circuits - 2 Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
February 20, 2009http://csg.csail.mit.edu/6.375L08-1 Asynchronous Pipelines: Concurrency Issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts.
Modular Refinement Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology March 8,
Introduction to Bluespec: A new methodology for designing Hardware Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
Multiple Clock Domains (MCD) Arvind with Nirav Dave Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.
Computer Architecture: A Constructive Approach Bluespec execution model and concurrent rule scheduling Teacher: Yoav Etsion Taken (with permission) from.
October 20, 2009L14-1http://csg.csail.mit.edu/korea Concurrency and Modularity Issues in Processor pipelines Arvind Computer Science & Artificial Intelligence.
February 28, 2005http://csg.csail.mit.edu/6.884/L09-1 Bluespec-3: Modules & Interfaces Arvind Computer Science & Artificial Intelligence Lab Massachusetts.
Overview Logistics Last lecture Today HW5 due today
Introduction to Bluespec: A new methodology for designing Hardware
Introduction to Bluespec: A new methodology for designing Hardware
Concurrency properties of BSV methods and rules
Bluespec-6: Modeling Processors
Folded “Combinational” circuits
Scheduling Constraints on Interface methods
Sequential Circuits - 2 Constructive Computer Architecture Arvind
Sequential Circuits Constructive Computer Architecture Arvind
Sequential Circuits: Constructive Computer Architecture
Introduction to Bluespec: A new methodology for designing Hardware
Stmt FSM Arvind (with the help of Nirav Dave)
Performance Specifications
Multirule Systems and Concurrent Execution of Rules
Bluespec-1: Design Affects Everything
Constructive Computer Architecture: Guards
Sequential Circuits Constructive Computer Architecture Arvind
Modular Refinement Arvind
Modular Refinement Arvind
Bluespec-7: Scheduling & Rule Composition
Modules with Guarded Interfaces
Sequential Circuits - 2 Constructive Computer Architecture Arvind
Introduction to Bluespec: A new methodology for designing Hardware
Multirule systems and Concurrent Execution of Rules
Stmt FSM Arvind (with the help of Nirav Dave)
Bluespec-4: Rule Scheduling and Synthesis
Elastic Pipelines and Basics of Multi-rule Systems
Constructive Computer Architecture: Guards
GCD: A simple example to introduce Bluespec
Elastic Pipelines and Basics of Multi-rule Systems
Bluespec-7: Scheduling & Rule Composition
Design methods to facilitate rapid growth of SoCs
Multirule systems and Concurrent Execution of Rules
Bluespec-5: Scheduling & Rule Composition
Modular Refinement Arvind
CS295: Modern Systems Bluespec Introduction
Presentation transcript:

Introduction to Bluespec: A new methodology for designing Hardware Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology February 8, 2010 http://csg.csail.mit.edu/6.375

What is needed to make hardware design easier “Intellectual Property” Extreme IP reuse Multiple instantiations of a block for different performance and application requirements Packaging of IP so that the blocks can be assembled easily to build a large system (black box model) Ability to do modular refinement Whole system simulation to enable concurrent hardware-software development February 8, 2010 http://csg.csail.mit.edu/6.375

IP Reuse sounds wonderful until you try it ... data_in push_req_n pop_req_n clk rstn data_out full empty Example: Commercially available FIFO IP block February 8, 2010 http://csg.csail.mit.edu/6.375

Bluespec promotes composition through guarded interfaces theModuleA theFifo.enq(value1); theFifo.deq(); value2 = theFifo.first(); theFifo.enq(value3); value4 = theFifo.first(); n rdy enab enq deq first FIFO theFifo not full not empty theModuleB February 8, 2010 http://csg.csail.mit.edu/6.375

Bluespec: A new way of expressing behavior using Guarded Atomic Actions Formalizes composition Modules with guarded interfaces Compiler manages connectivity (muxing and associated control) Powerful static elaboration facility Permits parameterization of designs at all levels Transaction level modeling Allows C and Verilog codes to be encapsulated in Bluespec modules February 8, 2010 http://csg.csail.mit.edu/6.375

Bluespec: State and Rules organized into modules interface module All state (e.g., Registers, FIFOs, RAMs, ...) is explicit. Behavior is expressed in terms of atomic actions on the state: Rule: guard  action Rules can manipulate state in other modules only via their interfaces. February 8, 2010 http://csg.csail.mit.edu/6.375

GCD: A simple example to explain hardware generation from Bluespec February 8, 2010 http://csg.csail.mit.edu/6.375

Programming with rules: A simple example Euclid’s algorithm for computing the Greatest Common Divisor (GCD): 15 6 February 8, 2010 http://csg.csail.mit.edu/6.375

GCD in BSV module mkGCD (I_GCD); Reg#(Int#(32)) x <- mkRegU; y swap sub module mkGCD (I_GCD); Reg#(Int#(32)) x <- mkRegU; Reg#(Int#(32)) y <- mkReg(0); rule swap ((x > y) && (y != 0)); x <= y; y <= x; endrule rule subtract ((x <= y) && (y != 0)); y <= y – x; method Action start(Int#(32) a, Int#(32) b) if (y==0); x <= a; y <= b; endmethod method Int#(32) result() if (y==0); return x; endmodule Assume a/=0 February 8, 2010 http://csg.csail.mit.edu/6.375

GCD Hardware Module implicit conditions interface I_GCD; rdy enab Int#(32) start result module GCD y == 0 implicit conditions interface I_GCD; method Action start (Int#(32) a, Int#(32) b); method Int#(32) result(); endinterface February 8, 2010 http://csg.csail.mit.edu/6.375

GCD: Another implementation module mkGCD (I_GCD); Reg#(Int#(32)) x <- mkRegU; Reg#(Int#(32)) y <- mkReg(0); rule swapANDsub ((x > y) && (y != 0)); x <= y; y <= x - y; endrule rule subtract ((x<=y) && (y!=0)); y <= y – x; method Action start(Int#(32) a, Int#(32) b) if (y==0); x <= a; y <= b; endmethod method Int#(32) result() if (y==0); return x; endmodule Combine swap and subtract rule February 8, 2010 http://csg.csail.mit.edu/6.375

Bluespec SystemVerilog source Bluespec Tool flow Bluespec SystemVerilog source Verilog 95 RTL Verilog sim VCD output Debussy Visualization Bluespec Compiler RTL synthesis gates C Bluesim Cycle Accurate Physical Place & Route Tapeout FPGA Power estimation tool Works in conjunction with exiting tool flows February 8, 2010 http://csg.csail.mit.edu/6.375

Generated Verilog RTL: GCD module mkGCD(CLK,RST_N,start_a,start_b,EN_start,RDY_start, result,RDY_result); input CLK; input RST_N; // action method start input [31 : 0] start_a; input [31 : 0] start_b; input EN_start; output RDY_start; // value method result output [31 : 0] result; output RDY_result; // register x and y reg [31 : 0] x; wire [31 : 0] x$D_IN; wire x$EN; reg [31 : 0] y; wire [31 : 0] y$D_IN; wire y$EN; ... // rule RL_subtract assign WILL_FIRE_RL_subtract = x_SLE_y___d3 && !y_EQ_0___d10 ; // rule RL_swap assign WILL_FIRE_RL_swap = !x_SLE_y___d3 && !y_EQ_0___d10 ; February 8, 2010 http://csg.csail.mit.edu/6.375

Generated Hardware start result rule swap ((x>y)&&(y!=0)); rdy start result sub x_en y_en x y > !(=0) swap? subtract? rule swap ((x>y)&&(y!=0)); x <= y; y <= x; endrule rule subtract ((x<=y)&&(y!=0)); y <= y – x; endrule x_en = y_en = February 8, 2010 http://csg.csail.mit.edu/6.375

Generated Hardware Module x y en rdy start result x_en y_en x y > !(=0) swap? subtract? sub x_en = swap? y_en = swap? OR subtract? rdy = February 8, 2010 http://csg.csail.mit.edu/6.375

GCD: A Simple Test Bench module mkTest (); Reg#(Int#(32)) state <- mkReg(0); I_GCD gcd <- mkGCD(); rule go (state == 0); gcd.start (423, 142); state <= 1; endrule rule finish (state == 1); $display (“GCD of 423 & 142 =%d”,gcd.result()); state <= 2; endmodule February 8, 2010 http://csg.csail.mit.edu/6.375

GCD: Test Bench Feeds all pairs (c1,c2) 1 < c1 < 7 module mkTest (); Reg#(Int#(32)) state <- mkReg(0); Reg#(Int#(4)) c1 <- mkReg(1); Reg#(Int#(7)) c2 <- mkReg(1); I_GCD gcd <- mkGCD(); rule req (state==0); gcd.start(signExtend(c1), signExtend(c2)); state <= 1; endrule rule resp (state==1); $display (“GCD of %d & %d =%d”, c1, c2, gcd.result()); if (c1==7) begin c1 <= 1; c2 <= c2+1; end else c1 <= c1+1; if (c1==7 && c2==63) state <= 2 else state <= 0; endmodule Feeds all pairs (c1,c2) 1 < c1 < 7 1 < c2 < 63 to GCD February 8, 2010 http://csg.csail.mit.edu/6.375

GCD: Synthesis results Original (16 bits) Clock Period: 1.6 ns Area: 4240 mm2 Unrolled (16 bits) Clock Period: 1.65ns Area: 5944 mm2 Unrolled takes 31% fewer cycles on the testbench February 8, 2010 http://csg.csail.mit.edu/6.375

Need for a rule scheduler February 8, 2010 http://csg.csail.mit.edu/6.375

Example 1 Can these rules be enabled together? +1 x_en y +1 y_en rule ra (z > 10); x <= x + 1; endrule rule rb (z > 20); y <= y + 1; >10 >20 x_en y_en z Can these rules be enabled together? Can they be executed concurrently? February 8, 2010 http://csg.csail.mit.edu/6.375

Example 2 Can these rules be enabled together? +1 x_en y y_en >10 >20 z rule ra (z > 10); x <= y + 1; endrule rule rb (z > 20); y <= x + 1; Can these rules be enabled together? Can they be executed concurrently? February 8, 2010 http://csg.csail.mit.edu/6.375

Example 3 Can these rules be enabled together? rule ra (z > 10); x <= y + z; endrule rule rb (z > 20); y <= y + Z; Can these rules be enabled together? Can they be executed concurrently? February 8, 2010 http://csg.csail.mit.edu/6.375

GAA Execution model Repeatedly: Select a rule to execute Compute the state updates Make the state updates February 8, 2010 http://csg.csail.mit.edu/6.375

Rule: As a State Transformer A rule may be decomposed into two parts p(s) and d(s) such that snext = if p(s) then d(s) else s p(s) is the condition (predicate) of the rule, a.k.a. the “CAN_FIRE” signal of the rule. p is a conjunction of explicit and implicit conditions d(s) is the “state transformation” function, i.e., computes the next-state values from the current state values Abstractly, we can think of a rule having to parts, a pi function and a delta function. The pi function tells us whetherule can be applied to a term s If the the pievaluates to true, then the delta function tells us what is the new term. And if pi is false, the rule cannot change s February 8, 2010 http://csg.csail.mit.edu/6.375

Compiling a Rule p enable d next current state state values rule r (f.first() > 0) ; x <= x + 1 ; f.deq (); endrule enable p f f x x d In a circuit, pi maps to combination logic that looks at the current state and generates a boolean enable signal for this rule The delta functions is another combination logic that computes the next state value from the current state value. Actually, delta has to compute the control signals to set the state element to the new value current state next state values rdy signals read methods enable signals action parameters p = enabling condition d = action signals & values February 8, 2010 http://csg.csail.mit.edu/6.375

Combining State Updates: strawman p’s from the rules that update R OR pn latch enable After mapping all the rules, we have to combine their logic some how. For a particular state elemetn like the PC register, the latch enable is the or the enable signals from all the rules that updates PC. The actual next state value of PC has to be selected through a multiplexer. Notice, this circuit only works if only one of these pi signal is asserted at a time d1,R dn,R OR R d’s from the rules that update R next state value February 8, 2010 http://csg.csail.mit.edu/6.375

Combining State Updates f1 p1 Scheduler: Priority Encoder OR p’s from all the rules pn fn latch enable After mapping all the rules, we have to combine their logic some how. For a particular state elemetn like the PC register, the latch enable is the or the enable signals from all the rules that updates PC. The actual next state value of PC has to be selected through a multiplexer. Notice, this circuit only works if only one of these pi signal is asserted at a time d1,R dn,R OR R d’s from the rules that update R next state value Scheduler ensures that at most one fi is true February 8, 2010 http://csg.csail.mit.edu/6.375

A compiler can determine if two rules can be executed in parallel without violating the one-rule-at-a-time semantics James Hoe, Ph.D., 2000 February 8, 2010 http://csg.csail.mit.edu/6.375

Scheduling and control logic Modules (Current state) “CAN_FIRE” “WILL_FIRE” Modules (Next state) Rules p1 Scheduler f1 d1 p1 pn fn d1 After mapping all the rules, we have to combine their logic some how. For a particular state elemetn like the PC register, the latch enable is the or the enable signals from all the rules that updates PC. The actual next state value of PC has to be selected through a multiplexer. Notice, this circuit only works if only one of these pi signal is asserted at a time Muxing dn pn cond action dn Compiler synthesizes a scheduler such that at any given time f’s for only non-conflicting rules are true February 8, 2010 http://csg.csail.mit.edu/6.375

The plan Express combinational circuits in Bluespec Express synchronous pipelines single-rule systems; no scheduling issues Multiple rule systems and concurrency issues Eliminating dead cycles Asynchronous pipelines and processors Each idea would be illustrated via examples No discussion of Bluespec syntax in the lectures; you are suppose to learn that by yourself and in tutorials February 8, 2010 http://csg.csail.mit.edu/6.375