September 3, 2009L02-1http://csg.csail.mit.edu/korea Introduction to Bluespec: A new methodology for designing Hardware Arvind Computer Science & Artificial.

Slides:



Advertisements
Similar presentations
BSV execution model and concurrent rule scheduling Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology February.
Advertisements

Elastic Pipelines and Basics of Multi-rule Systems Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology February.
An EHR based methodology for Concurrency management Arvind (with Asif Khan) Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.
Constructive Computer Architecture: Multirule systems and Concurrent Execution of Rules Arvind Computer Science & Artificial Intelligence Lab. Massachusetts.
March 2007http://csg.csail.mit.edu/arvindSemantics-1 Scheduling Primitives for Bluespec Arvind Computer Science & Artificial Intelligence Lab Massachusetts.
Overview Logistics Last lecture Today HW5 due today
Stmt FSM Richard S. Uhler Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology (based on a lecture prepared by Arvind)
On “Control Adaptivity”: how to sell the idea of Atomic Transactions to the HW design community Rishiyur Nikhil Bluespec, Inc. WG 2.8 meeting, June 16,
Asynchronous Pipelines: Concurrency Issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology October 13, 2009http://csg.csail.mit.edu/koreaL12-1.
March, 2007http://csg.csail.mit.edu/arvindIPlookup-1 IP Lookup Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.
September 24, L08-1 IP Lookup: Some subtle concurrency issues Arvind Computer Science & Artificial Intelligence Lab.
IP Lookup: Some subtle concurrency issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology February 22, 2011L06-1.
IP Lookup: Some subtle concurrency issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology March 4, 2013
December 10, 2009 L29-1 The Semantics of Bluespec Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute.
Computer Architecture: A Constructive Approach Sequential Circuits Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
February 22, 2005http://csg.csail.mit.edu/6.884/L07-1 Bluespec-1: Design Affects Everything Arvind Computer Science & Artificial Intelligence Lab Massachusetts.
December 12, 2006http://csg.csail.mit.edu/6.827/L24-1 Scheduling Primitives for Bluespec Arvind Computer Science & Artificial Intelligence Lab Massachusetts.
Pipelining combinational circuits Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology February 20, 2013http://csg.csail.mit.edu/6.375L05-1.
Lecture 9. MIPS Processor Design – Instruction Fetch Prof. Taeweon Suh Computer Science Education Korea University 2010 R&E Computer System Education &
February 14, 2007L04-1http://csg.csail.mit.edu/6.375/ Bluespec-1: Design methods to facilitate rapid growth of SoCs Arvind Computer Science & Artificial.
Introduction to Bluespec: A new methodology for designing Hardware Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
Multiple Clock Domains (MCD) Continued … Arvind with Nirav Dave Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology November.
March, 2007Intro-1http://csg.csail.mit.edu/arvind Design methods to facilitate rapid growth of SoCs Arvind Computer Science & Artificial Intelligence Lab.
March 6, 2006http://csg.csail.mit.edu/6.375/L10-1 Bluespec-4: Rule Scheduling and Synthesis Arvind Computer Science & Artificial Intelligence Lab Massachusetts.
Constructive Computer Architecture: Guards Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology September 24, 2014.
September 22, 2009http://csg.csail.mit.edu/koreaL07-1 Asynchronous Pipelines: Concurrency Issues Arvind Computer Science & Artificial Intelligence Lab.
Constructive Computer Architecture Sequential Circuits Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology
Elastic Pipelines: Concurrency Issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology February 28, 2011L08-1http://csg.csail.mit.edu/6.375.
Constructive Computer Architecture Sequential Circuits Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology September.
Introduction to Bluespec: A new methodology for designing Hardware Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
Constructive Computer Architecture Sequential Circuits - 2 Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
February 20, 2009http://csg.csail.mit.edu/6.375L08-1 Asynchronous Pipelines: Concurrency Issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts.
Modular Refinement Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology March 8,
October 22, 2009http://csg.csail.mit.edu/korea Modular Refinement Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.
Introduction to Bluespec: A new methodology for designing Hardware Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
Multiple Clock Domains (MCD) Arvind with Nirav Dave Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.
Computer Architecture: A Constructive Approach Bluespec execution model and concurrent rule scheduling Teacher: Yoav Etsion Taken (with permission) from.
October 20, 2009L14-1http://csg.csail.mit.edu/korea Concurrency and Modularity Issues in Processor pipelines Arvind Computer Science & Artificial Intelligence.
February 28, 2005http://csg.csail.mit.edu/6.884/L09-1 Bluespec-3: Modules & Interfaces Arvind Computer Science & Artificial Intelligence Lab Massachusetts.
Introduction to Bluespec: A new methodology for designing Hardware
Introduction to Bluespec: A new methodology for designing Hardware
Folded “Combinational” circuits
Scheduling Constraints on Interface methods
Sequential Circuits - 2 Constructive Computer Architecture Arvind
Sequential Circuits Constructive Computer Architecture Arvind
Sequential Circuits: Constructive Computer Architecture
Introduction to Bluespec: A new methodology for designing Hardware
Stmt FSM Arvind (with the help of Nirav Dave)
Multirule Systems and Concurrent Execution of Rules
Bluespec-1: Design Affects Everything
Constructive Computer Architecture: Guards
Sequential Circuits Constructive Computer Architecture Arvind
Modular Refinement Arvind
Modular Refinement Arvind
Bluespec-7: Scheduling & Rule Composition
Modules with Guarded Interfaces
Sequential Circuits - 2 Constructive Computer Architecture Arvind
Introduction to Bluespec: A new methodology for designing Hardware
Multirule systems and Concurrent Execution of Rules
Stmt FSM Arvind (with the help of Nirav Dave)
Elastic Pipelines and Basics of Multi-rule Systems
Constructive Computer Architecture: Guards
GCD: A simple example to introduce Bluespec
Elastic Pipelines and Basics of Multi-rule Systems
Design methods to facilitate rapid growth of SoCs
Multirule systems and Concurrent Execution of Rules
Introduction to Bluespec: A new methodology for designing Hardware
Bluespec-5: Scheduling & Rule Composition
Modular Refinement Arvind
CS295: Modern Systems Bluespec Introduction
Presentation transcript:

September 3, 2009L02-1http://csg.csail.mit.edu/korea Introduction to Bluespec: A new methodology for designing Hardware Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology September 3, 2009

L02-2http://csg.csail.mit.edu/korea What is needed to make hardware design easier Extreme IP reuse Multiple instantiations of a block for different performance and application requirements Packaging of IP so that the blocks can be assembled easily to build a large system (black box model) Ability to do modular refinement Whole system simulation to enable concurrent hardware-software development “Intellectual Property”

September 3, 2009 L02-3http://csg.csail.mit.edu/korea data_in push_req_n pop_req_n clk rstn data_out full empty IP Reuse sounds wonderful until you try it... Example: Commercially available FIFO IP block These constraints are spread over many pages of the documentation... No machine verification of such informal constraints is feasible Bluespec can change all this

September 3, 2009 L02-4http://csg.csail.mit.edu/korea Bluespec promotes composition through guarded interfaces not full not empty theModuleA theModuleB theFifo.enq(value1); theFifo.deq(); value2 = theFifo.first(); theFifo.enq(value3); theFifo.deq(); value4 = theFifo.first(); n n rdy enab rdy enab rdy enq deq first FIFO theFifo Enqueue arbitration control Dequeue arbitration control Self-documenting interfaces; Automatic generation of logic to eliminate conflicts in use.

September 3, 2009 L02-5http://csg.csail.mit.edu/korea Bluespec: A new way of expressing behavior using Guarded Atomic Actions Formalizes composition Modules with guarded interfaces Compiler manages connectivity (muxing and associated control) Powerful static elaboration facility Permits parameterization of designs at all levels Transaction level modeling Allows C and Verilog codes to be encapsulated in Bluespec modules  Smaller, simpler, clearer, more correct code  not just simulation, synthesis as well Bluespec

September 3, 2009 L02-6http://csg.csail.mit.edu/korea Bluespec: State and Rules organized into modules All state (e.g., Registers, FIFOs, RAMs,...) is explicit. Behavior is expressed in terms of atomic actions on the state: Rule: guard  action Rules can manipulate state in other modules only via their interfaces. interface module

September 3, 2009L02-7http://csg.csail.mit.edu/korea GCD: A simple example to explain hardware generation from Bluespec

September 3, 2009 L02-8http://csg.csail.mit.edu/korea Programming with rules: A simple example Euclid’s algorithm for computing the Greatest Common Divisor (GCD): subtract 36 subtract 63 swap 33 subtract 03 subtract answer:

September 3, 2009 L02-9http://csg.csail.mit.edu/korea module mkGCD (I_GCD); Reg#( Int#(32) ) x <- mkRegU; Reg#( Int#(32) ) y <- mkReg(0); rule swap ((x > y) && (y != 0)); x <= y; y <= x; endrule rule subtract ((x <= y) && (y != 0)); y <= y – x; endrule method Action start(Int#(32) a, Int#(32) b) if (y==0); x <= a; y <= b; endmethod method Int#(32) result() if (y==0); return x; endmethod endmodule Internal behavior GCD in BSV External interface State Assume a/=0 x y swapsub If (a==0) then 0 else b

September 3, 2009 L02-10http://csg.csail.mit.edu/korea rdy enab Int#(32) rdy start result GCD module Int#(32) y == 0 implicit conditions interface I_GCD; method Action start (Int#(32) a, Int#(32) b); method Int#(32) result(); endinterface GCD Hardware Module t #(type t) t t tt t In a GCD call t could be Int#(32), UInt#(16), Int#(13),... The module can easily be made polymorphic Many different implementations can provide the same interface: module mkGCD (I_GCD)

September 3, 2009 L02-11http://csg.csail.mit.edu/korea module mkGCD (I_GCD); Reg#(Int#(32)) x <- mkRegU; Reg#(Int#(32)) y <- mkReg(0); rule swapANDsub ((x > y) && (y != 0)); x <= y; y <= x - y; endrule rule subtract ((x<=y) && (y!=0)); y <= y – x; endrule method Action start(Int#(32) a, Int#(32) b) if (y==0); x <= a; y <= b; endmethod method Int#(32) result() if (y==0); return x; endmethod endmodule GCD: Another implementation Combine swap and subtract rule Does it compute faster ? Does it take more resources ?

September 3, 2009 L02-12http://csg.csail.mit.edu/korea Bluespec Tool flow Bluespec SystemVerilog source Verilog 95 RTL Verilog sim VCD output Debussy Visualization Bluespec Compiler RTL synthesis gates C Bluesim Cycle Accurate FPGA Power estimatio n tool Works in conjunction with exiting tool flows

September 3, 2009 L02-13http://csg.csail.mit.edu/korea Generated Verilog RTL: GCD module mkGCD(CLK,RST_N,start_a,start_b,EN_start,RDY_start, result,RDY_result); input CLK; input RST_N; // action method start input [31 : 0] start_a; input [31 : 0] start_b; input EN_start; output RDY_start; // value method result output [31 : 0] result; output RDY_result; // register x and y reg [31 : 0] x; wire [31 : 0] x$D_IN; wire x$EN; reg [31 : 0] y; wire [31 : 0] y$D_IN; wire y$EN;... // rule RL_subtract assign WILL_FIRE_RL_subtract = x_SLE_y___d3 && !y_EQ_0___d10 ; // rule RL_swap assign WILL_FIRE_RL_swap = !x_SLE_y___d3 && !y_EQ_0___d10 ;...

September 3, 2009 L02-14http://csg.csail.mit.edu/korea Generated Hardware next state values predicates x_en y_en x_en = y_en = x y > !(=0) swap?subtract? sub x y  en rdy x rdy start result swap? swap? OR subtract?

September 3, 2009 L02-15http://csg.csail.mit.edu/korea Generated Hardware Module x_en y_en x_en = swap? y_en = swap? OR subtract? x y > !(=0) swap?subtract? sub x y  en rdy x rdy start result rdy = start_en OR start_en (y==0) OR start_en

September 3, 2009 L02-16http://csg.csail.mit.edu/korea GCD: A Simple Test Bench module mkTest (); Reg#(Int#(32)) state <- mkReg(0); I_GCD gcd <- mkGCD(); rule go (state == 0); gcd.start ( 423, 142 ); state <= 1; endrule rule finish (state == 1); $display (“GCD of 423 & 142 =%d”,gcd.result()); state <= 2; endrule endmodule Why do we need the state variable? Is there any timing issue in displaying the result? No. Because the finish rule cannot execute until gcd.result is ready

September 3, 2009 L02-17http://csg.csail.mit.edu/korea GCD: Test Bench module mkTest (); Reg#(Int#(32)) state <- mkReg(0); Reg#(Int#(4)) c1 <- mkReg(1); Reg#(Int#(7)) c2 <- mkReg(1); I_GCD gcd <- mkGCD(); rule req (state==0); gcd.start(signExtend(c1), signExtend(c2)); state <= 1; endrule rule resp (state==1); $display (“GCD of %d & %d =%d”, c1, c2, gcd.result()); if (c1==7) begin c1 <= 1; c2 <= c2+1; end else c1 <= c1+1; if (c1==7 && c2==63) state <= 2 else state <= 0; endrule endmodule Feeds all pairs (c1,c2) 1 < c1 < 7 1 < c2 < 63 to GCD

September 3, 2009 L02-18http://csg.csail.mit.edu/korea GCD: Synthesis results Original (16 bits) Clock Period: 1.6 ns Area: 4240 m 2 Unrolled (16 bits) Clock Period: 1.65ns Area: 5944 m 2 Unrolled takes 31% fewer cycles on the testbench

September 3, 2009L02-19http://csg.csail.mit.edu/korea Rule scheduling and the synthesis of a scheduler

September 3, 2009 L02-20http://csg.csail.mit.edu/korea GAA Execution model Repeatedly: Select a rule to execute Compute the state updates Make the state updates Highly non- deterministic Implementation concern: Schedule multiple rules concurrently without violating one-rule-at-a-time semantics User annotations can help in rule selection

September 3, 2009 L02-21http://csg.csail.mit.edu/korea Rule: As a State Transformer A rule may be decomposed into two parts (s) and (s) such that s next = if (s) then (s) else s (s) is the condition (predicate) of the rule, a.k.a. the “CAN_FIRE” signal of the rule. is a conjunction of explicit and implicit conditions (s) is the “state transformation” function, i.e., computes the next-state values from the current state values

September 3, 2009 L02-22http://csg.csail.mit.edu/korea Compiling a Rule f x current state next state values   enable f x rule r (f.first() > 0) ; x <= x + 1 ; f.deq (); endrule  = enabling condition  = action signals & values rdy signals read methods enable signals action parameters

September 3, 2009 L02-23http://csg.csail.mit.edu/korea Combining State Updates: strawman next state value latch enable R OR 11 nn  1,R  n,R OR ’s from the rules that update R ’s from the rules that update R What if more than one rule is enabled?

September 3, 2009 L02-24http://csg.csail.mit.edu/korea Combining State Updates next state value latch enable R Scheduler: Priority Encoder OR 11 nn 11 nn  1,R  n,R OR ’s from the rules that update R Scheduler ensures that at most one  i is true ’s from all the rules one-rule-at- a-time scheduler is conservative

September 3, 2009L02-25http://csg.csail.mit.edu/korea A compiler can determine if two rules can be executed in parallel without violating the one-rule- at-a-time semantics James Hoe, Ph.D., 2000

September 3, 2009 L02-26http://csg.csail.mit.edu/korea Scheduling and control logic Modules (Current state) Rules   Scheduler 11 nn 11 nn Muxing 11 nn nn nn Modules (Next state) cond action “CAN_FIRE”“WILL_FIRE” Compiler synthesizes a scheduler such that at any given time ’s for only non-conflicting rules are true

September 3, 2009 L02-27http://csg.csail.mit.edu/korea The plan We will first revisit Simple MIPS architectures to understand them at the register transfer level (RTL) We will express these designs in Bluespec such that the whole design will consist of one rule (synchronous bluespec) no scheduling issues After that we will get into designs involving multiple rules We will not discuss Bluespec syntax in the lectures; you are suppose to learn that by yourself and in tutorials