Multirule systems and Concurrent Execution of Rules

Slides:



Advertisements
Similar presentations
BSV execution model and concurrent rule scheduling Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology February.
Advertisements

Elastic Pipelines and Basics of Multi-rule Systems Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology February.
An EHR based methodology for Concurrency management Arvind (with Asif Khan) Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.
Constructive Computer Architecture: Multirule systems and Concurrent Execution of Rules Arvind Computer Science & Artificial Intelligence Lab. Massachusetts.
March 2007http://csg.csail.mit.edu/arvindSemantics-1 Scheduling Primitives for Bluespec Arvind Computer Science & Artificial Intelligence Lab Massachusetts.
Folding and Pipelining complex combinational circuits Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology February.
Maria-Cristina Marinescu Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology A Synthesis Algorithm for Modular Design of.
Asynchronous Pipelines: Concurrency Issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology October 13, 2009http://csg.csail.mit.edu/koreaL12-1.
March, 2007http://csg.csail.mit.edu/arvindIPlookup-1 IP Lookup Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.
September 24, L08-1 IP Lookup: Some subtle concurrency issues Arvind Computer Science & Artificial Intelligence Lab.
December 10, 2009 L29-1 The Semantics of Bluespec Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute.
December 12, 2006http://csg.csail.mit.edu/6.827/L24-1 Scheduling Primitives for Bluespec Arvind Computer Science & Artificial Intelligence Lab Massachusetts.
Pipelining combinational circuits Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology February 20, 2013http://csg.csail.mit.edu/6.375L05-1.
Constructive Computer Architecture: Bluespec execution model and concurrency semantics Arvind Computer Science & Artificial Intelligence Lab. Massachusetts.
September 3, 2009L02-1http://csg.csail.mit.edu/korea Introduction to Bluespec: A new methodology for designing Hardware Arvind Computer Science & Artificial.
Introduction to Bluespec: A new methodology for designing Hardware Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
March 6, 2006http://csg.csail.mit.edu/6.375/L10-1 Bluespec-4: Rule Scheduling and Synthesis Arvind Computer Science & Artificial Intelligence Lab Massachusetts.
Constructive Computer Architecture: Guards Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology September 24, 2014.
September 22, 2009http://csg.csail.mit.edu/koreaL07-1 Asynchronous Pipelines: Concurrency Issues Arvind Computer Science & Artificial Intelligence Lab.
Constructive Computer Architecture Sequential Circuits Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology
Introduction to Bluespec: A new methodology for designing Hardware Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
Constructive Computer Architecture Sequential Circuits - 2 Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
February 20, 2009http://csg.csail.mit.edu/6.375L08-1 Asynchronous Pipelines: Concurrency Issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts.
Introduction to Bluespec: A new methodology for designing Hardware Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
Computer Architecture: A Constructive Approach Pipelining combinational circuits Teacher: Yoav Etsion Taken (with permission) from Arvind et al.*, Massachusetts.
Computer Architecture: A Constructive Approach Bluespec execution model and concurrent rule scheduling Teacher: Yoav Etsion Taken (with permission) from.
October 20, 2009L14-1http://csg.csail.mit.edu/korea Concurrency and Modularity Issues in Processor pipelines Arvind Computer Science & Artificial Intelligence.
EHRs: Designing modules with concurrent methods
Overview Logistics Last lecture Today HW5 due today
Introduction to Bluespec: A new methodology for designing Hardware
Introduction to Bluespec: A new methodology for designing Hardware
Concurrency properties of BSV methods and rules
Bluespec-6: Modeling Processors
Folded “Combinational” circuits
Scheduling Constraints on Interface methods
Sequential Circuits Constructive Computer Architecture Arvind
Sequential Circuits: Constructive Computer Architecture
Introduction to Bluespec: A new methodology for designing Hardware
Performance Specifications
Combinational Circuits in Bluespec
Pipelining combinational circuits
Multirule Systems and Concurrent Execution of Rules
Constructive Computer Architecture: Guards
Sequential Circuits Constructive Computer Architecture Arvind
Pipelining combinational circuits
Modular Refinement Arvind
Modular Refinement Arvind
EHR: Ephemeral History Register
Constructive Computer Architecture: Well formed BSV programs
Bluespec-7: Scheduling & Rule Composition
Modeling Processors: Concurrency Issues
Modules with Guarded Interfaces
Pipelining combinational circuits
Sequential Circuits - 2 Constructive Computer Architecture Arvind
Introduction to Bluespec: A new methodology for designing Hardware
Multirule systems and Concurrent Execution of Rules
Bluespec-4: Rule Scheduling and Synthesis
Computer Science & Artificial Intelligence Lab.
Elastic Pipelines and Basics of Multi-rule Systems
Constructive Computer Architecture: Guards
GCD: A simple example to introduce Bluespec
Elastic Pipelines and Basics of Multi-rule Systems
Bluespec-7: Scheduling & Rule Composition
Control Hazards Constructive Computer Architecture: Arvind
Introduction to Bluespec: A new methodology for designing Hardware
Bluespec-7: Semantics of Bluespec
Bluespec-5: Scheduling & Rule Composition
Modular Refinement Arvind
Pipelining combinational circuits
Constructive Computer Architecture: Well formed BSV programs
Presentation transcript:

Multirule systems and Concurrent Execution of Rules Constructive Computer Architecture: Multirule systems and Concurrent Execution of Rules Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology September 28, 2015 http://csg.csail.mit.edu/6.175

Rewriting Elastic pipeline as a multirule system f0 f1 f2 x inQ fifo1 fifo2 outQ rule stage1; if(inQ.notEmpty && fifo1.notFull) begin fifo1.enq(f0(inQ.first)); inQ.deq; end endrule rule stage2; if(fifo1.notEmpty && fifo2.notFull) begin fifo2.enq(f1(fifo1.first)); fifo1.deq; end endrule rule stage3; if(fifo2.notEmpty && outQ.notFull) begin outQ.enq(f2(fifo2.first)); fifo2.deq; end endrule How does such a system function? September 28, 2015 http://csg.csail.mit.edu/6.175

Bluespec Execution Model Repeatedly: Select a rule to execute Compute the state updates Make the state updates Highly non-deterministic; User annotations can be used in rule selection One-rule-at-a-time-semantics: Any legal behavior of a Bluespec program can be explained by observing the state updates obtained by applying only one rule at a time However, for performance we need to execute multiple rules concurrently if possible September 28, 2015 http://csg.csail.mit.edu/6.175

Multi-rule versus single rule elastic pipeline x fifo1 inQ f1 f2 f3 fifo2 outQ rule elasticPipeline; if(inQ.notEmpty && fifo1.notFull) begin fifo1.enq(f1(inQ.first)); inQ.deq; end if(fifo1.notEmpty && fifo2.notFull) begin fifo2.enq(f2(fifo1.first)); fifo1.deq; end if(fifo2.notEmpty && outQ.notFull) begin outQ.enq(f3(fifo2.first)); fifo2.deq; end endrule rule stage1; if(inQ.notEmpty && fifo1.notFull) begin fifo1.enq(f1(inQ.first)); inQ.deq; end endrule rule stage2; if(fifo1.notEmpty && fifo2.notFull) begin fifo2.enq(f2(fifo1.first)); fifo1.deq; end endrule rule stage3; if(fifo2.notEmpty && outQ.notFull) begin outQ.enq(f3(fifo2.first)); fifo2.deq; end endrule How are these two systems the same (or different)? September 28, 2015 http://csg.csail.mit.edu/6.175

Elastic pipeline Do these systems see the same state changes? The single rule system – fills up the pipeline and then processes a message at every pipeline stage for every rule firing – no more than one slot in any fifo would be filled unless the OutQ blocks The multirule system has many more possible states. It can mimic the behavior of one-rule system but one can also execute rules in different orders, e.g., stage1; stage1; stage2; stage1; stage3; stage2; stage3; … (assuming stage fifos have more than one slot) When can some or all the rules in a multirule system execute concurrently? September 28, 2015 http://csg.csail.mit.edu/6.175

Evaluating or applying a rule The state of the system s is defined as the value of all its registers An expression is evaluated by computing its value on the current state An action defines the next value of some of the state elements based on the current value of the state A rule is evaluated by evaluating the corresponding action and simultaneously updating all the affected state elements x y z ... rule    x’ y’ z’ ... Given action a and state S, let a(S) represent the state after the application of action a September 28, 2015 http://csg.csail.mit.edu/6.175

One-rule-at-a-time semantics Given a program with a set of rules {rule ri ai} and an initial state S0 , S is a legal state if and only if there exists a sequence of rules rj1,…., rjn such that S= ajn(…(aj1(S0))…) September 28, 2015 http://csg.csail.mit.edu/6.175

Concurrent execution of two rules Concurrent execution of two rules, rule r1 a1 and rule r2 a2, means executing a rule whose body looks like (a1; a2), that is a rule which is a parallel composition of the actions of the two rules However, we want to preserve one-rule-at-a-time semantics of Bluespec; (a1; a2) does not always preserve that! September 28, 2015 http://csg.csail.mit.edu/6.175

Concurrent scheduling of rules rule r1 a1 and rule r2 a2 can be scheduled concurrently, preserving one-rule-at-a-time semantics, if and only if Either S. (a1; a2)(S) = a2(a1(S)) or S. (a1; a2)(S) = a1(a2(S)) rule r1 a1 to rule rn an can be scheduled concurrently, preserving one-rule-at-a-time semantics, if and only if there exists a permutation (p1,…,pn) of (1,…,n) such that S. (a1;…;an)(S) = apn(…(ap1(S)) September 28, 2015 http://csg.csail.mit.edu/6.175

A compiler can determine if two rules can be executed in parallel without violating the one-rule-at-a-time semantics James Hoe, Ph.D., 2000 Construct a conflict matrix (CM) for rules September 28, 2015 http://csg.csail.mit.edu/6.175

Extending CM to rules CM between two rules is computed exactly the same way as CM for the methods of a module Given rule r1 a1 and rule r2 a2 such that mcalls(a1)={g11,g12...g1n} mcalls(a2)={g21,g22...g2m} Compute Conflict(x,y) = if x and y are methods of the same module then CM[x,y] else CF CM[r1,r2] = conflict(g11,g21)  conflict(g11,g22) ...  conflict(g12,g21)  conflict(g12,g22) ... …  conflict(g1n,g21)  conflict(g12,g22) ... Conflict relation is not transitive r1 < r2, r2 < r3 does not imply r1 < r3 September 28, 2015 http://csg.csail.mit.edu/6.175

Using CMs for concurrent scheduling of rules Two rules that are conflict free can be scheduled together without violating the one-rule-at-a-time semantics. In general, use the following theorem Theorem: Given a set of rules {rule ri ai}, if there exists a permutation {p1, p2 … pn} of {1..n} such that  i < j. CM(api, apj) is CF or < then  S. (a1;…;an)(S) = apn(…(ap1(S)). Thus, rules r1, r2 … rn can be scheduled concurrently with the effect  i, j. rpi < rpj September 28, 2015 http://csg.csail.mit.edu/6.175

Example 1: Compiler Analysis rule ra; if (z>10) x <= x+1; endrule rule rb; if (z>20) y <= y+2; mcalls(ra) = {z.r, x.w, x.r} mcalls(rb) = {z.r, y.w, y.r} CM(ra, rb) = conflict(z.r, z.r)  conflict(z.r, y.w) conflict(z.r, y.r)  conflict(x.w, z.r) conflict(x.w, y.w)  conflict(x.w, y.r) conflict(x.r, z.r)  conflict(x.r, y.w) Conflict(x.r, y.r) = CF  CF  CF  CF … = CF Rules ra and rb can be scheduled together without violating the one-rule-at-a-time-semantics. We say rules ra and rb are CF September 28, 2015 http://csg.csail.mit.edu/6.175

Example 2: Compiler Analysis mcalls(ra) = {z.r, x.w, y.r} mcalls(rb) = {z.r, y.w, x.r} CM(ra, rb) = conflict(z.r, z.r)  conflict(z.r, y.w) conflict(z.r, x.r)  conflict(x.w, z.r) conflict(x.w, y.w)  conflict(x.w, x.r) conflict(y.r, z.r)  conflict(y.r, y.w) Conflict(y.r, x.r) = CF  CF CF  CF CF  > CF  <  CF = C rule ra; if (z>10) x <= y+1; endrule rule rb; if (z>20) y <= x+2; Rules ra and rb cannot be scheduled together without violating the one-rule-at-a-time-semantics. Rules ra and rb are C September 28, 2015 http://csg.csail.mit.edu/6.175

Example 3: Compiler Analysis mcalls(ra) = {z.r, x.w, y.r} mcalls(rb) = {z.r, y.w, y.r} CM(ra, rb) = conflict(z.r, z.r)  conflict(z.r, y.w) conflict(z.r, y.r)  conflict(x.w, z.r) conflict(x.w, y.w)  conflict(x.w, y.r) conflict(y.r, z.r)  conflict(y.r, y.w) Conflict(y.r, y.r) = CF  CF CF  CF CF  <  CF = < rule ra; if (z>10) x <= y+1; endrule rule rb; if (z>20) y <= y+2; Rules ra and rb can be scheduled together without violating the one-rule-at-a-time-semantics. Rule ra < rb September 28, 2015 http://csg.csail.mit.edu/6.175

Multi-rule versus single rule elastic pipeline x fifo1 inQ f1 f2 f3 fifo2 outQ rule elasticPipeline; if(inQ.notEmpty && fifo1.notFull) begin fifo1.enq(f1(inQ.first)); inQ.deq; end if(fifo1.notEmpty && fifo2.notFull) begin fifo2.enq(f2(fifo1.first)); fifo1.deq; end if(fifo2.notEmpty && outQ.notFull) begin outQ.enq(f3(fifo2.first)); fifo2.deq; end endrule rule stage1; if(inQ.notEmpty && fifo1.notFull) begin fifo1.enq(f1(inQ.first)); inQ.deq; end endrule rule stage2; if(fifo1.notEmpty && fifo2.notFull) begin fifo2.enq(f2(fifo1.first)); fifo1.deq; end endrule rule stage3; if(fifo2.notEmpty && outQ.notFull) begin outQ.enq(f3(fifo2.first)); fifo2.deq; end endrule If we do concurrent scheduling in the multirule system then the multi-rule system behaves like the single rule system September 28, 2015 http://csg.csail.mit.edu/6.175

Concurrency when the FIFOs do not permit concurrent enq and deq x fifo1 inQ f1 f2 f3 fifo2 outQ not empty not empty & not full not empty & not full not full At best alternate stages in the pipeline will be able to fire concurrently September 28, 2015 http://csg.csail.mit.edu/6.175

Practical scheduling concerns Rules often have a top level predicate or guard: rule r1 if (p1); a1 It does make sense to schedule such a rule for execution unless it’s predicate is true We can evaluate the guards of many* rules in parallel every (clock) cycle and then select for parallel execution only among those rules whose guards are true. Of course the selected rules must preserve one-rule-at-a-time semantics. *Not all guards can be evaluated in parallel because of EHRs and method parameters September 28, 2015 http://csg.csail.mit.edu/6.175

Scheduling and control logic Modules (Current state) “CAN_FIRE” “WILL_FIRE” Modules (Next state) Rules cf1 Scheduler wf1 a1 p1 cfn wfn ns1 Muxing an pn After mapping all the rules, we have to combine their logic some how. For a particular state elemetn like the PC register, the latch enable is the or the enable signals from all the rules that updates PC. The actual next state value of PC has to be selected through a multiplexer. Notice, this circuit only works if only one of these pi signal is asserted at a time cond action nsn Compiler synthesizes a scheduler such that at any given time will-fire for only non-conflicting rules are true September 28, 2015 http://csg.csail.mit.edu/6.175

some insight into Concurrent rule execution Rules Ri Rj Rk rule steps Rj HW Rk clocks Ri There are more intermediate states in the rule semantics (a state after each rule step) In the HW, states change only at clock edges September 28, 2015 http://csg.csail.mit.edu/6.175

Parallel execution reorders reads and writes Rules rule steps reads writes reads writes reads writes reads writes reads writes reads writes reads writes clocks HW In the rule semantics, each rule sees (reads) the effects (writes) of previous rules In the HW, rules only see the effects from previous clocks, and only affect subsequent clocks September 28, 2015 http://csg.csail.mit.edu/6.175

Correctness Rules Ri Rj Rk rule steps Rj HW Rk clocks Ri The compiler will schedule rules concurrently only if the net state change is equivalent to sequential rule execution (which is what our theorem ensures) September 28, 2015 http://csg.csail.mit.edu/6.175