Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSE 291-a Interconnection Networks Lecture 15: Router (cont’d) March 5, 2007 Prof. Chung-Kuan Cheng CSE Dept, UC San Diego Winter 2007 Transcribed by Ling.

Similar presentations


Presentation on theme: "CSE 291-a Interconnection Networks Lecture 15: Router (cont’d) March 5, 2007 Prof. Chung-Kuan Cheng CSE Dept, UC San Diego Winter 2007 Transcribed by Ling."— Presentation transcript:

1 CSE 291-a Interconnection Networks Lecture 15: Router (cont’d) March 5, 2007 Prof. Chung-Kuan Cheng CSE Dept, UC San Diego Winter 2007 Transcribed by Ling Zhang

2 Topics  Router (cont’d) Output States  Router Pipelines and Stalls  Router Datapath Components Input Buffer Switches Output Buffer

3 Router  Output virtual channel state fields: G: Global state  I: idle  A: active  C: Waiting for credit I: Input VC  Input port and virtual channel that are forwarding flits to this output virtual channel. C: Credit Count  Number of free buffers available to hold flits from this virtual channel at the downstream node.

4 Router Pipeline  Each head flit must proceed through: RC: routing computation VA: virtual channel allocation SA: switch allocation ST: switch traversal  For body flit and tail flit: Only SA and ST are needed.

5 An example of router pipeline cycles1234567 HFRCVASAST B1SAST B2SAST TFSAST

6 Possible stalls in router pipeline  Packet stalls VC Busy: The head flit for one packet arrives before the tail flit of the previous packet has completed switch allocation. Route: Routing not completed. VA: VA not successful.  Flit stalls Switch busy: Switch allocation attempted but unsuccessful. Buffer empty: No flit available. Input buffer is empty. Credit: No credit available.

7 VA busy stall example cycle s 1234567 HF(A)RCxxVASAST TF(B)SAST B1(A)SAST Virtual channel 0 is busy. Packet B holds it.

8 Switch busy stall example cycles12345678 HFRCVASAST B1SAST B2xSAST B3xSAST B2 fails to allocation switch in cycle 5.

9 Buffer empty stall example cycles12345678 HFRCVASAST B1SAST B2xSAST B3SAST B2 comes in late, and introduce 1 cycle stall.

10 Credit stall example 123456789101112131415161718 credit432100000001111000 HFSASTW1W2RCVASAST B1SASTW1W2SAST B2SASTW1W2SAST B3SASTW1W2SAST C of HFCTW1W2CU B4xxxxxxXSASTW1W2SAST C of B1CTW1W2CU B5xxxxxxXSASTW1W2SAST

11 Credit stall example  W1,W2 is the 2 cycles of time of flights between two routers.  A buffer is allocated to the headflit when it is in the upstream SA stage in cycle 1. This buffer cannot be reassigned to another flit until after the head flit leaves the downstream SA stage, freeing the buffer, and a credit reflecting the free buffer propagates back to the update stage in cycle 11. Body flit 4 uses this credit to enter the SA stage in cycle 12.  t crt = t f +t c +2T w +1 tf: flit pipeline delay, which is 4 cycles. tc: credit pipeline delay, which is 2 cycles. Tw: one way wire delay, which is 2 cycles. The total delay is 11cycles.

12 Usage of output virtual channel 1234567891011121314151617 HF(A)RCVASASTW1W2RCVASAST TF(A)SASTW1W2SAST C of TFCTW1W2CU HF(B)RCxxxxxxxxxxxxVASAST C of HFCTW1W2CU HF(B)RCVASASTW1W2xRCVASAST Conservative approach Approach For few stalls

13 Usage of output virtual channel  The conservative approach is to wait until the downstream flit buffer for the virtual channel is completely empty, as indicated by the arrival of the credit from the tail flit. This avoids creating a dependency between the current packet and a packet occupying the downstream buffer.  If dependency is affordable, the virtual channel can be reallocated as soon as the tail flit of the previous packet completes the SA stage.

14 Router Datapath Components  Input buffer Central memory  Good usage, but long latency and small bandwidth Separated buffer  Inefficient usage, but good latency and bandwidth Separated buffer for each channel Multi-port memory

15 Router Datapath Components  Switch Input speedup by splitting the input

16 Switch  If k inputs are splitted into sk inputs, the throughput of the switch is:  If a switch has both input and output speedup, the throughput can be larger than one:

17 Router Datapath Components  Output buffers FIFO buffer with length of 2-4 flits is often sufficient to match the speed between the switch and the channel.


Download ppt "CSE 291-a Interconnection Networks Lecture 15: Router (cont’d) March 5, 2007 Prof. Chung-Kuan Cheng CSE Dept, UC San Diego Winter 2007 Transcribed by Ling."

Similar presentations


Ads by Google