Presentation is loading. Please wait.

Presentation is loading. Please wait.

Michael Wilson Block Design Review: ONL Header Format.

Similar presentations


Presentation on theme: "Michael Wilson Block Design Review: ONL Header Format."— Presentation transcript:

1 Michael Wilson mlw2@arl.wustl.edu http://www.arl.wustl.edu/projects/techX Block Design Review: ONL Header Format

2 2 - Michael Wilson - 10/6/2015 Revision History 4/10/07 (MLW): »Released 4/11/07 (MLW) »Updates from feedback at 10 April meeting

3 3 - Michael Wilson - 10/6/2015 Header Format Inputs/Outputs SRAM Rx (2 ME) HdrFmt (1 ME) Parse, Lookup, Copy (3 MEs) TCAM SRAM Mux (1 ME) Tx (1 ME) QM (1 ME) xScale Assoc. Data ZBT-SRAM Plugin1Plugin2 Plugin3 Plugin4Plugin5 NN FreeList Mgr (1 ME) Tx, QM Parse Plugin XScale Stats (1 ME) QM Copy Plugins SRAM NN SRAM Ring Scratch Ring NN Ring NN SRAM 32KW 64KW 32KW Each slide taken from ONL_NProuter.ppt

4 4 - Michael Wilson - 10/6/2015 Header Format Inputs/Outputs SRAM Rx (2 ME) HdrFmt (1 ME) Parse, Lookup, Copy (3 MEs) TCAM SRAM Mux (1 ME) Tx (1 ME) QM (1 ME) xScale Assoc. Data ZBT-SRAM Plugin1Plugin2 Plugin3 Plugin4Plugin5 NN FreeList Mgr (1 ME) Tx, QM Parse Plugin XScale Stats (1 ME) QM Copy Plugins SRAM NN SRAM Ring Scratch Ring NN Ring NN SRAM 32KW 64KW 32KW Each Buffer Handle(24b) Rsv (3b) Port (4b) V1V1 Buffer Handle(24b) Rsv (3b) Port (4b) V1V1 Ethernet DA[47-16] (32b) Ethernet DA[15-0](16b) Ethernet SA[31-0] (32b) Ethernet SA[47-32](16b) Ethernet Type(16b)Reserved (16b) slide taken from ONL_NProuter.ppt

5 5 - Michael Wilson - 10/6/2015 Contents Overview Latency Analysis Code Locations (Planned) Test Procedures (Planned) Implementation Status

6 6 - Michael Wilson - 10/6/2015 Overview Initialization 1.Initialize local table of Source MAC addresses for output ports Processing (Main Loop) 1.Receive handle from QM 2.Copy to output registers: Buffer Handle (from NN ring if not chained, from buffer descriptor Buffer_Next otherwise) Destination MAC, EtherType (from buffer descriptor) Source MAC address (from local memory, indexed by port) 3.If chained, free the header buffer 4.Update Stats (index from buffer descriptor) 5.Forward packet to TX 6.Update TX Counters Header Format will be written in C, not microcode

7 7 - Michael Wilson - 10/6/2015 Latency Analysis dl_source Is Valid? Read Buffer Descriptor Read Source MAC Write Buffer_Next:=NULL Is Chained Write Stats dl_sink 150 cycles 3 cycles Negligible cycles 150 cycles 60 cycles 150 cycles 60 150 Yes No Yes Write Stats Write Handle to Freelist Mgr 60 cycles Critical Path Latency: 360 Cycles No

8 8 - Michael Wilson - 10/6/2015 Performance What is our performance target? »To hit 5 Gb rate: Minimum Ethernet frame: 76B Ø 64B frame + 12B InterFrame Spacing 5 Gb/sec * 1B/8b * packet/76B = 8.22 Mpkt/sec »IXP ME processing: 1.4Ghz clock rate 1.4Gcycle/sec * 1 sec/ 8.22 Mp = 170.3 cycles per packet compute budget: (MEs*170) Ø 1 ME: 170 cycles Ø 2 ME: 340 cycles Ø 3 ME: 510 cycles Ø 4 ME: 680 cycles latency budget: (threads*170) Ø 1 ME: 8 threads: 1360 cycles Ø 2 ME: 16 threads: 2720 cycles Ø 3 ME: 24 threads: 4080 cycles Ø 4 ME: 32 threads: 5440 cycles slide taken from ONL_NProuter.ppt

9 9 - Michael Wilson - 10/6/2015 dl_sink Semantics One of my optimizations requires a change to dl_sink semantics. In pseudo-code: signal_t sig1, sig2, sig3; send_stats(stats, sig1); // 60 cycles free_block(hdr_buf, sig2); // 60 cycles dl_sink(data_buf, sig3); // 60 cycles wait(sig1, sig2, sig3) // 60+60+60=60 As of 10 April Meeting, this optimization is no longer necessary for Header Format. »Header Format has enough slack to skip exotic optimizations »Header Format can start all of the scratch ring writes and then dl_sink, do the wait after dl_sink. PLC does not have this option, but this doesn’t impact Header Format.

10 10 - Michael Wilson - 10/6/2015 File locations (in …/ONL_Router/) Code »src/hdrFormat/ONL/hdrfmt.c Includes »src/dispatch_loop/ONL/dl_source.[h,c] dl_source() and dl_sink() functions

11 11 - Michael Wilson - 10/6/2015 Test and Validation All validation tests will done with 8 threads »Header Format has no loops and only two conditionals. All code paths will be tested once. Invalid handle (Valid bit not set) Unchained packet Chained packet »Need to decide correct behavior in the face of erroneous input (port out of range) »Test back-pressure from TX through HdrFmt to QM »HdrFormat will be tested at high speeds to ensure I/O contention is not an issue

12 12 - Michael Wilson - 10/6/2015 Implementation Status Still in pseudo-code »Working on a C-equivalent of the HdrFmt Stub as a framework for my implementation Bugs »Doesn’t compile, as there is no source yet. Untested »Everything Optimizations not taken (but available if needed later) »The Buffer_Next field of the buffer descriptor can be read and written back-to-back because the memory controller guarantees in-order execution. Thus, we don’t need to read, check to see if we need to write, and then write. We can issue both at once and worry afterward. This won’t work with multi-buffer payload, but neither will the rest of Header Format.

13 13 - Michael Wilson - 10/6/2015 Extra Slides

14 14 - Michael Wilson - 10/6/2015 ONL Buffer Descriptor Buffer_Next (32b) LW0 LW1 LW2 LW3 LW4 Reserved (16b) LW5 LW6 Packet_Next (32b) LW7 Reserved (4b) Free_list 0000 (4b) Ref_Cnt (8b) MAC DAddr_47_32 (16b)Stats Index (16b) MAC DAddr_31_00 (32b) Reserved (32b) Written by Freelist Mgr Written by Rx Written by Copy Written by QM Ref_Cnt (8b) 1 Written by Rx, Added to by Copy Decremented by Freelist Mgr Buffer_Size (16b) Packet_Size (16b) Offset (16b) Written by Rx and Plugins EtherType (16b)


Download ppt "Michael Wilson Block Design Review: ONL Header Format."

Similar presentations


Ads by Google