Michael Wilson Block Design Review: ONL Header Format.

Slides:



Advertisements
Similar presentations
Intro to Computer Org. Pipelining, Part 2 – Data hazards + Stalls.
Advertisements

Shangri-La: Achieving High Performance from Compiled Network Applications while Enabling Ease of Programming Michael K. Chen, Xiao Feng Li, Ruiqi Lian,
Ubiquitous Component Remoting Support on Overlay Network Adaptation support with Ontology-based annotation Roaming support of wireless component communication.
John DeHart ONL NP Router Block Design Review: Lookup (Part of the PLC Block)
David M. Zar Applied Research Laboratory Computer Science and Engineering Department ONL Stats Block.
Jon Turner, John DeHart, Fred Kuhns Computer Science & Engineering Washington University Wide Area OpenFlow Demonstration.
John DeHart and Mike Wilson SPP V2 Router Design.
1 - Charlie Wiseman - 05/11/07 Design Review: XScale Charlie Wiseman ONL NP Router.
Gary MarsdenSlide 1University of Cape Town Chapter 5 - The Processor  Machine Performance factors –Instruction Count, Clock cycle time, Clock cycles per.
Michael Wilson Block Design Review: Line Card Key Extract (Ingress and Egress)
Block Design Review: Queue Manager and Scheduler Amy M. Freestone Sailesh Kumar.
David M. Zar Applied Research Laboratory Computer Science and Engineering Department ONL Freelist Manager.
John DeHart Block Design Review: Lookup for IPv4 MR, LC Ingress and LC Egress.
Brandon Heller Block Design Review: Substrate Decap and IPv4 Parse.
1 - Charlie Wiseman, Shakir James - 05/11/07 Design Review: Plugin Framework Charlie Wiseman and Shakir James ONL.
John DeHart An NP-Based Router for the Open Network Lab Memory Map.
Control units In the last lecture, we introduced the basic structure of a control unit, and translated our assembly instructions into a binary representation.
David M. Zar Block Design Review: PlanetLab Line Card Header Format.
1 - John DeHart, Jing Lu - 3/8/2016 SRAM ONL NP Router Rx (2 ME) HdrFmt (1 ME) Parse, Lookup, Copy (3 MEs) TCAM SRAM Mux (1 ME) Tx (1 ME) QM (1 ME) xScale.
Mart Haitjema Block Design Review: ONL NP Router Multiplexer (MUX)
ECE 526 – Network Processing Systems Design Network Address Translator II.
WINLAB Open Cognitive Radio Platform Architecture v1.0 WINLAB – Rutgers University Date : July 27th 2009 Authors : Prasanthi Maddala,
John DeHart Netgames Plugin Issues. 2 - JDD - 6/13/2016 SRAM ONL NP Router Rx (2 ME) HdrFmt (1 ME) Parse, Lookup, Copy (3 MEs) TCAM SRAM Mux (1 ME) Tx.
Supercharged PlanetLab Platform, Control Overview
Flow Stats Module James Moscola September 12, 2007.
ONL NP Router xScale xScale TCAM SRAM Rx (2 ME) Mux (1 ME) Parse,
Design of a High Performance PlanetLab Node
SPP Version 1 Router Plans and Design
An NP-Based Router for the Open Network Lab Design
An NP-Based Router for the Open Network Lab
An NP-Based Ethernet Switch for the Open Network Lab Design
Design of a Diversified Router: Line Card
ONL NP Router xScale xScale TCAM SRAM Rx (2 ME) Mux (1 ME) Parse,
Design of a Diversified Router: Common Router Framework
Design of a Diversified Router: Project Management
Design of a Diversified Router: Line Card
ONL NP Router Plugins Shakir James, Charlie Wiseman, Ken Wong, John DeHart {scj1, cgw1, kenw,
Design of a Diversified Router: Dedicated CRF for IPv4 Metarouter
An NP-Based Router for the Open Network Lab
Design of a Diversified Router: IPv4 MR (Dedicated NP)
SPP V2 Router Plans and Design
Flow Stats Module James Moscola September 6, 2007.
Design of a Diversified Router: Line Card
An NP-Based Router for the Open Network Lab Overview by JST
ONL Stats Engine David M. Zar Applied Research Laboratory Computer Science and Engineering Department.
Supercharged PlanetLab Platform, Control Overview
Next steps for SPP & ONL 2/6/2007
IXP Based Router for ONL: Architecture
An NP-Based Router for the Open Network Lab
An NP-Based Router for the Open Network Lab
QM Performance Analysis
John DeHart and Mike Wilson
SPP V1 Memory Map John DeHart Applied Research Laboratory Computer Science and Engineering Department.
Planet Lab Memory Map David M. Zar Applied Research Laboratory Computer Science and Engineering Department.
Design of a Diversified Router: Dedicated CRF plus IPv4 Metarouter
Code Review for IPv4 Metarouter Header Format
Code Review for IPv4 Metarouter Header Format
SPP Version 1 Router Plans and Design
An NP-Based Router for the Open Network Lab Meeting Notes
An NP-Based Router for the Open Network Lab Project Information
An NP-Based Router for the Open Network Lab Design
Implementing an OpenFlow Switch on the NetFPGA platform
John DeHart and Mike Wilson
SPP Router Plans and Design
IXP Based Router for ONL: Architecture
Design of a High Performance PlanetLab Node: Line Card
Control units In the last lecture, we introduced the basic structure of a control unit, and translated our assembly instructions into a binary representation.
Review: The whole processor
Design of a Diversified Router: Project Management
MIPS Pipelined Datapath
Presentation transcript:

Michael Wilson Block Design Review: ONL Header Format

2 - Michael Wilson - 10/6/2015 Revision History 4/10/07 (MLW): »Released 4/11/07 (MLW) »Updates from feedback at 10 April meeting

3 - Michael Wilson - 10/6/2015 Header Format Inputs/Outputs SRAM Rx (2 ME) HdrFmt (1 ME) Parse, Lookup, Copy (3 MEs) TCAM SRAM Mux (1 ME) Tx (1 ME) QM (1 ME) xScale Assoc. Data ZBT-SRAM Plugin1Plugin2 Plugin3 Plugin4Plugin5 NN FreeList Mgr (1 ME) Tx, QM Parse Plugin XScale Stats (1 ME) QM Copy Plugins SRAM NN SRAM Ring Scratch Ring NN Ring NN SRAM 32KW 64KW 32KW Each slide taken from ONL_NProuter.ppt

4 - Michael Wilson - 10/6/2015 Header Format Inputs/Outputs SRAM Rx (2 ME) HdrFmt (1 ME) Parse, Lookup, Copy (3 MEs) TCAM SRAM Mux (1 ME) Tx (1 ME) QM (1 ME) xScale Assoc. Data ZBT-SRAM Plugin1Plugin2 Plugin3 Plugin4Plugin5 NN FreeList Mgr (1 ME) Tx, QM Parse Plugin XScale Stats (1 ME) QM Copy Plugins SRAM NN SRAM Ring Scratch Ring NN Ring NN SRAM 32KW 64KW 32KW Each Buffer Handle(24b) Rsv (3b) Port (4b) V1V1 Buffer Handle(24b) Rsv (3b) Port (4b) V1V1 Ethernet DA[47-16] (32b) Ethernet DA[15-0](16b) Ethernet SA[31-0] (32b) Ethernet SA[47-32](16b) Ethernet Type(16b)Reserved (16b) slide taken from ONL_NProuter.ppt

5 - Michael Wilson - 10/6/2015 Contents Overview Latency Analysis Code Locations (Planned) Test Procedures (Planned) Implementation Status

6 - Michael Wilson - 10/6/2015 Overview Initialization 1.Initialize local table of Source MAC addresses for output ports Processing (Main Loop) 1.Receive handle from QM 2.Copy to output registers: Buffer Handle (from NN ring if not chained, from buffer descriptor Buffer_Next otherwise) Destination MAC, EtherType (from buffer descriptor) Source MAC address (from local memory, indexed by port) 3.If chained, free the header buffer 4.Update Stats (index from buffer descriptor) 5.Forward packet to TX 6.Update TX Counters Header Format will be written in C, not microcode

7 - Michael Wilson - 10/6/2015 Latency Analysis dl_source Is Valid? Read Buffer Descriptor Read Source MAC Write Buffer_Next:=NULL Is Chained Write Stats dl_sink 150 cycles 3 cycles Negligible cycles 150 cycles 60 cycles 150 cycles Yes No Yes Write Stats Write Handle to Freelist Mgr 60 cycles Critical Path Latency: 360 Cycles No

8 - Michael Wilson - 10/6/2015 Performance What is our performance target? »To hit 5 Gb rate: Minimum Ethernet frame: 76B Ø 64B frame + 12B InterFrame Spacing 5 Gb/sec * 1B/8b * packet/76B = 8.22 Mpkt/sec »IXP ME processing: 1.4Ghz clock rate 1.4Gcycle/sec * 1 sec/ 8.22 Mp = cycles per packet compute budget: (MEs*170) Ø 1 ME: 170 cycles Ø 2 ME: 340 cycles Ø 3 ME: 510 cycles Ø 4 ME: 680 cycles latency budget: (threads*170) Ø 1 ME: 8 threads: 1360 cycles Ø 2 ME: 16 threads: 2720 cycles Ø 3 ME: 24 threads: 4080 cycles Ø 4 ME: 32 threads: 5440 cycles slide taken from ONL_NProuter.ppt

9 - Michael Wilson - 10/6/2015 dl_sink Semantics One of my optimizations requires a change to dl_sink semantics. In pseudo-code: signal_t sig1, sig2, sig3; send_stats(stats, sig1); // 60 cycles free_block(hdr_buf, sig2); // 60 cycles dl_sink(data_buf, sig3); // 60 cycles wait(sig1, sig2, sig3) // =60 As of 10 April Meeting, this optimization is no longer necessary for Header Format. »Header Format has enough slack to skip exotic optimizations »Header Format can start all of the scratch ring writes and then dl_sink, do the wait after dl_sink. PLC does not have this option, but this doesn’t impact Header Format.

10 - Michael Wilson - 10/6/2015 File locations (in …/ONL_Router/) Code »src/hdrFormat/ONL/hdrfmt.c Includes »src/dispatch_loop/ONL/dl_source.[h,c] dl_source() and dl_sink() functions

11 - Michael Wilson - 10/6/2015 Test and Validation All validation tests will done with 8 threads »Header Format has no loops and only two conditionals. All code paths will be tested once. Invalid handle (Valid bit not set) Unchained packet Chained packet »Need to decide correct behavior in the face of erroneous input (port out of range) »Test back-pressure from TX through HdrFmt to QM »HdrFormat will be tested at high speeds to ensure I/O contention is not an issue

12 - Michael Wilson - 10/6/2015 Implementation Status Still in pseudo-code »Working on a C-equivalent of the HdrFmt Stub as a framework for my implementation Bugs »Doesn’t compile, as there is no source yet. Untested »Everything Optimizations not taken (but available if needed later) »The Buffer_Next field of the buffer descriptor can be read and written back-to-back because the memory controller guarantees in-order execution. Thus, we don’t need to read, check to see if we need to write, and then write. We can issue both at once and worry afterward. This won’t work with multi-buffer payload, but neither will the rest of Header Format.

13 - Michael Wilson - 10/6/2015 Extra Slides

14 - Michael Wilson - 10/6/2015 ONL Buffer Descriptor Buffer_Next (32b) LW0 LW1 LW2 LW3 LW4 Reserved (16b) LW5 LW6 Packet_Next (32b) LW7 Reserved (4b) Free_list 0000 (4b) Ref_Cnt (8b) MAC DAddr_47_32 (16b)Stats Index (16b) MAC DAddr_31_00 (32b) Reserved (32b) Written by Freelist Mgr Written by Rx Written by Copy Written by QM Ref_Cnt (8b) 1 Written by Rx, Added to by Copy Decremented by Freelist Mgr Buffer_Size (16b) Packet_Size (16b) Offset (16b) Written by Rx and Plugins EtherType (16b)