# Fast Buffer Memory with Deterministic Packet Departures Mayank Kabra, Siddhartha Saha, Bill Lin University of California, San Diego.

## Presentation on theme: "Fast Buffer Memory with Deterministic Packet Departures Mayank Kabra, Siddhartha Saha, Bill Lin University of California, San Diego."— Presentation transcript:

Fast Buffer Memory with Deterministic Packet Departures Mayank Kabra, Siddhartha Saha, Bill Lin University of California, San Diego

IEEE Hot Interconnects XIV, August 23-25, 20062 Packet Buffer in Routers Router Core: Scheduler and Packet Buffers In Out  Incoming linecards have 40byte@40Gbps = 8ns to read and write a packet.  The routers need to store the packets to deal with congestion.  Bandwidth X RTT = 40Gb/s*250ms = 1Gb buffer.  Too big to store in SRAM, hence need to use DRAM.  Problem: DRAM access time ~40ns. So, there is roughly 10x speed difference. Linecards

IEEE Hot Interconnects XIV, August 23-25, 20063 Parallel and Interleaved DRAM banks DRAMs  Assume the speed difference is 3x P SRAM PP PPP

IEEE Hot Interconnects XIV, August 23-25, 20064 Problems with Parallelism  The access pattern can create problems.  If we try to access 3, 6, 9 and 11 one after another, it is possible to issue interleaved read requests and read those packets out at Line Speed. DRAMs 1 3 14 1110 654 89 13 12 2 7

IEEE Hot Interconnects XIV, August 23-25, 20065 Problems with Parallelism  But, accessing 2 & 3 or 10 & 11 in succession is problematic.  This is an example of a Bank Conflict DRAMs 1 3 14 1110 654 89 13 12 2 7

IEEE Hot Interconnects XIV, August 23-25, 20066 Use The Packet Departure Time  Wide classes of routers (Crossbar Routers) where the packets departures are determined by the scheduler on the fly.  Packet buffers which cater to these routers exist but are complex  There are other high performance routers such as Switch-Memory-Switch, Load Balance Routers for which packet departure time can be calculated when the packet is inserted in the buffer. Solution Idea: We will use the known departure times of the packets to schedule them to different DRAM banks such that there won’t be any conflicts. Solution Idea: We will use the known departure times of the packets to schedule them to different DRAM banks such that there won’t be any conflicts.

IEEE Hot Interconnects XIV, August 23-25, 20067 Packet Buffer Abstraction  Fixed sized packets, time is slotted (Example: 40Gb/s, 40 byte packet => 8ns).  The buffer may contain arbitrary large number of logical queues, but with deterministic access.  Single-write Single-read time-deterministic packet buffer model.

IEEE Hot Interconnects XIV, August 23-25, 20068 Packet Buffer Architecture  Interleaved memory architecture with multiple slower DRAM banks.  K slower DRAM banks.  b time slots to complete a single memory read or write operation.  b consecutive time slots is a frame.  A time slot t belongs to frame [t/b]

IEEE Hot Interconnects XIV, August 23-25, 20069 Packet Buffer Operation DRAMs... 12K-1K b packets …… aggregatede-aggregate SRAM Bypass Buffer arriving packetsdeparting packets

IEEE Hot Interconnects XIV, August 23-25, 200610 Packet Arrival [Frame 1]  Frame 1:  Assume b = 3  Packets P 1, P 2 & P 3 arrive in time slot 1, 2 and 3 respectively.  They are aggregated before writing to the DRAM. P1P1 P1P1 P2P2 P2P2 P3P3 P3P3 1 2 4 5 3 DRAMs

IEEE Hot Interconnects XIV, August 23-25, 200611 Packet Arrival [Frame 2]  Frame 2:  Packets P 1, P 2 & P 3 are being written to the DRAM banks (1, 2 & 3) during Frame 2.  New packets P 4, P 5, P 6 comes, which are stored in the buffer. P4P4 P4P4 P5P5 P5P5 P6P6 P6P6 1 2 4 5 3 DRAMs P1P1 P1P1 P2P2 P2P2 P3P3 P3P3

IEEE Hot Interconnects XIV, August 23-25, 200612 Packet Departure [Frame 19]  Packets P 58, P 59 & P 60 are scheduled to depart at time slots 58, 59 and 60 respectively (frame 20).  They will be read from the DRAM banks one frame slot before their departure frame slot (frame 19) 1 2 4 5 3 DRAMs P 59 P 60 P 58

IEEE Hot Interconnects XIV, August 23-25, 200613 Packet Departure [Frame 20]  Packets P 58, P 59 & P 60 are read from the buffer and are output from the switch at time slot 58, 59 and 60 respectively. 1 2 4 5 3 DRAMs P 60 P 58 P 59

IEEE Hot Interconnects XIV, August 23-25, 200614 SRAM Bypass Buffer  The operational model dictates that the minimum round trip latency to write and read a packet from one of the DRAM banks is 4 frames.  Thus, a packet with a departure time less than 4b-1 time slots away cannot be stored into DRAM.  A small amount of SRAM (size 4b) is used as a bypass buffer.

IEEE Hot Interconnects XIV, August 23-25, 200615 Number of DRAM banks  Arrival Write Conflicts: P P P P P P At any current frame f, there can be at most b packets that will be written to the DRAM banks (including the current packet). Hence, for each packet, there will be maximum of b-1 “Arrival Write Conflicts” DRAMs

IEEE Hot Interconnects XIV, August 23-25, 200616 Number of DRAM banks  Arrival Read Conflicts: P P P P P P At any current frame f, there can be at most b packets that will be read from the DRAM banks. Those b banks will be busy in the current time frame and will be unavailable. Hence, for each packet, there will be maximum of b “Arrival Read Conflicts” DRAMs

IEEE Hot Interconnects XIV, August 23-25, 200617 Number of DRAM banks  Departure Read Conflicts: P P P P P P Any packet that is written in the current frame f, it will eventually need to be read in a future frame d for departure. At that future frame d, there are b-1 other departing packets. Hence, for each packet, there will be maximum of b-1 “Departure Read Conflicts” DRAMs

IEEE Hot Interconnects XIV, August 23-25, 2006 How Many DRAM Banks? P P DRAMs  Total Conflicts:  Arrival Write: (b-1)  Arrival Read: b  Departure Read: (b-1)  Hence, total (3b-2) conflicts.  If the number of banks is more than (3b-2), we will always have a free bank for all the packets.

IEEE Hot Interconnects XIV, August 23-25, 200619 DRAM Bank Selection  To find a compatible memory, maintain a two dimensional read-transaction bitmap R.  Each row corresponds to a frame slot.  Each column corresponds to a DRAM bank (hence 3b – 1 columns).  R(f, m) denotes whether m th DRAM bank has an already stored packet that must be read at the f th frame slot.

IEEE Hot Interconnects XIV, August 23-25, 200620 DRAM Bank Selection  Write-reservation bitmap W of size (3b – 1)  W(m) denotes that in current frame, m th memory bank has been assigned an arriving packet.

IEEE Hot Interconnects XIV, August 23-25, 200621 DRAM Bank Selection Logic

IEEE Hot Interconnects XIV, August 23-25, 200622 DRAM Bank Selection  Approach: Greedy solution avoiding the three types of conflicts.  To check if a memory bank is compatible for a packet p arriving at timeframe f, and having a departure timeframe d:  Check NOT(W(m) | R(f,m) | R(d, m))  Instead of checking one memory bank at a time, we can check all of them at once: V = NOT(W | R(f) | R(d)), where R(f) and R(d) are the row vectors. From V, get the index of the first compatible memory.  If n is the bank selected for p, then set W(n) = 1 and R(d,n) = 1.

IEEE Hot Interconnects XIV, August 23-25, 200623 Size of the Bitmap  Size of the packet buffer is T packets i.e., T is the farthest departure time slot relative to the current time slot.  Farthest departure frame:  Each row in the bitmap is (3b – 1) bits, then the size of the bitmap is:  Assuming a RTT of 250ms and a line rate of 40Gb/s, the packet buffer would correspond to a memory requirement of T = 3 x 10 7 packets, which makes the bitmap size close to 11MB.

IEEE Hot Interconnects XIV, August 23-25, 200624 Additional Details  Location of a packet in the DRAM:  Once a bank has been selected, need a way to assign the actual memory location to write, and later, read the packet.  Determine the memory location based on the departure frame using a circular indexing to map a frame to a packet location in the memory.  How to reorder/de-aggregate the packets?  Store the timestamp in the DRAM with the packet.

IEEE Hot Interconnects XIV, August 23-25, 200625 Conclusion  Developed a simple packet buffer architecture when the packet departure times are known e.g., Switch-Memory- Switch and Load-Balanced Routers.  Can support arbitrary large number of logical queues.  Number of DRAM banks and SRAM bypass buffer depend only on the physical parameters.

Thank You

Download ppt "Fast Buffer Memory with Deterministic Packet Departures Mayank Kabra, Siddhartha Saha, Bill Lin University of California, San Diego."

Similar presentations