Presentation is loading. Please wait.

Presentation is loading. Please wait.

SPP Version 1 Router NAT John DeHart.

Similar presentations


Presentation on theme: "SPP Version 1 Router NAT John DeHart."— Presentation transcript:

1 SPP Version 1 Router NAT John DeHart

2 Notes from 6/10/08 meeting (and after)
Handling of TCP SYN pkts while in various states: SYN_WAIT state: HIT: Forward SYN pkt ESTABLISHED state: HIT: Forward SYN pkt FIN_WAIT state: HIT: Xscale recycles connection and restarts, forward SYN pkt TCP State on Xscale uses full 5-tuple Do we need to identify slice initiating a connection on Egress so we can put some limits on the number of NAT connections used by a slice? FastPath needs to know which lookup hits are for NAT and which are for preconfigured connections. We will use the xlated port field in the lookup result to indicate this: 0 in the port result means no NAT translation required, just used the port field from the lookup key !0 in the port result means this is a NAT translation Even if the value in the lookup key is the same as the value in the lookup result Watch out for anywhere on Egress that we might say, “Send it to the CP” There is no path to the CP from Egress Egress sends to the RTM The CP hangs off the Hub Other protocols: What if we get a packet that is not TCP, UDP or ICMP: GRE IP/IP IGMP Etc Ingress: We could send it to the CP Egress: We actually have no path to send anything to the CP so we would drop. With ONL there was an issue with the XScale accessing pkt buffers beyond 128K pkts The XScale can only access the low 256MB of DRAM There is probably a way to work around this but we never found it for ONL.

3 SPP V1 NAT Overview We want to support existing PlanetLab applications As Is. Users should not have to make code changes to get there applications to run on our GPEs. Multiple GPEs competing for TCP/UDP Port space and ICMP ID space on physical interfaces. NAT needed for Port translation NAT NOT needed for IP Address translation It would be good if we could: Avoid Packet dropping while awaiting NAT resolution. Maintain Packet Order. NAT translation to be done for: TCP and UDP Src Port for outgoing pkts (LC Egress) Dst Port for incoming pkts (LC Ingress) ICMP Lookup needs to include an ICMP type field to differentiate between Echo Request and Echo Reply ID for outgoing Echo Request (LC Egress) ID for incoming Echo Reply (LC Ingress) Other ICMP messages? No Application Level Gateways needed for our system: Examples of things that need them in “normal” networks FTP SNMP DNS

4 SPP V1 NAT Two big questions:
How and when to add filters for flows requiring NAT? How and when to remove filters for flows requiring NAT? We want to add and remove Ingress and Egress filters together

5 Ingress Traffic Destined for NPE Destined for GPE Destined for CP
Preconfigured entries in Lookup table Should be no need for NAT Destined for GPE Slice on GPE registers that it is going to listen on a particular IP Addr, Protocol, Port This will cause a preconfigured entry in Lookup Table Result of Egress traffic from GPE Traffic going through Egress initiated by a GPE causes Xscale/Control to install filter(s) in Ingress. More details in Egress discussion Destined for CP Preconfigured Do we want a preconfigured “default” for ICMP Echo Request? What about ICMP errors (Destination Unreachable)? No default destination for Ingress Lookup Misses. Some are sent to the XScale. Some are dropped Some will be handled by the CP eventually: SPP V2?

6 Egress Traffic From NPE From GPE From CP
Preconfigured entries in Lookup table Should be no need for NAT From GPE Preconfigured entries in Lookup table: Slices can request preconfigured ports. Slice on GPE initiates a new flow Examples: Slice on GPE opens TCP connection to another node. Slice on GPE pings another node Slice on GPE initiates a UDP flow with a bind Slice on GPE initiates a UDP flow without a bind When the first packet of one of these types of flows arrives at the LC Egress we may not have a filter entry that matches it. Anything that does not have a match gets sent to the XScale for resolution This may cause drops and re-ordering in V1 but we’ll live with this for now and try to deal with it in V2. In V2 we will look at the possibility of adding Support in GPE Kernel and/or libc to “catch” calls to bind, send, etc so we can configure entries in the LC for NAT. Support in LC Data path for queuing packets awaiting NAT resolution. Other solutions… From CP

7 TCAM and Aging TCAM inserts Aging cycles in its processing to test entries to see if they have been accessed in the defined timeout interval Timeout intervals are defined on a per database basis. Individual entries in a database can NOT have different timeout intervals. Age Enable Array: 1 bit per filter entry in a database to enable/disable aging for that entry Age Fifo Report Enable Array: 1 bit per filter entry in a database to enable/disable age reporting for that entry Age Fifo Select Array: 1 bit per filter entry in a database to select which one of two Age Fifos to report to (Age Fifo 0, Age Fifo 1) Age Activity Array 1 bit per filter entry in a database to indicate if it has been active since its last Aging check Even if a filter has been timed out and reported, the age activity bit will be set if the filter is matched by a lookup If/When we read the Age Activity bits Control Bits in DB Conf1 AR Only: Report AND Invalidate or just Report AE Update: Update Age Enable Array or not AFR: How to initialize the Age Fifo Report Enable array when an entry is written. AFS: Which FIFO to report to Activity Enable: Enable/Disable Age Activity Array updates

8 TCAM and Aging TCAM can do one of two things when Aging:
Report and Invalidate a database entry that has been inactive for a period of time Report: Put the index of the filter that timed out in one of two Age Fifos. The XScale can read these Fifos. Invalidate: the filter is removed from database Report but do not Invalidate a database entry that has been inactive for a period of time On a per database basis we can control whether Aging is disabled or not for a filter entry that has timed out. Age Reporting is always disabled for a filter that is reported. We will NOT get a second Age Report for a filter until we re-enable Reporting and Aging (if necessary). Each entry in a database can have Aging enabled or disabled independent of other entries Each entry in a database can have Age Reporting enabled or disabled independent of other entries Setting AE_Update bit in DB CONF1 Register to 0 causes the Age Enable Array to be unmodified when an entry times out. If the AE_Update bit is set to 1 then the Age Enable Array bit for an entry that is timed out is cleared. Using AE_Update set to 0 will mean that we will have to enable Aging when we add a filter entry. Using AE_Update set to 1 would mean that the TCAM would automatically enable Aging when we added a filter entry. It is always the case that when an entry times out the corresponding bit in the Age Fifo Report Enable Array is cleared so it will not report another timeout on that entry until we re-enable it Age Activity Bits Set when a filter is matched True whether filter has been reported as timed out or not. When Age Activity Bits are read we can control whether they are cleared or not

9 NAT for Different Protocols
TCP Three types of timeouts involved: TCP_SYN_TIMEOUT: Between SYN and SYN-ACK TCP_IDLE_TIMEOUT: Idle time between SYN-ACK and FIN Established connection, Data transfer state TCP_FIN_TIMEOUT: After receiving second FIN To allow time for the final ACK UDP One type of timeout: UDP_IDLE_TIMEOUT: BOTH directions have been idle ICMP ICMP_IDLE_TIMEOUT: BOTH directions have been idle

10 TCP on Egress What to do with TCP pkts depending on Hit/Miss and what TCP Control bits are set: MISS/SYN: FastPath Sends to XScale XScale Installs Egress and Ingress filters with a timeout of TCP_SYN_TIMEOUT XScale performs NAT translation and forwards Syn packet? MISS/!SYN FastPath Drops pkt Does XScale need to know? We might need to send a RST pkt to sender. HIT/RST FastPath Sends to XScale AND (performs NAT translation and forwards pkt) Does XScale need to access actual pkt in DRAM? If so there might be race condition here with the FastPath removing it after transmitting it. XScale removes filters HIT/FIN If XScale has received FIN in both directions then it changes the timeout of Egress and Ingress filters to TCP_FIN_TIMEOUT HIT/!(RST or FIN) FastPath performs NAT translation and forwards pkt

11 TCP on Ingress Ingress HIT/SYN-ACK:
FastPath Sends to XScale AND (performs NAT translation and forwards pkt) XScale Changes timeout of Egress and Ingress filters to TCP_IDLE_TIMEOUT HIT/SYN (!ACK, Simultaneous Open?) With NAT we should NOT have a Simultaneous Open event. We cannot be sure that the Ingress Syn is actually for the same 5-tuple. FastPath Drops pkt (Does XScale need to know?) HIT/RST Does XScale need to access actual pkt in DRAM? If so there might be race condition here with the FastPath removing it after transmitting it. XScale removes filters HIT/FIN If XScale has received FIN in both directions then it changes the timeout of Egress and Ingress filters to TCP_FIN_TIMEOUT HIT/(RST or FIN) HIT/!(SYN,RST,FIN) FastPath performs NAT translation and forwards pkt MISS FastPath Drops pkt (Does XScale need to know?)

12 TCP TCP Timeouts TCP_SYN_TIMEOUT: ~4 Minutes TCP_IDLE_TIMEOUT: ~24 Hours TCP_FIN_TIMEOUT: ~4 Minutes SYNs and FINs timeout in software in the XScale – no TCAM aging support used 24 Hour timeout of TCP connections in software in the XScale. No TCAM aging support used This is a hard timeout NOT dependent on activity. If you want a longer TCP connection, allocate and configure one. Possibility of coordinating a DB query with FlowStats to prolong connections that have been active during the last N hours of that 24-hour period

13 UDP We need to be sure that both the Ingress and Egress filters have timed out: Use TCAM aging to give us a 5-minute timeout of UDP filters When we get the first timeout for a pair we just record it. When we get the second timeout for a pair we check the Age Activity bit for the other side if it has had no activity we close the connections. If it has had activity we re-enable aging and age reporting for the first connection (second connection remains in our timed out state) and we then wait for the first connection to time out again and we repeat this algorithm Caveat: we have to write enable bits for 32 sequential filters at once so we might have to re-enable other filters also but that might just mean we get timeouts for filters we already had timeouts for. We can just ignore these. We still need to verify that this is the way the activity bits work. Verified

14 UDP Egress Ingress UDP Timeouts: MISS: HIT:
FastPath sends to XScale XScale installs Egress (and Ingress?) filter with a timeout of UDP_IDLE_TIMEOUT HIT: FastPath does NAT translation if needed and forwards packet on Ingress FastPath Drops pkt (Does XScale need to know?) UDP Timeouts: UDP_IDLE_TIMEOUT: 5 Minutes

15 ICMP Egress: Ingress: ICMP Timeouts: MISS: Echo Request
FastPath sends to XScale XScale sets up Ingress and Egress filters Timeout on filters should be ICMP_IDLE_TIMEOUT XScale re-injects pkt with translation into fast path for translation and forwarding. Error: (Should be handled by the fast path) KE detects that it is an ICMP Error pkt Extracts internal IP Header fields for lookup key KE sends lookup key and a flag telling Lookup that this is an ICMP Error lookup Lookup performs lookup using Key provided by KE HIT: Send on to HF with flag indicating it is an ICMP Error pkt MISS: Drop There should be no MISS as there should be a completely general filter to catch everything and send to CP, but just in case lookup should drop a MISS. Otherwise: FastPath drops pkt HIT: FastPath forwards. Ingress: FastPath forwards pkt ICMP Timeouts: ICMP_IDLE_TIMEOUT: same as UDP_IDLE_TIMEOUT (4 or 5 minutes?)

16 FastPath to XScale Data
Reasons for Egress FP to send to XScale: TCP/MISS/SYN TCP/HIT/RST TCP/HIT/FIN UDP/MISS ICMP/MISS/Echo_Req ICMP/MISS/Error (Type = 3,4,5,11,12) Reasons for Ingress FP to send to XScale: TCP/HIT/SYN-ACK

17 FastPath to XScale Data
Egress: URG ACK PSH RST SYN FIN Ingress: URG ACK PSH RST SYN FIN Hit U 1b A 1b P 1b R 1b S 1b F 1b Hit U 1b A 1b P 1b R 1b S 1b F 1b H 1b Rsvd 1b TCP Flags 6b H 1b Rsvd 3b TCP Flags 6b Buf Handle(24b) IP Pkt Length (16b) Eth Hdr Len (8b) Flags (8b) IP_SAddr (32b) SrcMAC (8b) TCP/UDP SPort Or ICMP ID (16b) IP Proto ICMP Type(8b) IP_DAddr (32b) TCP/UDP DPort (16b) TCAM Hit Index (32b) IP Hdr 1st Word (32b) IP Hdr Top 16 bits Of 2nd Word (16b) Buf Handle(24b) IP Pkt Length (16b) Eth Hdr Len (8b) Reserved (8b) Flags (8b) IP DAddr (32b) Intf (4b) TCP/UDP DPort Or ICMP ID (16b) Protocol ICMP Type (8b) Rsv IP_SAddr (32b) TCP/UDP SPort (16b) TCAM Hit Index (32b) IP Hdr 1st Word (32b) IP Hdr Top 16 bits Of 2nd Word (16b) TCP State on XScale uses Full 5-tuple TCP state Updates Include TCAM Hit Index

18 XScale (and Lookup) to FastPath Data
Egress: Ingress: ICMP NAT Hit UDP TCP ICMP NAT Hit UDP TCP Reserved 3b N 1b H I U T Reserved 3b N 1b H I U T Buf Handle(24b) IP DAddr (32b) IP Pkt Length (16b) Reserved (8b) Eth Hdr Len (8b) IP Hdr 1st Word (32b) Flags (8b) Translated SPort(16b) Stats Index (16b) VLAN (12b) PerSchedQID (15b) Sch 3b QM 2b IP Hdr Top 16 bits Of 2nd Word (16b) Reserved (16b) Flags (8b) Buf Handle(24b) IP Pkt Length (16b) Eth Hdr Len (8b) Reserved (8b) VLAN (12b) PerSchedQID (15b) Sch 3b QM 2b Translated DPort/ID (16b) Stats Index (16b) IP Hdr 1st Word (32b) IP Hdr Top 16 bits Of 2nd Word (16b) Reserved (16b) Flags: H: HIT - Lookup was a valid hit. N: NAT - NAT translation is required I: ICMP - ICMP pkt U: UDP - UDP pkt T: TCP - TCP pkt At most one of I/U/T should be set at any time If N is 0, then I/U/T will be ignored HF does not need to do any protocol specific operations for packets that do not require NAT translation No need to send any H=0 pkts to HF.

19 ME Block Design These next set of slides still need some work
We still need to define: DONE: Re-Injection path for XScale to put pkts back into FastPath DONE: Complete definition of FastPath to/from XScale data DONE: How do the TCP Control bits get from Pkt to Lookup to XScale?

20 Proposed Change: SPP V1 LC Ingress
XScale ICMP ERR Flags(8b) Rsvd 2b IE 1b T 1b U 1b I 1b N 1b H 1b NAT Scratch Rings SCR Rsv 1b IE 1b TCP Flags 6b Hit TCP UDP ICMP NAT ICMP ERR SCR R B U F R T M M S F Rx1 Rx2 Key Extract Lookup Hdr Format NN NN NN NN Flags (8b) Buf Handle(24b) Flags (8b) Buf Handle(24b) IP Pkt Length (16b) Eth Hdr Len (8b) Rsv (4b) Intf (4b) TCAM IP Pkt Length (16b) Eth Hdr Len (8b) Reserved (8b) Lookup Key IP DAddr (32b) NN Lookup Result VLAN (12b) QM 2b Sch 3b PerSchedQID (15b) Protocol (8b) TCP/UDP DPort Or ICMP ID (16b) ICMP Type (8b) Translated DPort/ID (16b) Stats Index (16b) S W I T C H T B U F IP SAddr (32b) Scr2NN QM0 IP Hdr 1st Word (32b) SCR Port Splitter SCR M S F 1x10G Tx2 IP Hdr 1st Word (32b) 1x10G Tx1 IP Hdr Top 16 bits Of 2nd Word (16b) QM1 SCR Original DPort/ID (16b) IP Hdr Top 16 bits Of 2nd Word (16b) NN TCP/UDP SPort (16b) NN QM2 SCR QM3 SCR Stats (1 ME) SRAM1 SRAM3 SCR SRAM2

21 Proposed Change: SPP V1 LC Ingress
Hit TCP Flags 6b H 1b Rsv S R P A F U FIN SYN RST PSH ACK URG XScale Flags (8b) Reserved (8b) Buf Handle(24b) NAT Scratch Rings SCR IP Pkt Length (16b) Eth Hdr Len (8b) Rsv (4b) Intf (4b) IP DAddr (32b) SCR R B U F Protocol (8b) TCP/UDP DPort Or ICMP ID (16b) ICMP Type (8b) R T M M S F Rx1 IP_SAddr (32b) Rx2 Key Extract Lookup Hdr Format NN NN NN NN IP Hdr 1st Word (32b) IP Hdr Top 16 bits Of 2nd Word (16b) TCP/UDP SPort (16b) Flags (8b) Buf Handle(24b) TCAM Hit Index (32b) TCAM IP Pkt Length (16b) Eth Hdr Len (8b) Reserved (8b) ICMP ERR ICMP NAT Hit UDP TCP NN VLAN (12b) PerSchedQID (15b) Sch 3b QM 2b Rsvd 2b IE 1b T 1b U 1b I 1b N 1b H 1b Translated DPort/ID (16b) Stats Index (16b) S W I T C H T B U F Scr2NN QM0 IP Hdr 1st Word (32b) SCR Port Splitter SCR M S F 1x10G Tx2 1x10G Tx1 IP Hdr Top 16 bits Of 2nd Word (16b) QM1 SCR Original DPort/ID (16b) NN NN QM2 SCR QM3 SCR Stats (1 ME) SRAM1 SRAM3 SCR SRAM2

22 Proposed Change: SPP V1 LC Egress
ICMP ERR XScale Flags(8b) Rsvd 2b IE 1b T 1b U 1b I 1b N 1b H 1b SCR Rsv 1b IE 1b TCP Flags 6b ICMP NAT Hit UDP TCP NAT Scratch Rings ICMP ERR S W I T C H SCR R B U F M S F Rx1 Rx2 Key Extract Lookup Hdr Format NN NN NN NN Reserved (8b) Buf Handle(24b) Flags (8b) Buf Handle(24b) IP Pkt Length (16b) Eth Hdr Len (8b) SrcMAC (8b) IP Pkt Length (16b) TCAM Eth Hdr Len (8b) Reserved (8b) Lookup Result IP_SAddr (32b) VLAN (12b) PerSchedQID (15b) Sch 3b QM 2b NN IP Proto (8b) TCP/UDP SPort Or ICMP ID (16b) ICMP Type (8b) Translated SPort(16b) Stats Index (16b) T B U F 5x1G Tx1 (P0-P4) IP DAddr (32b) QM0 SCR Port Splitter R T M M S F IP DAddr (32b) Flow Stats1 SCR IP Hdr 1st Word (32b) SCR QM1 SCR IP Hdr 1st Word (32b) IP Hdr Top 16 bits Of 2nd Word (16b) Original SPort/ID (16b) 5x1G Tx2 (P5-P9) QM2 IP Hdr Top 16 bits Of 2nd Word (16b) TCP/UDP DPort (16b) Reserved (16b) SCR SCR QM3 SCR SCR NAT Pkt return Stats (1 ME) SRAM3 SRAM1 SCR Flow Stats2 SRAM Freelist SRAM XScale XScale SRAM2 Archive Records

23 Proposed Change: SPP V1 LC Egress
Hit TCP Flags 6b H 1b Rsv S R P A F U FIN SYN RST PSH ACK URG Flags (8b) Buf Handle(24b) XScale IP Pkt Length (16b) Eth Hdr Len (8b) SrcMAC (8b) NAT Scratch Rings SCR IP_SAddr (32b) S W I T C H IP Proto (8b) TCP/UDP SPort Or ICMP ID (16b) ICMP Type(8b) SCR R B U F M S F Rx1 IP_DAddr (32b) Rx2 Key Extract Lookup Hdr Format IP Hdr 1st Word (32b) NN NN NN NN IP Hdr Top 16 bits Of 2nd Word (16b) TCP/UDP DPort (16b) Flags (8b) Buf Handle(24b) TCAM Hit Index (32b) TCAM IP Pkt Length (16b) Eth Hdr Len (8b) Reserved (8b) ICMP ERR ICMP NAT Hit UDP TCP VLAN (12b) PerSchedQID (15b) Sch 3b QM 2b NN T B U F 5x1G Tx1 (P0-P4) Rsvd 2b IE 1b T 1b U 1b I 1b N 1b H 1b Translated SPort(16b) Stats Index (16b) SCR QM0 SCR Port Splitter Flow Stats1 SCR IP DAddr (32b) R T M M S F SCR IP Hdr 1st Word (32b) QM1 SCR IP Hdr Top 16 bits Of 2nd Word (16b) Original SPort/ID (16b) 5x1G Tx2 (P5-P9) QM2 SCR SCR QM3 SCR SCR NAT Pkt return Stats (1 ME) SRAM3 SRAM1 SCR Flow Stats2 SRAM Freelist SRAM XScale XScale SRAM2 Archive Records

24 FastPath Code Changes dl_system.h Key Extract
Add Scratch ring for XScale to HF return path We already have scratch ring TO XScale Change number of packets to 128K Key Extract Read more in first dram read so we get the TCP Flags If pkt is not IP, drop it If pkt is IP, test for TCP, UDP or ICMP If pkt is not TCP, UDP or ICMP, drop pkt If pkt is ICMP/IP and is of type ERROR (3,4,5,11,12), then perform a second DRAM read to get the internal IP Header Use the internal IP header for lookup and set bit indicating ICMP Error Pass more data to Lookup TCP Flags Rest of IP 5-tuple so XScale can have it if needed for NAT

25 FastPath Code Changes Lookup
Change how we test for DONE, similar to recent change for ONL. Get more input data from KE If ICMP Error bit is set and lookup is a MISS, drop pkt In this case, lookup is done on the internal IP header fields found by KE in the ICMP payload Feed NAT related info to XScale Ingress ICMP Nothing sent to XScale FastPath handles Errors, Misses are dropped, Hits are forwarded UDP MISS, drop HIT, forwarded by FastPath TCP MISS: Drop HIT and SYN-ACK, RST or FIN: Forward and send to XScale HIT and SYN (!ACK): Drop HIT and other: Fwd Egress FastPath handles Errors. MISS and ECHO Request, send to XScale MISS and other, drop HIT forwarded by Fastpath MISS sent to XScale HIT Forwarded by fastpath MISS and SYN: Send to XScale MISS and other: Drop HIT and RST or FIN: Forward and Send to XScale HIT and other: Forward

26 FastPath Code Changes HF Remove accesses to Scratch ring to XScale
Add Scratch ring from XScale Process requests from Lookup AND from XScale

27 SPP V1 NAT

28 UDP * * * * * * * * * * 5 mins 5 mins 5 mins 5 mins 5 mins < 5min

29 SPP V1 NAT Notes LC Ingress Lookup Key (72b):
Interface (8b) IP DAddr (32b) Protocol (8b) TCP UDP ICMP Etc. DPort/Identifier (16b) DPort for TCP and UDP Identifier for ICMP Echo Request/Reply Type (8b) Primarily for use with ICMP to distinguish between ICMP Echo Request and Reply For TCP and UDP should be a Don’t Care. LC Ingress Lookup Result (72b): VLAN (12b) Stats Index (16b) MAC Addr (8b) QID (20b) QM_ID (2b) Scheduler (3b) QID(15b) (value given to SRAM controller by QM) Translated DPort/Identifier (16b)

30 SPP V1 NAT Notes LC Egress Lookup Key (64b):
IP SAddr (32b) Protocol (8b) TCP UDP ICMP Etc. SPort/Identifier (16b) SPort for TCP and UDP Identifier for ICMP Echo Request/Reply Type (8b) Primarily for use with ICMP to distinguish between ICMP Echo Request and Reply For TCP and UDP should be a Don’t Care. LC Egress Lookup Result (64b): VLAN (12b) Stats Index (16b) QID (20b) QM_ID (2b) Scheduler (3b) QID(15b) (value given to SRAM controller by QM) Translated SPort/Identifier (16b)

31 SPP V1 NAT Notes ICMP Messages Echo Request Echo Reply Errors Ingress:
Contains the IP Hdr of original packet. Presumably the original packet was sent by a GPE and hence should have an entry in the Egress Lookup table. Egress: Being sent out by GPE, NPE or CP. Treat it like an Echo Request? Translation of embedded IP hdr Ports?

32 ICMP – RFC 792 Purposes of ICMP (Protocol == 1) ICMP Message
IP Hdr ICMP Hdr Data 20B 4B+ Variable Type Code Checksum Optional Data ICMP Message Purposes of ICMP (Protocol == 1) Error reporting from routers or destination host to source host. ICMP data includes header and first 64 bytes of data from the IP packet that caused the error Only fragment 0 of fragmented messages generate ICMP error messages Control messages between routers/hosts.

33 ICMP Echo Request Type = 8 Reply Type = 0 ICMP Message Type = 0/8
Code = 0 Checksum Identifier Sequence Number Optional Data ICMP Message Request Type = 8 Reply Type = 0

34 ICMP Message Types Type Field Code Message Echo Reply 3 - Destination Unreachable (Error) Network Unreachable 1 Host Unreachable 2 Protocol Unreachable Port Unreachable 4 Fragmentation needed and DF set 5 Source route failed 6 Destination network unknown 7 Destination host unknown 8 Source host isolated 9 Communication with destination network administratively prohibited 10 Communication with destination host administratively prohibited 11 Network unreachable for type of service 12 Host unreachable for type of service Source Quench Report congestion to original host Redirect – request host use different route Redirect for network (obsolete) Redirect for host Redirect for type-of-service and network Redirect for type-of-service and host Type Field Code Message 8 Echo Request 9 Router Advertisement 10 Router Solicitation 11 - Time Exceeded for a Datagram Time-to-live equals 0 during transit (traceroute) 1 Time-to-live equals 0 during reassembly Timeout occurred while waiting for fragments 12 Parameter Problem – any other error condition (incorrect option IP Header bad Required option missing 13 Timestamp Request 14 Timestamp Reply 15 Information Request (obsolete) 16 Information Reply (obsolete) 17 Address Mask Request 18 Address Mask Reply From Comer, “Internetworking with TCP/IP”, volume 1, 4th edition, 2000.

35 ORIGINAL: SPP V1 LC Ingress
H XScale Reserved 5b N 1b H I ICMP NAT Hit Flags(8b) NAT Miss Scratch Ring SCR R B U F R T M M S F Rx1 Rx2 Key Extract Lookup Hdr Format NN NN NN NN IP DAddr (32b) Buf Handle(24b) IP Pkt Length (16b) Intf (4b) UDP DPort (16b) Eth Hdr Len (8b) IP Hdr 1st Word (32b) Reserved (8b) Protocol Type IP Hdr 2nd Word (32b) Rsv Buf Handle(24b) IP Pkt Length (16b) Translated DPort/ID (16b) Stats Index (16b) Reserved (8b) VLAN (12b) Eth Hdr Len (8b) IP Hdr 1st Word (32b) Flags (8b) IP Hdr 2nd Word (32b) PerSchedQID (15b) Sch 3b QM 2b TCAM Lookup Key NN Lookup Result S W I T C H T B U F Scr2NN QM0 SCR Port Splitter SCR M S F 1x10G Tx2 1x10G Tx1 QM1 SCR NN NN QM2 SCR QM3 SCR Stats (1 ME) SRAM1 SRAM3 SCR SRAM2

36 ORIGINAL: SPP V1 LC Ingress
XScale NAT MISS! NAT Miss Scratch Ring Flags (8b) Reserved (8b) Buf Handle(24b) SCR R B U F IP Pkt Length (16b) Eth Hdr Len (8b) Reserved (8b) R T M M S F Rx1 Rx2 Key Extract Lookup Hdr Format NN NN NN NN Flags (8b) Buf Handle(24b) TCAM IP Pkt Length (16b) Eth Hdr Len (8b) Reserved (8b) NN VLAN (12b) PerSchedQID (15b) Sch 3b QM 2b Translated DPort/ID (16b) Stats Index (16b) S W I T C H T B U F Scr2NN QM0 IP Hdr 1st Word (32b) SCR Port Splitter SCR M S F 1x10G Tx2 1x10G Tx1 IP Hdr 2nd Word (32b) QM1 SCR NN NN QM2 SCR QM3 SCR Stats (1 ME) SRAM1 SRAM3 SCR SRAM2

37 ORIGINAL: SPP V1 LC Egress
H XScale ICMP Reserved 5b N 1b H I NAT Hit Flags(8b) NAT Miss Scratch Ring SCR S W I T C H R B U F M S F Rx1 Rx2 Key Extract Lookup Hdr Format NN NN NN NN Buf Handle(24b) IP_SAddr (32b) IP Pkt Length (16b) SrcMAC (8b) Eth Hdr Len (8b) UDP SPort (16b) IP Proto Type(8b) IP Hdr 1st Word (32b) Reserved IP DAddr (32b) IP Hdr Top 16 bits Of 2nd Word (16b) SliceID (12b) Rsv (4b) Buf Handle(24b) IP DAddr (32b) IP Pkt Length (16b) Reserved (8b) Eth Hdr Len (8b) IP Hdr 1st Word (32b) Translated SPort(16b) Stats Index (16b) Flags (8b) IP Hdr Top 16 bits Of 2nd Word (16b) VLAN (12b) PerSchedQID (15b) Sch 3b QM 2b SliceID (12b) Rsv (4b) TCAM Lookup Result NN T B U F 5x1G Tx1 (P0-P4) QM0 SCR Port Splitter SCR R T M M S F Flow Stats1 SCR QM1 SCR 5x1G Tx2 (P5-P9) QM2 SCR SCR QM3 SCR SCR NAT Pkt return Stats (1 ME) SRAM1 SCR SRAM3 Flow Stats2 SRAM Freelist SRAM XScale XScale SRAM2 Archive Records

38 ORIGINAL: SPP V1 LC Egress
NAT MISS! XScale NAT Miss Scratch Ring Buf Handle(24b) IP Pkt Length (16b) Reserved (8b) Eth Hdr Len (8b) Flags (8b) SCR S W I T C H R B U F M S F Rx1 Rx2 Key Extract Lookup Hdr Format NN NN NN NN Flags (8b) Buf Handle(24b) IP Pkt Length (16b) TCAM Eth Hdr Len (8b) Reserved (8b) VLAN (12b) PerSchedQID (15b) Sch 3b QM 2b NN Translated SPort(16b) Stats Index (16b) T B U F 5x1G Tx1 (P0-P4) IP DAddr (32b) SCR QM0 SCR Port Splitter SCR R T M M S F Flow Stats1 IP Hdr 1st Word (32b) SCR QM1 SCR IP Hdr Top 16 bits Of 2nd Word (16b) Rsv (4b) SliceID (12b) 5x1G Tx2 (P5-P9) QM2 SCR SCR QM3 SCR SCR NAT Pkt return Stats (1 ME) SRAM3 SRAM1 SCR Flow Stats2 SRAM Freelist SRAM XScale XScale SRAM2 Archive Records

39 Proposed Change: SPP V1 LC Ingress
XScale ICMP ERR Flags(8b) Rsvd 2b IE 1b T 1b U 1b I 1b N 1b H 1b NAT Scratch Rings SCR Rsv 1b IE 1b TCP Flags 6b Hit TCP UDP ICMP NAT ICMP ERR SCR R B U F R T M M S F Rx1 Rx2 Key Extract Lookup Hdr Format NN NN NN NN Flags (8b) Buf Handle(24b) Flags (8b) Buf Handle(24b) IP Pkt Length (16b) Eth Hdr Len (8b) Rsv (4b) Intf (4b) TCAM IP Pkt Length (16b) Eth Hdr Len (8b) Reserved (8b) Lookup Key IP DAddr (32b) NN Lookup Result VLAN (12b) QM 2b Sch 3b PerSchedQID (15b) Protocol (8b) TCP/UDP DPort Or ICMP ID (16b) ICMP Type (8b) Translated DPort/ID (16b) Stats Index (16b) S W I T C H T B U F IP SAddr (32b) Scr2NN QM0 IP Hdr 1st Word (32b) SCR Port Splitter SCR M S F 1x10G Tx2 IP Hdr 1st Word (32b) 1x10G Tx1 IP Hdr Top 16 bits Of 2nd Word (16b) QM1 SCR Original DPort/ID (16b) IP Hdr Top 16 bits Of 2nd Word (16b) NN TCP/UDP SPort (16b) NN Word0 Surrounding DPort (32b) QM2 SCR KE reads this data anyway, and passing it on will save HF a DRAM Read Word0 Surrounding DPort (32b) Word1 Surrounding DPort (32b) Word1 Surrounding DPort (32b) QM3 Word0 Surrounding Cksum (32b) SCR Word0 Surrounding Cksum (32b) Stats (1 ME) Word1 Surrounding Cksum (32b) Word1 Surrounding Cksum (32b) SRAM1 SRAM3 SCR SRAM2

40 Proposed Change: SPP V1 LC Ingress
Hit TCP Flags 6b H 1b Rsv S R P A F U FIN SYN RST PSH ACK URG XScale Intf (4b) Buf Handle(24b) IP Pkt Length (16b) Eth Hdr Len (8b) Flags (8b) IP DAddr (32b) TCP/UDP DPort Or ICMP ID (16b) Protocol (8b) ICMP Type (8b) Rsv IP_SAddr (32b) TCP/UDP SPort (16b) TCAM Hit Index (32b) IP Hdr 1st Word (32b) IP Hdr Top 16 bits Of 2nd Word (16b) Word0 Surrounding DPort (32b) Word1 Surrounding DPort (32b) Word0 Surrounding Cksum (32b) Word1 Surrounding Cksum (32b) Reserved (8b) NAT Scratch Rings SCR SCR R B U F R T M M S F Rx1 Rx2 Key Extract Lookup Hdr Format NN NN NN NN Flags (8b) Buf Handle(24b) TCAM IP Pkt Length (16b) Eth Hdr Len (8b) Reserved (8b) ICMP ERR ICMP NAT Hit UDP TCP NN VLAN (12b) PerSchedQID (15b) Sch 3b QM 2b Rsvd 2b IE 1b T 1b U 1b I 1b N 1b H 1b Translated DPort/ID (16b) Stats Index (16b) S W I T C H T B U F Scr2NN QM0 IP Hdr 1st Word (32b) SCR Port Splitter SCR M S F 1x10G Tx2 1x10G Tx1 IP Hdr Top 16 bits Of 2nd Word (16b) QM1 SCR Original DPort/ID (16b) NN NN Word0 Surrounding DPort (32b) QM2 SCR Word1 Surrounding DPort (32b) QM3 Word0 Surrounding Cksum (32b) SCR Stats (1 ME) Word1 Surrounding Cksum (32b) SRAM1 SRAM3 SCR SRAM2

41 Proposed Change: SPP V1 LC Egress
ICMP ERR XScale Flags(8b) Rsvd 2b IE 1b T 1b U 1b I 1b N 1b H 1b SCR Rsv 1b IE 1b TCP Flags 6b ICMP NAT Hit UDP TCP NAT Scratch Rings ICMP ERR S W I T C H SCR R B U F M S F Rx1 Rx2 Key Extract Lookup Hdr Format NN NN NN NN Reserved (8b) Buf Handle(24b) Flags (8b) Buf Handle(24b) IP Pkt Length (16b) Eth Hdr Len (8b) SrcMAC (8b) IP Pkt Length (16b) TCAM Eth Hdr Len (8b) Reserved (8b) Lookup Result IP_SAddr (32b) VLAN (12b) PerSchedQID (15b) Sch 3b QM 2b NN IP Proto (8b) TCP/UDP SPort Or ICMP ID (16b) ICMP Type (8b) Translated SPort(16b) Stats Index (16b) T B U F 5x1G Tx1 (P0-P4) IP DAddr (32b) QM0 SCR Port Splitter R T M M S F IP DAddr (32b) Flow Stats1 SCR IP Hdr 1st Word (32b) SCR QM1 SCR IP Hdr 1st Word (32b) IP Hdr Top 16 bits Of 2nd Word (16b) Original SPort/ID (16b) 5x1G Tx2 (P5-P9) QM2 IP Hdr Top 16 bits Of 2nd Word (16b) TCP/UDP DPort (16b) Reserved (16b) SCR Word0 Surrounding DPort (32b) SCR Word0 Surrounding DPort (32b) Word1 Surrounding DPort (32b) QM3 SCR Word1 Surrounding DPort (32b) SCR Word0 Surrounding Cksum (32b) NAT Pkt return Stats (1 ME) Word0 Surrounding Cksum (32b) SCR SRAM3 Word1 Surrounding Cksum (32b) SRAM1 Flow Stats2 SRAM Freelist Word1 Surrounding Cksum (32b) SRAM XScale XScale SRAM2 Archive Records

42 Proposed Change: SPP V1 LC Egress
Hit TCP Flags 6b H 1b Rsv S R P A F U FIN SYN RST PSH ACK URG Buf Handle(24b) IP Pkt Length (16b) Eth Hdr Len (8b) Flags (8b) IP_SAddr (32b) SrcMAC (8b) TCP/UDP SPort Or ICMP ID (16b) IP Proto ICMP Type(8b) IP_DAddr (32b) TCP/UDP DPort (16b) TCAM Hit Index (32b) IP Hdr 1st Word (32b) IP Hdr Top 16 bits Of 2nd Word (16b) Word0 Surrounding DPort (32b) Word1 Surrounding DPort (32b) Word0 Surrounding Cksum (32b) Word1 Surrounding Cksum (32b) XScale NAT Scratch Rings SCR S W I T C H SCR R B U F M S F Rx1 Rx2 Key Extract Lookup Hdr Format NN NN NN NN Flags (8b) Buf Handle(24b) TCAM IP Pkt Length (16b) Eth Hdr Len (8b) Reserved (8b) ICMP ERR ICMP NAT Hit UDP TCP VLAN (12b) PerSchedQID (15b) Sch 3b QM 2b NN Translated SPort(16b) Stats Index (16b) T B U F 5x1G Tx1 (P0-P4) Rsvd 2b IE 1b T 1b U 1b I 1b N 1b H 1b SCR QM0 SCR Port Splitter Flow Stats1 SCR IP DAddr (32b) R T M M S F SCR IP Hdr 1st Word (32b) QM1 SCR IP Hdr Top 16 bits Of 2nd Word (16b) Original SPort/ID (16b) 5x1G Tx2 (P5-P9) QM2 SCR SCR Word0 Surrounding DPort (32b) QM3 Word1 Surrounding DPort (32b) SCR SCR Stats (1 ME) Word0 Surrounding Cksum (32b) NAT Pkt return SRAM1 SCR SRAM3 Word1 Surrounding Cksum (32b) Flow Stats2 SRAM Freelist SRAM XScale XScale SRAM2 Archive Records


Download ppt "SPP Version 1 Router NAT John DeHart."

Similar presentations


Ads by Google