Presentation is loading. Please wait.

Presentation is loading. Please wait.

IPv6-Oriented 4 OC768 Packet Classification with Deriving-Merging Partition and Field- Variable Encoding Scheme Mr. Xin Zhang Undergrad. in Tsinghua University,

Similar presentations


Presentation on theme: "IPv6-Oriented 4 OC768 Packet Classification with Deriving-Merging Partition and Field- Variable Encoding Scheme Mr. Xin Zhang Undergrad. in Tsinghua University,"— Presentation transcript:

1 IPv6-Oriented 4 OC768 Packet Classification with Deriving-Merging Partition and Field- Variable Encoding Scheme Mr. Xin Zhang Undergrad. in Tsinghua University, Beijing, P. R. China z-x02@mails.tsinghua.edu.cn Presentation at IEEE INFOCOM’06 Apr. 26, 2006, Barcelona, Spain X. Zhang, B. Liu, W. Li, Y. Xi, D. Bermingham, X. Wang

2 Mr. Xin Zhang, IEEE INFOCOM’06, Barcelona, Spain 2/20 Outline V. Conclusion IV. Performance Analysis III. Detailed Ideas & Solutions II. Related Works & Motivation I. Brief Background

3 Mr. Xin Zhang, IEEE INFOCOM’06, Barcelona, Spain 3/20 Sour. IP (SIP) Dest. IP (DIP)Sour. Port (SP)Dest. Port (DP)Prot. I. Brief Background A Typical IPv6 5-tuple Rule TCAM and Range Encoding Prefix format 128+128 bits Range format 16+16 bits Exact value 8 bits TCAM Partition Y Partition B Partition G Coded Ranges Encoding Table Search Key Codes

4 Mr. Xin Zhang, IEEE INFOCOM’06, Barcelona, Spain 4/20 II. Related Works & Motivation Parallel distributed schemes fail to provide excellent balancing ( K. Zheng, infocom’04, ’05) Encoding speed becomes the bottleneck for even higher throughput (J.V. L-P 2 C, D. Pao-PIC, C. Hao-DRES) When scalable to 4*OC-768 or IPv6, previous schemes call for prohibitive cost Our Goals?

5 Mr. Xin Zhang, IEEE INFOCOM’06, Barcelona, Spain 5/20 Robust- ness Low Storage IPv6 Oriented 4*OC-768 long prefix unknown features Maximum Parallelism High Utilization Load Balancing 4*OC-768 Goals Minimum Parallelism II. Related Works & Motivation

6 Mr. Xin Zhang, IEEE INFOCOM’06, Barcelona, Spain 6/20 Fast, robust economical III. 3-level load balancing High speed low cost IPv6 Long Prefix Save storage Adaptive algorithms Practical I. IPv6 5-tuple rule encoding II. 3-plane parallel encoding Encoding Performance Speed & Reliability Unknown Features Overview of Solutions II. Related Works & Motivation

7 Mr. Xin Zhang, IEEE INFOCOM’06, Barcelona, Spain 7/20 4.1 5-tuple IPv6 Rule Encoding For SIP & DIP IPv6 Addresses MulticastUnicastAnycast Can be directly removed Encode the first 16 bits to 8 bits III. Detailed Ideas & Solutions For Protocol Field Prot. 8 bitsProt. 4 bits

8 Mr. Xin Zhang, IEEE INFOCOM’06, Barcelona, Spain 8/20 Layer #1 [1-1023] (LO)[1024-65535] (HI) Layer #2 EM Layer #3 AR Layer #n … AR well known (LO), ephemeral user (HI), arbitrary range (AR) and exact match (EM) port have different properties Pre-defined Layers 4.1 5-tuple IPv6 Rule Encoding For SP & DP III. Detailed Ideas & Solutions To prevent from running out of bits at certain layer resulting in the increasing update complexity Pre-defined bits SP (12): 1/6/2  1/6/2/2/1 DP (24): 1/9/6/2  1/9/6/2/2/2/2

9 Mr. Xin Zhang, IEEE INFOCOM’06, Barcelona, Spain 9/20 III. Detailed Ideas & Solutions Principles of Encoding High Speed Low Costs Field-Variable Processes moderate parallelism On-chip RAM 4.2 3-Plane Parallel Encoding SIP + DIP + SP + DP + Prot. 120 + 120 + 12 + 24 + 4

10 Mr. Xin Zhang, IEEE INFOCOM’06, Barcelona, Spain 10/20 4.2 3-Plane Parallel Encoding 5-tuple rule S&DIPSPDP Plane #1 HI/LO EM SP or DP AR HI/LO EMAR Plane #2 III. Detailed Ideas & Solutions Prot. Plane #1 and #2 still fail to match 4*OC 768

11 Mr. Xin Zhang, IEEE INFOCOM’06, Barcelona, Spain 11/20 4.2 3-Plane Parallel Encoding 5-tuple rule S&DIPSP-EM Plane #1 SP-ARDP-EMDP-AR Plane #2 S&DIP SP-EM SP-AR DP-EM SP-AR DP-AR Plane #3: Field-Variable Parallelism III. Detailed Ideas & Solutions S&DIPSPDP Plane #3? Prot. Inefficient and costly! Note that different fields have different processing speeds

12 Mr. Xin Zhang, IEEE INFOCOM’06, Barcelona, Spain 12/20 4.3 3-Level Load Balancing III. Detailed Ideas & Solutions Problem Statement of Load Balancing Distributed Storage Complete Policy Table + 2 Parallel TCAMs Balanced TrafficBalanced StorageLow Redundancy packets

13 Mr. Xin Zhang, IEEE INFOCOM’06, Barcelona, Spain 13/20 4.3 3-Level Load Balancing Size Threshold Sub tables Deriving-Merging Policy Table Partition Remove “bad” bits through some heuristic standards III. Detailed Ideas & Solutions Deriving-Merging Adjustment (DMA) Preliminary Partition Bits (PPB)

14 Mr. Xin Zhang, IEEE INFOCOM’06, Barcelona, Spain 14/20 4.3 3-Level Load Balancing TCAM#1 No. of rules Sub tables TCAM#2 #1 busier than #2? Yes No Packet III. Detailed Ideas & Solutions Policy Table Partition Distribution among TCAMs Redundancy Based Dynamic Balancing

15 Mr. Xin Zhang, IEEE INFOCOM’06, Barcelona, Spain 15/20 IV. Performance Analysis Implemental Results for 3-Plane Encoding SIP& DIP SPDP Protocol AREMAREM Original speed (RAM cycle) 261621 Parallel Num. of Encoding Units 261621 Parallel speed (RAM cycle) 111111 RAM Costs (Block) 8* M4K 56* M512 128* M512 2* M512 Storage Req. (Kbit RAM) 3228641

16 Mr. Xin Zhang, IEEE INFOCOM’06, Barcelona, Spain 16/20 Fig. 1 Sizes of sub tables after PPB For IPv4, the original 5-tuple rule is 104-bit long. The final number of candidate bits is decreased dramatically to 15. The redundancy, max group size, average group size are all smaller than those in similar research Experimental Results for Policy Table Partition Fig.2 Sizes of sub tables after DMA IV. Performance Analysis

17 Mr. Xin Zhang, IEEE INFOCOM’06, Barcelona, Spain 17/20 Experimental Results for Policy Table Partition Rule Set#4 (264 rules) PID Case1 (rules) Case2 (rules) Case3 (rules) PID Case1 (rules) Case2 (rules) Case3 (rules) 0595723852023 1601294602 2521361054031 3441721154068 4551231268052 53812135202 6 37321468032 737165361561074 SumCase1: 852 rules; Case2: 264 rules; Case3: 510 rules IV. Performance Analysis

18 Mr. Xin Zhang, IEEE INFOCOM’06, Barcelona, Spain 18/20 Experimental Results for Distribution Algorithms TCAMSub Group ID (Excluding GR) Num. of Rules Traffic Load Ratio #11351030618.13% #2421239717.9% #37131147419.32% #461514046419.36% IV. Performance Analysis

19 Mr. Xin Zhang, IEEE INFOCOM’06, Barcelona, Spain 19/20 Throughput Storage Worst-Case Loss Probability & Processing Delay Classify 266Mpps with TCAM working at 133MSPS (double-data rate I/O), RAM at 266Mhz Employ 4 TCAM chips with 1.78 times of the original table size, and 125 Kbit on-chip RAMs compared to 8 TCAM chips in C. Hao-DRES and K.Zheng-infocom’05 The loss probability is well close to zero when buffer depth>=5, and delay is only 12T c (TCAM cycle) compared to 54T c in K. Zheng-infocom’05 Update TCAM can be easily updated with “CoPTUA”; inserting a new range does not bother existing codes in the pre-defined bits manner. compared to P 2 C IV. Performance Analysis

20 Mr. Xin Zhang, IEEE INFOCOM’06, Barcelona, Spain 20/20 V. Conclusion Achieved ultra high throughput matching 4*OC768 line rate with the TCAM storage 1.7 times the policy table Propose the Deriving-Merging Partition and 3- level balancing resulting in a guaranteed worst case performance We designed the 3-plane IPv6 rule encoding scheme matching 4*OC768 line rate with 125 Kbits on-chip RAM We proposed a set of adaptive algorithms to deal with different IPv6 policy table characteristics

21


Download ppt "IPv6-Oriented 4 OC768 Packet Classification with Deriving-Merging Partition and Field- Variable Encoding Scheme Mr. Xin Zhang Undergrad. in Tsinghua University,"

Similar presentations


Ads by Google