Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 DRES:Dynamic Range Encoding Scheme for TCAM Coprocessors Authors: Hao Che, Zhijun Wang, Kai Zheng and Bin Liu Publisher: IEEE Transactions on Computers,

Similar presentations


Presentation on theme: "1 DRES:Dynamic Range Encoding Scheme for TCAM Coprocessors Authors: Hao Che, Zhijun Wang, Kai Zheng and Bin Liu Publisher: IEEE Transactions on Computers,"— Presentation transcript:

1 1 DRES:Dynamic Range Encoding Scheme for TCAM Coprocessors Authors: Hao Che, Zhijun Wang, Kai Zheng and Bin Liu Publisher: IEEE Transactions on Computers, 2008 Presenter: Chen – Yu Lin Date: July, 01, 2008

2 2 Outline Introduction and goal of DRES Rule implementation in TCAM Range encoding Encoded range update process Performance evaluation

3 3 Introduction and Goal of DRES(1/2) DRES is proposed to significantly improve the TCAM storage efficiency for range matching. A rule that involves multiple range fields will cause a multiplicative expansion of the rule expressed in TCAM. Our statistical analysis of real world rule databases shows that the TCAM storage efficiency can be as low 16% due to the existence of a significant number of rules with port ranges. Rule encoding: –Use a bit to represent a range in a field. Hence, each rule can be translated to a sequence of encoded bits.

4 4 Introduction and Goal of DRES (2/2) Search key encoding: –A search key based on the information extracted from the header is preprocessed to generate an encoded search key. Range selection: –Selects the ranges to be encoded to maximize the TCAM storage efficiency. Database update: –Minimize its impact on the rule matching process. The P 2 C rule encoding schemes are the most effective schemes as they can encode N nonoverlapping ranges using only log 2 (N+1) bits

5 5 Rule implementation in TCAM (1/4) TCAM Coprocessor: –It works as a look aside processor for packet classification on behalf of a network processing unit (NPU) or network processor. When a packet is to be classified, an NPU generates a search key based on the information extracted from the packet header and passes it to the TCAM coprocessor for classification.

6 6 Rule implementation in TCAM (2/4) Noncompact ranges: –Ranges that cannot be exactly implemented using one rule entry in a TCAM. –Ex : { >1023 } – it needs six rule entries to expressed. Compact ranges: –Ranges that can be exactly implemented in one rule entry in TCAM. –Ex : { <1024 } - 000000********** 1024 ~ 2047 2048 ~ 4095 4096 ~ 8191 8192 ~ 16383 16384 ~ 32767 32768 ~ 65535

7 7 Rule implementation in TCAM (3/4) R1R1 R2R2 R3R3 R4R4 R5R5 R6R6 R7R7

8 8 Rule implementation in TCAM (4/4) L1 L2 256 - 511 512 64 bit 24 bit

9 9 Range encoding (1/12) The details in this section: –How rules and search key are encoded. –How ranges are selected for encoding. Subsections in this section: –Structure of encoded rule and encoded search key –TCAM-based search key encoding process –Dynamic range selection algorithm –Code vector and index vector encoding algorithm

10 10 Range encoding (2/12) Structures of encoded rule / search key(1/2) –Instead of replacing a rule field altogether by a sequence of code bits, we design a hybrid encoding approach for DRES. –The hybrid encoding approach retains all of the fields in a rule and appends a sequence of code bits, called the code vector. –Due to the slotted TCAM structure, there will usually be some free bits left in each rule entry. (24 free bits in our example)

11 11 Range encoding (3/12) Structures of encoded rule / search key(2/2) Rule No encoded range Any encoded range The code vector is wild carded and the rule itself remains unchanged. That field was wild carded and the corresponding code vector is encoded based on the encoding rules

12 12 Range encoding (4/12) TCAM-based search key encoding process(1/3) –Assume that m k ranges from the k th rule field (for k = 1,2,…,K) in a rule database are selected for encoding. –Then, K search key fields matching against the corresponding K range tables must be done to generate an index vector and, hence, an encoded search key. –We propose using the TCAM coprocessor itself for sequential search key encoding. –Note that each range in a range table must be represented by multiple TCAM entries, and the corresponding intermediate index vector must be duplicated for every entry belonging to the same range.

13 13 Range encoding (5/12) TCAM-based search key encoding process(2/3) K+1 tables are allocated in TCAM.

14 14 Range encoding (6/12) TCAM-based search key encoding process(3/3) –In summary, a rule table lookup with range encoding requires K range table lookups for search key encoding, plus one encoded rule table lookup. –We quantify the performance impact of using TCAM for sequential search key encoding. TCAM runs at 133 MHz, that is 133 million lookups / second. Wire-speed forwording at a 10 Gbps line rate, up to 31.3 million packet. Each packet allowed to have 133/31.3 = 4.28 TCAM lookups. –If both the source and destination port fields have range to be encoded, that is K = 2, each PF table matching requires 4 TCAM lookups.

15 15 Range encoding (7/12) Dynamic range selection algorithm(1/3) –We use the bitmap scheme to encode ranges, that is, each unique range is mapped to a unique bit. –Figure shows selecting m ranges for encoding out of n ranges. # subranges need to exactly implemented the range # rule entries to implement all of the rules that contain the range Encoding gain: # rule entries that can be eliminated if the range is encoded

16 16 Range encoding (8/12) Dynamic range selection algorithm(2/3) –(1) The value of E and G are calculated. Range with the maximum G is selected as the first range for encoding. Suppose that R 1 is selected. –(2) E and G for all of the ranges, except for R 1, are updated. Then, the range with the maximum G is chosen to be the second encoded range. The computational complexity for this algorithm is O(nm).

17 17 Range encoding (9/12) Dynamic range selection algorithm(3/3) –There are a total of n = 7 ranges in both destintion and source port fields. R1 = {256 - 512}R2 = {768 - 2047} R3 = {6000 - 6064}R4 = {> 1023} R5 = {512 - 1536} R6 = {>1023}R7 = {256 - 512} m=3 Destination field

18 18 Range encoding (10/12) Code vector and Index vector encoding algorithm(1/3) –In this paper, the bit-map range encoding algorithms are fully leveraged for code vector and index vector encoding. –The most efficient BM algorithm is the P 2 C algorithm, which allows N ranges to be encoded by using only log 2 (N+1) bits in best case. –In BM, each bit in a code vector is assigned to a specific encoded range, which can come from any field in a rule. Suppose that the code vector has 8 bits and the i th is assigend to R i.. –The code vector for R 1 is 1*******.

19 19 Range encoding (11/12) Code vector and Index vector encoding algorithm(2/3) –Ranges from different fields must be encoded using different range table. –Similarly to the code vector, the i th bit in the index vector is assigned to range R i. –The encoding rules used to generate the index vectors are as follow 1. For R i, the i th bit in the index vector must be set to 1. 2. If R i is a subrange of R j, its index vector must have its j th bit set to 1. 3. R r1,r2,…,rn for n overlapping ranges. R r1, R r2,…,R rn needs to be expressed as a separate range if it is a new range other than any existing encoded ranges. 4. All other bits in the index vector must be set to 0. 5. The weight or match priority for a range is equal to the number of 1s in the corresponding index vector.

20 20 Range encoding (12/12) Code vector and Index vector encoding algorithm(3/3) Assume that NPU generates a search key sk = {1.2.3.4, 5.6.7.8, 1025, 1028, 17}, the index vector of source port / destination field are {00000100}, {01011000}. The final index vector is {01011100}.

21 21 Encoded range update process(1/9) The details in this section: –We propose a lock-free encoded range update algorithm. –Which allows the encoded range update and the search key / PF table lookup processes to occur simultaneously without impact the lookup performance. –The basic idea is to maintain consistent and error-free rule and range tables throughout the update process, thus eliminating that need for locking the tables. Subsections of this section –Encoding a newly selected range –Releasing encoded ranges –Encoded range update delay

22 22 Encoded range update process(2/9) Updating a TCAM database without locking may generate two possible types of incorrect TCAM lookups. –Erroneous: If a TCAM rule gets a match while the rule or its corresponding action is partially updated. –Inconsistent: When a match takes place in the middle of a database update process and there is no guarantee of table consistency until the process finishes. In general, each TCAM slot has a valid bit field associated with it. The key to avoiding erroneous lookup is to avoid directly overwriting rule fields and/or the corresponding action when that rule entry is active.

23 23 Encoded range update process(3/9) Any write operation for a rule/action over an existing rule/action must be decomposed into a write process including 3 operations: –Inactivate the rule. –Write thr rule/action. –Activate the rule. Any operations to move a rule-action pair to a new TCAM-associated memory location must be decomposed into a move process including: –Using a write process to write the pair to the new location. –Inactivate the rule at old location.

24 24 Encoded range update process(4/9) Encoding a newly selected range(1/2) –For a newly selected range to be encoded, the range that appeared in any rule in the original encoded rule table in the TCAM is exactly implemented. –In our algorithm, the range table is updated first, followed by the rule table updated. –Note that only the range table associated with the field to which the newly selected range belongs needs to be encoded.

25 25 Encoded range update process(5/9) Encoding a newly selected range(2/2) –There 2 steps for the range table update. 1. Consistently move the ranges and their index vectors from top(bottom) to bottom(top) while leaving the entries for the newly selected range and corresponding subranges empty. 2. Write the newly selected range, the associated subranges, and their index vectors to the preallocated locations in decreasing priority order.

26 26 Encoded range update process(6/9) Write L 2 to a new location L2L2 L2L2 Delete the rule entries at its old locations *Example of update a rule

27 27 Encoded range update process(7/9) Releasing encoded ranges(1/2) –If no free bit is left in the index and code vector, the encoded range with the least encoding gain is unencoded to release a free bit. –To unencode a range, the corresponding field in a rule with this encoded range needs to exactly implemented, which increases the number of rule entries in the table. –However, the increased number of rule entries must less than the reduced number of rule entries by encoding a newly selected range. –To release an encoded range, the rule table is updated first, followed by the range table updated.

28 28 Encoded range update process(8/9) Releasing encoded ranges(2/2) –For the rule table update: Changes the encoded range into an exactly implemented range in all of the rule entries having this encoded range. –For the range table update: Both the encoded range and the derived subranges need to be deleted.

29 29 Encoded range update process(9/9) Encoded range update delay(1/1) –We only consider the rule table update delay for doing the encoded range update. –Assume there are N er rule entries in the rule table. –All of the rule entries in the table are moved once for adding a newly encoded range and once for releasing an encoded range. Hence, the number of rule entry writes and deletes is 2N er for each encoded range update. Assume write and delete cost 100ns For a table with 100000 rule entries, the update delay is 0.02 seconds.

30 30 Performance evaluation (1/5) The performance of DRES is evaluated and compared with Liu’s algorithm, called the CE algorithm, based on four real-world five-tuple PF databases.

31 31 Performance evaluation (2/5) Frequency of source port, destination port, and in both port. # subrange to exactly implemented the range

32 32 Performance evaluation (3/5) Each rule entry has 24 free bits, which is much larger than 7, the maximum number of unique ranges found in the four database. Hence, no extra slot is needed for range encoding. For the four databases, the sizes of the range table in the source (destination) port are 12(12), 12(12), 29(20), 22(10) (in slot). In practice, due to the possible encoded range updates, a range table must be configured to be much larger than the maximum size 29. –Assume that 60 slots are allocated for each range table.

33 33 Performance evaluation (4/5) After encoding ranges in both source and destination fields in DRES, each rule takes one TCAM rule entry.

34 34 Performance evaluation (5/5) If only the source (destination) port range is encoded, the number of TCAM entries for the encoded rules of 4 databases are as follows: –389(389) –243(243) –516(762) –2124(1595) In summary, DRES can significantly improve the overall TCAM storage efficiency for range matching.


Download ppt "1 DRES:Dynamic Range Encoding Scheme for TCAM Coprocessors Authors: Hao Che, Zhijun Wang, Kai Zheng and Bin Liu Publisher: IEEE Transactions on Computers,"

Similar presentations


Ads by Google