Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 HEXA : Compact Data Structures for Faster Packet Processing Department of Computer Science and Information Engineering National Cheng Kung University,

Similar presentations


Presentation on theme: "1 HEXA : Compact Data Structures for Faster Packet Processing Department of Computer Science and Information Engineering National Cheng Kung University,"— Presentation transcript:

1 1 HEXA : Compact Data Structures for Faster Packet Processing Department of Computer Science and Information Engineering National Cheng Kung University, Taiwan R.O.C. Authors: Sailesh Kumar, Jonathan Turner, Patrick Crowley,and Michael Mitzenmacher Publisher: ICNP 2007 Present: Yu-Tso Chen Date: JUNE, 10, 2008

2 2 Outline 1. Introduction 2. Proposed Approach 3. Architecture Description 4. Performance Evaluation

3 3 Introduction HEXA (History-based Encoding, eXecution and addressing) Directed graphs are commonly used to implement various packet processing algorithms Used a fixed constant numbers of bits per node for structured graphs such as tries

4 4 Motivating Example 1. - 2. 0 3. 1 4. 00 5. 01 6. 11 7. 010 8. 011 9. 0100 1. 0, 2, 3 2. 0, 4, 5 3. 1, NULL, 6 4. 1, NULL, NULL 5. 0,7, 8 6. 1, NULL, NULL 7. 0, 9, NULL 8. 1, NULL, NULL 9. 1, NULL, NULL Standard representation HEXA identifier by the input stream

5 5 Motivating Example HEXA requires a hash function Store only 3 bits worth of information for each node of trie First bit is valid prefix Second and third bits are if node has a left and right child

6 6 Motivating Example Require a minimal perfect hash function Maintaining the one- to-one mapping 1. f(-) =4 2. f(0) =7 3. f(1) =9 4. f(00) = 2 5. f(01) = 8 6. f(11) = 1 7. f(010) = 5 8. f(011) = 3 9. f(0100) = 6 012345678 0,00 0,101,1,011,0,111, P5P1P4P2P3 Fast path Next hop 1. h(00 -) = 0 2. h(00 0) = 1 3. h(01 1) = 4 4. h(11 00) = 7 5. h(10 01) = 2 6. h(00 11) = 8 7. h(11 010) = 6 8. h(11 011) = 5 9. h(01 0100) = 3

7 7 Example Input stream 011 Start at index f(-) = 4 next input is 0 has left child f(0) = 7 f(0) next input is 1 has right child f(01) = 8 f(01) next input is 1 has right child f(011) = 3 match and no child stop search. Then to read the next hop is P4

8 8 Child’s discriminator give parent 012345678 0,00 0,111,1,011,0,011, P5P1P4P2P3 Fast path Next hop 1. h(00 -) = 0 2. h(00 0) = 1 3. h(01 1) = 4 4. h(11 00) = 7 5. h(10 01) = 2 6. h(00 11) = 8 7. h(11 010) = 6 8. h(11 011) = 5 9. h(01 0100) = 3

9 9 Example (to be cont.) Input stream 011 Start at index h(00 -) = 0 next input is 0 has child h(00 0) = 1 h(00 0) next input is 1 has child h(10 01) = 2 h(01) next input is 1 has child h(011) = 5 match and no child stop search. Then to read the next hop is P4

10 10 Devising One-to-one Mapping Simplify the problem (perfect hash function) by HEXA identifier of a node can be modified without changing its meaning and keeping it unique. Node identifier to contain few additional (say c) bits We call these c-bits the node’s discriminator Hence up to 2 c memory locations, from which we have to pick just one Use multiple-choice hashing and cuckoo hashing

11 11 Devising One-to-one Mapping Instead of storing a bit for each left and right child We store the discriminator if the child exists. All-0 c-bit word to represent NULL Only 2 c -1 memory locations This problem can be viewed as a bipartite graph matching problem. G=(V 1 +V 2, E) Left set – Original directed graph Right set – Locations memory 2 c edges connected to random right vertices

12 12 Memory mapping graph

13 13 Updating a Perfect Matching Deletions are easy (Check by Fig.1) Simply remove the relevant node from the hash table (and update pointers to that node) Insert a node but its hash locations are already taken Find a augmenting path in G Augmenting path can be found via a BFS [1] Remapping other nodes to other locations By changing their discriminator bits

14 14 Bounded HEXA (BHEXA) Cyclic graph – Aho-Corasick 1. no,2, 1,7 2. no, 2, 3, 7 3. no, 2, 4, 6 4. no,5, 1,7 5. match, 2, 3, 7 6. match, 8, 1, 7 7. no,8, 1,7 8. no, 2, 9, 7 9. match, 2, 4, 6 1. - 2. -, a 3. -, b, ab 4. -, b, bb, abb 5. -, a, ba, bba, abba 6. -, c, bc, abc 7. -, c 8. -, a, ca 9. -, b, ab, cab

15 15 Motivating Example Root node is “-” All incoming edges into node 2 are labeled with “a” Thus its identifier can either be – or a The identifier of node 7 can be – or c Both paths 1- a ->2- b ->3 and 9- b ->4- a ->5- b ->3 lead to the node 3 And two symbols in these paths are identical Its identifier can either be – or b or ab We must ensure the ones we choose are unique.

16 16 Memory Mapping If a node has k choices then up to log 2 k additional bits to indicate the length of its identifier Node 5 has 5 choices ; hence 3-bits may be needed Only c + log 2 k bits worth of information is required to be stored

17 17 Memory mapping graph 1. h(-) = 0 2. h(a) = 1 3. h(ab) = 5 4. h(bb) = 6 5. h(bba) = 9 6. h(bc) = 8 7. h(c) = 3 8. h(ca) = 4 9. h(b) = 2

18 18 Memory Mapping We only store the length of bHEXA identifiers in the memory Standard implementation (13-bits per node) bHEXA uses about half memory (7-bits per node) 1. h(-) = 0 2. h(a) = 1 3. h(ab) = 5 4. h(bb) = 6 5. h(bba) = 9 6. h(bc) = 8 7. h(c) = 3 8. h(ca) = 4 9. h(b) = 2

19 19 Example Input stream abba Start at h(-) = 0 next input is a h(a) = 1 Next input is b h(ab) = 5 then to h(bb) = 6 Then can find h(bba) = 9 match

20 20 Experimental Evaluation

21 21 Experimental Evaluation Multi-bit tries

22 22 Incremental Updates PDF of the number of memory operations required to perform a single trie update. Left trie size = 100,000 nodes, Right trie size = 10,000 nodes.

23 23 Result on Strings Implemented with Aho-Corasick Spill fraction – number of automaton nodes that could not be mapped to a memory location Summarize bHEXA achieve between 2-5 fold reductions in the memory

24 24 Result on Strings Plotting spill fraction: a) Aho-Coroasick automaton for random strings sets, b) Aho-Coroasick automaton for real world string sets, and c) random and real world strings with bit-split version of Aho-Corasick. Cisco622 有很多 state 有 相同的 bHEXA identifiers 4 state-machines, each handling two bits of the 8-bit input character

25 25 Tanks for your attention

26 26 Devising One-to-one Mapping If we require m=n, c is loglogn + O(1) to ensure perfect matching exists with high probability Fig.2 we assume that the hash function is simply the numerical value of the identifier modulo 9

27 27 Memory Mapping m=10 h=(sum from i=1 to k S i x i )mod10 - = 0, a = 1, b = 2,c = 3


Download ppt "1 HEXA : Compact Data Structures for Faster Packet Processing Department of Computer Science and Information Engineering National Cheng Kung University,"

Similar presentations


Ads by Google