HEXA: Compact Data Structures for Faster Packet Processing

HEXA: Compact Data Structures for Faster Packet Processing
Sailesh Kumar Jonathan Turner Patrick Crowley Michael Mitzenmacher

HEXA HEXA (History-based Encoding, eXecution and Addressing)
Novel representation for: IP Lookup tries (directed acyclic graph) Simple finite automaton such as Aho-Corasick String Matchers Space efficient Challenges the assumption that graph structures must store log2n bits pointers to identify successor nodes Requires only 2-bit versus 20-bit pointers (for 1 million nodes)

Tries - Traditional Implementation
Addr data 1 2 3 5 4 7 9 P2 P5 6 P3 8 P4 P1 Five IP prefixes 1 0, 2, 3 2 0, 4, 5 1* P1 3 1, NULL, 6 00* P2 4 1, NULL, NULL 5 0, 7, 8 6 7 0, 9, NULL 8 9 11* P3 011* P4 0100* P5 There are nine nodes; we will need 4-bit node identifiers Total memory = 9 x 9 bits Each trie node will require 9-bits in memory - a flag indicating if node is a prefix - a 4-bit left child pointer - a 4-bit right child pointer

HEXA based Implementation
1 Five IP prefixes 1 2 3 1* P1 1 1 P1 1 00* P2 4 5 6 11* P3 P2 1 P3 011* P4 7 8 0100* P5 P4 9 P5 Properties of HEXA identifiers: Define HEXA identifier of a node as the path that leads to it from the root Unique for every node Implicit (need not be stored) 1. - 2. 0 3. 1 4. 00 5. 01 6. 11 7. 010 8. 011 Can replace node pointers

Hash (HEXA identifier) = memory address IP addr. : x x x If we have a minimal perfect hash function f - A function that maps elements to unique location Then we can store the trie as shown below begin lookup at root node The prefix, we were looking Addr node mem Prefix 1 2 3 4 5 6 7 8 9 Addr node mem Prefix 1 1,0,0 P3 2 P2 3 P4 4 0,1,1 5 0,1,0 6 P5 7 8 9 1,0,1 P1 f(-) = 4 f(0) = 7 f(1) = 9 We use only 3-bits per node in fast path - Valid prefix flag - Left child flag - Right child flag Properties of HEXA identifiers: 0,1,1 f(00) = 2 f(01) = 8 f(11) = 1 Unique for every node Implicit (need not be stored) 1. - 2. 0 3. 1 4. 00 5. 01 6. 11 7. 010 8. 011 0,1,1 Can act as memory address f(010) = 5 f(011) = 3 f(0100) = 6 1,0,1 P1

Devising One-to-one Mapping
Finding a minimal perfect hash function is difficult One-to-one mapping is essential for HEXA to work Use discriminator bits Attach c-bits to every HEXA identifier, that we can modify Thus a node can have 2c choices of identifiers We now need to store these c-bits for every child instead of a single flag With multiple choices of HEXA identifiers for a node, reduce the problem to a bipartite graph matching We need to find a perfect matching in the graph to map nodes to unique memory locations

Devising One-to-one Mapping
Use 2-bit discriminators Nodes Input labels OR HEXA identifier Four choices of HEXA identifiers Choices of memory locations Bipartite graph 1 - 00 -, 01 -, 10 -, 11 - h(00) = 0, h(01) = 4 h(10) = 1, h(11) = 5 2 00 0, 01 0, 10 0, 11 0 h(000) = 1, h(010) = 5 1 PERFECT MATCHING h(100) = 2, h(110) = 6 3 1 00 1, 01 1, 10 1, 11 1 00 00, 01 00, 10 00, 11 00 00 01, 01 01, 10 01, 11 01 00 11, 01 11, 10 11, 11 11 00 010, , 10 010, 00 011, , 10 011, , , , h() = 0, h() = 4 h() = 1, h() = 5 h() = 2, h() = 6 h() = 3, h() = 7 h() = 8, h() = 3 h() = 6, h() = 2 h() = 5, h() = 1 h() = 0, h() = 3 h() = 4, h() = 6 2 4 00 3 5 01 4 Pick Appropriate Discriminators 6 11 5 7 010 6 8 011 7 9 0100 8

Store its discriminator instead of a single flag for left and right children Addr node mem Prefix 1 1,xx,xx P3 2 P2 3 P4 4 0,xx,xx 5 6 P5 7 8 9 P1 Here we use only 5-bits per node in fast path - Valid prefix flag - Left discriminator - Right discriminator 1. - 2. 0 3. 1 4. 00 5. 01 6. 11 7. 010 8. 011

Results 3 choices are sufficient to find a perfect matching (with 10% memory over-provisioning) Thus 2-bits discriminators (00 value reserved for no child) Significant reduction 2-bits per node versus log2n bits 32 Eatherton tries, each contains k prefixes.

Incremental Updates IP table updates are very frequent
When a node is removed and another added, we must ensure a few memory operations. In the new bipartite graph, a new perfect matching can be found Quickly (O(n2c) time in the worst-case, typically constant time) New matching is slightly different from the previous matching Typically around 10 different edges, experimental worst-case - 18 Thus less than 18 memory operations are needed for an update

HEXA for Pattern Matching
HEXA can be used to compress Aho-Corasick string matching automaton Directed graph In the future, HEXA may become useful for general finite automaton Reg-ex acceleration

Thank you and Questions???

HEXA: Compact Data Structures for Faster Packet Processing

Similar presentations

Presentation on theme: "HEXA: Compact Data Structures for Faster Packet Processing"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

HEXA: Compact Data Structures for Faster Packet Processing

Similar presentations

Presentation on theme: "HEXA: Compact Data Structures for Faster Packet Processing"— Presentation transcript:

Similar presentations

About project

Feedback