1 Scalable high-throughput SRAM-based architecture for IP-lookup using FPGA Author: Hoang Le; Weirong Jiang; Prasanna, V.K.; Publisher: FPL Field Programmable Logic and Applications, Presenter: Yu-Ping Chiang Date: 2008/12/03
2 Outline Binary-tree-based IP Lookup Mapping Searching Architecture Cache based Performance Throughput Comparison
3 Binary-tree-based IP Lookup Base on Binary Search Tree Property Each node has a value. Left sub-trie nodes contain only smaller values. Right sub-trie nodes contain only greater values. Element can found in (1+logN) operations. Pre-compute Pad prefixes to 32 bits with 1s. Padded bits. Sort with concatenation of prefix and padded bits.
4 Binary-tree-based IP Lookup Build Binary Search Tree Full binary tree without last level. Left-aligned. =>complete tree
5 Binary-tree-based IP Lookup. △ △
6. △
7 Recursive find root
8 Binary-tree-based IP Lookup Step 1: x = = /011 (Prefix length)
9 Binary-tree-based IP Lookup Step 2: x = = /
10 Binary-tree-based IP Lookup
11 Binary-tree-based IP Lookup Search /011 Step 1: IP= ≦> Not MATCH!!
12 Binary-tree-based IP Lookup Search /101 Step 2: IP= >≦ Not MATCH!!
13 Binary-tree-based IP Lookup Search /101 Step 3: IP= ≦ Match!! Continue search for longer matching.
14 Binary-tree-based IP Lookup Search /011 Step 4: IP= ≦ Match!! Not MATCH!!
15 Binary-tree-based IP Lookup Search: Match!! Property: ex: 1011* and 101*→ and (=) 1010* and 101*→ and (<) Left brench
16 Outline Binary-tree-based IP Lookup Mapping Searching Architecture Cache based Performance Throughput Comparison
17 Architecture Pipelining Memory of each stage contains one Binary Search Tree level nodes. Dual read/write port Content of each entry: Padded prefix Prefix length Data forward to next stage: IP address Memory address Previously longest matched prefix information.
18 Architecture Pipelining Memory of each stage contains one Binary Search Tree level nodes. Dual read/write port Content of each entry: Padded prefix Prefix length Data forward to next stage: IP address Memory address Previously longest matched prefix information.
19 Architecture Cache based Most recently searched packets. Update when: Route update related to cached entry. Cache miss.
20 Outline Binary-tree-based IP Lookup Mapping Searching Architecture Cache based Performance Throughput Comparison
21 Performance Throughput Without caching 324 MLPS, 100 Gbps 162 MHz Minimum packet size of 40 bytes. With 1% routing entries cached 4 packets processed per clock => 4*324=1.3GLPS, 416 Gbps
22 Performance Comparison Architecture# slicesBRAM# prefixThroughput Ring architecture1405(2.3%)53080K125 MLPS State of art on FPGA14274(22.7%)25480K263 MLPS Non-cache-based2009(3.2%)539228K324 MLPS Cache-based7982(12.7%)539228K1.3 GLPS Non-cache-based with SRAM1813(2.9%)3112M324 MLPS Cache-based with SRAM7713(12.3%)3112M1.3 GLPS