1 An Efficient, Hardware-based Multi-Hash Scheme for High Speed IP Lookup Hot Interconnects 2008 Socrates Demetriades, Michel Hanna, Sangyeun Cho and Rami.

Slides:



Advertisements
Similar presentations
IP Router Architectures. Outline Basic IP Router Functionalities IP Router Architectures.
Advertisements

A Search Memory Substrate for High Throughput and Low Power Packet Processing Sangyeun Cho, Michel Hanna and Rami Melhem Dept. of Computer Science University.
August 17, 2000 Hot Interconnects 8 Devavrat Shah and Pankaj Gupta
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Internetworking II: MPLS, Security, and Traffic Engineering
NetFPGA Project: 4-Port Layer 2/3 Switch Ankur Singla Gene Juknevicius
Scalable Multi-Cache Simulation Using GPUs Michael Moeng Sangyeun Cho Rami Melhem University of Pittsburgh.
A Scalable and Reconfigurable Search Memory Substrate for High Throughput Packet Processing Sangyeun Cho and Rami Melhem Dept. of Computer Science University.
Network Algorithms, Lecture 4: Longest Matching Prefix Lookups George Varghese.
Bio Michel Hanna M.S. in E.E., Cairo University, Egypt B.S. in E.E., Cairo University at Fayoum, Egypt Currently is a Ph.D. Student in Computer Engineering.
1 Fast Routing Table Lookup Based on Deterministic Multi- hashing Zhuo Huang, David Lin, Jih-Kwon Peir, Shigang Chen, S. M. Iftekharul Alam Department.
© 2009 Cisco Systems, Inc. All rights reserved. SWITCH v1.0—4-1 Implementing Inter-VLAN Routing Deploying Multilayer Switching with Cisco Express Forwarding.
M. Waldvogel, G. Varghese, J. Turner, B. Plattner Presenter: Shulin You UNIVERSITY OF MASSACHUSETTS, AMHERST – Department of Electrical and Computer Engineering.
IP Routing Lookups Scalable High Speed IP Routing Lookups.
Chisel: A Storage-efficient, Collision-free Hash-based Network Processing Architecture Author: Jahangir Hasan, Srihari Cadambi, Venkatta Jakkula Srimat.
Low Power TCAM Forwarding Engine for IP Packets Authors: Alireza Mahini, Reza Berangi, Seyedeh Fatemeh and Hamidreza Mahini Presenter: Yi-Sheng, Lin (
Router Architecture : Building high-performance routers Ian Pratt
KARL NADEN – NETWORKS (18-744) FALL 2010 Overview of Research in Router Design.
1 Author: Ioannis Sourdis, Sri Harsha Katamaneni Publisher: IEEE ASAP,2011 Presenter: Jia-Wei Yo Date: 2011/11/16 Longest prefix Match and Updates in Range.
IP Address Lookup for Internet Routers Using Balanced Binary Search with Prefix Vector Author: Hyesook Lim, Hyeong-gee Kim, Changhoon Publisher: IEEE TRANSACTIONS.
1 A Tree Based Router Search Engine Architecture With Single Port Memories Author: Baboescu, F.Baboescu, F. Tullsen, D.M. Rosu, G. Singh, S. Tullsen, D.M.Rosu,
Power Efficient IP Lookup with Supernode Caching Lu Peng, Wencheng Lu*, and Lide Duan Dept. of Electrical & Computer Engineering Louisiana State University.
1 A Novel Scalable IPv6 Lookup Scheme Using Compressed Pipelined Tries Author: Michel Hanna, Sangyeun Cho, and Rami Melhem Publisher: NETWORKING 2011 Presenter:
An Efficient Hardware-based Multi-hash Scheme for High Speed IP Lookup Department of Computer Science and Information Engineering National Cheng Kung University,
Performance Evaluation of IPv6 Packet Classification with Caching Author: Kai-Yuan Ho, Yaw-Chung Chen Publisher: ChinaCom 2008 Presenter: Chen-Yu Chaug.
Study of IP address lookup Schemes
1 A Fast IP Lookup Scheme for Longest-Matching Prefix Authors: Lih-Chyau Wuu, Shou-Yu Pin Reporter: Chen-Nien Tsai.
An Efficient IP Lookup Architecture with Fast Update Using Single-Match TCAMs Author: Jinsoo Kim, Junghwan Kim Publisher: WWIC 2008 Presenter: Chen-Yu.
1 Performing packet content inspection by longest prefix matching technology Authors: Nen-Fu Huang, Yen-Ming Chu, Yen-Min Wu and Chia- Wen Ho Publisher:
EaseCAM: An Energy And Storage Efficient TCAM-based IP-Lookup Architecture Rabi Mahapatra Texas A&M University;
Fast binary and multiway prefix searches for pachet forwarding Author: Yeim-Kuan Chang Publisher: COMPUTER NETWORKS, Volume 51, Issue 3, pp , February.
Chapter 9 Classification And Forwarding. Outline.
Hash, Don’t Cache: Fast Packet Forwarding for Enterprise Edge Routers Minlan Yu Princeton University Joint work with Jennifer.
1 Route Table Partitioning and Load Balancing for Parallel Searching with TCAMs Department of Computer Science and Information Engineering National Cheng.
Networking Virtualization Using FPGAs Russell Tessier, Deepak Unnikrishnan, Dong Yin, and Lixin Gao Reconfigurable Computing Group Department of Electrical.
Sarang Dharmapurikar With contributions from : Praveen Krishnamurthy,
PARALLEL TABLE LOOKUP FOR NEXT GENERATION INTERNET
IP Address Lookup Masoud Sabaei Assistant professor
Fast and deterministic hash table lookup using discriminative bloom filters  Author: Kun Huang, Gaogang Xie,  Publisher: 2013 ELSEVIER Journal of Network.
Author: Haoyu Song, Fang Hao, Murali Kodialam, T.V. Lakshman Publisher: IEEE INFOCOM 2009 Presenter: Chin-Chung Pan Date: 2009/12/09.
Hardware Implementation of Fast Forwarding Engine using Standard Memory and Dedicated Circuit Kazuya ZAITSU, Shingo ATA, Ikuo OKA (Osaka City University,
Wire Speed Packet Classification Without TCAMs ACM SIGMETRICS 2007 Qunfeng Dong (University of Wisconsin-Madison) Suman Banerjee (University of Wisconsin-Madison)
Multi-Field Range Encoding for Packet Classification in TCAM Author: Yeim-Kuan Chang, Chun-I Lee and Cheng-Chien Su Publisher: INFOCOM 2011 Presenter:
CA-RAM: A High-Performance Memory Substrate for Search-Intensive Applications Sangyeun Cho, J. R. Martin, R. Xu, M. H. Hammoud and R. Melhem Dept. of Computer.
Author : Guangdeng Liao, Heeyeol Yu, Laxmi Bhuyan Publisher : Publisher : DAC'10 Presenter : Jo-Ning Yu Date : 2010/10/06.
1. Outline Introduction Related work on packet classification Grouper Performance Analysis Empirical Evaluation Conclusions 2/42.
1 Dynamic Pipelining: Making IP- Lookup Truly Scalable Jahangir Hasan T. N. Vijaykumar School of Electrical and Computer Engineering, Purdue University.
Routing Prefix Caching in Network Processor Design Huan Liu Department of Electrical Engineering Stanford University
IP Address Lookup Masoud Sabaei Assistant professor
1 Power-Efficient TCAM Partitioning for IP Lookups with Incremental Updates Author: Yeim-Kuan Chang Publisher: ICOIN 2005 Presenter: Po Ting Huang Date:
Scalable High Speed IP Routing Lookups Scalable High Speed IP Routing Lookups Authors: M. Waldvogel, G. Varghese, J. Turner, B. Plattner Presenter: Zhqi.
A Smart Pre-Classifier to Reduce Power Consumption of TCAMs for Multi-dimensional Packet Classification Yadi Ma, Suman Banerjee University of Wisconsin-Madison.
High-Speed Policy-Based Packet Forwarding Using Efficient Multi-dimensional Range Matching Lakshman and Stiliadis ACM SIGCOMM 98.
CS 740: Advanced Computer Networks IP Lookup and classification Supplemental material 02/05/2007.
Memory-Efficient and Scalable Virtual Routers Using FPGA Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan,
HIGH-PERFORMANCE LONGEST PREFIX MATCH LOGIC SUPPORTING FAST UPDATES FOR IP FORWARDING DEVICES Author: Arun Kumar S P Publisher/Conf.: 2009 IEEE International.
On-Chip Logic Minimization Roman Lysecky & Frank Vahid* Department of Computer Science and Engineering University of California, Riverside *Also with the.
IP Routing table compaction and sampling schemes to enhance TCAM cache performance Author: Ruirui Guo a, Jose G. Delgado-Frias Publisher: Journal of Systems.
1 IP Routing table compaction and sampling schemes to enhance TCAM cache performance Author: Ruirui Guo, Jose G. Delgado-Frias Publisher: Journal of Systems.
1 Traffic Engineering By Kavitha Ganapa. 2 Introduction Traffic engineering is concerned with the issue of performance evaluation and optimization of.
IP Address Lookup Masoud Sabaei Assistant professor Computer Engineering and Information Technology Department, Amirkabir University of Technology.
BUFFALO: Bloom Filter Forwarding Architecture for Large Organizations Minlan Yu Princeton University Joint work with Alex Fabrikant,
Exploiting Graphics Processors for High-performance IP Lookup in Software Routers Jin Zhao, Xinya Zhang, Xin Wang, Yangdong Deng, Xiaoming Fu IEEE INFOCOM.
IP Routers – internal view
Statistical Optimal Hash-based Longest Prefix Match
Advance Computer Networking
Scalable Memory-Less Architecture for String Matching With FPGAs
A Small and Fast IP Forwarding Table Using Hashing
Jason Klaus, Duncan Elliott Confidential
Authors: A. Rasmussen, A. Kragelund, M. Berger, H. Wessing, S. Ruepp
Presentation transcript:

1 An Efficient, Hardware-based Multi-Hash Scheme for High Speed IP Lookup Hot Interconnects 2008 Socrates Demetriades, Michel Hanna, Sangyeun Cho and Rami Melhem.

2 Background IP Lookup in Core Router Incoming Packet Lookup IP Address IP address Next Hop Outgoing Link **** (Port 2) Longer Prefix Matching Port 2

3 Motivation Increasing Internet Traffic  High Speed links Optical technology -> link rates ~100Gbps  High Speed Routers TCAM-based forwarding engines  Larger forwarding tables TCAMs FAIL to scale.

4 IP Lookup Schemes 1.TCAM-based schemes. [idt, netlogic, micron,CoolCAM] 1.Fast and constant lookup time 2.High cost and power consumption 2.Trie-based schemes. [Eatherton04, Devroye03,…] 1.Multi-cycle lookup latencies and low worse-case throughput. 2.Performance and scalability are fundamentally tied with the IP address length. 3.Hash-based schemes. [Srinivasan98, Hasan06, Kaxiras05,…] 1.Key-length independent latencies 2.Easy to implement in hardware 3.Hashing collisions -> space inefficiency 4.Hash keys (prefixes) include “don’t care” bits and they make hashing complicated.

5 Overview Problem: Hash-based schemes can be power and cost efficient but are still space inefficient or slow. Goal: A hardware-based forwarding engine that has: 1. Constant and high speed lookup throughput. 2. Space efficiency. 3. Scales well with the increasing fwrding tables 4. Low cost and power consumption. Proposal: A h/w-based multi hash architecture with high throughput (1 packet lookup per mem cycle) and at the same time is space and power efficient.

6 Outline  Introduction  High Speed and Space Efficient Implementation  Selecting hashing bits / Dealing with wildcard bits  Experimental Evaluation  Summary

7 h/w Hash-based IP Lookup key 1 key 2 key c … 2 R rows C entries C keys fetched match 1 match 2 match c … Hash Index generator Key (IP address) Matching Processors LPM logic C-way associative memory array Much more power efficient scheme compared with TCAM. High Throughput

8 Hash-based IP Lookup example key 1 key 2 key j … 2 R rows C entries C keys fetched match 1 match 2 match j … Hash Index generator Key (IP address) / 8 bits 1010**** 1111**** **** Next Hop 1111****

9 Hash-based IP Lookup - LPM key 1 key 2 key j … 2 R rows C entries C keys fetched match 1 match 2 match j … Hash Index generator Key (IP address) / 8 bits 1010**** ** * Next Hop 1010**** ** * LPM (Longest Prefix Match)

10 Hash index generation Simple XOR-folding hash function N selected bits FR = N – F Skew XOR IP Prefix or IP incoming address Bit-Select mechanism R bit hash index XOR hash function

11 Inserting / Hashing IP prefixes Space Utilization = 30% Single Hash Table Balanced is better Bucket index Bucket Load Total available memory space Used memory space

12 How to Improve the utilization of the hash table.  Powerful Hash Functions -> Complexity -> Delay on Critical lookup path.  Adaptive perfect or semi-perfect Hash Functions -> Rehashing of the whole routing table is needed periodically – very time consuming process.  Using multiple hash functions (MHT) -> Increase of space efficiency Our proposal: multi-hashing scheme (MHT) + items are allowed to migrate during insertion operation.

13 IP prefix insertion (multi-hashing) h1 h2h3 Used Entry

14 Hashing IP prefixes: multi-hashing Single Hash Table Space Utilization 30%50% Bucket index Bucket Load Single hashing Multi-hashing with 3 hash tables.

15 Migrations are allowed during the insertion operation Insertion time? h1 h2h3

16 Hashing prefixes: MHT + migrations (a) (b) (c) Single Hash Table Single hashing Multi-hashing with 3 hash tables. Multi-hashing with 3 hash tables + migrations. Space Utilization 30%50% 70%

17 Crisis: Handling unresolved collisions Victim TCAM h1 h2h3

18 Outline  Introduction  High Speed and Space Efficient Implementation.  Selecting Hashing Bits / Dealing with wildcard bits.  Experimental Evaluation.  Summary

19 Selecting hashing bits from prefixes ************************ / length = 8 bits **************** / length = 16 bits ******** / length = 24 bits - No prefix has length < 8 bits - Rightmost bits have higher entropy and are more suitable for hashing. - Routing tables become larger while wildcard bits participate in hashing.

20 Supporting wildcard bits in hashing Current technique: Convert each prefix of length x to a set of new prefixes of length L=x+k so the wildcard bits are eliminated up to length L. Then hash the whole new expanded set of prefixes. [Srinivasan et al.] -> Each prefix expands the table by 2^k prefixes / /16 … / / **************** / length = 16 bits

… ************** 16 keys to be inserted (index) (index) (index) (index) ************** 4 keys to be inserted CWR: Select bits from any carefully predefined positions CWR: -> Allows Sensitivity analysis that can find optimal configuration points for maximum space efficiency. -> faster Insertion time per prefix Control Wildcard Resolution (CWR)

22 Outline  Introduction  High Speed and Space Efficient Implementation.  Selecting hashing bits / Dealing with wildcard bits  Experimental Evaluation.  Summary

23 Lookup Architecture … R+F bits (Selected bits for Index generation ) R bits Hash Index Tag to match T + F bits (TAG) … … LPM Incoming packet’s IP Address ( 32 bits) Bit-Select mechanism

24 Sensitivity Analysis Different Bit-select configurations 1.Advantage over the standard MHT scheme. 2.Very small deviation of the points around the trend line. -> a practical guarantee that the unresolved collisions will not be far from an estimated value.

25 Comparison for h/w based schemes TCAMIPStashNew scheme Descriptionh/w CAM based h/w Hash- based Throughput11/31 Space Efficiency BestVery Good (state of the art for hash-based) Good Power consumption CAM => high consumption per lookup 2.2 mem access per lookup + many row comparators 1 mem access per lookup + few row comparators

26 Space Efficiency - Comparison Load Factor = Routing table size / Available space capacity

27 Power Consumption Even with load factor = x more power efficient than TCAM - 2x compared with IPStash.

28 Victim TCAM space requirements The percentage of the ‘unresolved collisions’ is an accurate estimator of the victim space that is required for the corresponding load factor.

29 Summary IP Lookup using TCAMs is expensive. Current hash-based approaches are promising but are either space inefficient or limited by low lookup throughput. The proposed h/w-based multi-hash lookup scheme has: 1. High Speed Lookup Throughput. Requires 1 mem access time per packet lookup 2. Space Efficiency. Effective Load Factor 70% with < 5% victim TCAM 3. Low power consumption and cost. 8x less power than dynamic TCAMs. Best among hash-based schemes. Simple and easy hardware implementation. 4. Scalable to future routing table sizes abd IPv6 transition. All methods and techniques used scale well.

30 Questions source code: