Bio Michel Hanna M.S. in E.E., Cairo University, Egypt B.S. in E.E., Cairo University at Fayoum, Egypt Currently is a Ph.D. Student in Computer Engineering.

Slides:



Advertisements
Similar presentations
IP Router Architectures. Outline Basic IP Router Functionalities IP Router Architectures.
Advertisements

A Search Memory Substrate for High Throughput and Low Power Packet Processing Sangyeun Cho, Michel Hanna and Rami Melhem Dept. of Computer Science University.
August 17, 2000 Hot Interconnects 8 Devavrat Shah and Pankaj Gupta
Fast Updating Algorithms for TCAMs Devavrat Shah Pankaj Gupta IEEE MICRO, Jan.-Feb
NetFPGA Project: 4-Port Layer 2/3 Switch Ankur Singla Gene Juknevicius
A Scalable and Reconfigurable Search Memory Substrate for High Throughput Packet Processing Sangyeun Cho and Rami Melhem Dept. of Computer Science University.
1 An Efficient, Hardware-based Multi-Hash Scheme for High Speed IP Lookup Hot Interconnects 2008 Socrates Demetriades, Michel Hanna, Sangyeun Cho and Rami.
Space-for-Time Tradeoffs
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Part C Part A:  Index Definition in SQL  Ordered Indices  Index Sequential.
Quick Review of Apr 10 material B+-Tree File Organization –similar to B+-tree index –leaf nodes store records, not pointers to records stored in an original.
Chapter 11 Indexing and Hashing (2) Yonsei University 2 nd Semester, 2013 Sanghyun Park.
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree.
1 Fast Routing Table Lookup Based on Deterministic Multi- hashing Zhuo Huang, David Lin, Jih-Kwon Peir, Shigang Chen, S. M. Iftekharul Alam Department.
M. Waldvogel, G. Varghese, J. Turner, B. Plattner Presenter: Shulin You UNIVERSITY OF MASSACHUSETTS, AMHERST – Department of Electrical and Computer Engineering.
IP Routing Lookups Scalable High Speed IP Routing Lookups.
Reviewer: Jing Lu Gigabit Rate Packet Pattern- Matching Using TCAM Fang Yu, Randy H. Katz T. V. Lakshman UC Berkeley Bell Labs, Lucent ICNP’2004.
Low Power TCAM Forwarding Engine for IP Packets Authors: Alireza Mahini, Reza Berangi, Seyedeh Fatemeh and Hamidreza Mahini Presenter: Yi-Sheng, Lin (
1 A Tree Based Router Search Engine Architecture With Single Port Memories Author: Baboescu, F.Baboescu, F. Tullsen, D.M. Rosu, G. Singh, S. Tullsen, D.M.Rosu,
1 A Novel Scalable IPv6 Lookup Scheme Using Compressed Pipelined Tries Author: Michel Hanna, Sangyeun Cho, and Rami Melhem Publisher: NETWORKING 2011 Presenter:
1 Energy Efficient Multi-match Packet Classification with TCAM Fang Yu
An Efficient Hardware-based Multi-hash Scheme for High Speed IP Lookup Department of Computer Science and Information Engineering National Cheng Kung University,
Optimal Fast Hashing Yossi Kanizo (Technion, Israel) Joint work with Isaac Keslassy (Technion, Israel) and David Hay (Politecnico di Torino, Italy)
1 Gigabit Rate Multiple- Pattern Matching with TCAM Fang Yu Randy H. Katz T. V. Lakshman
An Efficient IP Lookup Architecture with Fast Update Using Single-Match TCAMs Author: Jinsoo Kim, Junghwan Kim Publisher: WWIC 2008 Presenter: Chen-Yu.
EaseCAM: An Energy And Storage Efficient TCAM-based IP-Lookup Architecture Rabi Mahapatra Texas A&M University;
Fast binary and multiway prefix searches for pachet forwarding Author: Yeim-Kuan Chang Publisher: COMPUTER NETWORKS, Volume 51, Issue 3, pp , February.
1 Route Table Partitioning and Load Balancing for Parallel Searching with TCAMs Department of Computer Science and Information Engineering National Cheng.
Sarang Dharmapurikar With contributions from : Praveen Krishnamurthy,
ECE 526 – Network Processing Systems Design Network Processor Architecture and Scalability Chapter 13,14: D. E. Comer.
PARALLEL TABLE LOOKUP FOR NEXT GENERATION INTERNET
Author: Haoyu Song, Fang Hao, Murali Kodialam, T.V. Lakshman Publisher: IEEE INFOCOM 2009 Presenter: Chin-Chung Pan Date: 2009/12/09.
Hardware Implementation of Fast Forwarding Engine using Standard Memory and Dedicated Circuit Kazuya ZAITSU, Shingo ATA, Ikuo OKA (Osaka City University,
IT253: Computer Organization
Wire Speed Packet Classification Without TCAMs ACM SIGMETRICS 2007 Qunfeng Dong (University of Wisconsin-Madison) Suman Banerjee (University of Wisconsin-Madison)
Multi-Field Range Encoding for Packet Classification in TCAM Author: Yeim-Kuan Chang, Chun-I Lee and Cheng-Chien Su Publisher: INFOCOM 2011 Presenter:
CA-RAM: A High-Performance Memory Substrate for Search-Intensive Applications Sangyeun Cho, J. R. Martin, R. Xu, M. H. Hammoud and R. Melhem Dept. of Computer.
Comp 335 File Structures Hashing.
© 2004, D. J. Foreman 1 Virtual Memory. © 2004, D. J. Foreman 2 Objectives  Avoid copy/restore entire address space  Avoid unusable holes in memory.
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 ECSE-6600: Internet Protocols Informal Quiz #14 Shivkumar Kalyanaraman: GOOGLE: “Shiv RPI”
1. Outline Introduction Related work on packet classification Grouper Performance Analysis Empirical Evaluation Conclusions 2/42.
1 Biometric Databases. 2 Overview Problems associated with Biometric databases Some practical solutions Some existing DBMS.
StrideBV: Single chip 400G+ packet classification Author: Thilan Ganegedara, Viktor K. Prasanna Publisher: HPSR 2012 Presenter: Chun-Sheng Hsueh Date:
March 23 & 28, Hashing. 2 What is Hashing? A Hash function is a function h(K) which transforms a key K into an address. Hashing is like indexing.
1 Power-Efficient TCAM Partitioning for IP Lookups with Incremental Updates Author: Yeim-Kuan Chang Publisher: ICOIN 2005 Presenter: Po Ting Huang Date:
Hashing Chapter 7 Section 3. What is hashing? Hashing is using a 1-D array to implement a dictionary o This implementation is called a "hash table" Items.
1 Fast packet classification for two-dimensional conflict-free filters Department of Computer Science and Information Engineering National Cheng Kung University,
Scalable High Speed IP Routing Lookups Scalable High Speed IP Routing Lookups Authors: M. Waldvogel, G. Varghese, J. Turner, B. Plattner Presenter: Zhqi.
A Small IP Forwarding Table Using Hashing Yeim-Kuan Chang and Wen-Hsin Cheng Dept. of Computer Science and Information Engineering National Cheng Kung.
Efficient Cache Structures of IP Routers to Provide Policy-Based Services Graduate School of Engineering Osaka City University
A Resource Efficient Content Inspection System for Next Generation Smart NICs Karthikeyan Sabhanatarajan, Ann Gordon-Ross* The Energy Efficient Internet.
A Smart Pre-Classifier to Reduce Power Consumption of TCAMs for Multi-dimensional Packet Classification Yadi Ma, Suman Banerjee University of Wisconsin-Madison.
High-Speed Policy-Based Packet Forwarding Using Efficient Multi-dimensional Range Matching Lakshman and Stiliadis ACM SIGCOMM 98.
CS 740: Advanced Computer Networks IP Lookup and classification Supplemental material 02/05/2007.
1 ECE 526 – Network Processing Systems Design System Implementation Principles I Varghese Chapter 3.
Fast Lookup for Dynamic Packet Filtering in FPGA REPORTER: HSUAN-JU LI 2014/09/18 Design and Diagnostics of Electronic Circuits & Systems, 17th International.
Parallel tree search: An algorithmic approach for multi- field packet classification Authors: Derek Pao and Cutson Liu. Publisher: Computer communications.
Packet Classification Using Dynamically Generated Decision Trees
Evaluating and Optimizing IP Lookup on Many Core Processors Author: Peng He, Hongtao Guan, Gaogang Xie and Kav´e Salamatian Publisher: International Conference.
IP Routing table compaction and sampling schemes to enhance TCAM cache performance Author: Ruirui Guo a, Jose G. Delgado-Frias Publisher: Journal of Systems.
1 IP Routing table compaction and sampling schemes to enhance TCAM cache performance Author: Ruirui Guo, Jose G. Delgado-Frias Publisher: Journal of Systems.
Hierarchical packet classification using a Bloom filter and rule-priority tries Source : Computer Communications Authors : A. G. Alagu Priya 、 Hyesook.
Exploiting Graphics Processors for High-performance IP Lookup in Software Routers Jin Zhao, Xinya Zhang, Xin Wang, Yangdong Deng, Xiaoming Fu IEEE INFOCOM.
IP Routers – internal view
Packet Forwarding.
CS 31006: Computer Networks – The Routers
Jason Klaus Supervisor: Duncan Elliott August 2, 2007 (Confidential)
Scalable Memory-Less Architecture for String Matching With FPGAs
Jason Klaus, Duncan Elliott Confidential
Authors: A. Rasmussen, A. Kragelund, M. Berger, H. Wessing, S. Ruepp
Authors: Ding-Yuan Lee, Ching-Che Wang, An-Yeu Wu Publisher: 2019 VLSI
Presentation transcript:

Bio Michel Hanna M.S. in E.E., Cairo University, Egypt B.S. in E.E., Cairo University at Fayoum, Egypt Currently is a Ph.D. Student in Computer Engineering Program, University of Pittsburgh

CHAP: Enabling Efficient Hardware- based Multiple Hash Schemes for IP Lookup Michel Hanna Socrates Demetriades Sangyeun Cho Rami Melhem

The Problem user you user YouTube Server user server router Internet

Input Port Inside a Router Output Port Input Port Forwarding Table Forwarding Decision Link Switching Fabric PrefixPort N# 0*0 1*1 100*6 1000* *3 101*2 110* *5 111*3 YouTube Hotmail CA Aachen Output Port

PrefixPort N# 0*0 1*1 100*6 1000* *3 101*2 110* *5 111*3 Current Solution the TCAM (Ternary Content Addressable Memory):  the most usable solution in real life as it provide the answer in single memory access However:  very high power consumption  very low bit density & not scalable  run at lower speed than RAMs

Hash-based solution N = n# rows Hash function h(.) Prefix i PrefixPort N# 0*0 1*1 100*6 1000* *3 101*2 110* *5 111*3 L = bucket size How to handle the OVERFLOW?!

Use Linear Probing… hash function h(.) Prefix i we might scan the entire table… means that we can’t bound memory access time …

Content-based HAsh Probing Probing Pointers hash function  1 [h()]  0 [h()] h (.) Prefix i Note that the probing pointers are set differently for each IP lookup table based on its content

Evaluation N# of Tables Ave. Size (K) rrc rrc rrc rrc we used simulation to validate our scheme on real life IP lookup tables 14 tables from different systems were used and all gave the same results quality is measured in terms of “overflow”: percentage of prefixes that did not fit in the hash table…

CHAP v.s. Linear Probing still some overflow? Legend L: Bucket or row width N: Number of rows m: Number of probing pointers for CHAP and the number of probing steps for the linear probing C i : Configuration number ‘i’

Use Multiple Hashing… Multiple hash functions h 0 (.) Prefix i h 1 (.) still have some overflow left… the prefix might go to one of the two buckets… We combine CHAP with Multiple Hashing

CHAP(H, m)  1 [h 1 ()]  0 [h 0 ()] Probing Pointers h 0 (.) h 1 (.) Prefix i multiple hash functions use multiple hash functions Number of Hash Functions Number of Probing Pointers

CHAP(H,H) vs. Multiple Hashing (MH): Overflow We experiment with m = H

IG MP Matching Processors Priority Encoder Result … … IP Address Search in Set-Associative Arch. RAM Parallel Matching One Cell PrefixLen Port - All keys in one row are matched in parallel - Consumes “1/N” of TCAM power N

CHAP vs. MH: ASST (Average Successful Search Time)

Tradeoff between Overflow and ASST ASST Overflow Curve # 1 Curve # 2 Better scheme

Tradeoff between Overflow and ASST

Conclusion and Future Work CHAP is effective in:  reducing the overflow by 72% on average compared to other probing schemes  low average memory access time (2.5 accesses max) apply it to other network applications:  packet filtering, packet inspection, VPN packet forwarding … (future work) study the general case ``CHAP(H,m)’’ with “H  m”  may be useful for other applications (speech recognition…)

Questions? THANK YOU

Backup slides

Index Generator … The CA-RAM Architecture One Cell PrefixLen Port MP Matching Processors Priority Encoder Parallel Matching … N = n# rows RAM L = bucket width IG Key

IG this row where the prefix is stored MP Matching Processors Priority Encoder Result … … IP Address Search in CA-RAM RAM Parallel Matching One Cell PrefixLen Port

CHAP Setup Algorithm The goal is to map lookup table into a hash table with 2 R = N rows  R = n# of bits used to index the hash table first sort prefixes from long to short then we collect stats about the lookup table:  calculate the n# of prefixes to be assigned to each row

CHAP Setup Algorithm

When Algorithm 1 exits, “table_overflow” contains the n# of prefixes that could not fit  if not acceptable, then the algorithm repeated with more hash functions  a separate TCAM is used to store the short prefixes and the overflow Activating the probing pointer’s array is done by running the best fit algorithm

Search in CHAP the order of accessing the probing pointers used in searching has to be the same order used in inserting the prefixes:  This constraint has to be satisfied to guarantee the LPM the order is maintained by dedicating one probing pointer per hash function

The Incremental Updates we need to define where to store the new prefix (k n ) according to its length to achieve LPM  if the prefix already exists then the existing entry will be updated based on the length of k n relative to the lengths of both k l and k s, we will try to insert k n in one of the 2×H rows generated by the hash functions and the probing pointers

The Incremental Updates

the subroutines terminate successfully if we were able to insert k n successfully Otherwise, we should either insert kn into the auxiliary TCAM, or try using backtracking scheme like “Cuckoo hashing”  replace an existing prefix (k y ) from the hash table by k n  try to reinsert k y into the hash table recursively

Formal problem definition internet routers require wire speed packet forwarding while sizes of the IP lookup tables are increasing near future: “Terabit” link rates will be available with affordable prices need scalable solution that fits our current needs, and future needs

Hash-based solution hardware realization of hash table!  it directly addresses the all severe shortcomings of the TCAM as it uses RAM: High bit density and very scalable Low power consumption  however: hard to handle the overflow no bound on the memory access time

CHAP vs. Multiple Hashing - Overflow overflow comparison between the CHAP(H,H) and MH(H) for H = 1 to 4 and for the same loading factor (RAM table aspect ratio)

CHAP vs. Multiple Hashing - ASST ASST comparison between the CHAP(H,H) and MH(H) for H = 1 to 4 and for the same loading factor (RAM table aspect ratio)

Content-based Hash Probing notice that some rows incur overflow while others have space can keep some bits at the end of each row that work as pointers to rows that have empty space

Search in CHAP the underlying architecture reads a full row of the table into a buffer in one clock cycle  uses parallel matching processors to determine the match if any in that bucket we measure the quality of the search in: Average Successful Search Time (ASST)  Ave. n# rows accessed for successful search