Exploiting Graphics Processors for High-performance IP Lookup in Software Routers Jin Zhao, Xinya Zhang, Xin Wang, Yangdong Deng, Xiaoming Fu IEEE INFOCOM.

Slides:



Advertisements
Similar presentations
IP Router Architectures. Outline Basic IP Router Functionalities IP Router Architectures.
Advertisements

August 17, 2000 Hot Interconnects 8 Devavrat Shah and Pankaj Gupta
Router/Classifier/Firewall Tables Set of rules—(F,A)  F is a filter Source and destination addresses. Port number and protocol. Time of day.  A is an.
Fast Updating Algorithms for TCAMs Devavrat Shah Pankaj Gupta IEEE MICRO, Jan.-Feb
Introduction to IPv6 Presented by: Minal Mishra. Agenda IP Network Addressing IP Network Addressing Classful IP addressing Classful IP addressing Techniques.
1 An Efficient, Hardware-based Multi-Hash Scheme for High Speed IP Lookup Hot Interconnects 2008 Socrates Demetriades, Michel Hanna, Sangyeun Cho and Rami.
Bio Michel Hanna M.S. in E.E., Cairo University, Egypt B.S. in E.E., Cairo University at Fayoum, Egypt Currently is a Ph.D. Student in Computer Engineering.
1 Fast Routing Table Lookup Based on Deterministic Multi- hashing Zhuo Huang, David Lin, Jih-Kwon Peir, Shigang Chen, S. M. Iftekharul Alam Department.
M. Waldvogel, G. Varghese, J. Turner, B. Plattner Presenter: Shulin You UNIVERSITY OF MASSACHUSETTS, AMHERST – Department of Electrical and Computer Engineering.
Exploiting Graphics Processors for High- performance IP Lookup in Software Routers Author: Jin Zhao, Xinya Zhang, Xin Wang, Yangdong Deng, Xiaoming Fu.
Low Power TCAM Forwarding Engine for IP Packets Authors: Alireza Mahini, Reza Berangi, Seyedeh Fatemeh and Hamidreza Mahini Presenter: Yi-Sheng, Lin (
An Efficient IP Address Lookup Algorithm Using a Priority Trie Authors: Hyesook Lim and Ju Hyoung Mun Presenter: Yi-Sheng, Lin ( 林意勝 ) Date: Mar. 11, 2008.
1 Author: Ioannis Sourdis, Sri Harsha Katamaneni Publisher: IEEE ASAP,2011 Presenter: Jia-Wei Yo Date: 2011/11/16 Longest prefix Match and Updates in Range.
IP Address Lookup for Internet Routers Using Balanced Binary Search with Prefix Vector Author: Hyesook Lim, Hyeong-gee Kim, Changhoon Publisher: IEEE TRANSACTIONS.
1 A Tree Based Router Search Engine Architecture With Single Port Memories Author: Baboescu, F.Baboescu, F. Tullsen, D.M. Rosu, G. Singh, S. Tullsen, D.M.Rosu,
Power Efficient IP Lookup with Supernode Caching Lu Peng, Wencheng Lu*, and Lide Duan Dept. of Electrical & Computer Engineering Louisiana State University.
Efficient IP-Address Lookup with a Shared Forwarding Table for Multiple Virtual Routers Author: Jing Fu, Jennifer Rexford Publisher: ACM CoNEXT 2008 Presenter:
An Efficient Hardware-based Multi-hash Scheme for High Speed IP Lookup Department of Computer Science and Information Engineering National Cheng Kung University,
Parallel-Search Trie-based Scheme for Fast IP Lookup
張 燕 光 資訊工程學系 Dept. of Computer Science & Information Engineering,
1 A Fast IP Lookup Scheme for Longest-Matching Prefix Authors: Lih-Chyau Wuu, Shou-Yu Pin Reporter: Chen-Nien Tsai.
Two stage packet classification using most specific filter matching and transport level sharing Authors: M.E. Kounavis *,A. Kumar,R. Yavatkar,H. Vin Presenter:
An Efficient IP Lookup Architecture with Fast Update Using Single-Match TCAMs Author: Jinsoo Kim, Junghwan Kim Publisher: WWIC 2008 Presenter: Chen-Yu.
EaseCAM: An Energy And Storage Efficient TCAM-based IP-Lookup Architecture Rabi Mahapatra Texas A&M University;
Chapter 9 Classification And Forwarding. Outline.
Gregex: GPU based High Speed Regular Expression Matching Engine Date:101/1/11 Publisher:2011 Fifth International Conference on Innovative Mobile and Internet.
Router Architectures An overview of router architectures.
Router Architectures An overview of router architectures.
Better by a HAIR: Hardware-Amenable Internet Routing Brent Mochizuki University of Illinois at Urbana-Champaign Joint work with: Firat Kiyak (Illinois)
1 Route Table Partitioning and Load Balancing for Parallel Searching with TCAMs Department of Computer Science and Information Engineering National Cheng.
IP Address Lookup Masoud Sabaei Assistant professor
Fast forwarding table lookup exploiting GPU memory architecture Author : Youngjun Lee,Minseon Jeong,Sanghwan Lee,Eun-Jin Im Publisher : Information and.
CoPTUA: Consistent Policy Table Update Algorithm for TCAM without Locking Zhijun Wang, Hao Che, Mohan Kumar, Senior Member, IEEE, and Sajal K. Das.
IP Forwarding.
Scalable Name Lookup in NDN Using Effective Name Component Encoding
Author: Haoyu Song, Fang Hao, Murali Kodialam, T.V. Lakshman Publisher: IEEE INFOCOM 2009 Presenter: Chin-Chung Pan Date: 2009/12/09.
Towards a Billion Routing Lookups per Second in Software  Author: Marko Zec, Luigi, Rizzo Miljenko Mikuc  Publisher: SIGCOMM Computer Communication Review,
A Hybrid IP Lookup Architecture with Fast Updates Author : Layong Luo, Gaogang Xie, Yingke Xie, Laurent Mathy, Kavé Salamatian Conference: IEEE INFOCOM,
1 Towards Practical Architectures for SRAM-based Pipelined Lookup Engines Author: Weirong Jiang, Viktor K. Prasanna Publisher: INFOCOM 2010 Presenter:
EECB 473 DATA NETWORK ARCHITECTURE AND ELECTRONICS PREPARED BY JEHANA ERMY JAMALUDDIN Basic Packet Processing: Algorithms and Data Structures.
1. Outline Introduction Related work on packet classification Grouper Performance Analysis Empirical Evaluation Conclusions 2/42.
Parallelization and Characterization of Pattern Matching using GPUs Author: Giorgos Vasiliadis 、 Michalis Polychronakis 、 Sotiris Ioannidis Publisher:
Routing Prefix Caching in Network Processor Design Huan Liu Department of Electrical Engineering Stanford University
Compact Trie Forest: Scalable architecture for IP Lookup on FPGAs Author: O˘guzhan Erdem, Aydin Carus and Hoang Le Publisher: ReConFig 2012 Presenter:
IP Address Lookup Masoud Sabaei Assistant professor
IP Routing Processing with Graphic Processors Author: Shuai Mu, Xinya Zhang, Nairen Zhang, Jiaxin Lu, Yangdong Steve Deng, Shu Zhang Publisher: IEEE Conference.
CS 4396 Computer Networks Lab Router Architectures.
1 Power-Efficient TCAM Partitioning for IP Lookups with Incremental Updates Author: Yeim-Kuan Chang Publisher: ICOIN 2005 Presenter: Po Ting Huang Date:
Scalable High Speed IP Routing Lookups Scalable High Speed IP Routing Lookups Authors: M. Waldvogel, G. Varghese, J. Turner, B. Plattner Presenter: Zhqi.
PARALLEL-SEARCH TRIE- BASED SCHEME FOR FAST IP LOOKUP Author: Roberto Rojas-Cessa, Lakshmi Ramesh, Ziqian Dong, Lin Cai Nirwan Ansari Publisher: IEEE GLOBECOM.
Memory-Efficient IPv4/v6 Lookup on FPGAs Using Distance-Bounded Path Compression Author: Hoang Le, Weirong Jiang and Viktor K. Prasanna Publisher: IEEE.
Author: Haoyu Song, Murali Kodialam, Fang Hao and T.V. Lakshman Publisher/Conf. : IEEE International Conference on Network Protocols (ICNP), 2009 Speaker:
Memory-Efficient and Scalable Virtual Routers Using FPGA Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan,
Updating Designed for Fast IP Lookup Author : Natasa Maksic, Zoran Chicha and Aleksandra Smiljani´c Conference: IEEE High Performance Switching and Routing.
HIGH-PERFORMANCE LONGEST PREFIX MATCH LOGIC SUPPORTING FAST UPDATES FOR IP FORWARDING DEVICES Author: Arun Kumar S P Publisher/Conf.: 2009 IEEE International.
1 On the Aggregatability of Router Forwarding Tables Author: Xin Zhao, Yaoqing Liu, Lan Wang and Beichuan Zhang Publisher: IEEE INFOCOM 2010 Presenter:
GFlow: Towards GPU-based High- Performance Table Matching in OpenFlow Switches Author : Kun Qiu, Zhe Chen, Yang Chen, Jin Zhao, Xin Wang Publisher : Information.
Evaluating and Optimizing IP Lookup on Many Core Processors Author: Peng He, Hongtao Guan, Gaogang Xie and Kav´e Salamatian Publisher: International Conference.
Hierarchical packet classification using a Bloom filter and rule-priority tries Source : Computer Communications Authors : A. G. Alagu Priya 、 Hyesook.
IP Address Lookup Masoud Sabaei Assistant professor Computer Engineering and Information Technology Department, Amirkabir University of Technology.
IP Routers – internal view
Implementation of GPU based CCN Router
Addressing: Router Design
Toward Advocacy-Free Evaluation of Packet Classification Algorithms
CS 31006: Computer Networks – The Routers
Jason Klaus Supervisor: Duncan Elliott August 2, 2007 (Confidential)
A Small and Fast IP Forwarding Table Using Hashing
Ch 17 - Binding Protocol Addresses
A Hybrid IP Lookup Architecture with Fast Updates
Authors: Ding-Yuan Lee, Ching-Che Wang, An-Yeu Wu Publisher: 2019 VLSI
Presentation transcript:

Exploiting Graphics Processors for High-performance IP Lookup in Software Routers Jin Zhao, Xinya Zhang, Xin Wang, Yangdong Deng, Xiaoming Fu IEEE INFOCOM 2011

Contents Introduction GALE functionalities –Lookup –Update Evaluation Conclusion

Introduction A core functionality of a router is –To determine the next hop port Router Routing Table IP address Next Hop

Introduction Two challenges –Lookup : large # of queries per time Link speed Routing table size –Update Addition / Deletion of mapping entries Modification of next hop information in existing entries

Introduction Existing solutions –Hardware-based Specialized hardware like TCAM –Software-based Optimization for longest-prefix matching Modification or extension of data structure Software router using GPUs –“Packetshader” –Assumption : routing tables are static

GPU-Accelerated Lookup Engine –Leverage CUDA programming model for parallel lookups of IP routing table –O(1) time complexity for IP lookup : use of direct table on GPU memory –Route update operation using parallelism

GPU-Accelerated Lookup Engine Architecture

GPU-Accelerated Lookup Engine Architecture (Cont’d) –Two tables Traditional trie-based routing table “Direct Table” –Sharing role and control for each table Fast lookup for direct table, which is controlled by GPU and CUDA Update-related task for trie using CPU code

Lookup in GALE IP lookup operation –Far more frequent than update operation Good target for parallelism of Graphic processing –Use of ‘direct table’ Next hop information for all the possible IP prefixes in a single table Direct Translation of IP addr. to memory address O(1) memory access / computational complexity –# of entries 2 32 = 4G possible IP prefix entries 99% of IP prefixes are less than or equal to 24 bits –Then space for 2 24 = 16M prefix entries are required

Lookup in GALE IP lookup operation (cont’d) –Direct table ‘dtable’ is stored in GPU memory –IP prefix will be translated into a single integer ipaddr = a * b * c * d for address ‘a.b.c.d’

Lookup in GALE Example –Sending packet to /24 dtable[] index= leftmost 24 bit of ipaddr = 9,645,784 index : 0 index : 16,777,215 Looking for dtable[9,645,784] next hop : eth1 next hop : eth2 next hop : eth4 index : 9,645,784 ipaddr= 147 * * * = 2,469,320,805 Interface to next hop is eth2

Update in GALE Routing table update operation –Insertion of new routing table entry –Modification of next hop for existing entry –Removal of existing entry Trie-specific operations involve trie traversing –Many algorithms exists ‘Direct table’ operations –No allocation/de-allocation of memory space –Single prefix may be represented as set of multiple 24-bit prefixes in the direct table /8 → /24 ~ /24

Update in GALE Direct table operations : Insertion / Modification –Two are the same operation To write the new next-hop information to the corresponding IP prefix(es) ltable stores prefix length for the index

Update in GALE Direct table operations : Deletion –Next-hop information is replaced with next- hop information of prefix of parent node The parent node is obtained during the deletion process in the trie structure by CPU

Update in GALE Example –Update next hop for /16 as eth1 dtable[] start= 9,645,784 end= 9,645, – 1 (for 8 bits) index : 16,777,215 Update entries from start to end next hop : eth2 next hop : eth4 index : 9,645,784 next hop : eth2 index : 9,646,039 next hop : eth1

Experiments Setup –Routing dataset from FUNET, RIS –Implementation on desktop PC ($1,122) 2.66GHz i5 750 (quad-core) 4GB DDR3 SDRAM NVIDIA GeForce 470 ($428) –1.2GB of global memory –448 stream processors (GPU cores)

Experiments Performance –Insert / Modify / Delete Due to less dependencies in updating the direct table

Experiments Performance –Lookup performance comparison to trie

Conclusion GALE exploits massive parallelism to speedup parallel routing table lookups Direct table can be used for O(1) lookup complexity for routing table Various updates for routing table also can be done with GPU’s parallelism