Offloading Linux LAG devices Via Open vSwitch and TC

Slides:



Advertisements
Similar presentations
Thoughts on Potential OF 1.1 Features Martin Casado, Brandon Heller, Glen Gibb, Rajiv Ramanathan, Leon Poutievski, Edward Crabbe, You.
Advertisements

NetFPGA Project: 4-Port Layer 2/3 Switch Ankur Singla Gene Juknevicius
© 2007 Cisco Systems, Inc. All rights reserved.Cisco Public ITE PC v4.0 Chapter 1 1 Addressing the Network – IPv4 Network Fundamentals – Chapter 6.
OpenFlow overview Joint Techs Baton Rouge. Classic Ethernet Originally a true broadcast medium Each end-system network interface card (NIC) received every.
IUT– Network Security Course 1 Network Security Firewalls.
Ipchains and Iptables Linux operating system natively supports packet-filtering rules: Kernel versions 2.2 and earlier support the ipchains command. Kernel.
Intel ® Ethernet Depths of the Cloud: How Linux Networking needs to evolve Peter (PJ) Waskiewicz Shannon Nelson.
Performance Evaluation of Open Virtual Routers M.Siraj Rathore
Scalable Flow-Based Networking with DIFANE 1 Minlan Yu Princeton University Joint work with Mike Freedman, Jennifer Rexford and Jia Wang.
Keith Wiles DPACC vNF Overview and Proposed methods Keith Wiles – v0.5.
© 2014 VMware Inc. All rights reserved. Open vSwitch for Microsoft Hyper-V Eitan Eliahu, Nithin Raju, Ankur Sharma Network & Security Business Unit.
SDN Scalability Issues
OpenFlow Switch Limitations. Background: Current Applications Traffic Engineering application (performance) – Fine grained rules and short time scales.
Net Optics, Inc. - Proprietary Director Pro™ Overview February 2010.
Virtual LANs. VLAN introduction VLANs logically segment switched networks based on the functions, project teams, or applications of the organization regardless.
Networking Features Upon completion of this module, you should be able to: Discuss and configure VNX networking features This module continues the discussion.
NDIS LBFO Miniports (Load Balancing And Failover) Larry Cleeton Program Manager Windows Networking And Communications Microsoft Corporation.
NetFilter – IPtables Firewall –Series of rules to govern what Kind of access to allow on your system –Packet filtering –Drop or Accept packets NAT –Network.
Implementation and Performance Analysis of a Delay Based Packet Scheduling Algorithm for an Embedded Open Source Router Master’s Thesis Presentation June.
Virtualization Infrastructure Administration Network Jakub Yaghob.
Security Groups Aswin Suryanarayanan and Ravindra Kencheppa.
 Configuring a vSwitch Cloud Computing (ISM) [NETW1009]
Access Control List (ACL) W.lilakiatsakun. ACL Fundamental ► Introduction to ACLs ► How ACLs work ► Creating ACLs ► The function of a wildcard mask.
Cisco 3 - LAN Perrine. J Page 110/20/2015 Chapter 8 VLAN VLAN: is a logical grouping grouped by: function department application VLAN configuration is.
Chapter 4 Objectives Upon completion you will be able to: Classful Internet Addressing Understand IPv4 addresses and classes Identify the class of an.
An initial study on Multi Path Routing Over Multiple Devices in Linux 2.4.x kernel Towards CS522 term project By Syama Sundar Kosuri.
STP Part II PVST (Per Vlan Spanning Tree): A Vlan field is added to the BPDU header along with Priority & Mac. Priority is 32768, Mac Address is MAC or.
Network Sniffer Anuj Shah Advisor: Dr. Chung-E Wang Department of Computer Science.
Extending OVN Forwarding Pipeline Topology-based Service Injection
BNL PDN Enhancements. Perimeter Load Balancers Scaleable Performance Fault Tolerance Server Maintainability User Convenience Perimeter Security.
Introduction to Mininet, Open vSwitch, and POX
Hiearchial Caching in Traffic Server. Hiearchial Caching  A set of techniques and mechanisms to increase the size and performance of network caches.
T3: TCP-based High-Performance and Congestion-aware Tunneling Protocol for Cloud Networking Satoshi Ogawa† Kazuki Yamazaki† Ryota Kawashima† Hiroshi Matsuo†
AVS Brazos : IPv6. Agenda AVS IPv6 background Packet flows TSO/TCO Configuration Demo Troubleshooting tips Appendix.
An open source user space fast path TCP/IP stack and more…
ArubaOS-Switch Tunneled Node
Simplify network configuration for VMs by harmonizing multiple Bridging, QOS, DCB and CNA implementations Shyam Iyer.
Central Management of 300 Firewalls and Access-Lists Fabian Mauchle TNC 2012 Reykjavík, 21-May-2012.
Shaopeng, Ho Architect of Chinac Group
New Approach to OVS Datapath Performance
Instructor Materials Chapter 1: LAN Design
Zero-copy Receive Path in Virtio
FIREWALL configuration in linux
ODL SFC, Implementing IETF SFC November 14, 2016
6WIND MWC IPsec Demo Scalable Virtual IPsec Aggregation with DPDK for Road Warriors and Branch Offices Changed original subtitle. Original subtitle:
VLANs: Virtual Local Area Networks
Chapter 5: Inter-VLAN Routing
Multi-PCIe socket network device
Virtual LANs.
Net431:advanced net services
Virtio Inline Accelerator
The Stanford Clean Slate Program
Dynamic SFC from Tacker to incept specific traffic of VM
Transport Layer Systems Packet Classification
Software Defined Networking
Open vSwitch HW offload over DPDK
Implementing an OpenFlow Switch on the NetFPGA platform
Enabling TSO in OvS-DPDK
Reprogrammable packet processing pipeline
Flow Monitoring in OVS Ashish Varma VMware.
TC With Connection Tracking [and offload too :]
All or Nothing The Challenge of Hardware Offload
Empowering OVS with eBPF
Top #1 in China Top #3 in the world
Packet Scheduling in Linux
NetCloud Hong Kong 2017/12/11 NetCloud Hong Kong 2017/12/11 PA-Flow:
ITIS 6167/8167: Network and Information Security
Chapter 5 Network Layer: The Control Plane
Openstack Summit November 2017
Chapter 4: outline 4.1 Overview of Network layer data plane
Presentation transcript:

Offloading Linux LAG devices Via Open vSwitch and TC John Hurley Open vSwitch 2018 Fall Conference

Offloading Flows With TC Flower ovs-vswitchd VM 1 VM 2 TC Flower OVS Datapath ovs-dpctl dump-flows in_port(2),eth_type(0x0800),ipv4(proto=6,frag=no), packets:98, bytes:14316, used:5.351s, actions:3 tc -s filter show dev nfp_v0.0 ingress filter protocol ip pref 1 flower chain 0 filter protocol ip pref 1 flower chain 0 handle 0x1 eth_type ipv4 ip_proto tcp ip_flags nofrag in_hw action order 1: mirred (Egress Redirect to device nfp_p0) stolen index 1 ref 1 bind 1 installed 12 sec used 11 sec Action statistics: Sent 14316 bytes 98 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 Nic Driver (NFP) (nfp_p0) (nfp_p1) (nfp_v0.0)

Offload Performance on SmartNIC VXLAN encapsulated traffic Open vSwitch rules offloaded to SmartNIC via TC Traffic sent in physical port Forwarded to VM and bounced back on different port

LAG Devices and Representors Link Aggregation (LAG) - combine multiple ports to act as single Load balance to increase bandwidth Active/backup failover Open vSwitch bonds Combination of OvS kernel, OvS bond, LACP, high throughput - link flapping Linux LAG devices Linux bond, Team Deployed in OpenStack environments What if a Linux LAG upper device has ‘offloadable’ lower devices and is added to an OVS bridge? ip link show 35: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP 37: nfp_p0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 state UP 38: nfp_p1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 state UP

Egress Offload NETDEV_CHANGELOWERSTATE NETDEV_CHANGEUPPER tc -s filter show dev nfp_v0.0 ingress filter protocol ip pref 49152 flower chain 0 handle 0x1 eth_type ipv4 in_hw action order 1: mirred (Egress Redirect to device bond0) stolen Kernel TC Bond Team NFP NETDEV_CHANGELOWERSTATE Record active/backup port states NETDEV_CHANGEUPPER Track new LAG upper and lower devs Packet hash on NIC gets egress port SmartNIC Group 1: nfp_p0 nfp_p1 _______ Group 2: …... match: nfp_v0.0, ipv4, action: Group 1

Ingress Offload SmartNIC is not aware of LAG devices (in-kernel representation) If ‘bond0’ contains offloadable ports there is a need to: Distribute filters to all lower devices Combine stats from all lower device offload to LAG upper device/flower rule No offload callback in the LAG drivers Difficult for SmartNIC driver to track changes Are there any TC features we can make use of to solve this? tc -s filter show dev bond0 ingress filter protocol ip pref 49152 flower chain 0 handle 0x1 eth_type ipv4 not_in_hw action order 1: mirred (Egress Redirect to device nfp_v0.0) stolen

TC Shared Blocks Introduced in Kernel 4.16 Each ingress qdisc has its own set of chains and filters called ‘blocks’ Shared blocks allow multiple qdiscs/netdevs to use the same chains/filters Prevent duplicating rules which may not scale on, say, TCAM device offload Each qdisc/netdev on block reports same filters and stats Block X tc qdisc add dev nfp_p0 ingress_block 22 ingress tc qdisc add dev nfp_p1 ingress_block 22 ingress tc filter add block 22 protocol ip parent ffff: flower ip_proto tcp skip_sw action drop nfp_p0 ingress qdisc nfp_p1 ingress qdisc

TC Shared Block as LAG Representations Grouping LAG lower devices in shared blocks along with their upper device LAG netdev hierarchy not influenced outside the TC layer All lower devices receive same filters applied to master Effective distribution of offloaded filters - all offload ports get callback Stats correctly handled by default Block X bond0 upper netdev bond0 upper netdev nfp_p0 lower netdev nfp_p1 lower netdev nfp_p0 lower netdev nfp_p1 lower netdev

TC Shared Block Offload and Re-offload TC shared blocks call offload hook for each netdev per filter Cannot (in 4.16 - 4.18) add new qdiscs/netdevs to block if it has offloaded rules Filter deletion offload hooks are only triggered on block deletion Removing a netdev from a shared block may still leave offloaded rules LAG devices by their nature require flexible addition/removal of netdevs [PATCH 0/7] net: sched: support replay of filter offload when binding to block (kernel 4.19) Add the ability to replay offloaded filters when a new callback is registered Replay ‘delete’ filter messages for each netdev on block withdrawal

Working with Open vSwitch ovs-vswitchd Open vSwitch 2.10 [Patch 0/6] offload Linux LAG devices to the TC datapath Add shared block ID support to OVS-TC API Track Linux kernel netdevs and record LAG info If a LAG upper dev is added to the OVS bridge, assign its qdisc a unique block ID If a lower dev’s related upper dev is on the OVS bridge then associate it with the upper devices block User Space Kernel TC Datapath Block X Upper Dev Qdisc Lower Dev 1 Qdisc Lower Dev 2 Qdisc NFP Driver SmartNIC NFP SmartNIC Lower Device 1 filters/stats Lower Device 2 filters/stats

Thank You