Presentation is loading. Please wait.

Presentation is loading. Please wait.

An open source user space fast path TCP/IP stack and more…

Similar presentations


Presentation on theme: "An open source user space fast path TCP/IP stack and more…"— Presentation transcript:

1 An open source user space fast path TCP/IP stack and more…

2 Enter OpenFastPath! A TCP/IP stack that lives in user space
is optimized for scalability, throughput and latency uses Open Data Plane (ODP) to access network hardware works with Data Plane Development Kit (DPDK) runs on ARM, x86, MIPS, PPC hardware runs natively, in a guest or in the host platform A modular protocols library for termination and forwarding usecases providing a framework extensible with new protocols augments existing or implements missing HW acceleration The OpenFastPath project is a true open source project uses well known open source components open for all to participate – no lock-in to HW or SW Nokia, ARM and Enea key contributors

3 Features implemented Fast path protocols processing:
Command line interface Layer 4: UDP, TCP, ICMP protocols Packet dumping and other debugging Layer 3 Statistics, ARP, routes, and interface printing ARP/NDP Configuration of routes and interfaces with VRF support IPv4 and IPv6 forwarding and routing IP and ICMP implementations PASS Ixia conformance tests IPv4 fragmentation and reassembly VRF for IPv4 IGMP and multicast IP and UDP implementations have been optimized for performance -> linear scalability Layer 2: Ethernet, VLAN GRE and VXLAN Tunneling Routes and MACs are in sync with Linux TCP impl. optimization is in progress Integrated with NGINX webserver Integration with Linux IP stack through TAP interface Binary compatibility with Linux applications - no recompilation is needed to use OFP.

4 OpenFastPath System View
User Termination or Forwarding Socket Egress API Socket Hook API Host OS (Linux) User ConfCode Init API OpenFastPath (OFP) Netlink pkt_cnt = odp_pktio_recv(pktio, pkt_tbl, OFP_PKT_BURST_SIZE); or buf = odp_schedule(&queue, ODP_SCHED_WAIT); Route tables TAP PKTIO Interface Management Ingress API Slow path ODP/DPDK Linux OFP HW Application User/Default Dispatcher Here is a block diagram view of the OFP system. Here we can see the PKTIO module that handles communication to and from the linux slowpath as well as packet egress. We can also see a block called user/default dispatcher. This block implements the dispatcher functionality that reads packets through the ODP API’s. It’s been placed outside OFP to give the user control over which API to use to get packets form ODP. Depending on the underlying ODP implementation and HW, scheduler, burst or polling mode can be selected Works together with the Linux IP stack ODP SW DPDK ODP/DPDK FW/HW Ctrl HW / NICs Packets

5 OpenFastPath multicore System View
Dispatcher 1 Ingress API Socket callback /Hook API User Termination or Forwarding A Init API PKTIO Dispatcher 2 Ingress API User Termination or Forwarding B OpenFastPath (OFP) (SMP multicore library) …. Dispatcher N Ingress API User Termination or Forwarding X PKTIO Socket callback /Hook API Socket callback /Hook API Single thread context Host OS (Linux) User ConfCode Netlink Route tables TAP PKTIO Slow path …. Ok, now let’s look at a multicore OFP system. One core (#0) is required for Linux system calls, mainly for CLI and route copy and for communication with Linux kernel using TUN/TAP interface. An additional Linux core might be needed for slowpath if there are a lot of slowpath traffic. Other cores are allocated by ODP for fast path processing. User Conf Code is a management thread that is running on the Linux core. Is started by ODP and shares same memory as the fastpath cores. OFP is a multithreaded multicore application so there is one instance of OFP that run across all data plane cores. However there is a separate independent dispatcher threads to allow different dispatchers on each core On the cores allocated to fastpath processing, ODP starts only one thread where the dispatcher, OFP and the user application code runs. This is the case when using the hook or callback APIs. If non-blocking legacy socket APIs is used then you can have both in the same thread. A NGiNX worker process works over OFP by scheduling packets, processing them and then consuming them through non-blocking APIs like: select(), read(), ... ODP SW DPDK …. Core 0 Core 1 Core 2 Core N ODP/DPDK Linux OFP HW Application ODP/DPDK FW/HW NICs

6 Ingress Packet Processing
Loopback to VXLAN IP, UDP, TCP, … classified by HW IPv4/v6 IPv4/v6 local hook API UDP input BSD Socket API Callback API IPv4/v6 forward hook API TCP input VXLAN IPv4 Reassembly Transport(L4) classifier IPv4 GRE GRE hook API Ingress API Ethernet VLAN ICMP NDP Pre-classified L2 L3 L4 User API Packets Information Fallback to slowpath for unknown traffic This is a view of the OFP ingress packet flow showing the different modules involved in the packet processing. The dark red boxes represent OFP application API’s or BSD socket APIs OFP has the capability to leverage HW classification functionality. Pre-classfied packets will simply bypass the stages that have already been done in HW enabling higher throughput. This can be seen in the top left corner of the picture. Packets with unsupported protocols or protocol extensions are sent to the linux slow path. Notice the relative simplicity compared to the complexity of the Linux TCP/IP stack IPv4/v6 routing IGMP Update MAC table IPv4/v6 output ARP Send ARP request

7 Egress Packet Processing
UDP output TCP output IPv6 output BSD Socket or Egress API IPv4 output IPv4 Fragmentation Ethernet VLAN ICMP error Pre-classified L2 L3 L4 User API Packets Information IPv4 GRE tunneling VXLAN The egress packet flow is even leaner to maximize throughput. It also supports BSD Socket APIs or OFP packet APIs

8 Optimized OpenFastPath socket APIs
New zero-copy APIs optimized for single thread run-to-completion environments UDP Send: Optimized send function with a packet container (packet + meta-data) Receive: A function callback can be registered to read on a socket. Receives a packet container and socket handle TCP Accept event: A function callback can be registered for TCP accept event. Receives socket handle. Receive: A function callback can be registered to read on socket. Receives a packet container and a socket handle Standard BSD Socket interface For compatibility with legacy Linux applications Traditional BSD socket communication typically involves a copy operation from the IP stack to the application. This has a major performance impact and to address this we have implemented new zero-copy API’s which are optimized for the type of run to completion environment ODP provides. This is done through a callback API that allows the application to register callbacks for UDP and TCP receive as well as TCP accept event functionality. The callback function receives an ODP packet container and a socket handle directly without a copy operation. UDP send has also been optimized as a zero copy API. For legacy application the standard BSD sockets can still be used with good performance but not as good as through the call back API’s

9 OFP Performance on ARM Benchmarking target is to measure external to host throughput, latency, jitter, and packet loss values of the DUT. Test app will be run starting from core number 4. With variable number 1 to 4 of Rx/Tx-queues which interrupt handling is mapped to corresponding core. Using a properly configured Ixia test environment, conduct the UDP baseline testing scenarios with various defined network frame size configurations to assess and profile baseline network impact, i.e., 64, 128, 256, 512, 1024, 1248, 1518 byte frame sizes (corresponding MTU size). Sending traffic incrementally to different IP addresses and ports. Opt. Monitor scheduling latency by running Cyclictest on the time when traffic flow is on. Opt. Monitor CPU load by running top on the time when traffic flow is on. UDP Echo test: Receiver, One IP address and as many UDP ports as cores. Sender, As many sender IP addressees as ports/cores and UDP ports range 2048 – 3072. IP Forward test: Receiver IP address range 4048 and 1000 UDP ports continously. Sender has always same IP address and UDP port.

10 OFP Performance on x86 x20 Intel Xeon E v3 processor (turbo disabled) Two NICs with modified netmap ixgbe driver (12 rx/tx queue pairs) totaling 4x10Gbps ports

11 NGINX – OFP TCP/IP integration
Standard TCP/IP OFP TCP/IP NGINX worker process NGINX worker process NGINX worker process NGINX worker process NGINX worker process NGINX worker process Context switch OFP BSD Socket API OFP TCP/IP stack library Kernel TCP/IP Locks PKT IO API ODP/DPDK Avoid context switches Avoid locks Streamlined packet path Better scalability, throughput and latency NIC Hardware NIC Hardware RX/TX queues RX/TX queues RX/TX queues Packet flows

12 Binary compatibility with Linux applications
Standard TCP/IP Libraries to LD_PRELOAD OFP TCP/IP Application binary Overload Socket API with OFP Socket API Application binary OpenFastPath library libofp_netwrap_crt OpenFastPath library ODP/DPDK library ./ofp_netwrap.sh ./binary Linux TCP/IP ODP/DPDK Library libofp Start and configure OFP TCP/IP stack NIC Hardware libofp_netwrap_proc NIC Hardware Packet flows

13 What’s next? - Get involved!
Download the source code from: Check us out at to get more information about the project Subscribe to Mailing-list: Ping us on our freenode chat: #OpenFastPath

14 For additional information, please visit www.openfastpath.org


Download ppt "An open source user space fast path TCP/IP stack and more…"

Similar presentations


Ads by Google