Presentation is loading. Please wait.

Presentation is loading. Please wait.

Pluginizing QUIC Quentin De Coninck, François Michel, Maxime Piraux, Florentin Rochet, Thomas Given-Wilson, Axel Legay, Olivier Pereira, Olivier Bonaventure.

Similar presentations


Presentation on theme: "Pluginizing QUIC Quentin De Coninck, François Michel, Maxime Piraux, Florentin Rochet, Thomas Given-Wilson, Axel Legay, Olivier Pereira, Olivier Bonaventure."— Presentation transcript:

1 Pluginizing QUIC Quentin De Coninck, François Michel, Maxime Piraux, Florentin Rochet, Thomas Given-Wilson, Axel Legay, Olivier Pereira, Olivier Bonaventure SIGCOMM 2019 UCLouvain, Belgium Welcome to this presentation about Pluginizing QUIC Made at UCLouvain Before talking about pluginizing, let’s explain what is QUIC

2 QUIC Currently under standardization at IETF
Providing TCP/TLS1.3 service atop UDP (Nearly) everything is encrypted Reliable, in-order, multistream data transfer Frame-based protocol Large companies are deploying both client and server sides Limited interoperability today IP TCP TLS HTTP/2 IP UDP QUIC HTTP/3 TLS H F F F New protocol being standardized at IETF TCP/TLS1.3 service, with reliable, in-order transfer like TCP and multistream support like SCTP for the transfer of several objects for HTTP use cases Frame-based protocol, a packet contains several smaller sub-packets being the core messages of the protocol It’s already there, with large deployments of Google, FB, Cloudflare and so on, and most of these companies implement both client and server sides. Used between their services directly, so there is some limited interoperability. One of the reasons of this rapid development is the application requirement evoluation.

3 Application Requirements Evolve
Simple reliable file transfer Mobility, (live) video streaming Latency critical, partially unreliable We started from simple reliable file transfers We have now mobile devices, with possibly changing network conditions, streaming (possibly live) videos We also have the Voice over IP and video conference applications, where latency is key and partial unreliability techniques apply, which might change core elements of the protocol. To all these application needs, the transport protocols should provide adapted services. This is typically done through extensions. Transport protocols need to adapt!

4 Negotiating Extensions: The Human Protocol
Require support from both sides: delayed deployment How could I learn Chinese? Hello, I speak [French, English, Dutch] Hi, I understand [Chinese, English] Ok, let’s speak English! To understand how transport protocols negotiate extensions, consider this fictive human protocol. Each of the participants advertises their supported languages, trying to find their common ones. Here, they both agree on the English for their discussion. Now, if one participant is willing to learn the other’s language, how can it do? Currently, it needs to update its own code to learn it. The issue is that typically, extensions require support from both sides, and this delays considerably the deployment of new extensions.

5 Adding Monitoring to an Implementation
Clients seem to have poor performance Client App A A QUIC lib B High-performance, native server implementation Client side based on open-source 3rd-party library But not merged by B A opens Pull Request to extend the library Get monitoring information for some connections A Activate monitoring with specific filters No, thanks. To better illustrate the code update issue, consider the following situation. The company A offers some service where the server is maintained by the company and wants to deploy it over mobile devices. For this, it creates an application and relies on the well-deployed open-source QUIC library maintained by the company B. At some point, A suspects that the performance of the client is low, and wants to collect some monitoring information from the QUIC stack to give them back to the server. As the project is open source, it opens a Pull Request to the library to activate the monitoring features, and also adds some filtering techniques, e.g., to get info on 0.5% of the connections. However, this pull request might not be merged by B because it might not need it and might not be willing to support additional unwanted features that might turn its stack into a gazworks. How could we then easily deploy new features into transport protocol implementations? B

6 What If We Could Have... A base, simple implementation enabling
Personalized Transport Functionalities Per-Connection Deployment Actually, what if we had a base, simple implementation providing support for new extensions on a per-connection basis? This is our proposal in our paper, Pluginized QUIC

7 Agenda Motivations Pluginized QUIC Design
Evaluating PQUIC with Use Cases Conclusion and Future Works Now that we just introduced the motivations behind PQUIC, we will first describe its design. We will then present a few use cases that will be used as our base for the evaluation of PQUIC Finally, we will conclude and present possible future works

8 Isolation between connections
Pluginized QUIC Revisit the structure of protocol implementations Transport protocol = set of basic functions (protocol operations) Plugin: new set of protocol functions (modified or added) Unreliable Message Add new functions Change algorithms RTO Computation Transport ... RTO Computation Header Preparation Add to send buffer Dynamic per-connection customization ... C1 C2 RTO Computation Header Preparation Add to send buffer Unreliable Message Isolation between connections Different algorithms ... The idea of Pluginized QUIC is to revisit how protocol implementations are structured In particular, the transport protocol is now viewed as a set of basic functions which can now be easily mapped to implementation methods For instance, in the base implementation, we find operations for RTO computations, for the preparation of header or to handle new data from the application. In this canvas, we can insert plugins, which are a set of modified or added protocol functions In PQUIC, a plugin can change the algorithms, for instance the RTO computation, or even add new functions such as the support of unreliable messages. Our PQUIC provides dynamic per-connection customization allowing different algorithms on different connections, thanks to the isolation between them.

9 Exchanging Plugins: 1st Connection
Let’s monitor the client state. Exchanging Plugins: 1st Connection Let’s request monitoring Initial: Client Hello - “Hey, I support multipath” Initial: Server Hello - “I want to inject monitoring” Encrypted - PLUGIN_REQUEST(monitoring) Bytecode Encrypted - PLUGIN(monitoring) ... PQUIC also enables the exchange of extensions over its connections. To illustrate this, reconsider the case where the server wants to monitor the internal client state. During the connection handshake, there is a negotiation about the supported extensions, similarly to the human protocol. If the client does not support the extension, PQUIC enables exchanging it over the connection as follow using dedicated frames. The exchanged plugin correspond to the bytecode that will be plugged to the implementation.

10 Exchanging Plugins: Next Connections
Let’s use monitoring Initial: Client Hello - “Hey, I support multipath and monitoring” Initial: Server Hello - “Let’s use monitoring” Added by the monitoring plugin Encrypted - STREAM, STAT(info about RTT, reordering,...) Encrypted - STREAM ... If both hosts support now the extension, the plugin can be applied to the connection. Here the STAT frame reports back to the server internal information about the QUIC client. This was added by the monitoring plugin, i.e., it was not in the core implementation.

11 Revisiting Protocol Implementation Design
What do we need? Running plugins in a safe environment Identifying protocol operations Providing an API to the protocol plugins Now that we have a mechanism to exchange plugins, what is needed at the PQUIC implementations to run the bytecode? Actually, we require three things: …

12 Running Plugins Plugin = set of bytecodes Rely on a virtual machine
1 bytecode → 1 function Hardware/OS independent Kept under control Rely on a virtual machine Isolate bytecode from host implementation Virtual Machine Actually, plugins are sets of bytecode, where one bytecote is dedicated to one particular function. It is hardware and OS independent, which is needed due to the heterogeneity in the available devices, such as desktops, smartphones or even smartwatches. And the bytecode is kept under control by the host implementation. All of this is possible by the usage of a virtual machine, which isolates the bytecode from the host implementation.

13 eBPF Virtual Machine Lightweight, integrated in Linux kernel since 2014 RISC instruction set (~100) ALU, memory and branch purposes Bytecode recompiled to native architecture Dedicated, isolated stack memory But no persistence Rely on a user-space implementation With relaxed verifier With persistent heap memory x86_64 For the choice of the virtual machine, we choose eBPF as it is a lightweight virtual machine which is now integrated in the mainstream Linux kernel. It supports a small RISC instruction set which mostly consists in arithmetic, memory and branch instructions. The bytecide can recompiled to the host native architecture, and it has its own dedicated and isolated stack memory, but without persistence. In this work, we rely on a user-space implementation of eBPF where the verifier was relaxed and where we provide persistent heap memory (which we will discuss later)

14 Identifying Protocol Operations
Flags Connection ID STREAM 4 “Some text” Packet Number ACK 4 Cleartext Encrypted Each operation has its own built-in implementation Process incoming packet Parse packet header Process ACK frame Process STREAM frame Compute Retransmit Timer Now, to exemplify how protocol operations appear in PQUIC, consider the following situation where an host receive a packet. In QUIC and thus PQUIC, a packet is composed of a small cleartext header and an encrypted payload containing the frames. So first, the host needs to process the incoming packet, with potentially decryption. Then it needs to parse the packet header, and all of the frames contained in the packet. While receiving the ACK frame, the host can estimate the latency of the network and update its retransmit timer. All these operations are base operations of the protocol, which thus map into implementation functions provided by the core of PQUIC.

15 Modifying/Adding Protocol Operations
Cleartext Encrypted Flags Connection ID STREAM 4 “Some text” Packet Number ACK 4 STAT (RTT info,...) Cleartext Encrypted Flags Connection ID STREAM 4 “Some text” Packet Number ACK 4 Each operation has its own built-in implementation Process incoming packet Parse packet header Process ACK frame Process STREAM frame Process STAT frame VM VM Compute Retransmit Timer Compute Retransmit Timer What if we want to recompute the retransmit timer? Just attach a VM at that place. How to handle the reception of STAT frames used to exchange monitoring information? Just create a new protocol operation to process the STAT frame and attach a VM.

16 Changing the Retransmission Timer
int process_ack_frame(args) { // Some processing ret = compute_retransmission_timer(args); return val; } int compute_retransmission_timer(args) { rto = srtt + 4 * rttvar; return rto; } REPLACE VM rto = srtt + 3 * min_rtt; return rto; Exclusive full-access to connection state And now you want to modify the retransmission timer computation. HOw can we insert the code here? Actually, the implementation of an operation is placed in an hook called REPLACE, which refer by default to the built-in implementation. If we want to change this code, we can simply attach a VM to this hook that will replace the behavior of the protocol operation. The code inserted at that hook has full read/write access to the connection state.

17 Inserting Monitoring Plugin
int process_ack_frame(args) { // Some processing ret = compute_retransmission_timer(args); return val; } int compute_retransmission_timer(args) { rto = srtt + 4 * rttvar; return rto; } PRE POST VM VM Read-only access to connection state VM Monitoring use case, see how code works, so just look at arguments and outputs of the protocol operations. PRE and POST hooks for that purpose, where VMs can be attached at that points. Those are read-only hooks to the connection state, but this enables multiple VMs, possibly for different purposes, to observe the operations.

18 Linking Plugins to the Core Implementation
Exposing connection fields Offering persistent heap memory Retrieving data in plugin memory Calling other protocol operations Get specific plugin data VM Plugin Memory heap stack Call VM’ stack heap Call get() / set() PQUIC core Connection State built_in() Now that bytecode can be attached to the protocol implementation, it remains to bridge the VMs to the PQUIC core. Indeed, VMs are run in complete isolation from the implementation they are attached to. To interact with the connection, PQUIC provides a set of getters and setters enabling VMs to operate on the connections while keeping control over it. Then, if you consider the monitoring use case where you want to aggregate measurements, such as the mean reordering, the VMs require some persistent memory, which is provided by PQUIC through a heap memory. It also enables a plugin to retrieve some specific memory location across runs. Finally, our API enables a VM to call protocol operations, which can be implemented either by the PQUIC core or provided by another VM, possibly from other plugins.

19 Communication between Plugin Bytecodes
parse_frame[ACK] built_in() write_frame[STAT] replace post compute_retransmission_timer heap VM stack VM’ Plugin A Memory bbr_congestion_control replace VM” stack heap Plugin B Memory Monitoring case, inserted observer and write STAT frames, how to communicate? Use shared heap memory. Another plugin for another use case, memory is isolated.

20 Agenda Motivations Pluginized QUIC Design
Evaluating PQUIC with Use Cases Conclusion and Future Works

21 Evaluation Overview Implementation based on picoquic
Evaluation Overview Implementation based on picoquic IETF QUIC in C language Evaluation in our lab Use NetEm and HTB to configure links Experimental design 139 points Consider median over 9 runs {d, bw} {d, bw} Metric Minimum Maximum One-way delay (ms) 2.5 25 Bandwidth (Mbps) 5 50 We implemented PQUIC in the picoquic implementation of IETF QUIC maintained by Christian Huitema and evaluated it in our lab, where we vary the link characteristics shown in red here. Our evaluation relies on experimental design, which aims at covering as much as possible all the parameter space with a limited number of experiments, here 139. For all our experiments, we consider the median run over 9.

22 Forward Erasure Correction
Exploring Use Cases Monitoring A QUIC VPN Multipath Forward Erasure Correction Plugin Lines of C Code Number of bytecodes Monitoring 500 14 QUIC VPN 11 Multipath 2600 32 Forward Erasure Correction 2500 51

23 Exploring Use Cases Monitoring A QUIC VPN
Multipath Forward Erasure Correction Read paper for other use cases Plugin Lines of C Code Number of bytecodes Monitoring 500 14 QUIC VPN 11 Multipath 2600 32 Forward Erasure Correction 2500 51

24 1st Plugin: QUIC VPN Alternative to DTLS, IPsec QUIC IP TCP DATA QUIC IP TCP DATA IP TCP DATA IP TCP DATA V QUIC Connection → But QUIC provides reliable, in-order data delivery QUIC VPN Application Datagram plugin Unreliable data delivery Application API Protocol operation send_datagram() Datagram Plugin Our first plugin explore the ability to change the reliability of PQUIC through a QUIC VPN. This can be seen as an alternative to DTLS and Ipsec, but immune to network interference thanks to QUIC. VPN to flow IP packets, which are then encapsulated into QUIC packets inside the VPN tunnel. The issue with QUIC is that it provides reliable and in-order delivery, while a VPN could simply rely on unreliable traffic delivery. To solve this, we implemented a datagram plugin providing unreliable data delivery service, and made this service available to applications through a new protoocl operation

25 Processing time + MTU difference
QUIC VPN in Action Compare download time for a single file transfer using TCPCubic over 1 path Overhead of ≊7% Processing time + MTU difference Faster in VPN Plain TCP Faster To evaluate the VPN, we download a single file using TCP cubic and compare the ratio between the completion times inside the VPN over plain TCP. Having values below 1 means that using the VPN provides lower completion times, while values greater than 1 shows that plain TCP is faster. Theoritically, our VPN introduces a overhead due to encapsulation of around 3%. Actually, we observe with large files an overhead of about 7%, which is also related to the processing time and the MTU difference inside and outside the tunnel.

26 2nd Plugin: Multipath QUIC
Implement the IETF Multipath draft Link paths to specific networks Connection not bound to 4-tuple Exchange IP addresses Create paths over pair of IP addresses Full-mesh Schedule packets over paths Round-robin fashion Core functionalities Multipath Plugin Modular Algorithms We now consider another use case which is the multihoming, i.e., the access to multiple networks and their simultaneous usage. This is why we explore the solution to provide Multipath in PQUIC. We implemented the IETF mulitpath draft within our plugins, and the multipath extensions requires several functionalities. Some are core ones, such as linking paths to specific networks or exchanging IP addresses. Other are modular algorithms which are multipath-specific, such as the path manager and the packet scheduler. Potentially, different versions of the mulitpat plugins with different algorithms might be deployed to address various application needs.

27 Multipath QUIC in Action
GET request over a dedicated stream Consider 2-path network with symmetric characteristics Aggregate both paths! Equivalent to MPTCP! To assess that multipath plugin works, we perform a download of a file over a dedicated stream and compute the speedup ratio, i.e., the download completion time over a single path over the multipath version over two symmetric network paths. We observe that with large files, we achieve aggreagation of both paths, which corresponds to the current state of the art with Multipath TCP.

28 Combining Plugins into a Multipath QUIC VPN
The VPN and Multipath plugins provides orthogonal features Operate on different protocol operations Each plugin is isolated from other ones Possible to apply them on a same connection! V QUIC Connection Finally, we presented two orthogonal but complementary plugins, as they operate of different protocol operations. As each plugin is isolated from other one, it is possible to combine both of them out-of-the-box on a single connection to have a VPN using multiple network paths.

29 Multipath QUIC VPN in Action
Compare download time for a single file transfer using TCPCubic 2 symmetric network paths VPN enables TCP to use both paths! And for this, we reuse the experiment made for the QUIC VPN and compare it with the Multipath QUIC one. We can see that the completion time is nearly divided by two, which means that the multipath ability of the VPN enables TCP to use both paths.

30 Agenda Motivations Pluginized QUIC Design
Evaluating PQUIC with Use Cases Conclusion and Future Works

31 Conclusion and Future Works
PQUIC: Dynamically extends protocol implementations Common pluginizable client, specialized server implementation Possible through secure plugin exchange Four different use-cases fully implemented with plugins Monitoring, VPN, Multipath, Forward Erasure Correction Future works Approach applicable to many other protocols How can we have independent, interoperable PQUIC implementations? Specification of the VM, protocol operations, API,... To conclude, we introduced PQUIC, a dynamic way to extend protocol implementations. We envision the situation where there is a common pluginizable client, with specialized server implementations, where extensions could be communicated through the secure plugin exchange. We implemented four very different use-cases, fully implemented with plugins, which are monitoring, VPN, multipath and forward erasure correction. We believe this shows the genericity of our approach. As future works, we believe that this approach is applicable to many other protocols, and encourage you to try it on your favourite one. And the big challenge it introduces is how can we have several independent, but still interoperable PQUIC implementations. This requires some specification about the VM, the protocol operations, and the proposed API. Our PQUIC implementation and all artefacts can be found on pquic.org. How do we have interoperable, independent PQUIC implementations? (au lieu de la partie specify PP) Say “Try it with your favourite protocol”

32 Thanks for your attention!

33 Extending/Customizing TCP Is Hard
TCP Option Negotiation SYN [SACK, WSCALE, TSTAMP] Extension deployment challenges Specification process (IETF) Implementation puzzle Client waits for server support Server waits for client support Middlebox interferences SYN/ACK [SACK, WSCALE] Coarse protocol tuning Global parameters (sysctl,...) Very specific socket options

34 Processing New STAT frames
Type of the frame = Parameter to a generic protocol operation Frame Fields (ex. used for RTT estimation) STAT int process_frame(args) { for each frame { ret = process_frame_param(frame.type, args); // Some processing } return val; int process_stat_frame(args) { } PRE POST REPLACE VM // Code processing STAT return ret;

35 Trusting Plugins ... Developer Check own bindings Publish
Plugin Repository ... Verifier A Verifier B Verifier C Pull and build Merkel Tree Lookup for proof Ask plugin and proof from B Client Server Give plugin and B authentication path

36 Compare with stored verifier Root Provided Authentication Path
Proof of Consistency Compare with stored verifier Root Root = H(h0 || h1) 1 h1 h0 = H(h00 || h01) ... 1 h00 h01 = H(h010 || h011) ... 1 h011 h010 = H(‘multipath’||multipath bytecode) ... Provided Authentication Path H(‘multipath’) = ‘010’ Recomputed

37 Plugin Overhead 10 Gbps Plugin Mean Goodput
Intel Xeon E v3 10 Gbps Plugin Mean Goodput PQUIC, no plugin 1104 Mbps Monitoring 1037 Mbps Multipath 1 path 757 Mbps Monitoring + Multipath 1 path 714 Mbps -7% goodput +8% CPU instructions JITed eBPF ~2x slower than native code Get/set interface ~5x slower than direct access

38 Verifying the Termination of Plugins with T2
Lines of C Code # Bytecodes Proven terminating Monitoring 500 14 13 VPN 11 8 Multipath 2600 32 29 FEC 2500 51 37


Download ppt "Pluginizing QUIC Quentin De Coninck, François Michel, Maxime Piraux, Florentin Rochet, Thomas Given-Wilson, Axel Legay, Olivier Pereira, Olivier Bonaventure."

Similar presentations


Ads by Google