Nick McKeown High Performance Networking Group Stanford University

Nick McKeown High Performance Networking Group Stanford University
What’s next? Nick McKeown High Performance Networking Group Stanford University

The Big Picture We’ve all seen the list…
How about a global network that is Robust against failure of infrastructure and end-points Secure against attack Available when you need it Fast Predictable in the service it does/doesn’t deliver Evolvable as new technologies are invented Economically viable OK, so this is all mother-and-apple-pie. Q1: Why do we need a clean slate? Q2: What can we do about it at Stanford?

Why a Clean Slate? Business-as-usual won’t get us there
Research-as-usual won’t get us there It doesn’t mean we have to throw out the good parts of the current Internet

Stanford Clean Slate Program
How would we design the Internet if – with what we know today – we started over with a clean slate? Small, medium and large projects that – if successful – will significantly impact the Internet in years Not a single design: A collection of high-risk projects with the same theme 12-15 professors and research groups from EE, CS and MS&E Let’s get started!

Two Clean Slate Examples at Stanford
VLB: A clean slate architecture for backbone networks to be robust and predictable SANE: A clean slate architecture for secure Enterprise networks

Backbone Networks: Emerging Structure
10-50 Regional Nodes interconnected by long-haul optical links Increasingly rich topology for robustness and load-balancing Typical utilization < 25%, because Uncertainty of traffic matrix network is designed for Headroom for future growth Headroom to carry traffic when links and routers fail Minimize congestion and delay variation Efficiency sacrificed for robustness and low queueing delay

Traffic Matrices ri needs to be predicted anyway ? ? ? Regional Node i
From Traffic matrix is hard to predict ri needs to be predicted anyway

How flexible are networks today?
What fraction of allowable traffic matrices can they support? Abilene Verio 25% Over Prov: 0.025% 50% Over Prov: 0.66% 25% Over Prov: % 50% Over Prov: 1.15% AT&T Sprint 25% Over Prov: % 50% Over Prov: 0.15% 25% Over Prov: % 50% Over Prov: 0.06% Note: Verio, AT&T and Sprint topologies are from RocketFuel

Desired Characteristics
Robust Recovers quickly; continues to operate under failure Flexible Will support broad class of applications, new customers, and traffic patterns Predictable Can predict how it will perform, with and without failures Efficient Does not sacrifice cost for robustness

Approach Assume we know/estimate traffic entering and leaving each Regional Network Requires only local knowledge of users and market estimates Use Valiant Load Balancing (VLB) over whole network Enables support of all traffic matrices

Valiant Load-Balancing
2r1r2 /rN r2 r1 1 2 N 3 rN r3 We propose Valiant Load-balancing to solve these problems. For now, let’s consider at a network of N identical nodes. The scheme can be extended to the general case where the nodes have different capacities, and Murali’s talk will focus on that case. The N nodes are connected by a full mesh. The full mesh can be made of physical links, but more likely it’s made of logical links. Here is how Valiant load-balancing works. Suppose Node N has a packet to send and it’s destined to node 2. It randomly picks a node out of the N nodes and sends the packet to it. Suppose it picked node 3. Node 3 then looks up the packet, and delivers it to the final destination. Another packet comes to N and it’s destined to node 4. Node N again picks a random node, in this case node 2, and sends the packet to it. Node 2 then delivers the packet to node 4. In this scheme, picking the intermediate node can be either random or round robin or in some other way, as long as each node spreads its flows evenly to N two-hop paths. Here a flow is defined by the source and destination nodes. You may ask why we are doing this two hop forwarding? Doesn’t this increase propagation delay and cause mis-sequencing? The biggest reason is, it was first proved by Valiant, and later extended by C-S Chang that this scheme can guarantee throughput to any traffic matrix. What’s the link capacity required in order to guarantee throughput to any traffic matrix? The tow-hop routing can also be viewed as two-stage routing. In the first stage a node evenly spreads its incoming traffic to N nodes, so the traffic rate on each link due to the first stage routing is at most r/N. In the second stage, similarly, the destination node collects traffic evenly from N nodes, so the traffic rate on each link due to the second stage routing is again at most r/N. This means the required link capacity is 2r/N. As you may have noticed, we only constrained the node capacity, so any traffic matrix whose row and column sums are bounded by r can be served. Now, imagine that we don’t do load balancing, and still want to support all traffic matrices that satisfy the node capacity constraints. Then the traffic matrix may be such that node N wants to send all its traffic to node 2, so we need capacity r on this link. Node N can also send all its traffic to node 4, so we need capacity r on this link as well. Since we do not know how the traffic matrix will be, all the links need to have capacity r. Compare to that, load balancing gives a big gain of N/2 here. Supposed N is one hundred, then this gain is 50-fold. … 4 r4 Capacity provisioned over existing robust mesh of physical circuits

A Predictable Backbone Network
Performance: 100% throughput for any valid traffic matrix. Only need to know aggregate node traffic. Under low load, no need to spread traffic. Robustness Upon failure, spread over working paths Small cost to recover from k failures: Provision 2rirj/r(N-k) Simple routing algorithm Efficient VLB is lowest cost method to support all traffic matrices Similar cost, while supporting significantly more traffic matrices. The Valiant load-balancing architecture allows us to build a predictable backbone network. We have seen that VLB can guarantee 100% throughput for any traffic matrix, so when designing a network, we only need to know the aggregate traffic rate for each node, instead of the traffic matrix. The aggregate traffic rate is much easier to estimate than the traffic matrix. For example, we can assume that it’s proportional to population, etc. Routing packets is simple in VLB, because all paths have at most two hop. Normally, a flow is spread over N paths. When there are failures, a node simply stops spreading to the failed paths and spread only to the working paths. This can give fast failure recovery.

How expensive would VLB be?
Cost normalized to VLB routing. Cost of switching = cost of transmission for 370miles Abilene Verio 25% Over Prov: 0.026% Cost: 0.87 50% Over Prov: 0.66% Cost: 1.04 25% Over Prov: % Cost: 0.99 50% Over Prov: 1.08% Cost: 1.19 AT&T Sprint 25% Over Prov: % Cost: 0.94 50% Over Prov: 0.14% Cost: 1.12 25% Over Prov: % Cost: 0.86 50% Over Prov: 0.04% Cost: 1.04 Rui Zhang-Shen will talk about SANE on February 27th

SANE: A Clean Slate Architecture for Secure Enterprise Networks
Problem Enterprise networks must be secure Today they rely on a mess of distributed firewalls, NAT, VLANs, …with complicated and fragile rules SANE Uses simple and natural high-level security policies “Allow the sales group to access the http server” Hides topology information and services from users unless they have specific permission Only requires one trusted entity: A single (logically) centralized Domain Controller Communications are “default-off” Capabilities explicitly granted by Domain Controller and enforced by network Capabilities are encrypted source routes. Research groups: Boneh, Rosenblum, Mazieres, McKeown Martin Casado will talk about SANE on February 13th

What you can do Invent Start a study group
Come talk to me, your advisor, someone else’s advisor, an advisor you’d like as your own

Nick McKeown High Performance Networking Group Stanford University

Similar presentations

Presentation on theme: "Nick McKeown High Performance Networking Group Stanford University"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Nick McKeown High Performance Networking Group Stanford University

Similar presentations

Presentation on theme: "Nick McKeown High Performance Networking Group Stanford University"— Presentation transcript:

Similar presentations

About project

Feedback