SDN Scalability Issues

Slides:



Advertisements
Similar presentations
MPLS VPN.
Advertisements

High-Fidelity Switch Models for SDN Emulation
Internet Protocol How does information get sent from one device to another across a WAN?
Virtual Machine Queue Architecture Review Ali Dabagh Architect Windows Core Networking Don Stanwyck Sr. Program Manager NDIS Virtualization.
Big Data + SDN SDN Abstractions. The Story Thus Far Different types of traffic in clusters Background Traffic – Bulk transfers – Control messages Active.
SDN Controller Challenges
Logically Centralized Control Class 2. Types of Networks ISP Networks – Entity only owns the switches – Throughput: 100GB-10TB – Heterogeneous devices:
Effective Straggler Mitigation: Attack of the Clones [1]
VCRIB: Virtual Cloud Rule Information Base Masoud Moshref, Minlan Yu, Abhishek Sharma, Ramesh Govindan HotCloud 2012.
Forwarding Metamorphosis: Fast Programmable Match-Action Processing in Hardware for SDN Pat Bosshart, Glen Gibb, Hun-Seok Kim, George Varghese, Nick.
CloudStack Scalability Testing, Development, Results, and Futures Anthony Xu Apache CloudStack contributor.
Nanxi Kang Princeton University
Jennifer Rexford Princeton University
OpenSketch Slides courtesy of Minlan Yu 1. Management = Measurement + Control Traffic engineering – Identify large traffic aggregates, traffic changes.
Stratos: A Network-Aware Orchestration Layer for Middleboxes in the Cloud Aditya Akella, Aaron Gember, Anand Krishnamurthy, Saul St. John University of.
PortLand: A Scalable Fault-Tolerant Layer 2 Data Center Network Fabric. Presented by: Vinuthna Nalluri Shiva Srivastava.
Precept 6 Hashing & Partitioning 1 Peng Sun. Server Load Balancing Balance load across servers Normal techniques: Round-robin? 2.
© 2009 Cisco Systems, Inc. All rights reserved. SWITCH v1.0—4-1 Implementing Inter-VLAN Routing Deploying Multilayer Switching with Cisco Express Forwarding.
OpenFlow-Based Server Load Balancing GoneWild
Virtual Machines What Why How Powerpoint?. What is a Virtual Machine? A Piece of software that emulates hardware.  Might emulate the I/O devices  Might.
Scalable Network Virtualization in Software-Defined Networks
Scalable Flow-Based Networking with DIFANE 1 Minlan Yu Princeton University Joint work with Mike Freedman, Jennifer Rexford and Jia Wang.
Flowspace revisited OpenFlow Basics Flow Table Entries Switch Port MAC src MAC dst Eth type VLAN ID IP Src IP Dst IP Prot L4 sport L4 dport Rule Action.
Server Switch Carly Ho Sarah Alsulaiman. Programmability? Commodity chips have limited programmability, not comparable to even general purpose CPUs FPGA:
CSCE 212 Chapter 7 Memory Hierarchy Instructor: Jason D. Bakos.
Traffic Characterization Dr. Abdulaziz Almulhem. Almulhem©20012 Agenda Traffic characterization Switching techniques Internetworking, again.
Memory Management April 28, 2000 Instructor: Gary Kimura.
ProActive Routing In Scalable Data Centers with PARIS Joint work with Dushyant Arora + and Jennifer Rexford* + Arista Networks *Princeton University Theophilus.
Router Architectures An overview of router architectures.
Scalable Management of Enterprise and Data Center Networks Minlan Yu Princeton University 1.
(part 3).  Switches, also known as switching hubs, have become an increasingly important part of our networking today, because when working with hubs,
MPLS And The Data Center Adrian Farrel Old Dog Consulting / Juniper Networks
Software Defined Networking COMS , Fall 2013 Instructor: Li Erran Li SDNFall2013/
Router Architectures An overview of router architectures.
OpenFlow Switch Limitations. Background: Current Applications Traffic Engineering application (performance) – Fine grained rules and short time scales.
Measuring zSeries System Performance Dr. Chu J. Jong School of Information Technology Illinois State University 06/11/2012 Sponsored in part by Deer &
Scalability By Alex Huang. Current Status 10k resources managed per management server node Scales out horizontally (must disable stats collector) Real.
RAMCloud Design Review Recovery Ryan Stutsman April 1,
ECE 526 – Network Processing Systems Design Network Processor Architecture and Scalability Chapter 13,14: D. E. Comer.
Software Defined Networking COMS , Fall 2014
Two-issue Super Scalar CPU. CPU structure, what did we have to deal with: -double clock generation -double-port instruction cache -double-port instruction.
Measuring Control Plane Latency in SDN-enabled Switches Keqiang He, Junaid Khalid, Aaron Gember-Jacobson, Sourav Das, Chaithan Prakash, Aditya Akella,
INTERNATIONAL NETWORKS At Indiana University Hans Addleman TransPAC Engineer, International Networks University Information Technology Services Indiana.
1 Computer and Network Bottlenecks Author: Rodger Burgess 27th October 2008 © Copyright reserved.
Block1 Wrapping Your Nugget Around Distributed Processing.
Cloud Scale Performance & Diagnosability Comprehensive SDN Core Infrastructure Enhancements vRSS Remote Live Monitoring NIC Teaming Hyper-V Network.
Vic Liu Lingli Deng Dapeng Liu China Mobile Speaker: Vic Liu China Mobile Gap Analysis on Virtualized Network Test draft-liu-dclc-gap-virtual-test-00.
Increasing Web Server Throughput with Network Interface Data Caching October 9, 2002 Hyong-youb Kim, Vijay S. Pai, and Scott Rixner Rice Computer Architecture.
Content-oriented Networking Platform: A Focus on DDoS Countermeasure ( In incremental deployment perspective) Authors: Junho Suh, Hoon-gyu Choi, Wonjun.
SEATTLE and Recent Work Jennifer Rexford Princeton University Joint with Changhoon Kim, Minlan Yu, and Matthew Caesar.
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
CS 4396 Computer Networks Lab Router Architectures.
Hyper-V Performance, Scale & Architecture Changes Benjamin Armstrong Senior Program Manager Lead Microsoft Corporation VIR413.
Extending OVN Forwarding Pipeline Topology-based Service Injection
Shadow MACs: Scalable Label- switching for Commodity Ethernet Author: Kanak Agarwal, John Carter, Eric Rozner and Colin Dixon Publisher: HotSDN 2014 Presenter:
Jennifer Rexford Princeton University MW 11:00am-12:20pm Data-Plane Verification COS 597E: Software Defined Networking.
Network Virtualization Sandip Chakraborty. In routing table we keep both the next hop IP (gateway) as well as the default interface. Why do we require.
OpenFlow: Enabling Innovation in Campus Networks Yongli Chen.
Network Virtualization Ben Pfaff Nicira Networks, Inc.
Virtual Memory (Section 9.3). The Need For Virtual Memory Many computers don’t have enough memory in RAM to accommodate all the programs a user wants.
SDN challenges Deployment challenges
Lecture 2: Leaf-Spine and PortLand Networks
Revisiting Ethernet: Plug-and-play made scalable and efficient
Multi-PCIe socket network device
Load Balancing Memcached Traffic Using SDN
CS 31006: Computer Networks – The Routers
Software Defined Networking (SDN)
SoftRing: Taming the Reactive Model for Software Defined Networks
RDMA over Commodity Ethernet at Scale
SDN-Guard: DoS Attacks Mitigation in SDN Networks
Presentation transcript:

SDN Scalability Issues

Last Class Measuring with SDN What are measurement tasks? What are sketches? What is the minimal building blocks for implementing arbitrary sketches? How do we trade-off between accuracy and space? How to allocate memory across a set of switches to support a given accuracy

Today’s Class What are bottlenecks within SDN ecosystem? Hub MacTracker SDN Controller 2 (FloodLight) S1 S2 S4

Bottleneck 1: Control Channel Hub MacTracker SDN Controller 2 (FloodLight) If packets go to controller, they uses TCP connection 13Mbs If packets go to CPU, they uses PCI bus Switch CPU 35Mbs 250GB TCAM 250GB The switch NIC processes packets at 250GB

Bottleneck 2: TCAM Memory Hub MacTracker SDN Controller 2 (FloodLight) If packets go to controller, they uses TCP connection Only stores N flow table entries. Limits number of flow entries 13Mbs If packets go to CPU, they uses PCI bus Switch CPU 35Mbs 250GB TCAM 250GB The switch NIC processes packets at 250GB

Bottleneck 3: Controller Server Runs on a mac: only so much CPU & RAM. Limits Apps Hub MacTracker SDN Controller 2 (FloodLight) If packets go to controller, they uses TCP connection 13Mbs If packets go to CPU, they uses PCI bus Switch CPU 35Mbs 250GB TCAM 250GB The switch NIC processes packets at 250GB

Today’s Class What are bottlenecks within SDN ecosystem? Control Channel Controller Server (Scalability) Switch TCAM (Number of entries) Hub MacTracker SDN Controller 2 (FloodLight) S1 S2 S4

How to Get Around TCAM Limitations Use the controller Use a hierarchy of Switches Place servers/applications/VM wisely

How to Get Around TCAM Limitations Use the controller Doesn’t Scale --- remember controller has limits Too slow --- takes over 10ms to get info to controller Use a hierarchy of Switches Difane Place servers/applications/VM wisely VM Bin Packing

DiFane Creates a hierarchy of switches Authoritative switches Lots of memory Collectively stores all the rules Local switches Small amount of memory Stores a few rules For unknown rules route traffic to an authoritative switch

Packet Redirection and Rule Caching Authority Switch Feedback: Cache rules Ingress Switch Forward Egress Switch Redirect First packet Following packets Hit cached rules and forward A slightly longer path in the data plane is faster than going through the control plane

Packet Redirection and Rule Caching Authority Switch Feedback: Cache rules Ingress Switch To: bruce To: Theo Forward Redirect Egress Switch First packet To: bruce Everything else Following packets Hit cached rules and forward

Three Sets of Rules in TCAM Type Priority Field 1 Field 2 Action Timeout Cache Rules 210 00** 111* Forward to Switch B 10 sec 209 1110 11** Drop … Authority Rules 110 001* Forward Trigger cache manager Infinity 109 0001 0*** Drop, Partition Rules 15 000* Redirect to auth. switch 14 In ingress switches reactively installed by authority switches In authority switches proactively installed by controller In every switch proactively installed by controller

Stage 1 The controller proactively generates the rules and distributes them to authority switches.

Partition and Distribute the Flow Rules Flow space accept Controller Distribute partition information AuthoritySwitch B Authority Switch A reject Authority Switch C Authority Switch B Egress Switch Authority Switch A Ingress Switch Authority Switch C

Stage 2 The authority switches keep packets always in the data plane and reactively cache rules.

Packet Redirection and Rule Caching Authority Switch Feedback: Cache rules Ingress Switch Forward Egress Switch Redirect First packet Following packets Hit cached rules and forward A slightly longer path in the data plane is faster than going through the control plane

Assumptions That Authoritative switches have more TCAM than regular switches You know all the rules you want to insert into the switches before hand. So your SDN-App you should like Assignment 3 If your SDN-App is like Assignment2 (Hub), all first packets will still need to go to the controller

Interesting Questions What quickly can the authoritative switches install a cache rule into the other switches? How many cache-rules can the authoritative switches generate per second? Should the authoritative switch have any special hardware/software?

How to Get Around TCAM Limitations Use the controller Doesn’t Scale --- remember controller has limits Too slow --- takes over 10ms to get info to controller Use a hierarchy of Switches Difane Place servers/applications/VM wisely VM Bin Packing

Distributed Applications Applications have set communication patterns. E.g.3-Tier applications. Insight: traffic is between certain servers If server placed together then their rules are only inserted in one switch

Insight VM A,B,C talk to only each other VM C talks to everyone. VM B VM A,B,C talk to only each other If you place together you can limit TCAM usage VM C talks to everyone.

Bin-Packing of VMs 2 VMB VMA

Random Placement of VMs 2 2 2 2 2 VMA VMB

Random Placement Bin-Packing 2 2 2 2 2 2 VMB VMB VMA VMA

Limitations Some applications don’t have nice communication patterns How do you learn these patterns? Some applications are too large to fit in one rack --- too spread out.