Scalable and Crash-Tolerant Load Balancing based on Switch Migration

Slides:

Advertisements

Similar presentations

Resonance: Dynamic Access Control in Enterprise Networks Ankur Nayak, Alex Reimers, Nick Feamster, Russ Clark School of Computer Science Georgia Institute.

Advertisements

Network Resource Broker for IPTV in Cloud Computing Lei Liang, Dan He University of Surrey, UK OGF 27, G2C Workshop 15 Oct 2009 Banff,

All Rights Reserved © Alcatel-Lucent 2009 Enhancing Dynamic Cloud-based Services using Network Virtualization F. Hao, T.V. Lakshman, Sarit Mukherjee, H.

Scheduling in Web Server Clusters CS 260 LECTURE 3 From: IBM Technical Report.

Distributed Systems Major Design Issues Presented by: Christopher Hector CS8320 – Advanced Operating Systems Spring 2007 – Section 2.6 Presentation Dr.

Efficient Event-based Resource Discovery Wei Yan*, Songlin Hu*, Vinod Muthusamy +, Hans-Arno Jacobsen +, Li Zha* * Chinese Academy of Sciences, Beijing.

Logically Centralized Control Class 2. Types of Networks ISP Networks – Entity only owns the switches – Throughput: 100GB-10TB – Heterogeneous devices:

CloudWatcher: Network Security Monitoring Using OpenFlow in Dynamic Cloud Networks or: How to Provide Security Monitoring as a Service in Clouds? Seungwon.

Connect communicate collaborate GN3plus What the network should do for clouds? Christos Argyropoulos National Technical University of Athens (NTUA) Institute.

NCCA 2014 Performance Evaluation of Non-Tunneling Edge-Overlay Model on 40GbE Environment Nagoya Institute of Technology, Japan Ryota Kawashima and Hiroshi.

NDN in Local Area Networks Junxiao Shi The University of Arizona

TOWARDS AN ELASTIC DISTRIBUTED SDN CONTROLLER Advait Dixit, Fang Hao, Sarit Mukherjee, T.V. Lakshman, Ramana Kompella.

Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.

SDN and Openflow.

Towards Virtual Routers as a Service 6th GI/ITG KuVS Workshop on “Future Internet” November 22, 2010 Hannover Zdravko Bozakov.

Look Who’s Talking: Discovering Dependencies between Virtual Machines Using CPU Utilization HotCloud 10 Presented by Xin.

Scalable Network Virtualization in Software-Defined Networks

NETWORK LOAD BALANCING NLB.  Network Load Balancing (NLB) is a Clustering Technology.  Windows Based. (windows server).  To scale performance, Network.

1 Introduction to Load Balancing: l Definition of Distributed systems. Collection of independent loosely coupled computing resources. l Load Balancing.

Handout # 4: Scaling Controllers in SDN - HyperFlow

ProActive Routing In Scalable Data Centers with PARIS Joint work with Dushyant Arora + and Jennifer Rexford* + Arista Networks *Princeton University Theophilus.

Study of Server Clustering Technology By Thao Pham and James Horton For CS526, Dr. Chow.

Draft-li-rtgwg-cc-igp-arch-00IETF 88 RTGWG1 An Architecture of Central Controlled Interior Gateway Protocol (IGP) draft-li-rtgwg-cc-igp-arch-00 Zhenbin.

Bandwidth Measurements for VMs in Cloud Amit Gupta and Rohit Ranchal Ref. Cloud Monitoring Framework by H. Khandelwal, R. Kompella and R. Ramasubramanian.

Distributed Data Stores – Facebook Presented by Ben Gooding University of Arkansas – April 21, 2015.

Virtualized FPGA accelerators in Cloud Computing Systems

Network Support for Cloud Services Lixin Gao, UMass Amherst.

OpenFlow-Based Server Load Balancing GoneWild Author : Richard Wang, Dana Butnariu, Jennifer Rexford Publisher : Hot-ICE'11 Proceedings of the 11th USENIX.

Networking Virtualization Using FPGAs Russell Tessier, Deepak Unnikrishnan, Dong Yin, and Lixin Gao Reconfigurable Computing Group Department of Electrical.

Department of Computer Science Engineering SRM University

Advanced Network Architecture Research Group 2001/11/149 th International Conference on Network Protocols Scalable Socket Buffer Tuning for High-Performance.

Module 12: Designing High Availability in Windows Server ® 2008.

Software-Defined Networks Jennifer Rexford Princeton University.

1 Enabling Large Scale Network Simulation with 100 Million Nodes using Grid Infrastructure Hiroyuki Ohsaki Graduate School of Information Sci. & Tech.

A Novel Adaptive Distributed Load Balancing Strategy for Cluster CHENG Bin and JIN Hai Cluster.

NetCloud 2013 Non-Tunneling Edge-Overlay Model using OpenFlow for Cloud Datacenter Networks Nagoya Institute of Technology, Japan Ryota Kawashima and Hiroshi.

Challenges towards Elastic Power Management in Internet Data Center.

Designing Routing Protocol For Mobile Ad Hoc Networks Navid NIKAEIN Christian BONNET EURECOM Institute Sophia-Antipolis France.

Advanced Network Architecture Research Group 2001/11/74 th Asia-Pacific Symposium on Information and Telecommunication Technologies Design and Implementation.

The Only Constant is Change: Incorporating Time-Varying Bandwidth Reservations in Data Centers Di Xie, Ning Ding, Y. Charlie Hu, Ramana Kompella 1.

Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.

SDN AND OPENFLOW SPECIFICATION SPEAKER: HSUAN-LING WENG DATE: 2014/11/18.

Distributed database system

Embedded System Lab. 정범종 A_DRM: Architecture-aware Distributed Resource Management of Virtualized Clusters H. Wang et al. VEE, 2015.

70-293: MCSE Guide to Planning a Microsoft Windows Server 2003 Network, Enhanced Chapter 12: Planning and Implementing Server Availability and Scalability.

Distributed Computing Systems CSCI 4780/6780. Scalability ConceptExample Centralized servicesA single server for all users Centralized dataA single on-line.

The Goals Proposal Realizing broadcast/multicast in virtual networks

Complete VM Mobility Across the Datacenter Server Virtualization Hyper-V 2012 Live Migrate VM and Storage to Clusters Live Migrate VM and Storage Between.

Network Computing Laboratory Load Balancing and Stability Issues in Algorithms for Service Composition Bhaskaran Raman & Randy H.Katz U.C Berkeley INFOCOM.

VSE: Virtual Switch Extension for Adaptive CPU Core Assignment in softirq Shin Muramatsu, Ryota Kawashima Shoichi Saito, Hiroshi Matsuo Nagoya Institute.

Cluster computing. 1.What is cluster computing? 2.Need of cluster computing. 3.Architecture 4.Applications of cluster computing 5.Advantages of cluster.

T3: TCP-based High-Performance and Congestion-aware Tunneling Protocol for Cloud Networking Satoshi Ogawa† Kazuki Yamazaki† Ryota Kawashima† Hiroshi Matsuo†

Yiting Xia, T. S. Eugene Ng Rice University

Instructor Materials Chapter 7: Network Evolution

Software defined networking: Experimental research on QoS

University of Maryland College Park

Architecture and Algorithms for an IEEE 802

Introduction to Load Balancing:

ETHANE: TAKING CONTROL OF THE ENTERPRISE

Network Load Balancing

Author: Ragalatha P, Manoj Challa, Sundeep Kumar. K

Module 8: Concepts of a Network Load Balancing Cluster

Author: Daniel Guija Alcaraz

Overview of SDN Controller Design

Bandwidth Measurements for VMs in Cloud

SDN Based IoT-Cloud Comm.

Specialized Cloud Mechanisms

Specialized Cloud Architectures

NetCloud Hong Kong 2017/12/11 NetCloud Hong Kong 2017/12/11 PA-Flow:

Requirements of Computing in Network

Presentation transcript:

Scalable and Crash-Tolerant Load Balancing based on Switch Migration for Multiple OpenFlow Controllers 　 Chu LIANG＊ Ryota KAWASHIMA＊ Hiroshi MATSUO＊＊Nagoya Institute of Technology, Japan

Distributed control plane has been proposed Research Background The spread of Software-Defined Networking Easier management and faster innovation A centralized controller and programmable network devices The complication of controller is increasing Load balancing, QoS controlling, Security… The size of networks continues to increase The growing number of OpenFlow-enabled devices A centralized controller has a potential bottleneck Distributed control plane has been proposed

Distributed OpenFlow Controllers OpenFlow : one of the mostly representative protocol for SDN Packet-in : not match any forwarding rule forward to controller OpenFlow Controller OFC 1 OpenFlow controller cluster OFC 2 Packet-in OpenFlow switches Group B Group A physically distributed controllers achieve better scalability

Problems in Distributed Controllers Load imbalance results in suboptimal performance Topology change Variety user traffic Elastic resources, VMs migration, Variety utilizations OFC 1 OpenFlow controller cluster OFC 2 High loaded Low loaded Static mapping configuration OpenFlow switches Dynamic load balance among the controllers is required Group B Group A

Problems in OpenFlow Controllers Multiple Controllers in OpenFlow Roles : Master/ Slave/ Equal Master Slave ？ Switch OFC 1 OFC 2 S1 S2 S3 OFC1 OFC2 Master Slave Master Slave Master Slave Master-Request Role-Reply Role-Reply Role-Reply Master-Request Master-Request Each controller only has one role S1 S2 S3 The coordination of “role changing” is not provided

Related Work Scalable OpenFlow Controller Redundancy Tackling Local and Global Recoveries　 Keisuke Kuroki, Nobutaka Matsumoto, Michiaki Hayashi, The Fifth International Conference on Advances in Future Internet. 2013. Towards an Elastic Distributed SDN Controller　 Advait Dixit, Fang Hao, Sarit Mukherjee, T.V. Lakshman, Ramana Kompella, ACM SIGCOMM HotSDN, 2013 Proposed a crash-tolerant method based on with multiple controllers of OpenFlow1.3 Do not support load balancing among the controllers Role-Management server can be single point of failure Proposed a switch migration protocol according to controller load Be complex and do not support the crash-tolerant for master controller

Proposed Method Dynamically shift the load across the multiple controllers different controllers can be set master for individual switch switch migration Support the crash-tolerant for controllers distributed architecture automatic failover in the event of a failure JGroups based communication Simplification of the switch management by grouping each controller only manage switches in the same group switch migration is performed in group

Proposed Architecture Global DB Local controller cluster Local DB Global controller cluster OpenFlow switch Group A Group B Group C

Global controller cluster Global DB Global controller cluster Based on the global JGroups channel Share global data tenant information, user data etc. Provide global view of network to upper controller can be considered as a logically centralized controller plane

Dynamically shift the load across the multiple controllers Local controller cluster Local controller cluster reduce network delay reduce communicate traffic Synchronize network status switch-controller mapping, link, port Perform controller load scheduling Coordinate switch migration set master/slave role for switches Local DB Dynamically shift the load across the multiple controllers

Implementation Controller structure (OpenDaylight based) Distributed Key-Value Store (Infinispan) Application Application Event Notification (JGroups) OpenDaylight　APIs A : Load Monitoring Module A : Load Monitoring Module Collect and calculate controllers load B : Load Scheduling Module B : Load Scheduling Module Selected master controller C : Switch Migration Module Perform switch migration Link Discovery Switch Manger Host Manger OpenDaylight Core OpenFlow Driver

A. Load Calculation Coordinator : collecting and computing load information Controller Load : switch metric and server metric the number of active switches, packets requests rate (switch metric ) usage of cpu, memory, network bandwidth (server metric) coordinator OFC1 OFC2 Local Controller Cluster OFC3 OFC4

Perform Switch Migration B. Load Scheduling When and Which controller should be elected as master The lightest-load controller Which switches should be selected to migrate Dynamic round-trip time feedback based switch selection Perform Switch Migration OFC1 OFC2 OFC3 Controller failover Add new switches ？

C. Switch Migration Initial heaviest-load Controller A OpenFlow Switch T Initial lightest-load Controller B Switch T migration Request Role_request to Master Master for T Slave for T Role_reply for Master Switch T migration Reply Slave for T Master for T

C. Switch Migration Initial heaviest-load Controller A OpenFlow Switch T Initial lightest-load Controller B Switch T migration Request Role_request to Master Master for T Slave for T Role_reply for Master Switch T migration Reply Slave for T Master for T Role_request to Master Failover time Role_reply to Master Master for T

Preliminary evaluation (1/2) The switch migration process The migration process takes about 2ms Initial heaviest-load Controller A OpenFlow Switch T Initial lightest-load Controller B Switch T migration Request Role_request to Master Master for T 2ms Slave for T Role_reply for Master Switch T migration Reply Slave for T Master for T

Preliminary evaluation (2/2) The controller failover process The failover process takes about an average of 20ms mostly affected by the failure detection provided by JGroups. Initial heaviest-load Controller A OpenFlow Switch T Initial lightest-load Controller B Slave for T Master for T Failure detection Role_request to Master Failover time 20ms Role_reply to Master 17/18

Evaluation environment Host 3 Iperf client VM1 Host 1 Host 2 (OFC A) (OFC B) VM2 Iperf server SW 1 SW 2 SW 3 SW 4 SW 5 SW 6 SW 7 SW 8 Host Host Host Host Host Host Host Host (Mininet Network) Host 4 (Traffic Generator)

Evaluation Three kind of workloads Machine specifications Switch Workload A 1000 pps 2000 pps 4000 pps Workload B 6000 pps Workload C 8000 pps Load Machine specifications Controller Node Traffic Generator Evaluation Node OS Ubuntu-server 12.04 Centos 6.5 64bit CPU Core i5(4 core) Core i7(1 core) Memory 16GB 8GB Network 100Mbps Ethernet OpenFlow Switch - Open vSwtich-1.10.0

Results : Throughput static : existing ( static switch-controller mapping) proposal : dynamically switch migration OFC A:6 switches OFC B:2 switches OFC A:5 switches OFC B:3 switches Throughput (Mbit/sec) Load Difference OFC A:7 switches OFC B:1 switches OFC A:6 switches OFC B:2 switches Workload A Workload B Workload C Run Time (in sec)

Results : Response Time Response Time: VM——OFC B (Ping) Packets loss cumulative distribution function (CDF) workload A (static) workload A (proposal) workload B (static) workload B (proposal) workload C (static) workload C (proposal) Response Time (in msec)

Conclusion & Future Work Proposed a scalable and crash-tolerant load balancing based on switch migration for multiple OpenFlow controllers Enable the controllers coordinate actions to dynamically shift the load across the multiple controllers Improve the throughput and response time of control plane Future Work Optimize of the load scheduling modules Implement a topology aware switch migration algorithms to improve the scalability in the real large scale network Evaluate the performance in vary applications and topologies with more practical traffics