Presentation is loading. Please wait.

Presentation is loading. Please wait.

ONOS Open Network Operating System An Experimental Open-Source Distributed SDN OS Pankaj Berde, Umesh Krishnaswamy, Jonathan Hart, Masayoshi Kobayashi,

Similar presentations


Presentation on theme: "ONOS Open Network Operating System An Experimental Open-Source Distributed SDN OS Pankaj Berde, Umesh Krishnaswamy, Jonathan Hart, Masayoshi Kobayashi,"— Presentation transcript:

1 ONOS Open Network Operating System An Experimental Open-Source Distributed SDN OS Pankaj Berde, Umesh Krishnaswamy, Jonathan Hart, Masayoshi Kobayashi, Pavlin Radoslavov, Pingping Lin, Sean Corcoran, Tim Lindberg, Rachel Sverdlov, Suibin Zhang, William Snow, Guru Parulkar

2 Agenda  Overview of ONRC (Open Networking Research Center)  ONOS Architecture Scale-out and high availability Network graph as north-bound abstraction DEMO Consistency models Development and test environment Performance Next steps

3 Leadership National Academy of Engineering ACM SIGCOMM Award Winners Fellow of IEEE and ACM Entrepreneurs Impact on practice of networking/cloud Nick Mckeown KP, Mayfield, Sequoia Professor, Stanford Larry Peterson Bob Kahn Professor Princeton Chief Architect, ON.LAB Scott Shenker Professor, UC Berkeley Chief Scientist, ICSI

4 Stanford/Berkeley SDN Activities With Partners Ethane Demo Deployment Platform Development OpenFlow Spec v0.8.9v1.0v1.1 Reference Switch NetFPGA Software Network OS NOXSNAC Beacon Virtualization FlowVisor FlowVisor (Java) Tools Test SuiteoftraceMininetMeasurement tools GENI software suite Expedient/Opt-in Manager/FOAM Stanford University ~45 switch/APs ~25user In McKeown Group CIS/EE Building Production Network US R&E Community GENI: 8 Universities + Internet2 + NLR Many other campuses Other countries Over 68 countries (Europe, Japan, China, Korea, Brazil, etc.) VM Migration (Best Demo) Trans-Pacific VM Migration Baby GENI Nation Wide GENI“The OpenFlow Show” – IT World SDN Concept (Best Demo) SIGCOMM08 GEC3 SIGCOMM09 GEC6 GEC9 Interop Broadcom

5 Scaling of SDN Innovation Standardize OpenFlow and promote SDN ~100 Members from all parts of the industry Bring best SDN content; facilitate high quality dialogue 3 successive sold out events; participation of ecosys Build strong intellectual foundation Bring open source SDN tools/platforms to community SDN Academy Bring best SDN training to companies to accelerate SDN development and adoption

6 ONRC Organizational Structure Berkeley Scott Shenker Sylvia Ratnasamy Open Network Lab Exec Director: Guru Parulkar VP Eng: Bill Snow Chief Architect: Larry Peterson Engineers/Tech Leads (includes PlanetLab team) Tools/Platforms for SDN community OpenCloud demonstration of XaaS and SDN PhD/Postdocs Research Stanford Nick McKeown Guru Parulkar Sachin Katti

7 Mission Bring innovation and openness to internet and cloud infrastructure with open source tools and platforms 7

8 Tools & Platforms 3 rd party components Network OS Apps Network OS Apps Open Interfaces Network Hypervisor Forwarding FlowVisor, OpenVirteX MININET, Cluster Edition ONOS SDN-IP Peering TestON with debugging support NetSight

9 Open Network OS (ONOS)  Architecture  Scale-out and high availability  Network graph as north-bound abstraction  DEMO  Consistency models  Development and test environment  Performance  Next steps

10 ONOS: Executive Summary ONOS Status Distributed Network OS Network Graph Northbound Abstraction Horizontally Scalable Highly Available Built using open source components Version Flow API, Shortest Path computation, Sample application - Build & QA ( Jenkins, Sanity Tests, Perf/Scale Tests, CHO) - Deployment in progress at REANNZ (SDN-IP peering) Next Exploring performance & reactive computation frameworks Expand graph abstraction for more types of network state Control functions: intra-domain & inter-domain routing Example use cases: traffic engineering, dynamic virtual networks on demand, …

11 ONOS – Architecture Overview

12 Routing TE Network OS Packet Forwarding Packet Forwarding Packet Forwarding Packet Forwarding Packet Forwarding Packet Forwarding Mobility Programmable Base Station Programmable Base Station Openflow Scale-out Design Fault Tolerance Global network view Open Network OS Focus (Started in Summer 2012) Global Network View

13 Prior Work Distributed control platform for large-scale networks Focus on reliability, scalability, and generality Scale-out NOS focused on network virtualization in data centers State distribution primitives, global network view, ONIX API ONIX Other Work Helios (NEC), Midonet (Midokura), Hyperflow, Maestro, Kandoo NOX, POX, Beacon, Floodlight, Trema controllers Community needs an open source distributed SDN OS

14 Host Titan Graph DB Cassandra In-Memory DHT Instance 1 Instance 2 Instance 3 Network Graph Eventually consistent Distributed Registry Strongly Consistent Zookeeper OpenFlow Controller+ ONOS High Level Architecture +Floodlight Drivers

15 Scale-out & HA

16 ONOS Scale-Out Distributed Network OS Instance 2 Instance 3 Instance 1 Network Graph Global network view An instance is responsible for maintaining a part of network graph Control capacity can grow with network size or application need Data plane

17 Master Switch A = ONOS 1 Candidates = ONOS 2, ONOS 3 Master Switch A = ONOS 1 Candidates = ONOS 2, ONOS 3 Master Switch A = ONOS 1 Candidates = ONOS 2, ONOS 3 Master Switch A = ONOS 1 Candidates = ONOS 2, ONOS 3 Master Switch A = ONOS 1 Candidates = ONOS 2, ONOS 3 Master Switch A = ONOS 1 Candidates = ONOS 2, ONOS 3 ONOS Control Plane Failover Distributed Network OS Instance 2 Instance 3 Instance 1 Distributed Registry Host A B C D E F Master Switch A = NONE Candidates = ONOS 2, ONOS 3 Master Switch A = NONE Candidates = ONOS 2, ONOS 3 Master Switch A = NONE Candidates = ONOS 2, ONOS 3 Master Switch A = NONE Candidates = ONOS 2, ONOS 3 Master Switch A = NONE Candidates = ONOS 2, ONOS 3 Master Switch A = NONE Candidates = ONOS 2, ONOS 3 Master Switch A = ONOS 2 Candidates = ONOS 3 Master Switch A = ONOS 2 Candidates = ONOS 3 Master Switch A = ONOS 2 Candidates = ONOS 3 Master Switch A = ONOS 2 Candidates = ONOS 3 Master Switch A = ONOS 2 Candidates = ONOS 3 Master Switch A = ONOS 2 Candidates = ONOS 3

18 Network Graph

19 Cassandra In-memory DHT Id: 1 A Id: 101, Label Id: 103, Label Id: 2 C Id: 3 B Id: 102, Label Id: 104, Label Id: 106, Label Id: 105, Label Network Graph Titan Graph DB ONOS Network Graph Abstraction

20 Network Graph port switchport device port on port link switch on device host  Network state is naturally represented as a graph  Graph has basic network objects like switch, port, device and links  Application writes to this graph & programs the data plane

21 Example: Path Computation App on Network Graph port switchport device Flow path Flow entry port on port link switch inport on Flow entry device outport switch host flow Application computes path by traversing the links from source to destination Application writes each flow entry for the path Thus path computation app does not need to worry about topology maintenance

22 Example: A simpler abstraction on network graph? Logical Crossbar port switchport device Edge Port port on port link switch physical on Edge Port device physical host App or service on top of ONOS Maintains mapping from simpler to complex Thus makes applications even simpler and enables new abstractions Virtual network objects Real network objects

23 Network Graph Representation Flow path Flow entry flow Vertex with 10 properties Vertex with 11 properties Vertex represented as Cassandra row Edge represented as Cassandra column ColumnValue Label id + direction Primary key Edge idVertex idSignature properties Other properties Switch Vertex with 3 properties Row indices for fast vertex centric queries

24 Switch Manager Network Graph: Switches OF Network Graph and Switches

25 SM Network Graph: Links SM Link Discovery LLDP Network Graph and Link Discovery

26 Network Graph: Devices SM LD Device Manager PKTIN Host Devices and Network Graph

27 SM LD Host DM Path Computation Network Graph: Flow Paths Flow 1 Flow 4 Flow 7 Flow 2 Flow 5 Flow 3 Flow 6 Flow 8 Flow entries Path Computation with Network Graph

28 SM LD Host DM Flow Manager Network Graph: Flows PC Flow Manager Flowmod Flow 1 Flow 4 Flow 7 Flow 2 Flow 5 Flow 3 Flow 6 Flow 8 Flow entries Network Graph and Flow Manager

29 Host Titan Graph DB Cassandra In-Memory DHT Instance 1 Instance 2 Instance 3 Network Graph Eventually consistent Distributed Registry Strongly Consistent Zookeeper OpenFlow Controller+ ONOS High Level Architecture +Floodlight Drivers

30 DEMO

31 Consistency Deep Dive

32 Consistency Definition Strong Consistency: Upon an update to the network state by an instance, all subsequent reads by any instance returns the last updated value. Strong consistency adds complexity and latency to distributed data management. Eventual consistency is slight relaxation – allowing readers to be behind for a short period of time.

33 Strong Consistency using Registry Distributed Network OS Instance 2 Instance 3 Network Graph Instance 1 A = Switch A Master = NONE Switch A Master = NONE A = ONOS 1 Timeline All instances Switch A Master = NONE Instance 1 Switch A Master = ONOS 1 Instance 2 Switch A Master = ONOS 1 Instance 3 Switch A Master = ONOS 1 Master elected for switch A Registry Switch A Master = NONE Switch A Master = NONE Switch A Master = ONOS 1 Switch A Master = ONOS 1 Switch A Master = ONOS 1 Switch A Master = ONOS 1 Switch A Master = NONE Switch A Master = NONE Switch A Master = ONOS 1 Switch A Master = ONOS 1 Delay of Locking & Consensus All instances Switch A Master = NONE

34 Why Strong Consistency is needed for Master Election  Weaker consistency might mean Master election on instance 1 will not be available on other instances.  That can lead to having multiple masters for a switch.  Multiple Masters will break our semantic of control isolation.  Strong locking semantic is needed for Master Election

35 Eventual Consistency in Network Graph Distributed Network OS Instance 2 Instance 3 Network Graph Instance 1 SWITCH A STATE= INACTIVE SWITCH A STATE= INACTIVE Switch A State = INACTIVE Switch A State = INACTIVE Switch A STATE = INACTIVE Switch A STATE = INACTIVE All instances Switch A STATE = ACTIVE Instance 1 Switch A = ACTIVE Instance 2 Switch A = INACTIVE Instance 3 Switch A = INACTIVE DHT Switch Connected to ONOS Switch A State = ACTIVE Switch A State = ACTIVE Switch A State = ACTIVE Switch A State = ACTIVE Switch A STATE = ACTIVE Switch A STATE = ACTIVE Timeline All instances Switch A STATE = INACTIVE Delay of Eventual Consensus

36 Cost of Eventual Consistency  Short delay will mean the switch A state is not ACTIVE on some ONOS instances in previous example.  Applications on one instance will compute flow through the switch A while other instances will not use the switch A for path computation.  Eventual consistency becomes more visible during control plane network congestion.

37 Why is Eventual Consistency good enough for Network State?  Physical network state changes asynchronously  Strong consistency across data and control plane is too hard  Control apps know how to deal with eventual consistency  In the current distributed control plane, each router makes its own decision based on old info from other parts of the network and it works fine  Strong Consistency is more likely to lead to inaccuracy of network state as network congestions are real.

38 Consistency learning  One Consistency does not fit all  Consequences of delays need to be well understood  More research needs to be done on various states using different consistency models

39 Development & test environment

40 ONOS Development & Test cycle  Source code on github  Agile: 3-4 week sprints  Mostly Java and many utility scripts  CI: Maven, Jenkins, JUnit, Coverage, TestON  Vagrant-based development VM  Daily 4 hour of Continuous Hours of Operations (CHO) tests as part of build  Several CHO cycles simulating rapid churns in network & failures on ONOS instances

41 ONOS Development Environment Single installation script creates a cluster of Virtual Box VMs

42 Test Lab Topology

43 ON.LAB ONOS Test implementation  ON.LAB team has implemented following aspects of automated tests ONOS Unit Tests (70% coverage) ONOS System Tests for Functionality, Scale, Performance and Resiliency test (85% coverage) White Box Network Graph Performance Measurements  All tests are executed nightly in Jenkins Continuous Integration environment.

44 Performance

45 Key performance metrics in Network OS  Network scale (# switches, # ports) -> Delay and Throughput Link failure, switch failure, switch port failure Packet_in (request for setting reactive flows) Reading and searching network graph Network Graph Traversals Setup of proactive flows  Application scale (# operations, # applications) Number of network events propagated to applications (delay & throughput) Number of operations on Network Graph (delay & throughput) Parallelism/threading for applications (parallelism on Network Graph) Parallel path computation performance

46 Performance: Hard Problems  Off the shelf open source does not perform  Ultra low-latency requirements are unique  Need to apply distributed/parallel programming techniques to scale control applications  Reactive control applications need event-driven framework which scale 46

47 ONOS: Summary ONOS Status Distributed Network OS Network Graph Northbound Abstraction Horizontally Scalable Highly Available Built using open source components Version Flow API, Shortest Path computation, Sample application - Build & QA ( Jenkins, Sanity Tests, Perf/Scale Tests, CHO) - Deployment in progress at REANNZ (SDN-IP peering) Next Exploring performance & reactive computation frameworks Expand graph abstraction for more types of network state Control functions: intra-domain & inter-domain routing Example use cases: traffic engineering, dynamic virtual networks on demand, …

48


Download ppt "ONOS Open Network Operating System An Experimental Open-Source Distributed SDN OS Pankaj Berde, Umesh Krishnaswamy, Jonathan Hart, Masayoshi Kobayashi,"

Similar presentations


Ads by Google