Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSci8211: SDN Controller Design: ONOS 1 NOS Case Study: ONOS Open Network OS by ON.LAB  Prototype 1 focus on implementing a global network view goals:

Similar presentations


Presentation on theme: "CSci8211: SDN Controller Design: ONOS 1 NOS Case Study: ONOS Open Network OS by ON.LAB  Prototype 1 focus on implementing a global network view goals:"— Presentation transcript:

1 CSci8211: SDN Controller Design: ONOS 1 NOS Case Study: ONOS Open Network OS by ON.LAB  Prototype 1 focus on implementing a global network view goals: scale-out & fault tolerance but not performance using (generic) open-source platforms  Prototype 2 focus on improving performance, esp. event latency use RAMcloud for data store, add a cache layer, & “customerized” data model

2 2 SDN Controller Design Questions Some Key Questions & Issues:  How to obtain global (network-wide) information?  How to perform distributed state management?  time scales of state change dynamics? consistency issues?  What are the configurations? Abstractions & APIs?  How to implement such a Network OS?  And will it really work? E.g., response time & other performance issues?  How to program control apps? E.g., a SDN programming language?  Will it scale?  Not only in terms of network size, but also # flows, control apps, etc.?  What about reliability & security issues?  … (e.g., inter-operability, evolvability) Are there some fundamental design principles we can adopt & apply? CSci8211: SDN Controller Design: ONOS

3 3 SDN Controller Design How to design a Network Operating System?  What features or “abstractions” should be provided by this “Network Operating System”?  In particular, what should be the “global network view” & “programmatic interfaces” provided to control apps?  or what “low-level” details should be handled by Network OS?  And what is the granularity of control allowed to “apps”? Analogies (& possible differences?):  computer OS and (high-level) programming models  computer architecture: instruction sets, CPU, memory, disks, I/O devices,...  (high-level) programming language constructs: statements, data types, functions, …  OS: (virtual) memory, processes, I/O and drivers, system calls, …  (distributed) file systems (or databases or data stores)  files, directories & permissions, transactions, relations & schemas; vs. disks, …. CSci8211: SDN Controller Design: ONOS

4 NOS Requirements for WANs 4CSci8211: SDN Controller Design: ONOS

5 Prior Work Distributed control platform for large-scale networks ONIX: closed source; datacenter + virtualization focus ONOS design influenced by ONIX Distributed: ONIX Single Instance NOX, POX, Beacon, Floodlight, Trema controllers Helios, Midonet, Hyperflow, Maestro, Kandoo, … Community needs an open source distributed network OS CSci8211: SDN Controller Design: ONOS

6 Routing TE Packet Forwarding Packet Forwarding Packet Forwarding Packet Forwarding Packet Forwarding Packet Forwarding Mobility Programmable Base Station Programmable Base Station Openflow Scale-out Design Fault Tolerance Global network view ONOS: Open Network OS Global Network View CSci8211: SDN Controller Design: ONOS

7 ONOS Scale-Out Distributed Network OS Instance 2 Instance 3 Instance 1 Network Graph Global network view An instance is responsible for maintaining a part of network graph Control capacity can grow with network size or application need Data plane CSci8211: SDN Controller Design: ONOS

8 Master Switch A = ONOS 1 Candidates = ONOS 2, ONOS 3 Master Switch A = ONOS 1 Candidates = ONOS 2, ONOS 3 Master Switch A = ONOS 1 Candidates = ONOS 2, ONOS 3 Master Switch A = ONOS 1 Candidates = ONOS 2, ONOS 3 Master Switch A = ONOS 1 Candidates = ONOS 2, ONOS 3 Master Switch A = ONOS 1 Candidates = ONOS 2, ONOS 3 ONOS Control Plane Failover Distributed Network OS Instance 2 Instance 3 Instance 1 Distributed Registry Host A B C D E F Master Switch A = NONE Candidates = ONOS 2, ONOS 3 Master Switch A = NONE Candidates = ONOS 2, ONOS 3 Master Switch A = NONE Candidates = ONOS 2, ONOS 3 Master Switch A = NONE Candidates = ONOS 2, ONOS 3 Master Switch A = NONE Candidates = ONOS 2, ONOS 3 Master Switch A = NONE Candidates = ONOS 2, ONOS 3 Master Switch A = ONOS 2 Candidates = ONOS 3 Master Switch A = ONOS 2 Candidates = ONOS 3 Master Switch A = ONOS 2 Candidates = ONOS 3 Master Switch A = ONOS 2 Candidates = ONOS 3 Master Switch A = ONOS 2 Candidates = ONOS 3 Master Switch A = ONOS 2 Candidates = ONOS 3 CSci8211: SDN Controller Design: ONOS

9 Host Network Graph Database (Titan) Instance 1 Instance 2Instance 3 Distributed Registry strongly Consistent Zookeeper OpenFlow Manager (Floodlight) High Level Architecture: Prototype I +Floodlight Drivers Scale-out Coordination Control Application Applications OpenFlow Manager (Floodlight) OpenFlow Manager (Floodlight) Global Network View (Distributed Network State) Distributed Key-Value Store (Cassandra) eventual consistency Blueprint API transactional consistency CSci8211: SDN Controller Design: ONOS

10 Network Graph CSci8211: SDN Controller Design: ONOS

11 Cassandra In-memory DHT Id: 1 A Id: 101, Label Id: 103, Label Id: 2 C Id: 3 B Id: 102, Label Id: 104, Label Id: 106, Label Id: 105, Label Network Graph Titan Graph DB ONOS Network Graph Abstraction CSci8211: SDN Controller Design: ONOS

12 Network Graph port switchport device port on port link switch on device host  Network state is naturally represented as a graph  Graph has basic network objects like switch, port, device and links  Application writes to this graph & programs the data plane CSci8211: SDN Controller Design: ONOS

13 Example: Path Computation App on Network Graph port switchport device Flow path Flow entry port on port link switch inport on Flow entry device outport switch host flow Application computes path by traversing the links from source to destination Application writes each flow entry for the path Thus path computation app does not need to worry about topology maintenance CSci8211: SDN Controller Design: ONOS

14 Network Graph Representation Flow path Flow entry flow Vertex with 10 properties Vertex with 11 properties Vertex represented as Cassandra row Edge represented as Cassandra column ColumnValue Label id + direction Primary key Edge idVertex idSignature properties Other properties Switch Vertex with 3 properties Row indices for fast vertex centric queries CSci8211: SDN Controller Design: ONOS

15 Switch Manager Network Graph: Switches OF Network Graph and Switches OF CSci8211: SDN Controller Design: ONOS

16 SM Network Graph: Links SM Link Discovery LLDP Network Graph and Link Discovery CSci8211: SDN Controller Design: ONOS

17 Network Graph: Devices SM LD Device Manager PKTIN Host Devices and Network Graph CSci8211: SDN Controller Design: ONOS

18 SM LD Host DM Path Computation Network Graph: Flow Paths Flow 1 Flow 4 Flow 7 Flow 2 Flow 5 Flow 3 Flow 6 Flow 8 Flow entries Path Computation with Network Graph CSci8211: SDN Controller Design: ONOS

19 SM LD Host DM Flow Manager Network Graph: Flows PC Flow Manager Flowmod Flow 1 Flow 4 Flow 7 Flow 2 Flow 5 Flow 3 Flow 6 Flow 8 Flow entries Network Graph and Flow Manager CSci8211: SDN Controller Design: ONOS

20 Example: A simpler abstraction on network graph? Logical Crossbar port switchport device Edge Port port on port link switch physical on Edge Port device physical host App or service on top of ONOS Maintains mapping from simpler to complex Thus makes applications even simpler and enables new abstractions Virtual network objects Real network objects More on Network Graph Later CSci8211: SDN Controller Design: ONOS

21 Evaluation of Prototype I  Evaluation Setting  ONOS cluster controls 100s of virtual switches, programming end- to-end flows using the network view  dynamically adding switches & ONOS instances to the cluster  failovers in response to ONOS instance shut-downs  rerouting in response to link failures  Scalability & High Availability (HA) ONOS scales with more controller instances  though need low latency distributed data store  Consistency and Integrity Titan maintains graph’s structural integrity on top of Cassandra’s eventual consistent data store  can benefit from some degree of sequencing for deterministic state transitions CSci8211: SDN Controller Design: ONOS

22 Evaluation of Prototype I …  Low Performance and Visibility  latency for event handling much worse than expected  e.g., reacting to link failure could take up to 30 seconds  diagnosing performance problems hard due to large, complex code base of generic open-source code base  Data Model Issues & Excessive Data Store Operations Titan => model all data objects (e.g., ports, flow entries) as vertices  requires indexing vertices by type: index maintenance a bottleneck  maintaining & storing references between many small objects leads to dozens of graph database update operations Mapping Titan to Cassandra => excessive data store operations  shared table & index: unnecessary contention among “independent” ops  Polling for detecting changes in network states high CPU load, increasing delay to events & info. exchange 22 CSci8211: SDN Controller Design: ONOS

23 23 ONOS Prototype II  Lessons learned from Prototype 1  more efficient data models to reduce # of data store operations  fast notification & messaging across ONOS instances  also, Simplified network view APIs  Prototype 2: focus on improving performance, esp. event latency  RAMcloud for data store: Blueprints APIs directly on top RAMcloud: low-latency, dist. key-value store w/ 15-30 us read/write  Optimized data model  table for each type of net. objects (switches, links, flows, …)  minimize # of references bw elements: most updates require 1 r/w  (In-memory) topology cache at each ONOS instance  eventual consistent: updates may receive at diff. times & orders  atomicity under schema integrity: no r/w by apps midst an update  Event notifications using Hazelcast, a pub-sub system

24 Host ONOS Graph Abstraction Instance 1 Instance 2Instance 3 Network Graph Distributed Registry Strongly Consistent Zookeeper OpenFlow Manager (Floodlight) High Level Architecture: Prototype II +Floodlight Drivers Scale-out Coordination Control Application Applications OpenFlow Manager (Floodlight) OpenFlow Manager (Floodlight) Global Network View (Distributed Network State) Distributed Key-Value Store (RAMCloud) Eventual consistency Network View API Event Notification (Hazelcast) CSci8211: SDN Controller Design: ONOS

25 25 ONOS Graph Representation and Topology Cache  Network Topology State and Graph Representation  ONOS keeps track of info about infrastructure, making it available to control apps  both protocol-agnostic & protocol-specific network elements & state representations; can be translated from one to the other  Model objects (prtcl-agnostic) and providers (prtcl-specific) model object dependencies: device first-class entity  A table for each type of objects; a distributed store  switch, host, port, link, edgelink, path, flow entry, …  Topology Cache  Each instance keeps an in-memory cache of an entire network topology (i.e.., not only part of the store under its mastership)  apply updates atomically to maintain integrity

26 Network State Topology (Switch, Port, Link, …) Network Events (Link down, Packet In, …) Flow state (Flow table, connectivity paths,...) Applications program observe Applications Applications Switch Port Link Host Intent FlowPath FlowEntry ONOS Global Network View CSci8211: SDN Controller Design: ONOS

27 “Tell me about your slice?” Cache ONOS Topology Cache CSci8211: SDN Controller Design: ONOS

28 Evaluation of Prototype II  Basic Network State Changes  Evaluation Setting 3-node ONOS cluster & a WAN of 81 OF switches w/ avg. 4 ports  RAMcloud with generic graph data model: 1 r + 8w to add a sw  RAMcloud with the new data model: 1 write to add a sw & port 28 ONOS cluster w/ 10Gb/s Eth. Conn. w/ Kryow/ Google Protocol Buffers CSci8211: SDN Controller Design: ONOS

29 Evaluation of Prototype II …  Handling Network Events  Evaluation Setting 6-node ONOS cluster; Mininet net. of 206 soft sw.’s, 416 links 16,000 flows, one interface fails: reroute 1000 flows, 6->7 hops 29 CSci8211: SDN Controller Design: ONOS

30 Evaluation of Prototype II …  Path Installation  Evaluation Setting 6-node ONOS cluster; Mininet net. of 206 soft sw.’s, 416 links 15,000 flows preinstalled; add 1000 6-hop flows 30  Latency (table 3) and Throughput (derived from table 3) median throughout: 18,832 paths/sec CSci8211: SDN Controller Design: ONOS

31 ONOS Demo on Internet2 6 node ONOS cluster, Mininet topology, 1,000 affected flows, 6 hop path Reaction Time: 45.2 ms (median) 75.8 ms (99th percentile) Total Time to Reroute: 71.2 ms (median) 116 ms (99th percentile) CSci8211: SDN Controller Design: ONOS

32 ONOS Summary  Control isolation (sharding)  Divide network into parts and control them exclusively  Load balancing -> we can do more  Distributed data store  That scales with controller nodes with HA -> though we need low latency distributed data store  Dynamic controller assignment to parts of network  Dynamically assign which part of network is controlled by which controller instance -> we can do better with sophisticated algorithms  Graph abstraction of network state  Easy to visualize and correlate with topology  Enables several standard graph algorithms 32 CSci8211: SDN Controller Design: ONOS


Download ppt "CSci8211: SDN Controller Design: ONOS 1 NOS Case Study: ONOS Open Network OS by ON.LAB  Prototype 1 focus on implementing a global network view goals:"

Similar presentations


Ads by Google