Verifying & Testing SDN Data & Control Planes: Header Space Analysis.

Slides:



Advertisements
Similar presentations
Automated Theorem Proving Lecture 1. Program verification is undecidable! Given program P and specification S, does P satisfy S?
Advertisements

Toward Practical Integration of SDN and Middleboxes
Models and techniques for verification of Software Defined Networks
Composing Software-Defined Networks Princeton*Cornell^ Chris Monsanto*, Joshua Reich* Nate Foster^, Jen Rexford*, David Walker*
Slick: A control plane for middleboxes Bilal Anwer, Theophilus Benson, Dave Levin, Nick Feamster, Jennifer Rexford Supported by DARPA through the U.S.
A SOFT Way for OpenFlow Interoperability Testing Marco Canini TU Berlin / T-Labs [CoNEXT’12]
A SOFT Way for OpenFlow Interoperability Testing Maciej Kuźniar, Peter Perešini, Marco Canini†, Daniele Venzano, Dejan Kostić‡ EPFL †TU Berlin/T-Labs ‡IMDEA.
An Overview of Software-Defined Network Presenter: Xitao Wen.
Rigorous Software Development CSCI-GA Instructor: Thomas Wies Spring 2012 Lecture 13.
Leveraging SDN Layering to Systematically Troubleshoot Networks Brandon Heller ★ Colin Scott  Nick McKeown ⌘ Scott Shenker  Andreas Wundsam § Hongyi.
VeriCon: Towards Verifying Controller Programs in SDNs (PLDI 2014) Thomas Ball, Nikolaj Bjorner, Aaron Gember, Shachar Itzhaky, Aleksandr Karbyshev, Mooly.
OpenFlow : Enabling Innovation in Campus Networks SIGCOMM 2008 Nick McKeown, Tom Anderson, et el. Stanford University California, USA Presented.
Multi-Layer Switching Layers 1, 2, and 3. Cisco Hierarchical Model Access Layer –Workgroup –Access layer aggregation and L3/L4 services Distribution Layer.
I Know What Your Packet Did Last Hop: Using Packet Histories to Troubleshoot Networks Nikhil Handigol With Brandon Heller, Vimal Jeyakumar, David Mazières,
© 2008 Cisco Systems, Inc. All rights reserved.Cisco ConfidentialPresentation_ID 1 Chapter 5: Inter-VLAN Routing Routing & Switching.
ISBN Chapter 3 Describing Syntax and Semantics.
CS 355 – Programming Languages
Troubleshooting SDNs Peyman Kazemian. Why SDN Troubleshooting SDN decouples software (control plane) from hardware (data plane). Opens doors for innovation.
Scalable Network Virtualization in Software-Defined Networks
Traffic Management - OpenFlow Switch on the NetFPGA platform Chun-Jen Chung( ) SriramGopinath( )
A Type System for Expressive Security Policies David Walker Cornell University.
Describing Syntax and Semantics
An Overview of Software-Defined Network
Languages for Software-Defined Networks Nate Foster, Arjun Guha, Mark Reitblatt, and Alec Story, Cornell University Michael J. Freedman, Naga Praveen Katta,
OpenFlow Switch Limitations. Background: Current Applications Traffic Engineering application (performance) – Fine grained rules and short time scales.
An Overview of Software-Defined Network Presenter: Xitao Wen.
Formal checkings in networks James Hongyi Zeng with Peyman Kazemian, George Varghese, Nick McKeown.
Software Defined Networking COMS , Fall 2013 Instructor: Li Erran Li SDNFall2013/
Software Engineering Prof. Dr. Bertrand Meyer March 2007 – June 2007 Chair of Software Engineering Static program checking and verification Slides: Based.
A Simple Method for Extracting Models from Protocol Code David Lie, Andy Chou, Dawson Engler and David Dill Computer Systems Laboratory Stanford University.
Software Defined-Networking. Network Policies Access control: reachability – Alice can not send packets to Bob Application classification – Place video.
Where is the Debugger for my Software-Defined Network? [ndb]
VeriFlow: Verifying Network-Wide Invariants in Real Time
Higher-Level Abstractions for Software-Defined Networks Jennifer Rexford Princeton University.
Traffic Management - OpenFlow Switch on the NetFPGA platform Chun-Jen Chung( ) Sriram Gopinath( )
Reasoning about Software Defined Networks Mooly Sagiv Tel Aviv University Thursday (Physics 105) Monday Schrieber.
Automatic Verification of Finite-State Concurrent Systems Using Temporal Logic Specifications 1.
UNDERSTANDING THE HOST-TO-HOST COMMUNICATIONS MODEL - OSI LAYER & TCP/IP MODEL 1.
VeriCon: Towards Verifying Controller Programs in SDNs Thomas Ball, Nikolaj Bjorner, Aaron Gember, Shachar Itzhaky, Aleksandr Karbyshev, Mooly Sagiv, Michael.
ISBN Chapter 3 Describing Semantics -Attribute Grammars -Dynamic Semantics.
Programming Languages for Software Defined Networks Jennifer Rexford and David Walker Princeton University Joint work with the.
Aaron Gember, Theophilus Benson, Aditya Akella University of Wisconsin-Madison.
1 Data Link Layer Lecture 23 Imran Ahmed University of Management & Technology.
Design - programming Cmpe 450 Fall Dynamic Analysis Software quality Design carefully from the start Simple and clean Fewer errors Finding errors.
Chapter 4 Version 1 Virtual LANs. Introduction By default, switches forward broadcasts, this means that all segments connected to a switch are in one.
Introduction to Mininet, Open vSwitch, and POX
Presented By: Mohammed Al-Mehdhar Presentation Outline Introduction Approaches Implementation Evaluation Conclusion Q & A.
CSci8211: SDN Controller Design 1 Overview of SDN Controller Design  SDN Re-cap  SDN Controller Design: Case Studies  NOX Next Week:  ONIX  ONOS 
Jennifer Rexford Princeton University MW 11:00am-12:20pm Data-Plane Verification COS 597E: Software Defined Networking.
CSC3315 (Spring 2009)1 CSC 3315 Languages & Compilers Hamid Harroud School of Science and Engineering, Akhawayn University
Jennifer Rexford Princeton University MW 11:00am-12:20pm Testing and Debugging COS 597E: Software Defined Networking.
Header Space Analysis: Static Checking for Networks Broadband Network Technology Integrated M.S. and Ph.D. Eun-Do Kim Network Standards Research Section.
Authors: Mark Reitblatt, Nate Foster, Jennifer Rexford, Cole Schlesinger, David Walker Presenter: Byungkwon Choi Abstractions for Network Update INA.
Atrium Router Project Proposal Subhas Mondal, Manoj Nair, Subhash Singh.
SDN controllers App Network elements has two components: OpenFlow client, forwarding hardware with flow tables. The SDN controller must implement the network.
SDN challenges Deployment challenges
SDN Network Updates Minimum updates within a single switch
Instructor Materials Chapter 8: Network Troubleshooting
Instructor Materials Chapter 6: VLANs
Martin Casado, Nate Foster, and Arjun Guha CACM, October 2014
ETHANE: TAKING CONTROL OF THE ENTERPRISE
NOX: Towards an Operating System for Networks
Overview of SDN Controller Design
Abstractions for Model Checking SDN Controllers
The Stanford Clean Slate Program
CS 31006: Computer Networks – The Routers
SDN and Testing.
Programmable Networks
Lecture 10, Computer Networks (198:552)
Control-Data Plane Separation
Presentation transcript:

Verifying & Testing SDN Data & Control Planes: Header Space Analysis

SDN Stack State layers hold a representation of the network’s configuration. Code layers implement logic to maintain the mapping between two state layers Firmware Network Hypervisor App State Layers Logical View Physical View Device State Hardware Policy Code Layers Network OS HW

Troubleshooting Workflow

Tools for Finding The Code Layer A: Actual Behavior ~ Policy? Automatic Test Packet Generation, Network Debugger B: Device State ~ Policy? Anteater, Header Space Analysis, VeriFlow C: Physical View ~ Device State? OFRewind D: Device State ~ Hardware? SOFT E: Logical View ~ Physical View? Corresponding Checking

Tools for Localizing within a Code Layer Within the controller (control plane) OFRewind, Retrospective Casual Inference, Synoptic, NICE, VeriCon, … Within the switch firmware (data plane) Network Debugger, Automatic Test Packet Generation, SOFT, …  Systematic troubleshooting  from design (policy specification & verification), configuration analysis, runtime dynamic analysis, trouble-shooting, …  Verification vs. testing vs. debugging vs. trouble- shooting, …  Summarize tools and reveal missing tools

Some Existing Approaches to SDN Verification & Testing Specification Policy Language, Semantics Testing Verification (e.g. reachabilty) Synthesis (e.g. forwarding) CTL, NoD, Klee Assert NetKAT, NetCore Data Plane: Anteater, VeriFlow, HSA, Atomic Predicates, First-order + Transitive Closure, local checks Control Plane: VeriCon, BatFish One big switch, VLAN, NetCore, NetKAT, FlowLog NICE, ATPG

SDN: Verification & Testing Data Plane Verification: –given data plane FIBs f, every packet p satisfies designed network spec/property –If not, generate one counterexample or all counterexamples –Or provides optimizations or help adding functionality Data Plane Testing: –detour via Symbolic Testing Control Plane Verification: –given configuration c, for every packet & network environment, verify control plane generates correct data plane states meeting desired network specs/properties Control Plane Testing: Synthesis: –design a Control Plane CP that will ∀ packet p, s (network state) satisfy Φ (policy specs), e.g., reachability, no loop, waypoint traversal ), by construction 7

Verification: Values and Obstacles HardwareSoftwareNetworks ChipsDevices (PC, phone)Service Bugs are:Burned into silicone Exploitable, workarounds Latent, Exposed Dealing with bugs: Costly recallsOnline updatesLive site incidents Obstacles to eradication: Design ComplexityCode churn, legacy, false positives Topology, configuration churn Value proposition Cut time to marketSafety/OS critical systems, Quality of code base Meet SLA, Utilize bandwidth, Enable richer policies

Goal: Guaranteeing network invariants to ensure correctness, safety, etc. Network should always satisfy some invariants VeriCon: Towards Verifying Controller Programs in SDNs 9 Difficult to write an SDN application that always guarantees such invariants

Limitations of Existing Approaches 1.Establish existence, but not absence, of bugs –NICE (finite-state model checking): unexplored topologies may cause bugs to be missed –HSA (check network snapshots): snapshots may not capture situations in which bugs exist 2.Runtime overhead –VeriFlow & NetPlumber (check in real-time): bugs only identified when app is actually running 10

VeriCon Verifies network-wide invariants for any event sequence and all admissible topologies 11 SDN application in Core SDN Topology constraints & invariants in first order logic Guarantee invariants are satisfied Concrete counter- example Verify conditions using the Z3 theorem prover + OR

Example: Stateful Firewall Always forward from trusted to untrusted hosts Only forward from untrusted to trusted hosts if a trusted host previously sent a packet to the untrusted host Trusted Hosts Untrusted Hosts

Core SDN (CSDN) Language Define and initialize relations –Topology: link (S, O, H)link(S 1, I 1, I 2, S 2 ) –Forwarding : S.ft(Src → Dst, I → O) S.sent(Src → Dst, I → O) Write event handlers: pktIn(S, Pkt, I) –Update relation –Install rule (insert into ft ) –Forward packet (insert into sent ) –If-then-else 13

CSDN: Built-In Relations Describing Network States 14

First-Order Formulas & Invariants 15 Three types of invariants: topo: defines admissible topology safety: hold initially & preserve after event executions trans: hold after an event execution Example: no black hole

Example: Learning Switch Controller Code 16

Learning Switch Controller Code: Some Invariants 17

Stateful Firewall in CSDN rel tr(SW, HO) = {} pktIn(s, pkt, prt(1)) → s.forward(pkt, prt(1), prt(2)) tr.insert(s, pkt.dst) s.install(pkt.src → pkt.dst, prt(1), prt(2)) pktIn(s, pkt, prt(2)) → if tr(s, pkt.src) then s.forward(pkt, prt(2), prt(1)) s.install(pkt.src→pkt.dst, prt(2), prt(1))

Invariants Topology: define admissible topologies Safety: define the required consistency of network-wide states Transition: define the effect of executing event handlers 19 assumed to hold initially checked initially & after each event

Topology: At least one switch with two ports, prt(1) & prt(2) ; a packet P is forwarded from an untrusted host U to a trusted host T Safety: For every packet sent from a host U to a host T there exists a packet sent to T’ from U Stateful Firewall Invariants 20

Counterexample I 1 is not inductive—not all executions starting from an arbitrary state satisfy the invariant 21 in out HO:0 prt(3) prt(2) prt(1) prt(0) SW:0 s flow-table HO:0 SrcDstInOut ** pkt.src pkt.dst

Additional Firewall Invariants Flow table entries only contain forwarding rules from trusted hosts Controller relation tr records the correct hosts I 1 ˄ I 2 ˄ I 3 is inductive 22

Non-buggy Verification Examples ProgramLOCsTopo Inv. Safety + Trans Inv. Time (sec) Firewall Stateless Firewall Firewall + Host Migration Learning Switch Learning Switch + Auth Resonance (simplified) Stratos (simplified)

Buggy Verification Examples BenchmarkCounterex Host + Sw Auth: Rules for unauth host not removed3 + 2 Firewall: Forgot part of consistency inv5 + 3 Firewall: No check if host is trusted6 + 4 Firewall: No inv defining trusted host6 + 4 Learning: Packets not forwarded1 + 1 Resonance: No inv for host to have one state StatelessFW: Rule allowing all port 2 traffic4 + 2

CSDN: Abstract Syntax

From Formulas to Theorems & Models Includes a separate utility for inferring inductive invariants using iterated weakest preconditions by Dijkstra’s alg. Automatic theorem proving (verification) by Z3

Further Work Needed Assume events are executed atomically –Enforceable using barriers, with performance hit –Consider out-of-order rule installs Rule timeouts –App handles timeout events to update its ft relation and check invariants –Need to reason about event ordering 27

Summary of VeriCon Verifies network-wide invariants for any event sequence and all admissible topologies Guarantees invariants are satisfied, or provides a concrete counterexample Application with 93 LOC and 13 invariants is verified in 0.21s 28

NDB: Debugging SDNs Bugs can be anywhere in the SDN stack –Hardware, control plane logic, race conditions Switch state might change rapidly Bugs might show up rarely 29 How can we exploit the SDN architecture to systematically track down the root cause of bugs?

30 Bug Story: Incomplete Handover A B Switch X WiFi AP Y WiFi AP Z

ndb : Network Debugger Goal –Capture and reconstruct the sequence of events leading to the errant behavior Allow users to define a Network Breakpoint –A (header, switch) filter to identify the errant behavior Produce a Packet Backtrace –Path taken by the packet –State of the flow table at each switch 31

Debugging software programs Function A(): i = …; j = …; u = B(i, j) Function A(): i = …; j = …; u = B(i, j) Function B(x, y): k = …; v = C(x, k) Function B(x, y): k = …; v = C(x, k) Function C(x, y): … w = abort() Function C(x, y): … w = abort() Breakpoint “line 25, w = abort() ” Backtrace File “A”, line 10, Function A () File “B”, line 43, Function B () File “C”, line 21, Function C () Breakpoint “line 25, w = abort() ” Backtrace File “A”, line 10, Function A () File “B”, line 43, Function B () File “C”, line 21, Function C () 32

Breakpoint “ICMP packets A->B, arriving at X, but not Z” Backtrace Switch X: { inport: p0, outports: [p1] mods: [...] matched flow: 23 [...] matched table version: 3 } Switch Y: { inport p1, outports: [p3] mods: } Breakpoint “ICMP packets A->B, arriving at X, but not Z” Backtrace Switch X: { inport: p0, outports: [p1] mods: [...] matched flow: 23 [...] matched table version: 3 } Switch Y: { inport p1, outports: [p3] mods: } Y X Debugging Networks A B Switch X WiFi AP Y WiFi AP Z

Using ndb to Debug Common Issues Reachability –Symptom: A is not able to talk to B –Breakpoint: “Packet A->B, not reaching B” Isolation –Symptom: A is talking to B, but it shouldn’t –Breakpoint: “Packet A->B, reaching B” Race conditions –Symptom: Flow entries not reaching on time –Breakpoint: “Packet-in at switch S, port P” 34

Control Plane Flow Table State Recorder Match ACT Match ACT Postcard Collector 35 How Does ndb Work?

Postcard Collector Control Plane Flow Table State Recorder … 7. … … 7. … … 7. … … 7. … … 7. … … 7. … … 7. … … 7. … 36

Postcard Collector Control Plane Flow Table State Recorder 37

Who Benefits Network developers –Programmers debugging control programs Network operators –Find policy errors –Send error report to switch vendor –Send error report to control program vendor 38

Performance and Scalability Control channel –Negligible overhead –No postcards –Extra flow-mods Postcards in the datapath –Single collector server for the entire Stanford backbone –Selective postcard generation to reduce overhead –Parallelize postcard collection 39

ndb : Network Breakpoint + Packet Backtrace Systematically track down root cause of bugs Practical and deployable today 40 Summary