Leveraging SDN Layering to Systematically Troubleshoot Networks Brandon Heller ★ Colin Scott  Nick McKeown ⌘ Scott Shenker  Andreas Wundsam § Hongyi.

Slides:



Advertisements
Similar presentations
New Directions in Enterprise Network Management Aditya Akella University of Wisconsin, Madison MSR Networking Summit June 2006.
Advertisements

Resonance: Dynamic Access Control in Enterprise Networks Ankur Nayak, Alex Reimers, Nick Feamster, Russ Clark School of Computer Science Georgia Institute.
1 Resonance: Dynamic Access Control in Enterprise Networks Ankur Nayak, Alex Reimers, Nick Feamster, Russ Clark School of Computer Science Georgia Institute.
Flow-based Management Language Tim Hinrichs Natasha Gude* Martín Casado John Mitchell Scott Shenker University of Chicago Stanford University ICSI/UC Berkeley.
1 Resonance: Dynamic Access Control in Enterprise Networks Ankur Nayak, Alex Reimers, Nick Feamster, Russ Clark School of Computer Science Georgia Institute.
Toward Practical Integration of SDN and Middleboxes
CloudWatcher: Network Security Monitoring Using OpenFlow in Dynamic Cloud Networks or: How to Provide Security Monitoring as a Service in Clouds? Seungwon.
Programmable Measurement Architecture for Data Centers Minlan Yu University of Southern California 1.
Ver 1,12/09/2012Kode :CIJ 340,Jaringan Komputer Lanjut FASILKOM Routing Protocols and Concepts – Chapter 2 Static Routing CCNA.
A SOFT Way for OpenFlow Interoperability Testing Marco Canini TU Berlin / T-Labs [CoNEXT’12]
A SOFT Way for OpenFlow Interoperability Testing Maciej Kuźniar, Peter Perešini, Marco Canini†, Daniele Venzano, Dejan Kostić‡ EPFL †TU Berlin/T-Labs ‡IMDEA.
Header Space Analysis: Static Checking For Networks Peyman Kazemian, Nick McKeown (Stanford University) and George Varghese (UCSD and Yahoo Labs). Presented.
VeriCon: Towards Verifying Controller Programs in SDNs (PLDI 2014) Thomas Ball, Nikolaj Bjorner, Aaron Gember, Shachar Itzhaky, Aleksandr Karbyshev, Mooly.
Application Centric Infrastructure
© 2007 Cisco Systems, Inc. All rights reserved.Cisco Public 1 Version 4.1 Troubleshooting Working at a Small-to-Medium Business or ISP – Chapter 9.
I Know What Your Packet Did Last Hop: Using Packet Histories to Troubleshoot Networks Nikhil Handigol With Brandon Heller, Vimal Jeyakumar, David Mazières,
Secure web browsers, malicious hardware, and hardware support for binary translation Sam King.
Towards a Logic for Wide-Area Internet Routing Nick Feamster and Hari Balakrishnan M.I.T. Computer Science and Artificial Intelligence Laboratory Kunal.
Troubleshooting SDNs Peyman Kazemian. Why SDN Troubleshooting SDN decouples software (control plane) from hardware (data plane). Opens doors for innovation.
What Great Research ?s Can RAMP Help Answer? What Are RAMP’s Grand Challenges ?
An Overview of Software-Defined Network
Data Plane Verification. Background: What are network policies Alice can talk to Bob Skype traffic must go through a VoIP transcoder All traffic must.
A victim-centric peer-assisted framework for monitoring and troubleshooting routing problems.
Replay Debugging for Distributed Systems Dennis Geels, Gautam Altekar, Ion Stoica, Scott Shenker.
Presenter: Chi-Hung Lu 1. Problems Distributed applications are hard to validate Distribution of application state across many distinct execution environments.
CECS 5460 – Assignment 3 Stacey VanderHeiden Güney.
Formal checkings in networks James Hongyi Zeng with Peyman Kazemian, George Varghese, Nick McKeown.
How SDNs will tame networks Nick McKeown Stanford University.
© Janice Regan, CMPT 128, Jan CMPT 128 Introduction to Computing Science for Engineering Students Creating a program.
Dynamic Network Emulation Security Analysis for Application Layer Protocols.
Software-Defined Networks Jennifer Rexford Princeton University.
Software Defined Networks and OpenFlow SDN CIO Summit 2010 Nick McKeown & Guru Parulkar Stanford University In collaboration with Martin Casado and Scott.
Where is the Debugger for my Software-Defined Network? [ndb]
VeriFlow: Verifying Network-Wide Invariants in Real Time
CS : Software Defined Networks 3rd Lecture 28/3/2013
FIREWALLS Vivek Srinivasan. Contents Introduction Need for firewalls Different types of firewalls Conclusion.
1 Topic 2: Lesson 3 Intro to Firewalls Summary. 2 Basic questions What is a firewall? What is a firewall? What can a firewall do? What can a firewall.
 network appliances to filter network traffic  filter on header (largely based on layers 3-5) Internet Intranet.
SDX: A Software-Defined Internet eXchange Jennifer Rexford Princeton University
Aaron Gember, Theophilus Benson, Aditya Akella University of Wisconsin-Madison.
Hyper-V Performance, Scale & Architecture Changes Benjamin Armstrong Senior Program Manager Lead Microsoft Corporation VIR413.
Network Protocols and Standards (Part 2). The OSI Model In 1984, the International Organization for Standardization (ISO) defined a standard, or set of.
P4 Amore! ( Or, How I Learned to Stop Worrying and Love P4) Jennifer Rexford Princeton University.
Network Virtualization Sandip Chakraborty. In routing table we keep both the next hop IP (gateway) as well as the default interface. Why do we require.
NETWORK DEVICES Department of CE/IT.
Header Space Analysis: Static Checking for Networks Broadband Network Technology Integrated M.S. and Ph.D. Eun-Do Kim Network Standards Research Section.
Logically Centralized? State Distribution Trade-offs in Software Defined Networks.
BUZZ: Testing Context-Dependent Policies in Stateful Networks Seyed K. Fayaz, Tianlong Yu, Yoshiaki Tobioka, Sagar Chaki, Vyas Sekar.
Atrium Router Project Proposal Subhas Mondal, Manoj Nair, Subhash Singh.
The Road to SDN: An Intellectual History of Programmable Networks KyoungSoo Park Department of Electrical Engineering KAIST.
Verifying & Testing SDN Data & Control Planes: Header Space Analysis.
Software Defined Networking BY RAVI NAMBOORI. Overview  Origins of SDN.  What is SDN ?  Original Definition of SDN.  What = Why We need SDN ?  Conclusion.
Seyed K. Fayaz, Tushar Sharma, Ari Fogel
Road to SDN Review the main features of SDN
Xin Li, Chen Qian University of Kentucky
SDN challenges Deployment challenges
Instructor Materials Chapter 8: Network Troubleshooting
Martin Casado, Nate Foster, and Arjun Guha CACM, October 2014
ETHANE: TAKING CONTROL OF THE ENTERPRISE
NOX: Towards an Operating System for Networks
1.
6.829 Lecture 13: Software Defined Networking
Stanford University Software Defined Networks and OpenFlow SDN CIO Summit 2010 Nick McKeown & Guru Parulkar In collaboration with Martin Casado and Scott.
TASK 4 Guideline.
Abstractions for Model Checking SDN Controllers
I. Basic Network Concepts
Software Defined Networking (SDN)
Cloud-Enabling Technology
Software Defined Networking
Control-Data Plane Separation
Presentation transcript:

Leveraging SDN Layering to Systematically Troubleshoot Networks Brandon Heller ★ Colin Scott  Nick McKeown ⌘ Scott Shenker  Andreas Wundsam § Hongyi Zeng ⌘ Sam Whitlock  Vimalkumar Jeyakumar ⌘ Nikhil Handigol ★ James McCauley  Kyriakos Zarifis∞ Peyman Kazemian ★ HotSDN 2013 Hong Kong ⌘ Stanford  Berkeley ∞USC  ICSI ★ SDN Academy § Big Switch Networks

Admin Network skills + tools + knowledge Protocols Configuration Topology Policy connect hosts A + B quarantine virus- infected hosts route guest traffic to an HTTP proxy prioritize SSH +  1: Configure Ethane, overlays, consistency primitives, network programming languages, … 3: Fix Stuff! 2: Troubleshoot This Talk

Admin Network skills + tools + knowledge Protocols Configuration Topology Policy connect hosts A + B quarantine virus- infected hosts route guest traffic to an HTTP proxy prioritize SSH +  1: Configure Ethane, overlays, consistency primitives, network programming languages, … 3: Fix Stuff! 2: Troubleshoot #1 request from network admins: Automatic Troubleshooting Source: “Automatic Test Packet Generation”, CoNEXT ‘12, Zeng et al. This Talk

How to automate troubleshooting? Network Policy isolate groups A + B route guest traffic to an HTTP proxy block a list of virus- infected hosts Challenging in traditional networks. ~ ? (2) Check behavior against policy: confusing: don’t know lowest-level forwarding behavior distributed: hard to get a meaningful snapshot Two requirements. (1) Know the intended policy: confusing: different config format for each protocol distributed: configuration spread among all nodes hard: must understand all protocols & their interactions difficult to check impractical to infer

Control-Plane Layering in SDN Firmware Network Hypervisor App State Layers Logical View Physical View Device State Hardware Policy Code Layers Network OS HW

Firmware HW Systematically Troubleshooting an SDN Network OS Network Hypervisor App State Layers Logical View Physical View Device State Hardware Policy Code Layers Observation: Each state layer fully specifies network behavior. Insight: Bugs manifest as mistranslations between layers. Systematic Approach: (1)Binary search to isolate to a code layer. (2)Leverage state to isolate within the code layer.

Phase 1: Localizing to a code layer [Operator Intent] Logical View Physical View Device State Hardware Policy ? ~ Apps NetHyperV NetOS Firmware [Actual Behavior] Cause: Firmware Bug Yes No ? ~ Yes No ? ~ Yes No SOFT [CONEXT ‘12] Anteater [SIGCOMM ‘11] Symptom: Hosts unable to communicate

Phase 1: Localizing to a code layer [Operator Intent] Logical View Physical View Device State Hardware Policy ? ~ Apps NetHyperV NetOS Firmware [Actual Behavior] Yes No ? ~ Yes No Symptom: Tenant Isolation Breach HSA [NSDI ’12] OFRewind [ATC ‘11] Yes No ? ~ ? ~ Yes No Correspondence Checking Cause: NetHypervisor Bug

How to automate troubleshooting? Network Policy isolate groups A + B route guest traffic to an HTTP proxy block a list of virus- infected hosts Possible in Software-Defined Networks ~ ? (2) Check behavior against policy: confusing: don’t know lowest-level forwarding behavior distributed: hard to get a meaningful snapshot Two requirements. (1) Know the intended policy: confusing: different config format for each protocol distributed: configuration spread among all nodes hard: must understand all protocols & their interactions directly accessible directly provided app fewer nodes

Takeways Control plane layering enables systematic troubleshooting Thinking about troubleshooting in terms of layers shows us where tools fit in – Reveals missing tools – Highlights choices between tools, with tradeoffs Plenty of opportunities left. Operationalize!

Leverage the layers in SDN. Brandon Heller ★ Colin Scott  Nick McKeown ⌘ Scott Shenker  Andreas Wundsam § Hongyi Zeng ⌘ Sam Whitlock  Vimalkumar Jeyakumar ⌘ Nikhil Handigol ★ James McCauley  Kyriakos Zarifis∞ Peyman Kazemian ★ HotSDN 2013 Hong Kong ⌘ Stanford  Berkeley ∞USC  ICSI ★ SDN Academy § Big Switch Networks Questions?

How is this different than general distributed systems debugging? Simple answer: it’s not! SDN is an excellent opportunity to draw upon ideas from other distributed systems Subtlety: networks are solving a much more constrained problem than general distributed systems

Limitations Correctness only, not performance Side effects not reflected in state No guarantee of finding single code layer No guarantee of individual layer correctness No guarantee of future correctness Layer visibility may be imperfect

Plenty of Opportunities Remain Automatic Troubleshooting  Actionable Bug Reports – Filtering the signal from the noise – Creating consistent views of state Improving Invariant Checkers – Scale – Flexible Policy Input Hybrid Traditional + SDN Debugging

Plenty of Opportunities Remain Automatic Troubleshooting  Actionable Bug Reports – Filtering the signal from the noise – Creating consistent views of state Packet History: Path + Headers + Forwarding State Forwarding State [HotSDN 2012: Where is the Debugger for My Software-Defined Network?]

Plenty of Opportunities Remain Automatic Troubleshooting  Actionable Bug Reports – Filtering the signal from the noise [Berkeley Tech Report: How Did We Get Into This Mess? Isolating Fault-Inducing Inputs to SDN Control Software] Minimal Causal Sequence

Isn’t this unnecessary with consistency primitives/languages/etc? No Catch/rule out bugs outside the framework Catch instances where the framework pushes config that breaks the policy

What’s novel about this work? Simple answer: nothing!

Control-Plane Layering in SDN Firmware Network Hypervisor App State Layers Logical View Physical View Device State Hardware Policy Example Errors Configuration Parsing Error Tenant isolation breach (policy mistranslation) Failover logic error, synchronization bug Register misconfiguration, Router memory corruption Code Layers [Unintended Config] [External Connectivity Error] Network OS HW