Michael Schapira School of Computer Science and Engineering Hebrew University of Jerusalem Some Open Questions on the Borderline of Distributed Computing.

Slides:



Advertisements
Similar presentations
Path Splicing with Network Slicing
Advertisements

Jennifer Rexford Princeton University MW 11:00am-12:20pm Logically-Centralized Control COS 597E: Software Defined Networking.
COS 461 Fall 1997 Routing COS 461 Fall 1997 Typical Structure.
Noam Nisan, Michael Schapira, Gregory Valiant, and Aviv Zohar.
VeriCon: Towards Verifying Controller Programs in SDNs (PLDI 2014) Thomas Ball, Nikolaj Bjorner, Aaron Gember, Shachar Itzhaky, Aleksandr Karbyshev, Mooly.
Putting BGP on the Right Path: A Case for Next-Hop Routing Michael Schapira Joint work with Yaping Zhu and Jennifer Rexford (Princeton University)
1 Interdomain Routing and Games Hagay Levin, Michael Schapira and Aviv Zohar The Hebrew University.
Consensus Routing: The Internet as a Distributed System John P. John, Ethan Katz-Bassett, Arvind Krishnamurthy, and Thomas Anderson Presented.
The Structure of Networks with emphasis on information and social networks T-214-SINE Summer 2011 Chapter 8 Ýmir Vigfússon.
Towards a Logic for Wide-Area Internet Routing Nick Feamster and Hari Balakrishnan M.I.T. Computer Science and Artificial Intelligence Laboratory Kunal.
Putting BGP on the Right Path: A Case for Next-Hop Routing Michael Schapira (Yale University and UC Berkeley) Joint work with Yaping Zhu and Jennifer Rexford.
Game Theoretic and Economic Perspectives on Interdomain Routing Michael Schapira Yale University and UC Berkeley.
Traffic Engineering With Traditional IP Routing Protocols
Dynamics of Hot-Potato Routing in IP Networks Renata Teixeira (UC San Diego) with Aman Shaikh (AT&T), Tim Griffin(Intel),
Interdomain Routing Establish routes between autonomous systems (ASes). Currently done with the Border Gateway Protocol (BGP). AT&T Qwest Comcast Verizon.
Distributed Computing with Adaptive Heuristics Michael Schapira Princeton Innovations in Computer Science 09 January 2011 Partially supported by NSF Aaron.
Link-State Routing Reading: Sections 4.2 and COS 461: Computer Networks Spring 2011 Mike Freedman
When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley)
A Routing Control Platform for Managing IP Networks Jennifer Rexford Princeton University
COS 420 Day 17. Agenda Finished Grading Individualized Projects Very large disparity in student grading No two students had same ranking for other students.
Spring Routing & Switching Umar Kalim Dept. of Communication Systems Engineering 06/04/2007.
Game Dynamics Out of Sync Michael Schapira (Yale University and UC Berkeley) Joint work with Aaron D. Jaggard and Rebecca N. Wright.
Building a Strong Foundation for a Future Internet Jennifer Rexford ’91 Computer Science Department (and Electrical Engineering and the Center for IT Policy)
Jennifer Rexford Princeton University MW 11:00am-12:20pm Wide-Area Traffic Management COS 597E: Software Defined Networking.
1 Semester 2 Module 6 Routing and Routing Protocols YuDa college of business James Chen
The Structure of Networks with emphasis on information and social networks T-214-SINE Summer 2011 Chapter 8 Ýmir Vigfússon.
Switching, routing, and flow control in interconnection networks.
Routing. A world without networks and routing  No connection between offices, people and applications  Worldwide chaos because of the lack of centralized.
Data Communications & Computer Networks
Towards a Logic for Wide- Area Internet Routing Nick Feamster Hari Balakrishnan.
Network Sensitivity to Hot-Potato Disruptions Renata Teixeira (UC San Diego) with Aman Shaikh (AT&T), Tim Griffin(Intel),
Authors Renata Teixeira, Aman Shaikh and Jennifer Rexford(AT&T), Tim Griffin(Intel) Presenter : Farrukh Shahzad.
Distributed Asynchronous Bellman-Ford Algorithm
1 Computer Communication & Networks Lecture 22 Network Layer: Delivery, Forwarding, Routing (contd.)
Institute of Computer and Communication Network Engineering OFC/NFOEC, 6-10 March 2011, Los Angeles, CA Lessons Learned From Implementing a Path Computation.
Common Devices Used In Computer Networks
Reasoning about Software Defined Networks Mooly Sagiv Tel Aviv University Thursday (Physics 105) Monday Schrieber.
Transit price negotiation: repeated game approach Sogea 23 Mai 2007 Nancy, France D.Barth, J.Cohen, L.Echabbi and C.Hamlaoui
Software Defined Networking Mike Freedman COS 461: Computer Networks
Jennifer Rexford Fall 2014 (TTh 3:00-4:20 in CS 105) COS 561: Advanced Computer Networks BGP.
1 Internet Routing. 2 Terminology Forwarding –Refers to datagram transfer –Performed by host or router –Uses routing table Routing –Refers to propagation.
COP 5611 Operating Systems Spring 2010 Dan C. Marinescu Office: HEC 439 B Office hours: M-Wd 2:00-3:00 PM.
Networking Fundamentals. Basics Network – collection of nodes and links that cooperate for communication Nodes – computer systems –Internal (routers,
Michael Schapira Yale and UC Berkeley Joint work with P. Brighten Godfrey, Aviv Zohar and Scott Shenker.
Intradomain Traffic Engineering By Behzad Akbari These slides are based in part upon slides of J. Rexford (Princeton university)
Routing and Routing Protocols
Chapter 24 Transport Control Protocol (TCP) Layer 4 protocol Responsible for reliable end-to-end transmission Provides illusion of reliable network to.
Evolving Toward a Self-Managing Network Jennifer Rexford Princeton University
1 Version 3.1 Module 6 Routed & Routing Protocols.
Evolving Toward a Self-Managing Network Jennifer Rexford Princeton University
Spring 2000CS 4611 Routing Outline Algorithms Scalability.
SDN and Beyond Ghufran Baig Mubashir Adnan Qureshi.
Preliminaries: EE807 Software-defined Networked Computing KyoungSoo Park Department of Electrical Engineering KAIST.
William Stallings Data and Computer Communications
Network Layer COMPUTER NETWORKS Networking Standards (Network LAYER)
SDN challenges Deployment challenges
CIS 700-5: The Design and Implementation of Cloud Networks
University of Maryland College Park
Martin Casado, Nate Foster, and Arjun Guha CACM, October 2014
Introduction to Internet Routing
Routing: Distance Vector Algorithm
Software Defined Networking
Software Defined Networking (SDN)
Software Defined Networking
Enabling Innovation Inside the Network
COS 561: Advanced Computer Networks
PRESENTATION COMPUTER NETWORKS
Lecture 10, Computer Networks (198:552)
COS 461: Computer Networks
Control-Data Plane Separation
Presentation transcript:

Michael Schapira School of Computer Science and Engineering Hebrew University of Jerusalem Some Open Questions on the Borderline of Distributed Computing and Networking

This Talk 1.New questions in Internet protocol design 2.Self-stabilizing Internet protocols 3.Incentive-compatible network protocols … illustrated via Internet routing examples

The Internet Tremendous success – from research experiment to global infrastructure Enables innovation in applications – Web, P2P, VoIP, social networks, virtual worlds But, the Internet infrastructure fairly stagnant for decades…

Why Can’t We Innovate? “Closed” equipment – software bundled with hardware – vendor-specific interfaces Slow protocol standardization Few people can innovate – equipment vendors write the code – long delays to introduce new features

Traditional Computer Networks data plane: packet streaming Handle packets in “real time”: forward, filter, buffer, mark, rate-limit, measure, …

slower time scale: track topology changes, compute routes, install forwarding rules, … control plane: distributed algorithms Traditional Computer Networks

Software Defined Networking (SDN): a New Paradigm API to the data plane (e.g., OpenFlow) Controller: logically-centralized control, smart, slow, implemented in software, … Switch: dumb, fast, implemented in hardware

8 Network OS Controller Application events from switches topology changes, traffic statistics, arriving packets, … commands to switches (un)install rules, query statistics, … Software Defined Networking (SDN): a New Paradigm

So… Change is finally on the horizon But many challenges remain… – Realizing SDN (e.g., distribute the controller?) – What are the “right” protocols (for routing, traffic engineering, etc.)? Distributed computing theory can play an important role here

Distributed Controller? 10 Network OS Controller Application Network OS Controller Application for scalability and reliability partition and replicate state Elect a leader? Distribute the computation? How to ensure consistency (across controllers / switches)? Where to place the controller(s)? Elect a leader? Distribute the computation? How to ensure consistency (across controllers / switches)? Where to place the controller(s)?

Rethinking (Routing) Protocols Routing is a control plane operation – slow (milliseconds – seconds) Packet forwarding is a data plane operation – fast (microseconds) Today’s (intradomain) routing – establishes connectivity – optimizes routes (= shortest paths) failure ⇒ re-convergence ⇒ dropped packets!

Pushing Connectivity (Only!) to the Data Plane … while retaining scalability – implemented in hardware – low overhead (end-to-end backup paths too costly…) – static forwarding tables (no changes in packet rates) – no change to packet header When packet to a node d arrives at node i, i’s outgoing link is a function only of i i d d incoming link set of “live” outgoing edges f id : E i x P(E i ) -> E i

Resilient Forwarding A “forwarding pattern” {f id } i is t-resilient if for any (at most) t-edge-failures the existence of a path between a node i and the destination d implies loop-free forwarding from i to d. Perfect resilience ≣ t → ∞ i i d d j j x

Theoretical Perspective Thm [Feigenbaum-Godfrey-Panda-S-Shenker-Singla] : 1-resilient forwarding pattern always exists Thm [Feigenbaum-Godfrey-Panda-S-Shenker-Singla] : Perfect resilience is not achievable Big gap! – does a 2-resilient forwarding pattern always exist? – specific families of graphs? – relax restrictions (randomness, dynamic forwarding tables, …)?

Practical Perspective A perfectly-resilient mechanism for achieving connectivity in the data plane – [“Data Driven Connectivity”, Liu-Panda- Singla-Godfrey-S-Shenker, NSDI 2013] – utilizes existing mechanisms – small (few bits) changes to forwarding tables at packet rate

How to distribute the controller? Data-plane/control-plane perspective on other networking tasks (e.g., traffic engineering) Connectivity in the data plane Directions for Future Research

(Self-)Stabilizing Internet Routing

Border Gateway Protocol Google Verizon Comcast AT&T The Border Gateway Protocol (BGP) establishes routes between the (over 42,000) networks that make up the Internet

BGP ≠ Shortest-Path Routing! Google Verizon Comcast AT&T I want to avoid routes through Comcast if possible I won’t carry traffic between AT&T and Verizon I want a cheap route I want short routes

Illustration: BGP Dynamics 1 2 d 2, I’m available 1, my route is 2d 1, I’m available Prefer routes through 2 Prefer routes through 1 A stable state is reached

1 2 d BGP might oscillate indefinitely between 1d, 2d and 12d, 21d 1, 2, I’m the destination 1, my route is 2d 2, my route is 1d Illustration: BGP Oscillation Prefer routes through 2 Prefer routes through 1 Conjecture [Griffin-Wilfong, SIGCOMM 99] : 2+ stable states → BGP can oscillate

Why are Oscillations Bad? Make the network unpredictable and hard to debug. Might lead to the flooding on the network with BGP update messages. Deteriorate performance! –almost 50% of VoIP disruptions are due to BGP route fluctuations

Internet Protocols, Markets, and Beyond Often, in computational and economic environments 1.the prescribed behavior for each “node” (human, machine) is simple and natural 2.nodes’ interaction is not synchronized How can we reason about such environments? – Internet protocols (BGP routing, TCP congestion control) – large-scale markets – social networks – …

Dynamics: Game Theory vs. Distributed Computing Game theory: – establishes convergence to equilibrium for “natural dynamics” (best-/better-response, fictitious play, no- regret, …) – … but typically assumes synchronization. Distributed computing theory: – analyzes system behavior in asynchronous environments – … but no general notions of natural behavior.

n nodes 1,…,n Node i has action space A i – A=A 1 … A n – A -i =A 1 … A i-1 A i+1 … A n Node i has reaction function f i :A -i → A i – f=(f 1,…,f n ) – f i can capture node i’s “best-responses” Simple Model

Infinite sequence of discrete time steps t=1,… A schedule  :{1,…} → 2 [n] maps each time step to the subset of nodes “activated” at that time step – a fair schedule activates each node infinitely often An initial action-profile and schedule naturally induce a dynamics. Simple Model (Cont.)

Defn: An action-profile a*=(a 1,…,a n ) is a stable state if f i (a*)=a i for all i. – that is, a* is a fixed point of f – abusing notation… Defn: A system is convergent if for every choice of initial action-profile and fair schedule the induced dynamics converge to a stable state. Simple Model (Cont.)

Thm [Jaggard-S-Wright] : If there exist multiple stable states, then the system is not convergent. – valency argument! – no failures, just dumb nodes! So, a unique stable state is a necessary condition for guaranteed convergence. Can be generalized to bounded-recall, non- stationary reaction functions. Towards a Characterization of Convergent Systems

Application: Internet Routing BGP establishes routes between the smaller networks that make up the Internet Question [Griffin-Shepherd-Wilfong, 2001] : Do multiple stable routing configurations imply the possibility of persistent route oscillations? Answer [Sami-S-Zohar, 2009] : Yes! AT&T Qwest Comcast Sprint

Other Applications Our “two people in a corridor” example… Models of congestion control on the Internet Load balancing Diffusion of technologies in social networks Asynchronous circuits …

Meanwhile, back in the corridor…

Defn: An r-fair schedule activates each node at least once in every r consecutive time steps Defn: A system is r-convergent if for all choices of initial action-profile and r-fair schedule the induced dynamics converges to a stable state. – convergent  r-convergent – not r-convergence  not convergent Thm [Erdmann-S] : If there exist multiple stable states, then the system is not (n-1)-convergent. – tight! – much more delicate valency argument Strengthening the Result: Convergence vs. Synchronism

Thm [Jaggard-S-Wright] : Determining if a system with n nodes is convergent requires exponential communication (in n). Thm [Engelberg-Fabrikant-S-Wajc] : Determining if a succinctly described system is convergent is PSPACE-complete. Both results extend also to “stochastic convergence”. Complexity of Convergent Systems

Other protocols! Identify specific classes of (stochastically) convergent games and measure convergence rate (e.g., in terms of asynchronous rounds). Characterize guaranteed convergence, and design algorithms for determining such convergence for  other game dynamics (e.g., fictitious play, no-regret dynamics)  other notions of equilibrium (e.g., mixed Nash, correlated)  other notions of asynchrony Directions for Future Research

Incentive-Compatible Network Protocols

queue routerlink TCP Congestion Control is NOT Incentive Compatible AIMD = Additive Increase Multiplicative Decrease

What About BGP? BGP was designed to guarantee connectivity between largely trusted and obedient parties. In today’s commercial Internet ASes are owned by self-interested, often competing, entities – might not follow the “prescribed behaviour” Simple examples show that BGP is, in fact, not incentive compatible – a node can obtain a better route by “lying”

How Can We Fix This? Economic Mechanism Design: “the reverse-engineering approach to game- theory”. Goal: Incentivize players to follow the prescribed behaviour – if others run the protocol so should I! – without money! Thm [Levin-S-Zohar] : Secure variants of BGP are incentive compatible.

An exciting time to be in networking Internet protocols motivate new research directions Distributed computing theory has much to contribute Conclusion

Thank You