The DHCP Failover Protocol A Formal Perspective Rui FanMIT Ralph Droms Cisco Systems Nancy GriffethCUNY Nancy LynchMIT.

Slides:



Advertisements
Similar presentations
Global States.
Advertisements

Impossibility of Distributed Consensus with One Faulty Process
Brewer’s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services Authored by: Seth Gilbert and Nancy Lynch Presented by:
Internet Control Protocols Savera Tanwir. Internet Control Protocols ICMP ARP RARP DHCP.
Distributed Systems Overview Ali Ghodsi
CS 603 Handling Failure in Commit February 20, 2002.
Outline. Theorem For the two processor network, Bit C(Leader) = Bit C(MaxF) = 2[log 2 ((M + 2)/3.5)] and Bit C t (Leader) = Bit C t (MaxF) = 2[log 2 ((M.
The SMART Way to Migrate Replicated Stateful Services Jacob R. Lorch, Atul Adya, Bill Bolosky, Ronnie Chaiken, John Douceur, Jon Howell Microsoft Research.
CPSC 668Set 10: Consensus with Byzantine Failures1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch.
CS514: Intermediate Course in Operating Systems Professor Ken Birman Vivek Vishnumurthy: TA.
CS 582 / CMPE 481 Distributed Systems Fault Tolerance.
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 3 – Distributed Systems.
An Introduction to Input/Output Automata Qihua Wang.
CPSC 668Set 16: Distributed Shared Memory1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.
IP Addressing: introduction
Clock Synchronization Ken Birman. Why do clock synchronization?  Time-based computations on multiple machines Applications that measure elapsed time.
20101 Synchronization in distributed systems A collection of independent computers that appears to its users as a single coherent system.
Tuple Spaces and JavaSpaces CS 614 Bill McCloskey.
CS294, YelickConsensus, p1 CS Consensus
Lecture 13 Synchronization (cont). EECE 411: Design of Distributed Software Applications Logistics Last quiz Max: 69 / Median: 52 / Min: 24 In a box outside.
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 4 – Consensus and reliable.
Chapter 23: ARP, ICMP, DHCP IS333 Spring 2015.
16: Distributed Systems1 DISTRIBUTED SYSTEM STRUCTURES NETWORK OPERATING SYSTEMS The users are aware of the physical structure of the network. Each site.
Lecture 12 Synchronization. EECE 411: Design of Distributed Software Applications Summary so far … A distributed system is: a collection of independent.
Composition Model and its code. bound:=bound+1.
State Machines CS 614 Thursday, Feb 21, 2002 Bill McCloskey.
Time, Clocks, and the Ordering of Events in a Distributed System Leslie Lamport (1978) Presented by: Yoav Kantor.
Managing DHCP. 2 DHCP Overview Is a protocol that allows client computers to automatically receive an IP address and TCP/IP settings from a Server Reduces.
Lecture 3a Mobile IP 1. Outline How to support Internet mobility? – by Mobile IP. Our discussion will be based on IPv4 (the current version). 2.
Bootstrap and Autoconfiguration (DHCP)
New Protocols for Remote File Synchronization Based on Erasure Codes Utku Irmak Svilen Mihaylov Torsten Suel Polytechnic University.
1 CMPT 471 Networking II DHCP Failover and multiple servers © Janice Regan,
Securing Every Bit: Authenticated Broadcast in Wireless Networks Dan Alistarh, Seth Gilbert, Rachid Guerraoui, Zarko Milosevic, and Calvin Newport.
A Survey of Rollback-Recovery Protocols in Message-Passing Systems.
Distributed Systems: Concepts and Design Chapter 1 Pages
BitTorrent Nathan Marz Raylene Yung. BitTorrent BitTorrent consists of two protocols – Tracker HTTP protocol (THP) How an agent joins a swarm How an agent.
Issues with Clocks. Context The tree correction protocol was based on the idea of local detection and correction. Protocols of this type are complex to.
BAI513 - PROTOCOLS DHCP BAIST – Network Management.
Timed I/O Automata: A Mathematical Framework for Modeling and Analyzing Real-Time Systems Frits Vaandrager, University of Nijmegen joint work with Dilsun.
1 © R. Guerraoui Regular register algorithms R. Guerraoui Distributed Programming Laboratory lpdwww.epfl.ch.
Time, Clocks, and the Ordering of Events in a Distributed System Leslie Lamport Massachusetts Computer Associates,Inc. Presented by Xiaofeng Xiao.
1 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Dynamic Host Configuration Protocol (DHCP)
Chapter 18 Host Configuration : DHCP
Physical clock synchronization Question 1. Why is physical clock synchronization important? Question 2. With the price of atomic clocks or GPS coming down,
Exercises for Chapter 15: COORDINATION AND AGREEMENT From Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edition 3, © Addison-Wesley.
CSCI1600: Embedded and Real Time Software Lecture 24: Real Time Scheduling II Steven Reiss, Fall 2015.
SysRép / 2.5A. SchiperEté The consensus problem.
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT By Jyothsna Natarajan Instructor: Prof. Yanqing Zhang Course: Advanced Operating Systems.
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Set 16: Distributed Shared Memory 1.
CIS 540 Principles of Embedded Computation Spring Instructor: Rajeev Alur
Relying on Safe Distance to Achieve Strong Partitionable Group Membership in Ad Hoc Networks Authors: Q. Huang, C. Julien, G. Roman Presented By: Jeff.
Mutual Exclusion Algorithms. Topics r Defining mutual exclusion r A centralized approach r A distributed approach r An approach assuming an organization.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Paxos Steve Ko Computer Sciences and Engineering University at Buffalo.
Fundamentals of Fault-Tolerant Distributed Computing In Asynchronous Environments Paper by Felix C. Gartner Graeme Coakley COEN 317 November 23, 2003.
Formal verification of distance vector routing protocols.
Exercises for Chapter 11: COORDINATION AND AGREEMENT
Network Load Balancing
BOOTP and DHCP Objectives
On the Complexity of Buffer Allocation in Message Passing Systems
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT -Sumanth Kandagatla Instructor: Prof. Yanqing Zhang Advanced Operating Systems (CSC 8320)
Distributed Systems, Consensus and Replicated State Machines
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
Allocating IP Addressing by Using Dynamic Host Configuration Protocol
Physical clock synchronization
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
The SMART Way to Migrate Replicated Stateful Services
EEC 688/788 Secure and Dependable Computing
Presentation transcript:

The DHCP Failover Protocol A Formal Perspective Rui FanMIT Ralph Droms Cisco Systems Nancy GriffethCUNY Nancy LynchMIT

Fault Tolerant DHCP Dynamic Host Configuration Protocol (DHCP) is a widely deployed protocol to assign IP addresses and other client parameters. DHCP is also important for the wireless and mobile setting. Current implementations use one DHCP server, are not fault tolerant. Main challenge to using multiple servers is to maintain consistent view of assigned addresses across servers to avoid double allocation. Standard database techniques are too slow. The DHCP Failover Protocol (DKS+’03) is a 2-server DHCP algorithm retaining the client interface and performance of DHCP.

Our Contributions We present an algorithm based on DKS+’03, generalized to arbitrary number of servers. Rigorously specify algorithm and its behavior using TIOA Helps end-users understand and use DHCP. We decompose the DHCPF problem into independent subproblems. Subproblems can be solved separately, and their solutions composed to solve DHCPF. Helps to understand and prove the correctness of the algorithm. Helps to analyze the effects of network parameters on algorithm performance, and to optimize the algorithm. Demonstrates that formal, theoretical approach can provide correct, simple and efficient solutions to complex, real-world problems.

Timed I/O Automaton Formal modeling framework for describing distributed systems. Rigorous and structured. Composition, simulation, other proof / design techniques. A Timed I/O Automaton (TIOA) [KLSV’05] consists of States, start states Discrete actions State transitions (state, action, state) Continuous actions (trajectories) A mapping from [0,t] to states Scheduling of actions is nondeterministic. Execution is alternating sequence of trajectories and discrete actions. Example A mobile robot. State is its position. Discrete actions are changes in destination. Trajectories are movement towards destination.

System Assumptions Ideally, we want DHCPF to satisfy the following. Safety property No IP address is double allocated. Liveness property All client commands are quickly executed. These properties depend on correct behavior of network and environment. Clock assumption Clients and servers have bounded skew clocks. Let  be a constant. Then |clock i (t) – t| , for every client or server i, and every time t. Both safety and liveness depend on clock assumption.

System Assumptions Stability Let be a parameter. A time interval [t, t’] is - stable if Some server is alive throughout [t-, t’]. No server fails or recovers during [t-, t’]. Timeliness Time interval [t, t’] is -timely if any message sent during [t, t’- ] is delivered within time. Liveness property depends on having sufficiently long stable and timely time intervals.

System Assumptions Failure detector   tells servers which other servers are alive. Model by recv ,j (  dead, j’  ) and recv ,j (  alive, j’  ) actions, where j, j’ are servers. Can be implemented by heartbeats, network admin, etc. Let be a parameter.  is –perfect if it satisfies Accuracy If recv ,* (  dead, j’  ) occurs at time t, then j’ is dead sometime in [t-, t]. Likewise for recv ,* (  alive, j’  ). Timeliness Every j gets a recv ,j (  dead, j’  ) or recv ,j (  alive, j’  ) msg every seconds, for every j’. Failure detectors used in many distributed algorithms, and are sometimes provably necessary. Safety depends on a failure detector 

A Formal Spec of DHCPF DHCP client interface and message exchange sequence.  is an interaction identifier. Client is correct if it executes this message sequence. Say client i owns an IP address  at time t if send *,i (ack,*, ,  ) occurs before t, and   t –  Takes into account clock skew of client. If i doesn’t own  at t, then i is definitely not using  at t  Assumes correct clients. bcast(discover,  ) client server send(offer, ,  ) bcast(request,  ) send(ack, ,  ’) bcast(renew,  ’’  ’) send(ack, ,  ’’’)

A Formal Spec of DHCPF Assume a -perfect failure detector, and a  bound on clock skew. Safety For all IP addresses  and at all times t, at most one client owns  at t. Request liveness Suppose time t is (4 +4  )-stable and  - timely, and client i does bcast(discover,  ) at time t. Assume client i is correct and does not fail during [t, t+4  ]. Then By time t+ , every live server receives i’s message. By time t+2 , either send(offer, ,  ) occurs for some , or for every , either  was offer’ed to some client but not request’ed. There is a lease for  which has not expired. If send(offer, ,  ) occurs, then send(ack, ,*,*) occurs by time t+4 

A Formal Spec of DHCPF Renew liveness Suppose time t is (4 +4  )-stable and  -timely, and client i has a lease for  for time  t+  + . Then if i bcasts renew for  at t, i recvs an ack for  by time t+2 

DHCPF Algorithm Overview We break the DHCPF problem into two independent subproblems, Lease and Elect. Elect For any IP address , elect a leader server for  Only the leader can lease  to clients. There is at most one leader for  at any time. The leader can change as servers fail and recover. Lease The leader gives out leases for  Ensure clients can always request or renew leases for  Ensure no double allocation even if leader changes. Lease and Elect run continuously, in parallel. The DHCPF algorithm is the formal composition Elect  Lease.

The Elect Algorithm For any IP address , Elect ensures Safety There is at most one leader server for  at any time. Liveness If execution is currently “nice”, then a leader exists. Code shown is for server j. clock The current clock value at j. live Set of servers j thinks is alive. my-addrs Set of IP addresses j thinks it is leader for. lead-time[  ] Time when j became leader for  rec-time Time when j last recovered.

The Elect Algorithm Basic idea is the min live server should be leader for  ’s. Actually, can use a different min  for each , for load balancing. If j hears j’ is alive Add j’ to live. For each , if j no longer min  for , give up leadership of  If j hears j’ is dead Remove j’ from live. For each , if j became min  for , and enough time passed since last recovery, become leader for  Time to wait depends on quality of failure detector, and clock skew  is min, and enough time passed no longer min

Assume  is -perfect, and clock skew is at most  Theorem (Safety) At any time, for any address , there is at most one server j with  my-addrs j. Proof Theorem (Liveness) If current state is (4 + 4  )-stable, then for every address , we have  my-addrs min  L, where L is the set of current live servers. Elect Properties deadalive s 1 is alive from this point on t- t-2 s 2 sees s 1, won’t become leader t s1s1 s2s2 s 1, s 2 both leaders for 

The Lease Algorithm To avoid double allocation, leader should tell others servers its leases, in case it fails. Waiting for acks from other servers is too slow. Leader first gives client a temporary Maximum Client Lead Time (MCLT) lease. Client gets a shorter lease than he asked for. While client is using MCLT lease, leader negotiates an acknowledged lease with other servers. When client renews, he gets the lease he asked for last time. In this example, suppose MCLT = 3. renew(15) req(10) ok(4) ok(10) lease(15) ack(15) ack(10) lease(10) renew(20) ok(15) lease(20) s1s1 s2s2

The Lease Algorithm When new leader takes over, it waits MCLT time, and also till its max acknowledged lease expires. This upper bounds the maximum potential lease that the previous leader might have given out. Leader only gives out new lease for  when all potential leases have expired. This is the main idea of DKS+’03. ack(10) req(10) ok(4) lease(10) s1s1 s2s2 req(8) nok

The Lease Algorithm potlease[  ] Maximum potential lease given out for  reserved Set of addresses offered but not requested. acklease[  ] The lease value that j will give for   An interaction identifier. write-acks[  ] Set of servers acknowledging interaction instance  MCLT lease negotiate acknowledged lease give the ack’ed lease every server increased potlease, so j can increase acklease wait for max of MCLT and potlease check  is available

Safety of Elect  Lease Theorem Elect  Lease satisfies the safety property of the DHCPF specification. Proof A sequence of invariants, proved by induction on the execution. Prove that servers have good estimate of max lease given out for  Lemma For all j, j’, if j  write-acks[  ] j’, then potlease[   ] j   Lemma For all j, j’, max(potlease[  ] j, clock j + MCLT + 2  )  acklease[  ] j’ Key invariant of [DKS+’03]. Only consider actions  which increase acklease[  ] j’.

Safety of Elect  Lease Lemma Let  be the leader for . Then potlease[  ]   acklease[  ] j, for all j. If inductive step  doesn’t change leader, we show this using the fact that there’s at most one leader for  If leader changes, then  sets potlease[  ]   max(potlease[  ] j, clock j + MCLT + 2  ). Since leader always knows the max lease for , it avoids double allocation during request or renew.

Liveness of Elect  Lease Hard to state Need to identify all situations which prevent progress. Easy to prove! When nothing bad happens, something good happens. Theorem Elect  Lease satisfies the request and renew liveness properties of the DHCPF specification. Proof (Request liveness) Suppose client i bcasts discover at time t. By time t+ , every live server gets i’s message. Since t is (4 + 4  )-stable and  -timely, then every  has a leader. Server j doesn’t offer i any address only if for every  j owns,  has been reserved by another client, or the lease for  hasn’t expired. If i is offered some  ’s, then no other client is offered those  ’s, so within 2  time, i gets ack for  Renew liveness proof similar.

Conclusions Formally specified and implemented a fault tolerant DHCP algorithm using TIOA. A simple algorithm based on decomposition into independent subproblems. Is our decomposition “good”? Does DHCPF need a perfect failure detector? Is the dependence on clock skew and msg delay the best possible? Is “goodness” merely a “human” and case-by-case concept, or a more universal one? Perhaps not totally far-fetched? Church-Turing formalized computation, Cook-Levin formalized completeness…

Thank you!