Sliding window protocol The sender continues the send action without receiving the acknowledgements of at most w messages (w > 0), w is called the window.

Slides:



Advertisements
Similar presentations
Impossibility of Distributed Consensus with One Faulty Process
Advertisements

DISTRIBUTED SYSTEMS II FAULT-TOLERANT BROADCAST Prof Philippas Tsigas Distributed Computing and Systems Research Group.
PROTOCOL VERIFICATION & PROTOCOL VALIDATION. Protocol Verification Communication Protocols should be checked for correctness, robustness and performance,
6.852: Distributed Algorithms Spring, 2008 Class 7.
Distributed Computing 8. Impossibility of consensus Shmuel Zaks ©
Failure detector The story goes back to the FLP’85 impossibility result about consensus in presence of crash failures. If crash can be detected, then consensus.
Announcements. Midterm Open book, open note, closed neighbor No other external sources No portable electronic devices other than medically necessary medical.
Computer Science 425 Distributed Systems CS 425 / ECE 428 Consensus
Consensus Hao Li.
Byzantine Generals Problem: Solution using signed messages.
Consensus Algorithms Willem Visser RW334. Why do we need consensus? Distributed Databases – Need to know others committed/aborted a transaction to avoid.
1 TCP CSE May TCP Services Flow control Connection establishment and termination Congestion control 2.
Faults and fault-tolerance One of the selling points of a distributed system is that the system will continue to perform even if some components / processes.
Distributed Systems 2006 Group Communication I * *With material adapted from Ken Birman.
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 3 – Distributed Systems.
EEC-484/584 Computer Networks Lecture 12 Wenbing Zhao (Part of the slides are based on Drs. Kurose & Ross ’ s slides for their Computer.
Asynchronous Consensus (Some Slides borrowed from ppt on Web.(by Ken Birman) )
Transport Layer 3-1 Transport Layer r To learn about transport layer protocols in the Internet: m TCP: connection-oriented protocol m Reliability protocol.
Impossibility of Distributed Consensus with One Faulty Process Michael J. Fischer Nancy A. Lynch Michael S. Paterson Presented by: Oren D. Rubin.
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 4 – Consensus and reliable.
Department of Electronic Engineering City University of Hong Kong EE3900 Computer Networks Transport Protocols Slide 1 Transport Protocols.
Distributed Algorithms: Agreement Protocols. Problems of Agreement l A set of processes need to agree on a value (decision), after one or more processes.
CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta (Indy) Lecture 19: Paxos All slides © IG.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Consensus Steve Ko Computer Sciences and Engineering University at Buffalo.
1 Transport Layer Computer Networks. 2 Where are we?
Paxos Made Simple Jinghe Zhang. Introduction Lock is the easiest way to manage concurrency Mutex and semaphore. Read and write locks. In distributed system:
Distributed Consensus Reaching agreement is a fundamental problem in distributed computing. Some examples are Leader election / Mutual Exclusion Commit.
Distributed Consensus Reaching agreement is a fundamental problem in distributed computing. Some examples are Leader election / Mutual Exclusion Commit.
CS3502: Data and Computer Networks DATA LINK LAYER - 2 WB version.
Faults and fault-tolerance
Lecture 8-1 Computer Science 425 Distributed Systems CS 425 / CSE 424 / ECE 428 Fall 2010 Indranil Gupta (Indy) September 16, 2010 Lecture 8 The Consensus.
Go-Back-N ARQ  packets transmitted continuously (when available) without waiting for ACK, up to N outstanding, unACK’ed packets  a logically different.
Why do we need models? There are many dimensions of variability in distributed systems. Examples: interprocess communication mechanisms, failure classes,
Distributed Algorithms – 2g1513 Lecture 9 – by Ali Ghodsi Fault-Tolerance in Distributed Systems.
FALL 2005CSI 4118 – UNIVERSITY OF OTTAWA1 Part 2.5 Internetworking Chapter 25 (Transport Protocols, UDP and TCP, Protocol Port Numbers)
Consensus and Its Impossibility in Asynchronous Systems.
Review for Exam 2. Topics included Deadlock detection Resource and communication deadlock Graph algorithms: Routing, spanning tree, MST, leader election.
Ch11 Distributed Agreement. Outline Distributed Agreement Adversaries Byzantine Agreement Impossibility of Consensus Randomized Distributed Agreement.
Transport Control Protocol (TCP) Features of TCP, packet loss and retransmission, adaptive retransmission, flow control, three way handshake, congestion.
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 8 Instructor: Haifeng YU.
Chapter 24 Transport Control Protocol (TCP) Layer 4 protocol Responsible for reliable end-to-end transmission Provides illusion of reliable network to.
CS603 Fault Tolerance - Communication April 17, 2002.
Sliding window protocol The sender continues the send action without receiving the acknowledgements of at most w messages (w > 0), w is called the window.
Hwajung Lee. Reaching agreement is a fundamental problem in distributed computing. Some examples are Leader election / Mutual Exclusion Commit or Abort.
Chap 15. Agreement. Problem Processes need to agree on a single bit No link failures A process can fail by crashing (no malicious behavior) Messages take.
Hwajung Lee.  Models are simple abstractions that help understand the variability -- abstractions that preserve the essential features, but hide the.
Hwajung Lee. One of the selling points of a distributed system is that the system will continue to perform even if some components / processes fail.
SysRép / 2.5A. SchiperEté The consensus problem.
Revisiting failure detectors Some of you asked questions about implementing consensus using S - how does it differ from reaching consensus using P. Here.
ITEC452 Distributed Computing Lecture 2 – Part 2 Models in Distributed Systems Hwajung Lee.
Failure detection The design of fault-tolerant systems will be easier if failures can be detected. Depends on the 1. System model, and 2. The type of failures.
Alternating Bit Protocol S R ABP is a link layer protocol. Works on FIFO channels only. Guarantees reliable message delivery with a 1-bit sequence number.
Fault tolerance and related issues in distributed computing Shmuel Zaks GSSI - Feb
CSE 486/586 CSE 486/586 Distributed Systems Leader Election Steve Ko Computer Sciences and Engineering University at Buffalo.
1 The utopia protocol  Unrealistic assumptions: –processing time ignored –infinite buffer space available –simplex: data transmitted in one direction.
Transmission Control Protocol (TCP) TCP Flow Control and Congestion Control CS 60008: Internet Architecture and Protocols Department of CSE, IIT Kharagpur.
Classifying fault-tolerance Masking tolerance. Application runs as it is. The failure does not have a visible impact. All properties (both liveness & safety)
CSE 486/586 CSE 486/586 Distributed Systems Consensus Steve Ko Computer Sciences and Engineering University at Buffalo.
DATA LINK CONTROL. DATA LINK LAYER RESPONSIBILTIES  FRAMING  ERROR CONTROL  FLOW CONTROL.
Ch 3. Transport Layer Myungchul Kim
Group Communication A group is a collection of users sharing some common interest.Group-based activities are steadily increasing. There are many types.
Chapter 3: The Data Link Layer –to achieve reliable, efficient communication between two physically connected machines. –Design issues: services interface.
Reliable multicast Tolerates process crashes. The additional requirements are: Only correct processes will receive multicasts from all correct processes.
1 Chapter 24 Internetworking Part 4 (Transport Protocols, UDP and TCP, Protocol Port Numbers)
Faults and fault-tolerance
Chapter 6: Transport Layer (Part I)
Alternating Bit Protocol
Distributed Consensus
Faults and fault-tolerance
Distributed Consensus
Presentation transcript:

Sliding window protocol The sender continues the send action without receiving the acknowledgements of at most w messages (w > 0), w is called the window size.

Sliding window protocol {program for process S} define next, last, w : integer ; initially next = 0, last = -1, w > 0 do last+1 ≤ next ≤ last + w  send (m[next], next); next := next + 1 ÿ (ack, j) is received  if j > last  last := j  j ≤ last  skip fi ÿ timeout (R,S)  next := last+1 {retransmission begins} od {program for process R} define j : integer ; initially j = 0; do (m[next], next) is received  if j = next  accept the message; send (ack, j); j:= j+1  j ≠ next  send (ack, j-1) fi ; od

Why does it work? Show that Every message is accepted exactly once M[k] is accepted before m[k+1] Required unbounded sequence number. This is bad. Can we avoid it?

Why unbounded sequence no? (m[k],k) ( m’, k) ( m’ ’,k) Retransmitted version of m New message using the same seq number k

Theorem If the communication channels are non-FIFO, and the message propagation delays are arbitrarily large, then using bounded sequence numbers it is impossible to design a window protocol that can withstand the (1) loss, (2) duplication, and (3) reordering of messages.

Alternating Bit Protocol SR Link layer protocol. Works on FIFO channels only. Guarantees reliable message delivery with a 1-bit seq. One message at a time, so window size = 1. Study how this works. m[0], 0 ack, 0

Alternating Bit Protocol program ABP; {program for process S} define sent, b : 0 or 1 ; next : integer ; initially next = 0, sent = 1, b = 0, and channels are empty ; do sent ≠b  send (m[next], b); next := next +1; sent := b  (ack, j) is received  if j = b  b := 1-b  j ≠ b  skip fi timeout (R,S)  send (m[next-1], b) od {program for process R} define j : 0 or 1; {initially j = 0}; do(m[next], b) is received  if j = b  accept the message; send (ack, j); j:= 1 - j  j ≠ b  send (ack, 1-j) fi od S R

How TCP works Supports end-to-end logical connection between any two computers on the Internet. Basic idea is the same as those of sliding window protocols When is it safe to re-use a sequence number seq? If it is unique. With a high probability, a random 32 or 64-bit number is unique. Also, current seq numbers are flushed out of the system after a time = 2  where  is the round trip delay.

How TCP works

Three way handshake. Why is the knowledge of roundtrip delay important? Sequence numbers are unique with high probability. What if the window is small/large? What if the timeout period is small/large? Adaptive retransmission: receiver can throttle sender and control the window size to save its buffer space.

Distributed Consensus Reaching agreement. Fundamental problem in distributed computing. Many examples Leader election / Mutual Exclusion Commit or Abort in distributed transactions Reaching agreement about which process has failed

Distributed Consensus No failure - trivial Consensus in presence of failures can be surprisingly complex.

Problem Specification u3 u2 u1 u0 v v v v Here, v equals the input value at some input line. All outputs must be identical. input output p0 p1 p2 p3

Problem Specification Termination. Every non-faulty process must eventually decide. Agreement. The final decision of every non-faulty process must be identical. Validity. If every non-faulty process begins with the same initial value v, then their final decision must be v.

Asynchronous Consensus Seven members of a busy household decided to hire a cook, since they do not have time to prepare their own food. Each member of the household separately interviewed every applicant for the cook’s position. Depending on how it went, each member formed his or her independent opinion "yes" (means “hire”) or "no" (means “don't hire”). These members will now have to communicate with one another to reach a uniform final decision about whether the applicant will be hired. The process will be repeated with the next applicant, until someone is hired.

Asynchronous Consensus Consider various modes of communication. If there is no failure, then collect all inputs and apply a decision function on the bag of inputs. Easy!

Asynchronous Consensus In a purely asynchronous distributed system consensus problem is impossible to solve if even a single process crashes Famous result due to Fischer, Lynch, Patterson (commonly known as FLP 85)