Presentation is loading. Please wait.

Presentation is loading. Please wait.

© 2013 A. Haeberlen, Z. Ives NETS 212: Scalable and Cloud Computing 1 University of Pennsylvania Internet basics; Faults and failures September 10, 2013.

Similar presentations


Presentation on theme: "© 2013 A. Haeberlen, Z. Ives NETS 212: Scalable and Cloud Computing 1 University of Pennsylvania Internet basics; Faults and failures September 10, 2013."— Presentation transcript:

1 © 2013 A. Haeberlen, Z. Ives NETS 212: Scalable and Cloud Computing 1 University of Pennsylvania Internet basics; Faults and failures September 10, 2013

2 © 2013 A. Haeberlen, Z. Ives Announcements HW0 is due at 10:00pm today If you haven't received your svn account info yet, please see me after class HW1 will be available tonight MS1 due 9/17; MS2 due 9/26 [MS2 is still beta!] AWS credit codes will be mailed out later today Please start early! Why not start tomorrow? Don't wait until the last moment -- you will need some time for debugging! Getting an AWS account may take some time (days) -- please sign up for the account soon! Anyone still on the waiting list? Please come see me after class (one last time) 2 University of Pennsylvania

3 © 2013 A. Haeberlen, Z. Ives Plan for today Parallel programming and its challenges Parallelization and scalability, Amdahl's law Synchronization, consistency Mutual exclusion, locking, issues related to locking Architectures: SMP, NUMA, Shared-nothing All about the Internet in 30 minutes Structure; packet switching; some important protocols Latency, packet loss, bottlenecks, and why they matter Distributed programming and its challenges Network partitions and the CAP theorem Faults, failures, and what we can do about them 3 University of Pennsylvania NEXT

4 © 2013 A. Haeberlen, Z. Ives The life and times of a web request What happens when I open the web page 'www.google.com' in my browser? First approximation: My computer contacts another computer in California and requests the web page from there 4 University of Pennsylvania "Please give me the web page" Server in California

5 © 2013 A. Haeberlen, Z. Ives "Please give me the web page" HTTP and HTML There are standardized protocols: Hypertext Transfer Protocol (HTTP): Describes how web pages are requested Hypertext Markup Language (HTML): The language the actual web page is written in How does the request make it to California? 5 University of Pennsylvania GET / HTTP/1.1 Server in California Google... HTTP/1.1 200 OK

6 © 2013 A. Haeberlen, Z. Ives The Internet The Internet consists of tens of thousands of interconnected networks Routers and switches forward the data from one network link to the next Request and response travel along a path through these networks (usually, but not always the 'shortest' path) 6 University of Pennsylvania Server in California Google UPenn Cogent AT&T Level 3 Router Switch Networks Individual network link Path Client

7 © 2013 A. Haeberlen, Z. Ives Packet switching Communication consists of packets Each packet traverses the path independently No dedicated connection like in the telephone network Packets are relatively small (typically up to 1,500 bytes) Why is this a good idea? 7 University of Pennsylvania Google UPenn Cogent AT&T Level 3 Server in California Client

8 © 2013 A. Haeberlen, Z. Ives IP addresses How do routers know where to send a packet? Each machine is assigned an IP address Machines in the same network are given similar addresses, usually from an IP range (Example: Penn's IP range is 158.130.0.0/16) Each packet has a source and a destination address Each router has a forwarding table that maps ranges to links over which packets in that range should be sent 8 University of Pennsylvania Google UPenn Cogent AT&T Level 3 173.194.34.104 158.130.53.72 ? 4 Bit 0 Bit 31 Source IP Destination IP (data) Indicates this is an IPv4 packet

9 © 2013 A. Haeberlen, Z. Ives A A A A IP routing Networks exchange routing information If a connection or router fails, this information is updated Result: Global reachability. Any machine on the Internet can (in principle) communicate with any other machine. L L M M I I J J N N E E K K G G C C B B D D F F H H I know how to get to A Networks 9 University of Pennsylvania

10 © 2013 A. Haeberlen, Z. Ives Plan for today Parallel programming and its challenges Parallelization and scalability, Amdahl's law Synchronization, consistency Mutual exclusion, locking, issues related to locking Architectures: SMP, NUMA, Shared-nothing All about the Internet in 30 minutes Structure; packet switching; some important protocols Latency, packet loss, bottlenecks, and why they matter Distributed programming and its challenges Network partitions and the CAP theorem Faults, failures, and what we can do about them 10 University of Pennsylvania NEXT

11 © 2013 A. Haeberlen, Z. Ives Path properties: Bottleneck capacity How fast can we send data on our path? Limited by the bottleneck capacity What else does the available capacity depend on? Which links are usually the bottleneck links? 11 University of Pennsylvania Server Client Bottleneck

12 © 2013 A. Haeberlen, Z. Ives Path properties: Propagation delay Speed of light: 299 792 458 m/s Latency matters! 12 University of Pennsylvania [ahae@ds01 ~]$ traceroute www.mpi-sws.org traceroute to www.mpi-sws.org (139.19.1.156), 30 hops max, 60 byte packets 1 SUBNET-46-ROUTER.seas.UPENN.EDU (158.130.46.1) 1.744 ms 2.134 ms 2.487 ms 2 158.130.21.34 (158.130.21.34) 5.327 ms 5.395 ms 5.649 ms 3 isc-uplink-2.seas.upenn.edu (158.130.128.2) 5.671 ms 5.825 ms 6.175 ms 4 external3-core1.dccs.UPENN.EDU (128.91.9.2) 6.007 ms 6.283 ms 6.362 ms 5 external-core2.dccs.upenn.edu (128.91.10.1) 6.830 ms 6.990 ms 7.080 ms 6 local.upenn.magpi.net (216.27.100.73) 7.250 ms 3.429 ms 3.533 ms 7 remote.internet2.magpi.net (216.27.100.54) 4.487 ms 3.002 ms 2.925 ms 8 198.32.11.51 (198.32.11.51) 90.557 ms 90.806 ms 91.028 ms 9 so-6-2-0.rt1.fra.de.geant2.net (62.40.112.57) 97.403 ms 97.473 ms 97.766 ms 10 dfn-gw.rt1.fra.de.geant2.net (62.40.124.34) 98.834 ms 98.890 ms 99.043 ms 11 xr-fzk1-te2-3.x-win.dfn.de (188.1.145.50) 100.627 ms 101.034 ms 101.387 ms 12 xr-kai1-te1-1.x-win.dfn.de (188.1.145.102) 103.985 ms 104.383 ms 104.528 ms 13 xr-saa1-te1-1.x-win.dfn.de (188.1.145.97) 103.636 ms 103.903 ms 104.139 ms 14 kr-0unisb.x-win.dfn.de (188.1.234.38) 103.983 ms 103.746 ms 103.853 ms 15 mpi2rz-hsrp2.net.uni-saarland.de (134.96.6.28) 104.469 ms 104.355 ms 104.491 ms [ahae@ds01 ~]$ ~6,270km (one way) Round-trip time

13 © 2013 A. Haeberlen, Z. Ives RTT versus geographical distance 13 University of Pennsylvania Source: http://www.caida.org/projects/ark/statistics/otp-ro/rtt_vs_distance.html Theoretical best (given speed of light in fiber)

14 © 2013 A. Haeberlen, Z. Ives Path properties: What if we send packets too quickly? Router stores the packets in a queue until it can send them Consequence : End-to-end delay increases Where does this matter? What if the router runs out of queue space? Packets are dropped and lost Other reasons why packets might be dropped? 14 University of Pennsylvania Queueing delay, loss

15 © 2013 A. Haeberlen, Z. Ives TCP Transmission Control Protocol (TCP) provides abstraction of a reliable stream of bytes Ensures packets are delivered to application in correct order Retransmits lost packets Tracks available capacity and prevents packets from being sent too fast (congestion control) Prevents sender from overwhelming the receiver (flow control) 15 University of Pennsylvania 1 2 3 4 IP 1 24 Sender Receiver TCP Data packets ACK 1ACK 2 Acknowledgments

16 © 2013 A. Haeberlen, Z. Ives TCP congestion control How fast should the sender send? Problem: Available capacity not known (and can vary) Solution: Congestion control Maintain a congestion window of max #packets in flight Slow start: Exponential increase until threshold Increase cwnd by one packet for each incoming ACK Congestion avoidance: Additive increase, multiplicative decrease (AIMD) 16 University of Pennsylvania Congestion window (cwnd) Time -50% "Slow start" phase (actually fast!) ssthresh packet loss

17 © 2013 A. Haeberlen, Z. Ives Another reason why latency matters The higher the RTT, the slower the process 17 University of Pennsylvania SenderReceiver SenderReceiver... 1-1460 1461-2920 ACK 1460 1-1460 ACK 1460 1461-2920 2921-4380 ACK 2920 ACK 4380

18 © 2013 A. Haeberlen, Z. Ives Recap: The Internet in 30 minutes What is the Internet? Tens of thousands of interconnected networks Technology: Packet switching (not like telephone network!) How does the network matter to applications? Propagation delay Good to be physically close to customer Bottlenecks Transfer speed is limited Queueing delays, loss, reordering Delay can vary Network can partition Problem for consistency/availability Some of these can be taken care of by TCP 18 University of Pennsylvania

19 © 2013 A. Haeberlen, Z. Ives A quick look at HW1 19 University of Pennsylvania

20 © 2013 A. Haeberlen, Z. Ives A quick look at HW1 Task: Build a cloud-based "Image search" Goal: Get experience with JavaScript, jQuery, Node.js, SimpleDB; start working with large data sets (DBpedia) We've provided most of the code (and detailed instructions), but you have to fill in some missing pieces 20 University of Pennsylvania

21 © 2013 A. Haeberlen, Z. Ives How does this work? 21 University of Pennsylvania Your VM Browser … Web page (home.ejs) function foo() { $("#id").html("x"); } Script (app.js) DOM accesses Server require('http'); http.createServer (…) Server (HW1.js) Amazon SimpleDB Internet

22 © 2013 A. Haeberlen, Z. Ives What is JavaScript? A widely-used programming language Started out at Netscape in 1995 Widely used on the web; supported by every major browser Also used in many other places: PDFs, certain games,...... and now even on the server side (Node.js)! What is it like? Dynamic typing, duck typing Object-based, but associative arrays instead of 'classes' Prototypes instead of inheritance Supports run-time evaluation via eval() First-class functions 22 University of Pennsylvania

23 © 2013 A. Haeberlen, Z. Ives Running JavaScript in the browser Web pages can contain JavaScript code Example: Pop up a dialog box when user clicks a button The code can receive user inputs (e.g., clicks) and produce outputs, e.g., by changing the web page in which it runs This is done via the DOM (Document Object Model) Not just a toy language: Entire applications are being written in it (think Google Apps!) 23 University of Pennsylvania function update(){ document.getElementById("color"). innerHTML = "Green"; } Red Uses DOM to change text on page Event handler

24 © 2013 A. Haeberlen, Z. Ives What is jQuery? A lightweight JavaScript library Makes many common functions, such as DOM manipulation or AJAX, much easier to implement (typically single line) Examples: $("#id").html(), $("#id").click(), $.getJSON(),... Widely used (Google, Microsoft, IBM, Netflix,...) 24 University of Pennsylvania This is some bold text in a paragraph. Show Text Show HTML $(document).ready(function(){ $("#btn1").click(function(){ alert("Text: " + $("#test").text()); }); $("#btn2").click(function(){ alert("HTML: " + $("#test").html()); }); }); test.html app.js

25 © 2013 A. Haeberlen, Z. Ives Bootstrap A toolbox for creating web sites HTML/CSS-based design templates for typography, forms, buttons, navigation, and other interface elements Can do popups, navbars, progress bars,... 25 University of Pennsylvania

26 © 2013 A. Haeberlen, Z. Ives What is Node.js? A platform for JavaScript-based network apps Based on Google's JavaScript engine from Chrome Comes with a built-in HTTP server library Lots of libraries and tools available; even has its own package manager (npm) Event-driven programming model There is a single "thread", which must never block If your program needs to wait for something (e.g., a response from some server you contacted), it must provide a callback function 26 University of Pennsylvania

27 © 2013 A. Haeberlen, Z. Ives "Hello World" with Node.js Uses built-in HTTP library to create a server The server will listen on port 8080 createServer() is given a callback function that is called whenever someone requests a web page Callback writes the required HTTP header followed by "Hello World" To view the result, open http://localhost:8080/ in a browser 27 University of Pennsylvania var http = require('http'); http.createServer( function (request, response) { response.writeHead(200, {'Content-Type': 'text/plain'}); response.end('Hello World\n'); } ).listen(8080); console.log('Server running' + ' at http://localhost:8080/'); GET / HTTP/1.1 HTTP/1.1 200 OK Content-Type: text/plain Hello World Callback function

28 © 2013 A. Haeberlen, Z. Ives What is JSON? A standard format for data interchange "JavaScript Object Notation"; MIME type application/json Basically legal JavaScript code; can be parsed with eval() Often used in AJAX-style applications Data types: Numbers, strings, booleans, arrays, "objects" 28 University of Pennsylvania { "firstName": "John", "lastName": "Smith", "age": 25, "address": { "streetAddress": "21 2nd Street", "city": "New York", "state": "NY", "postalCode": 10021 }, "phoneNumber": [ { "type": "home", "number": "212 555-1234" }, { "type": "fax", "number": "646 555-4567" } ] } Array (ordered sequence of values; can be different types) "Object": Unordered collection of key-value pairs

29 © 2013 A. Haeberlen, Z. Ives Calling the server 29 University of Pennsylvania $.getJSON('/search/' + $("#inputfield").val(), function(data) { $("#status").html(data.num_results+" result(s)"); } ); var express = require('express'); var app = express();... app.get('/search/:word', function(req, res) { var n = findResults(req.params.word); res.send(JSON.stringify({num_results: n, foo: 123})); }); Client code (in your browser) Server code (Node.js) GET /search/clouds HTTP/1.1 { num_results: 5, foo: 123 } Status: 5 result(s) Web page (in your browser) Waiting...

30 © 2013 A. Haeberlen, Z. Ives Fear not! You don't have to learn this all at once For HW1, we've already written most of the code for you All you need to do for milestone #1 is implement a small part of the application But it'll be useful to see how the pieces fit together The TAs are trying to organize a lab session on Friday; stay tuned for announcements on Piazza 30 University of Pennsylvania

31 © 2013 A. Haeberlen, Z. Ives Plan for today Parallel programming and its challenges Parallelization and scalability, Amdahl's law Synchronization, consistency Mutual exclusion, locking, issues related to locking Architectures: SMP, NUMA, Shared-nothing All about the Internet in 30 minutes Structure; packet switching; some important protocols Latency, packet loss, bottlenecks, and why they matter Distributed programming and its challenges Faults, failures, and what we can do about them Network partitions, CAP theorem, relaxed consistency 31 University of Pennsylvania NEXT

32 © 2013 A. Haeberlen, Z. Ives Complications in wide-area networks Communication is slower, less reliable Latencies are higher, more variable Bottleneck capacity is lower Packet loss, reordering, queueing delays Faults are more common Broken or malfunctioning nodes Network partitions 32 University of Pennsylvania

33 © 2013 A. Haeberlen, Z. Ives Faults and failures Terminology: Fault: Some component is not working correctly Failure: System as a whole is not working correctly 33 University of Pennsylvania X=5 Set X:=5 X=5 What is X? X=5 What is X? X=5 X=3 What is X? X=3 Fault (masked) Faults causing failure Correct

34 © 2013 A. Haeberlen, Z. Ives Faults in distributed systems What could possibly go wrong? Node loses power Hard disk fails Administrator accidentally erases data Administrator configures node incorrectly Software bug triggers Network overloaded, drops lots of packets Hacker breaks into some of the nodes Disgruntled employee manipulates node Fire breaks out in data center where node resides Police confiscates node because of illegal activity... 34 University of Pennsylvania

35 © 2013 A. Haeberlen, Z. Ives Common misconceptions about faults "Faults are rare exceptions" NO! At scale, faults are occurring all the time Stopping the system while handling the fault is NOT an option - system needs to continue despite the fault "Faulty machines always stop/crash" NO! There are many types of faults with different effects If your system is designed to handle only crash faults and another type of fault occurs, things can become very bad 35 University of Pennsylvania

36 © 2013 A. Haeberlen, Z. Ives Types of faults Crash faults Node simply stops Examples: OS crash, power loss Rational behavior Owner manipulates node to increase profit Example: Traffic attraction attack (see next slide) Byzantine faults Arbitrary - faulty node could do anything (stop, tamper with data, tell lies, attack other nodes, send spam, spy on user...) Example: Node compromised by a hacker, data corruption, hardware defect... 36 University of Pennsylvania

37 © 2013 A. Haeberlen, Z. Ives Rational fault example System + control are distributed 37 University of Pennsylvania $$$$$$$ $$$ $$$$ $$$ I need connectivity! Who knows how to get to YouTube? I have a good route I have an okay route Alice's provider can choose between several routes to the same destination Alice Traffic attraction: see Goldberg et al., "Rationality and Traffic Attraction: Incentives for honestly announcing paths in BGP", SIGCOMM 2010

38 © 2013 A. Haeberlen, Z. Ives Rational fault example Networks have an incentive to make their routes appear better than they are 38 University of Pennsylvania $$$$ $$$ I have a GREAT route to YouTube I wish my route had been chosen Alice

39 © 2013 A. Haeberlen, Z. Ives Some examples of Byzantine faults 39 University of Pennsylvania http://consumerist.com/2007/08/lax-meltdown-caused-by-a-single-network-interface-card.html

40 © 2013 A. Haeberlen, Z. Ives Some examples of Byzantine faults 40 University of Pennsylvania http://status.aws.amazon.com/s3-20080720.html

41 © 2013 A. Haeberlen, Z. Ives Some examples of Byzantine faults 41 University of Pennsylvania http://www.wired.com/epicenter/2009/01/magnolia-suffer/

42 © 2013 A. Haeberlen, Z. Ives Some examples of Byzantine faults 42 University of Pennsylvania http://groups.google.com/group/google-appengine/msg/ba95ded980c8c179

43 © 2013 A. Haeberlen, Z. Ives Some examples of Byzantine faults Disgruntled UBS PaineWebber Employee Charged with Allegedly Unleashing Logic Bomb on Company Computers NEWARK - A disgruntled computer systems administrator for UBS PaineWebber was charged today with using a logic bomb to cause more than $3 million in damage to the companys computer network, and with securities fraud for his failed plan to drive down the companys stock with activation of the logic bomb, U.S. Attorney Christopher J. Christie announced. Roger Duronio, 60, of Bogota, N.J., was charged today a two-count Indictment returned by a federal grand jury, according to Assistant U.S. Attorney William Devaney. The Indictment alleges that Duronio, who worked at PaineWebbers offices in Weehawken, N.J., planted the logic bomb in some 1,000 of PaineWebbers approximately 1,500 networked computers in branch offices around the country. Duronio, who repeatedly expressed dissatisfaction with his salary and bonuses at Paine Webber resigned from the company on Feb. 22, 2002. The logic bomb Duronio allegedly planted was activated on March 4, 2002. In anticipation that the stock price of UBS PaineWebbers parent company, UBS, A.G., would decline in response to damage caused by the logic bomb, Duronio also purchased more than $21,000 of put option contracts for UBS, A.G.s stock, according to the charging document. A put option is a type of security that increases in value when the stock price drops. Market conditions at the time suggest there was no such impact on the UBS, A.G. stock price. [...] The Indictment alleges that, from about November 2001 to February, Duronio constructed the logic bomb computer program. On March 4, as planned, Duronios program activated and began deleting files on over 1,000 of UBS PaineWebbers computers. It cost PaineWebber more than $3 million to assess and repair the damage, according to the Indictment. As one of the companys computer systems administrators, Duronio had responsibility for, and access to, the entire UBS PaineWebber computer network, according to the Indictment. He also had access to the network from his home computer via secure Internet access. [...] 43 University of Pennsylvania Source: http://www.justice.gov/criminal/cybercrime/duronioIndict.htm

44 © 2013 A. Haeberlen, Z. Ives Correlated faults A single problem can cause many faults Example: Overloaded machine crashes, increases load on other machines domino effect Example: Bug is triggered in a program that is used on lots of machines Example: Hacker manages to break into many computers due to a shared vulnerability Example: Machines may be connected to the same power grid, cooled by the same A/C, managed by the same admin... Why is this problematic? 44 University of Pennsylvania

45 © 2013 A. Haeberlen, Z. Ives Recap: Faults and failures Faults happen all the time Hardware malfunction, software bug, manipulation, hacker break-ins, misconfiguration,... NOT a rare occurrence at scale - must design system to handle them All faults are NOT independent crash faults Faults can be correlated Rational and Byzantine faults are real Three common fault models: Crash fault model: Faulty machines simply stop Rational model: Machines manipulated by selfish owners Byzantine fault model: Faulty machines could do anything 45 University of Pennsylvania

46 © 2013 A. Haeberlen, Z. Ives So what can we do? 46 University of Pennsylvania

47 © 2013 A. Haeberlen, Z. Ives What can we do? Prevention and avoidance Example: Prevent crashes with software verification Example: Provide incentives for participation Detection Example: Cross-check network's route announcements with other information to see whether it is lying, and hold it accountable if it is (e.g., sue for breach of contract) Masking Example: Store replicas of the data on multiple nodes; if data is lost or corrupted on one of them, we still have the other copies Mitigation 47 University of Pennsylvania

48 © 2013 A. Haeberlen, Z. Ives Masking faults with replication 48 University of Pennsylvania Server A Server B Alice Bob Alice can store her data on both servers Bob can get the data from either server A single crash fault on a server does not lead to a failure Availability is maintained What about other types faults, or multiple faults?

49 © 2013 A. Haeberlen, Z. Ives Problem: Maintaining consistency What if multiple clients are accessing the same set of replicas? Requests may be ordered differently by different replicas Result: Inconsistency! (remember race conditions?) For what types of requests can this happen? What do we need to do to maintain consistency? 49 University of Pennsylvania Server A Server B Alice Bob X:=5 X:=7 X:=5 X:=7

50 © 2013 A. Haeberlen, Z. Ives Types of consistency Strong consistency After an update completes, any subsequent access will return the updated value Weak consistency Updated value not guaranteed to be returned immediately, only after some conditions are met (inconsistency window) Eventual consistency A specific type of weak consistency If no new updates are made to the object, eventually all accesses will return the last updated value 50 University of Pennsylvania

51 © 2013 A. Haeberlen, Z. Ives Example: Storage system Scenario: Replicated storage We have N nodes that can store data Data contains a monotonically increasing timestamp To write a value: Pick W replicas and write the value to each, using a fresh timestamp (say, the current wallclock time) To read a value: Pick R replicas and read the value from each Return the value with the highest timestamp If any replicas had a lower timestamp, send them the newer value 51 University of Pennsylvania X=3 v1 X=5 v2 X=2 v4 X=5 v2 Replica

52 © 2013 A. Haeberlen, Z. Ives How to set N, R, and W For strong consistency? What happens otherwise? Will the data ever become consistent again? To avoid conflicting writes? To make reads fast? Writes? To minimize the risk of data loss? Let's do some examples! N=2, W=2, R=1 N=2, W=1, R=1 52 University of Pennsylvania Read set Write set

53 © 2013 A. Haeberlen, Z. Ives Consensus Replicas need to agree on a single order in which to execute client requests How can we do this? Does the specific order matter? Problem: What if some replicas are faulty? Crash fault: Replica does not respond; no progress (bad) Byzantine fault: Replica might tell lies, corrupt order (worse) Solution: Consensus protocol Paxos (for crash faults), PBFT (for Byzantine faults) Works as long as no more than a certain fraction of the replicas are faulty (PBFT: one third) 53 University of Pennsylvania

54 © 2013 A. Haeberlen, Z. Ives How do consensus protocols work? Idea: Correct replicas 'outvote' faulty ones Clients send requests to each of the replicas Replicas coordinate and each return a result Client chooses one of the results, e.g., the one that is returned by the largest number of replicas If a small fraction of the replicas returns the wrong result, or no result at all, they are 'outvoted' by the other replicas 54 University of Pennsylvania

55 © 2013 A. Haeberlen, Z. Ives Plan for today Parallel programming and its challenges Parallelization and scalability, Amdahl's law Synchronization, consistency Mutual exclusion, locking, issues related to locking Architectures: SMP, NUMA, Shared-nothing All about the Internet in 30 minutes Structure; packet switching; some important protocols Latency, packet loss, bottlenecks, and why they matter Distributed programming and its challenges Faults, failures, and what we can do about them Network partitions, CAP theorem, relaxed consistency 55 University of Pennsylvania NEXT

56 © 2013 A. Haeberlen, Z. Ives Network can partition Hardware fault, router misconfigured, undersea cable cut,... Result: Gobal connectivity is lost What does this mean for the properties of our system? 56 University of Pennsylvania Server A Server B What if this link breaks? Alice Bob Network partitions

57 © 2013 A. Haeberlen, Z. Ives The CAP theorem What we want from a web system: Consistency: All clients share the same view of the data, even in the presence of concurrent updates Availability: All clients can access at least one replica of the data, even when faults occur Partition-tolerance: Consistency and availability hold even when the network partitions Can we get all three? CAP theorem: We can get at most two out of the three Which ones should we choose for a given system? Conjecture by Brewer; proven by Gilbert and Lynch 57 University of Pennsylvania

58 © 2013 A. Haeberlen, Z. Ives Common CAP choices Example #1: Consistency & Partition tolerance Many replicas + consensus protocol Do not accept new write requests during partitions Certain functions may become unavailable Example #2: Availability & Partition tolerance Many replicas + relaxed consistency Continue accepting write requests Clients may see inconsistent state during partitions 58 University of Pennsylvania

59 © 2013 A. Haeberlen, Z. Ives Relaxed consistency: ACID vs. BASE Classical database systems: ACID semantics Atomicity Consistency Isolation Durability Modern Internet systems: BASE semantics Basically Available Scalable Eventually consistent 59 University of Pennsylvania

60 © 2013 A. Haeberlen, Z. Ives Eventual consistency Idea: Optimistically allow updates Don't coordinate with ALL replicas before returning response But ensure that updates reach all replicas eventually What do we do if conflicting updates were made to different replicas? Good: Decouples replicas. Better performance, availability under partitions (Potentially) bad: Clients can see inconsistent state 60 University of Pennsylvania Server A Server B Alice

61 © 2013 A. Haeberlen, Z. Ives Recap: Consistency and partitions Use replication to mask limited # of faults Can achieve strong consistency by having replicas agree on a common request ordering Even non-crash faults can be handled, as long as there are not too many of them (typical limit: 1/3) Partition tolerance, availability, consistency? Can't have all three (CAP theorem) For some services, need to drop one (usually availability) If service works with weaker consistency guarantees, such as eventual consistency, can get a compromise (BASE) Example: Shopping cart 61 University of Pennsylvania

62 © 2013 A. Haeberlen, Z. Ives Plan for today Parallel programming and its challenges Parallelization and scalability, Amdahl's law Synchronization, consistency Mutual exclusion, locking, issues related to locking Architectures: SMP, NUMA, Shared-nothing All about the Internet in 30 minutes Structure; packet switching; some important protocols Latency, packet loss, bottlenecks, and why they matter Distributed programming and its challenges Faults, failures, and what we can do about them Network partitions, CAP theorem, relaxed consistency 62 University of Pennsylvania

63 © 2013 A. Haeberlen, Z. Ives Stay tuned Next time you will learn about: Cloud basics; Amazon AWS 63 University of Pennsylvania


Download ppt "© 2013 A. Haeberlen, Z. Ives NETS 212: Scalable and Cloud Computing 1 University of Pennsylvania Internet basics; Faults and failures September 10, 2013."

Similar presentations


Ads by Google