Presentation is loading. Please wait.

Presentation is loading. Please wait.

© 2013 A. Haeberlen, Z. Ives Internet Basics; Faults & Failures; Cloud Platforms NETS 212: Scalable & Cloud Computing Fall 2014 Z. Ives University of Pennsylvania.

Similar presentations


Presentation on theme: "© 2013 A. Haeberlen, Z. Ives Internet Basics; Faults & Failures; Cloud Platforms NETS 212: Scalable & Cloud Computing Fall 2014 Z. Ives University of Pennsylvania."— Presentation transcript:

1 © 2013 A. Haeberlen, Z. Ives Internet Basics; Faults & Failures; Cloud Platforms NETS 212: Scalable & Cloud Computing Fall 2014 Z. Ives University of Pennsylvania 1

2 © 2013 A. Haeberlen, Z. Ives Reminders Homework 1 Milestone 1 due Thursday @ 10PM If you’re still having issues with svn, please come to TA office hours! Homework 1 Milestone 2 due next Thursday @ 10PM University of Pennsylvania 2

3 © 2013 A. Haeberlen, Z. Ives Below HTTP: Routing University of Pennsylvania3

4 © 2013 A. Haeberlen, Z. Ives The Internet The Internet consists of tens of thousands of interconnected networks Routers and switches forward the data from one network link to the next Request and response travel along a path through these networks (usually, but not always the 'shortest' path) University of Pennsylvania 4 Server in California Google UPenn Cogent AT&T Level 3 Router Switch Networks Individual network link Path Client

5 © 2013 A. Haeberlen, Z. Ives Packet switching Communication consists of packets Each packet traverses the path independently No dedicated connection like in the telephone network Packets are relatively small (typically up to 1,500 bytes) Why is this a good idea? University of Pennsylvania 5 Google UPenn Cogent AT&T Level 3 Server in California Client

6 © 2013 A. Haeberlen, Z. Ives IP addresses How do routers know where to send a packet? Each machine is assigned an IP address Machines in the same network are given similar addresses, usually from an IP range (Example: Penn's IP range is 158.130.0.0/16) Each packet has a source and a destination address Each router has a forwarding table that maps ranges to links over which packets in that range should be sent 6 Google UPenn Cogent AT&T Level 3 173.194.34.104 158.130.53.72 ? 4 Bit 0 Bit 31 Source IP Destination IP (data) Indicates this is an IPv4 packet

7 © 2013 A. Haeberlen, Z. Ives A A A A IP routing Networks exchange routing information If a connection or router fails, this information is updated Result: Global reachability. Any machine on the Internet can (in principle) communicate with any other machine. University of Pennsylvania 7 L L M M I I J J N N E E K K G G C C B B D D F F H H I know how to get to A Networks

8 © 2013 A. Haeberlen, Z. Ives Path properties: Bottleneck capacity How fast can we send data on our path? Limited by the bottleneck capacity What else does the available capacity depend on? Which links are usually the bottleneck links? University of Pennsylvania 8 Server Client Bottleneck

9 © 2013 A. Haeberlen, Z. Ives Path properties: Propagation delay Speed of light: 299 792 458 m/s Latency matters! University of Pennsylvania 9 [ahae@ds01 ~]$ traceroute www.mpi-sws.org traceroute to www.mpi-sws.org (139.19.1.156), 30 hops max, 60 byte packets 1 SUBNET-46-ROUTER.seas.UPENN.EDU (158.130.46.1) 1.744 ms 2.134 ms 2.487 ms 2 158.130.21.34 (158.130.21.34) 5.327 ms 5.395 ms 5.649 ms 3 isc-uplink-2.seas.upenn.edu (158.130.128.2) 5.671 ms 5.825 ms 6.175 ms 4 external3-core1.dccs.UPENN.EDU (128.91.9.2) 6.007 ms 6.283 ms 6.362 ms 5 external-core2.dccs.upenn.edu (128.91.10.1) 6.830 ms 6.990 ms 7.080 ms 6 local.upenn.magpi.net (216.27.100.73) 7.250 ms 3.429 ms 3.533 ms 7 remote.internet2.magpi.net (216.27.100.54) 4.487 ms 3.002 ms 2.925 ms 8 198.32.11.51 (198.32.11.51) 90.557 ms 90.806 ms 91.028 ms 9 so-6-2-0.rt1.fra.de.geant2.net (62.40.112.57) 97.403 ms 97.473 ms 97.766 ms 10 dfn-gw.rt1.fra.de.geant2.net (62.40.124.34) 98.834 ms 98.890 ms 99.043 ms 11 xr-fzk1-te2-3.x-win.dfn.de (188.1.145.50) 100.627 ms 101.034 ms 101.387 ms 12 xr-kai1-te1-1.x-win.dfn.de (188.1.145.102) 103.985 ms 104.383 ms 104.528 ms 13 xr-saa1-te1-1.x-win.dfn.de (188.1.145.97) 103.636 ms 103.903 ms 104.139 ms 14 kr-0unisb.x-win.dfn.de (188.1.234.38) 103.983 ms 103.746 ms 103.853 ms 15 mpi2rz-hsrp2.net.uni-saarland.de (134.96.6.28) 104.469 ms 104.355 ms 104.491 ms [ahae@ds01 ~]$ ~6,270km (one way) Round-trip time

10 © 2013 A. Haeberlen, Z. Ives Path properties: Queueing delay What if we send packets too quickly? Router stores the packets in a queue until it can send them Consequence : End-to-end delay increases Where does this matter? What if the router runs out of queue space? Packets are dropped and lost Other reasons why packets might be dropped? University of Pennsylvania 10

11 © 2013 A. Haeberlen, Z. Ives TCP Transmission Control Protocol (TCP) provides abstraction of a reliable stream of bytes Ensures packets are delivered to application in correct order Retransmits lost packets Tracks available capacity and prevents packets from being sent too fast (congestion control) Prevents sender from overwhelming the receiver (flow control) University of Pennsylvania 11 1 2 3 4 IP 1 24 Sender Receiver TCP Data packets ACK 1ACK 2 Acknowledgments

12 © 2013 A. Haeberlen, Z. Ives TCP congestion control How fast should the sender send? Problem: Available capacity not known (and can vary) Solution: Congestion control Maintain a congestion window of max #packets in flight Slow start: Exponential increase until threshold Increase cwnd by one packet for each incoming ACK Congestion avoidance: Additive increase, multiplicative decrease (AIMD) University of Pennsylvania 12 Congestion window (cwnd) Time -50% "Slow start" phase (actually fast!) ssthresh packet loss

13 © 2013 A. Haeberlen, Z. Ives Recap: The Internet in 30 minutes What is the Internet? Tens of thousands of interconnected networks Technology: Packet switching (not like telephone network!) How does the network matter to applications? Propagation delay  Good to be physically close to customer Bottlenecks  Transfer speed is limited Queueing delays, loss, reordering  Delay can vary Network can partition  Problem for consistency/availability Some of these can be taken care of by TCP University of Pennsylvania 13

14 © 2013 A. Haeberlen, Z. Ives What Can Go Wrong? University of Pennsylvania14

15 © 2013 A. Haeberlen, Z. Ives Complications in wide-area networks Communication is slower, less reliable Latencies are higher, more variable Bottleneck capacity is lower Packet loss, reordering, queueing delays Faults are more common Broken or malfunctioning nodes Network partitions University of Pennsylvania 15

16 © 2013 A. Haeberlen, Z. Ives Faults and failures Terminology: Fault: Some component is not working correctly Failure: System as a whole is not working correctly University of Pennsylvania 16 X=5 Set X:=5 X=5 What is X? X=5 What is X? X=5 X=3 What is X? X=3 Fault (masked) Faults causing failure Correct

17 © 2013 A. Haeberlen, Z. Ives Faults in distributed systems What could possibly go wrong? Node loses power Hard disk fails Administrator accidentally erases data Administrator configures node incorrectly Software bug triggers Network overloaded, drops lots of packets Hacker breaks into some of the nodes Disgruntled employee manipulates node Fire breaks out in data center where node resides Police confiscates node because of illegal activity... University of Pennsylvania 17

18 © 2013 A. Haeberlen, Z. Ives Common misconceptions about faults "Faults are rare exceptions" NO! At scale, faults are occurring all the time Stopping the system while handling the fault is NOT an option - system needs to continue despite the fault "Faulty machines always stop/crash" NO! There are many types of faults with different effects If your system is designed to handle only crash faults and another type of fault occurs, things can become very bad University of Pennsylvania 18

19 © 2013 A. Haeberlen, Z. Ives Types of faults Crash faults Node simply stops Examples: OS crash, power loss Rational behavior Owner manipulates node to increase profit Example: Lying about performance to get a sale Byzantine faults Arbitrary - faulty node could do anything (stop, tamper with data, tell lies, attack other nodes, send spam, spy on user...) Example: Node compromised by a hacker, data corruption, hardware defect... University of Pennsylvania 19

20 © 2013 A. Haeberlen, Z. Ives Example Byzantine fault University of Pennsylvania 20 http://status.aws.amazon.com/s3-20080720.html

21 © 2013 A. Haeberlen, Z. Ives Correlated faults A single problem can cause many faults Overloaded machine crashes, increases load on other machines  domino effect Bug is triggered in a program that is used on lots of machines Hacker manages to break into many computers due to a shared vulnerability Machines may be connected to the same power grid, cooled by the same A/C, managed by the same admin... University of Pennsylvania 21

22 © 2013 A. Haeberlen, Z. Ives Recap: Faults and failures Faults happen all the time Hardware malfunction, software bug, manipulation, hacker break-ins, misconfiguration,... NOT a rare occurrence at scale - must design system to handle them All faults are NOT independent crash faults Faults can be correlated Rational and Byzantine faults are real Three common fault models: Crash fault model: Faulty machines simply stop Rational model: Machines manipulated by selfish owners Byzantine fault model: Faulty machines could do anything University of Pennsylvania 22

23 © 2013 A. Haeberlen, Z. Ives So what can we do? University of Pennsylvania23

24 © 2013 A. Haeberlen, Z. Ives What can we do? Prevention and avoidance Example: Prevent crashes with software verification Example: Provide incentives for participation Detection Example: Cross-check network's route announcements with other information to see whether it is lying, and hold it accountable if it is (e.g., sue for breach of contract) Masking Example: Store replicas of the data on multiple nodes; if data is lost or corrupted on one of them, we still have the other copies Mitigation University of Pennsylvania 24

25 © 2013 A. Haeberlen, Z. Ives Masking faults with replication Alice can store her data on both servers Bob can get the data from either server A single crash fault on a server does not lead to a failure Availability is maintained What about other types faults, or multiple faults? University of Pennsylvania 25 Server A Server B Alice Bob

26 © 2013 A. Haeberlen, Z. Ives Problem: Maintaining consistency What if multiple clients are accessing the same set of replicas? Requests may be ordered differently by different replicas Result: Inconsistency! (remember race conditions?) For what types of requests can this happen? What do we need to do to maintain consistency? University of Pennsylvania 26 Server A Server B Alice Bob X:=5 X:=7 X:=5 X:=7

27 © 2013 A. Haeberlen, Z. Ives Types of consistency Strong consistency After an update completes, any subsequent access will return the updated value Weak consistency Updated value not guaranteed to be returned immediately, only after some conditions are met (inconsistency window) Eventual consistency A specific type of weak consistency If no new updates are made to the object, eventually all accesses will return the last updated value University of Pennsylvania 27

28 © 2013 A. Haeberlen, Z. Ives Example: Storage system Scenario: Replicated storage We have N nodes that can store data Data contains a monotonically increasing timestamp To write a value: Pick W replicas and write the value to each, using a fresh timestamp (say, the current wallclock time) To read a value: Pick R replicas and read the value from each Return the value with the highest timestamp If any replicas had a lower timestamp, send them the newer value University of Pennsylvania 28 X=3 v1 X=5 v2 X=2 v4 X=5 v2 Replica

29 © 2013 A. Haeberlen, Z. Ives Consensus Replicas need to agree on a single order in which to execute client requests How can we do this? Does the specific order matter? Problem: What if some replicas are faulty? Crash fault: Replica does not respond; no progress (bad) Byzantine fault: Replica might tell lies, corrupt order (worse) Solution: Consensus protocol Paxos (for crash faults), PBFT (for Byzantine faults) Works as long as no more than a certain fraction of the replicas are faulty (PBFT: one third) University of Pennsylvania 29

30 © 2013 A. Haeberlen, Z. Ives How do consensus protocols work? Idea: Correct replicas 'outvote' faulty ones Clients send requests to each of the replicas Replicas coordinate and each return a result Client chooses one of the results, e.g., the one that is returned by the largest number of replicas If a small fraction of the replicas returns the wrong result, or no result at all, they are 'outvoted' by the other replicas University of Pennsylvania 30

31 © 2013 A. Haeberlen, Z. Ives What If the Network Breaks? University of Pennsylvania31

32 © 2013 A. Haeberlen, Z. Ives Network partitions Network can partition Hardware fault, router misconfigured, undersea cable cut,... Result: Gobal connectivity is lost What does this mean for the properties of our system? University of Pennsylvania 32 Server A Server B What if this link breaks? Alice Bob

33 © 2013 A. Haeberlen, Z. Ives The CAP theorem What we want from a web system: Consistency: All clients share the same view of the data, even in the presence of concurrent updates Availability: All clients can access at least one replica of the data, even when faults occur Partition-tolerance: Consistency and availability hold even when the network partitions Can we get all three? CAP theorem: We can get at most two out of the three Which ones should we choose for a given system? Conjecture by Brewer; proven by Gilbert and Lynch University of Pennsylvania 33

34 © 2013 A. Haeberlen, Z. Ives Common CAP choices Example #1: Consistency & Partition tolerance Many replicas + consensus protocol Do not accept new write requests during partitions Certain functions may become unavailable Example #2: Availability & Partition tolerance Many replicas + relaxed consistency Continue accepting write requests Clients may see inconsistent state during partitions University of Pennsylvania 34

35 © 2013 A. Haeberlen, Z. Ives Relaxed consistency: ACID vs. BASE Classical database systems: ACID semantics Atomicity Consistency Isolation Durability Modern Internet systems: BASE semantics Basically Available Scalable Eventually consistent University of Pennsylvania 35

36 © 2013 A. Haeberlen, Z. Ives Recap: Consistency and partitions Use replication to mask limited # of faults Can achieve strong consistency by having replicas agree on a common request ordering Even non-crash faults can be handled, as long as there are not too many of them (typical limit: 1/3) Partition tolerance, availability, consistency? Can't have all three (CAP theorem) For some services, need to drop one (usually availability) If service works with weaker consistency guarantees, such as eventual consistency, can get a compromise (BASE) Example: Shopping cart University of Pennsylvania 36

37 © 2013 A. Haeberlen, Z. Ives Cloud Computing University of Pennsylvania37

38 © 2013 A. Haeberlen, Z. Ives History: The early days Cloud computing: A new term for a concept that has been around since the 1960s Who invented it? No agreement. Some candidates: John McCarthy (Stanford professor and inventor of Lisp; proposed the 'service bureau' model in 1961) J.C.R. Licklider (contributed key ideas to ARPANET; published a memo on the "Intergalactic Computer Network" in 1963) Douglas Parkhill (published a book on "The Challenge of the Computer Utility" in 1966) University of Pennsylvania 38

39 © 2013 A. Haeberlen, Z. Ives History: Becoming a cloud provider Early 2000s: Phenomenal growth of web services Many large Internet companies deploy huge data centers, develop scalable software infrastructure to run them Due to economies of scale, these companies were now able to run computation very cheaply What else can we do with this? University of Pennsylvania 39 TechnologyCost in medium DC (~1,000 servers) Cost in large DC (~50,000 servers) Ratio Network$95 per Mbit/sec/month$13 per Mbit/sec/month7.1 Storage$2.20 per GByte/month$0.40 per GByte/month5.7 Administration~140 servers/admin>1,000 servers/admin7.1 Source: James Hamilton's Keynote, LADIS 2008

40 © 2013 A. Haeberlen, Z. Ives History: Incentives Idea: Use your existing data center to provide cloud services Why is this a good idea? Make a lot of money Price advantage of 3x-7x  Can offer services much cheaper than medium-size company and still make profit Leverage existing investment New revenue stream at low incremental cost (example: many Amazon AWS technologies were initially developed for Amazon's internal operations) Defend a franchise Example: Microsoft enterprise + development apps  Microsoft Azure University of Pennsylvania 40

41 © 2013 A. Haeberlen, Z. Ives History: Incentives (continued) Attack an incumbent Company with requisite datacenter may want to establish a 'beach head' before a '800 pound gorilla' emerges Leverage existing customer relationships IT service organizations like IBM Global Services have extensive customer relationships; provide anxiety-free migration path to existing customers Become a platform Example: Facebook's initiative to enable plug-in applications is a great fit for cloud computing University of Pennsylvania 41

42 © 2013 A. Haeberlen, Z. Ives History: The pioneers Jul 2002: Amazon Web Services launched Third-party sites can search and display products from Amazon's web site, add items to Amazon shopping carts Available through XML and SOAP Mar 2006: Amazon S3 launched Innovative 'pay-per-use' pricing model, which is now the standard in cloud computing Cheaper than many small/medium storage solutions: $0.15/GB/month of storage, $0.20/GB/month for traffic Amazon no longer a pure retailer, entering technology space Aug 2006: EC2 launched Core computing infrastructure becomes available University of Pennsylvania 42

43 © 2013 A. Haeberlen, Z. Ives History: Wide-spread adoption Apr 2008: Google App Engine launched Same building blocks Google uses for its own applications: Bigtable and GFS for storage, automatic scaling and load balancing,... Nov 2009: Windows Azure Beta launched Becomes generally available in 21 countries in Feb 2010 Microsoft’s online services are gradually transitioning to Azure Dec 2013: Google Compute Engine launched Provides lower level support vs. App Engine, gives full set of services Dramatically lower prices, quickly matched by AWS and Azure University of Pennsylvania 43

44 © 2013 A. Haeberlen, Z. Ives One Set of Cloud Services: Amazon Web Services University of Pennsylvania44

45 © 2013 A. Haeberlen, Z. Ives Why Amazon AWS and not ? Amazon is only one of several cloud providers Others include Microsoft Azure, Google Cloud Engine / App Engine,... There is no common standard (yet) Initially, MS and Google supported PaaS (.NET and Java, resp.) Gradually each has grown to support both IaaS and PaaS AWS is PaaS/IaaS with a broad menu of choices So we had to pick one specific provider Amazon AWS is going to be used for the rest of this class Amazon's only involvement is providing free AWS cycles/storage Everything we do on AWS has an equivalent on Azure and GCE/GAE University of Pennsylvania 45 Insert your favorite cloud here

46 © 2013 A. Haeberlen, Z. Ives What is Amazon AWS? Amazon Web Services (AWS) provides a number of different services, including: Amazon Elastic Compute Cloud (EC2) Virtual machines for running custom software Amazon Simple Storage Service (S3) Simple key-value store, accessible as a web service Amazon DynamoDB Distributed “NoSQL” database, one of several in AWS Amazon Elastic MapReduce Scalable MapReduce computation Amazon Mechanical Turk (MTurk) A 'marketplace for work' Amazon CloudFront Content delivery network... University of Pennsylvania 46 Used for the projects

47 © 2013 A. Haeberlen, Z. Ives Setting up an AWS account University of Pennsylvania 47 aws.amazon.com Sign up for an account on aws.amazon.com You need to choose an username and a password These are for the management interface only Your programs will use other credentials (RSA keypairs, access keys,...) to interact with AWS

48 © 2013 A. Haeberlen, Z. Ives AWS credentials Why so many different types of credentials? University of Pennsylvania 48 Sign-in credentials X.509 certificates EC2 key pairs Access keys AWS web site and management console Command-line tools SOAP APIs REST APIs Connecting to an instance (e.g., via ssh)

49 © 2013 A. Haeberlen, Z. Ives The AWS management console Used to control many AWS services: For example, start/stop EC2 instances, create S3 buckets... University of Pennsylvania 49

50 © 2013 A. Haeberlen, Z. Ives REST and SOAP How do your programs access AWS? Via the REST or SOAP protocols Example: Launch an EC2 instance, store a value in S3,... Simple Object Access protocol (SOAP) Not as simple as the name suggests XML-based, extensible, general, standardized, but also somewhat heavyweight and verbose Increasingly deprecated (e.g., for SimpleDB and EC2) Representational State Transfer (REST) Much simpler to develop than SOAP Web-specific; lack of standards University of Pennsylvania 50

51 © 2013 A. Haeberlen, Z. Ives Example: REST University of Pennsylvania 51 https://sdb.amazonaws.com/?Action=PutAttributes &DomainName=MyDomain &ItemName=Item123 &Attribute.1.Name=Color&Attribute.1.Value=Blue &Attribute.2.Name=Size&Attribute.2.Value=Med &Attribute.3.Name=Price&Attribute.3.Value=0014.99 &AWSAccessKeyId= &Version=2009-04-15 &Signature=[valid signature] &SignatureVersion=2 &SignatureMethod=HmacSHA256 &Timestamp=2010-01-25T15%3A01%3A28-07%3A00 Success f6820318-9658-4a9d-89f8- b067c90904fc 0.0000219907 Sample requestSample response Source: http://awsdocs.s3.amazonaws.com/SDB/latest/sdb-dg.pdf Invoked method Parameters Credentials Response elements

52 © 2013 A. Haeberlen, Z. Ives Example: SOAP University of Pennsylvania 52 <SOAP-ENV:Envelope xmlns:SOAP-ENV='http://schemas.xmlsoap.org/soap/envelope/' xmlns:SOAP-ENC='http://schemas.xmlsoap.org/soap/encoding/' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xmlns:xsd='http://www.w3.org/2001/XMLSchema'> a1 2 a2 4 domain1 eID001 2009-04-15 4c68e051-fe45-43b2-992a- a24017ffe7ab 0.0000219907 Sample request Sample response Source: http://awsdocs.s3.amazonaws.com/SDB/latest/sdb-dg.pdf

53 © 2013 A. Haeberlen, Z. Ives Plan for today A brief history of cloud computing Introduce one specific commercial cloud Amazon Web Services (AWS) Elastic Compute Cloud (EC2) Elastic Block Storage (EBS) Other services: Mechanical Turk, CloudFront,... Next time: S3 and SimpleDB University of Pennsylvania 53 NEXT

54 © 2013 A. Haeberlen, Z. Ives What is Amazon EC2? Infrastructure-as-a-Service (IaaS) You can rent various types of virtual machines by the hour In your VMs, you can run your own (Linux/Windows) programs Examples: Web server, search engine, movie renderer,... University of Pennsylvania 54 http://aws.amazon.com/ec2/#pricing (9/11/2013) 68.4 GB memory 8 virtual cores (3.25 CU each) 1690 GB storage 'high' I/O 1.7 GB memory 1 virtual core (1 CU each) 160GB storage 'moderate' I/O

55 © 2013 A. Haeberlen, Z. Ives Demo Logging into AWS Management Console Launching an instance Contacting the instance via ssh Terminating an instance Have a look at the AWS Getting Started guide: http://www.cis.upenn.edu/~nets212/handouts/aws-getting-started.pdf University of Pennsylvania 55

56 © 2013 A. Haeberlen, Z. Ives Oh no - where has my data gone? EC2 instances do not have persistent storage Data survives stops & reboots, but not termination So where should I put persistent data? Elastic Block Store (EBS) - in a few slides Ideally, use an AMI with an EBS root (Amzon's default AMI has this property) University of Pennsylvania 56 If you store data on the virtual hard disk of your instance and the instance fails or you terminate it, your data WILL be lost!

57 © 2013 A. Haeberlen, Z. Ives Amazon Machine Images When I launch an instance, what software will be installed on it? Software is taken from an Amazon Machine Image (AMI) Selected when you launch an instance Essentially a file system that contains the operating system, applications, and potentially other data Lives in S3 How do I get an AMI? Amazon provides several generic ones, e.g., Amazon Linux, Fedora Core, Windows Server,... You can make your own You can even run your own custom kernel (with some restrictions) University of Pennsylvania 57

58 © 2013 A. Haeberlen, Z. Ives Security Groups Basically, a set of firewall rules Can be applied to groups of EC2 instances Each rule specifies a protocol, port numbers, etc... Only traffic matching one of the rules is allowed through Sometimes need to explicitly open ports University of Pennsylvania 58 Instance Evil attacker Legitimate user (you or your customers)

59 © 2013 A. Haeberlen, Z. Ives Regions and Availability Zones Where exactly does my instance run? No easy way to find out - Amazon does not say Instances can be assigned to regions Currently 9 availble: US East (Northern Virginia), US West (Northern California), US West (Oregon), EU (Ireland), Asia/Pacific (Singapore), Asia/Pacific (Sydney), Asia/Pacific (Tokyo), South America (Sao Paulo), AWS GovCloud Important, e.g., for reducing latency to customers Instances can be assigned to availability zones Purpose: Avoid correlated fault Several availability zones within each region University of Pennsylvania 59

60 © 2013 A. Haeberlen, Z. Ives Network pricing AWS does charge for network traffic Price depends on source and destination of traffic Free within EC2 and other AWS svcs in same region (e.g., S3) Remember: ISPs are typically charged for upstream traffic University of Pennsylvania 60 http://aws.amazon.com/ec2/#pricing (9/11/2013)

61 © 2013 A. Haeberlen, Z. Ives Instance types So far: On-demand instances Also available: Reserved instances One-time reservation fee to purchase for 1 or 3 years Usage still billed by the hour, but at a considerable discount Also available: Spot instances Spot market: Can bid for available capacity Instance continues until terminated or price rises above bid University of Pennsylvania 61 Source: http://aws.amazon.com/ ec2/reserved-instances/

62 © 2013 A. Haeberlen, Z. Ives Service Level Agreement University of Pennsylvania 62 http://aws.amazon.com/ec2-sla/ (9/11/2013; excerpt) 4.38h downtime per year allowed

63 © 2013 A. Haeberlen, Z. Ives Recap: EC2 What EC2 is: IaaS service - you can rent virtual machines Various types: Very small to very powerful How to use EC2: Ephemeral state - local data is lost when instance terminates AMIs - used to initialize an instance (OS, applications,...) Security groups - "firewalls" for your instances Regions and availability zones On-demand/reserved/spot instances Service level agreement (SLA) University of Pennsylvania 63

64 © 2013 A. Haeberlen, Z. Ives Plan for today A brief history of cloud computing Introduce one specific commercial cloud Amazon Web Services (AWS) Elastic Compute Cloud (EC2) Elastic Block Storage (EBS) Other services: Mechanical Turk, CloudFront,... Next time: S3 and SimpleDB University of Pennsylvania 64 NEXT

65 © 2013 A. Haeberlen, Z. Ives What is Elastic Block Store (EBS)? Persistent storage Unlike the local instance store, data stored in EBS is not lost when an instance fails or is terminated Should I use the instance store or EBS? Typically, instance store is used for temporary data University of Pennsylvania 65 Instance EBS storage

66 © 2013 A. Haeberlen, Z. Ives Volumes EBS storage is allocated in volumes A volume is a 'virtual disk' (size: 1GB - 1TB) Basically, a raw block device Can be attached to an instance (but only one at a time) A single instance can access multiple volumes Placed in specific availability zones Why is this useful? Be sure to place it near instances (otherwise can't attach) Replicated across multiple servers Data is not lost if a single server fails Amazon: Annual failure rate is 0.1-0.5% for a 20GB volume University of Pennsylvania 66

67 © 2013 A. Haeberlen, Z. Ives EC2 instances with EBS roots EC2 instances can have an EBS volume as their root device ("EBS boot") Result: Instance data persists independently from the lifetime of the instance You can stop and restart the instance, similar to suspending and resuming a laptop You won't be charged for the instance while it is stopped (only for EBS) You can enable termination protection for the instance Blocks attempts to terminate the instance (e.g., by accident) until termination protection is disabled again Alternative: Use instance store as the root You can still store temporary data on it, but it will disappear when you terminate the instance You can still create and mount EBS volumes explicitly University of Pennsylvania 67

68 © 2013 A. Haeberlen, Z. Ives Time Snapshots You can create a snapshot of a volume Copy of data in the volume at the time snapshot was made Only the first snapshot makes a full copy; subsequent snapshots are incremental What are snapshots good for? Sharing data with others DBpedia snapshot ID is "snap-882a8ae3" Access control list (specific account numbers) or public access Instantiate new volumes Point-in-time backups University of Pennsylvania 68

69 © 2013 A. Haeberlen, Z. Ives Pricing You pay for... Storage space: $0.10 per allocated GB per month I/O requests: $0.10 per million I/O requests S3 operations (GET/PUT) Charge is only for actual storage used Empty space does not count University of Pennsylvania 69

70 © 2013 A. Haeberlen, Z. Ives Creating an EBS volume University of Pennsylvania 70 Needs to be in same availability zone as your instance! DBpedia snapshot ID Create volume

71 © 2013 A. Haeberlen, Z. Ives Mounting an EBS volume Step 1: Attach the volume Step 2: Mount the volume in the instance University of Pennsylvania 71 mkse212@vm:~$ ec2-attach-volume -d /dev/sda2 -i i-9bd6eef1 vol-cca68ea5 ATTACHMENT vol-cca68ea5 i-9bd6eef1 /dev/sda2 attaching mkse212@vm:~$ mkse212@vm:~$ ssh ec2-user@ec2-50-17-64-130.compute-1.amazonaws.com __| __|_ ) Amazon Linux AMI _| ( / Beta ___|\___|___| See /usr/share/doc/system-release-2011.02 for latest release notes. :-) [ec2-user@ip-10-196-82-65 ~]$ sudo mount /dev/sda2 /mnt/ [ec2-user@ip-10-196-82-65 ~]$ ls /mnt/ dbpedia_3.5.1.owl dbpedia_3.5.1.owl.bz2 en other_languages [ec2-user@ip-10-196-82-65 ~]$

72 © 2013 A. Haeberlen, Z. Ives Detaching an EBS volume Step 1: Unmount the volume in the instance Step 2: Detach the volume University of Pennsylvania 72 mkse212@vm:~$ ec2-detach-volume vol-cca68ea5 ATTACHMENT vol-cca68ea5 i-9bd6eef1 /dev/sda2 detaching mkse212@vm:~$ [ec2-user@ip-10-196-82-65 ~]$ sudo umount /mnt/ [ec2-user@ip-10-196-82-65 ~]$ exit mkse212@vm:~$

73 © 2013 A. Haeberlen, Z. Ives Recap: Elastic Block Store (EBS) What EBS is: Basically a virtual hard disk; can be attached to EC2 instances Persistent - state survives termination of EC2 instance How to use EBS: Allocate volume - empty or initialized with a snapshot Attach it to EC2 instance and mount it there Can create snapshots for data sharing, backup University of Pennsylvania 73

74 © 2013 A. Haeberlen, Z. Ives Plan for today A brief history of cloud computing Introduce one specific commercial cloud Amazon Web Services (AWS) Elastic Compute Cloud (EC2) Elastic Block Storage (EBS) Other services: Mechanical Turk, CloudFront,... Next time: S3 and SimpleDB University of Pennsylvania 74 NEXT

75 © 2013 A. Haeberlen, Z. Ives AWS Import/Export Import/export large amounts of data to/from S3 buckets via physical storage device Mail an actual hard disk to Amazon (power adapter, cables!) Signature file for authentication Discussion: Is this the Right Way to be shipping data, or should we rather be using a network? University of Pennsylvania 75 MethodTime Internet (20Mbps)45 days FedEx1 day Time to transfer 10TB [AF10]

76 © 2013 A. Haeberlen, Z. Ives Mechanical Turk (MTurk) A crowdsourcing marketplace Requesters post small jobs (HIT - Human Intelligence Task), offer small rewards ($0.01-$0.10) University of Pennsylvania 76 https://www.mturk.com/mturk/ (9/23/2010 1:58am)

77 © 2013 A. Haeberlen, Z. Ives CloudFront Content distribution network Caches S3 content at edge locations for low-latency delivery Some similarities to other CDNs like Akamai, Limelight,... University of Pennsylvania 77

78 © 2013 A. Haeberlen, Z. Ives Plan for today A brief history of cloud computing Introduce one specific commercial cloud Amazon Web Services (AWS) Elastic Compute Cloud (EC2) Elastic Block Storage (EBS) Other services: Mechanical Turk, CloudFront,... Next time: S3 and SimpleDB University of Pennsylvania 78 NEXT

79 © 2013 A. Haeberlen, Z. Ives Stay tuned Next time you will learn about: Cloud storage University of Pennsylvania 79


Download ppt "© 2013 A. Haeberlen, Z. Ives Internet Basics; Faults & Failures; Cloud Platforms NETS 212: Scalable & Cloud Computing Fall 2014 Z. Ives University of Pennsylvania."

Similar presentations


Ads by Google