Presentation is loading. Please wait.

Presentation is loading. Please wait.

Cloud Computing Part #1 1. 2

Similar presentations


Presentation on theme: "Cloud Computing Part #1 1. 2"— Presentation transcript:

1 Cloud Computing Part #1 1

2 2 http://www.digitaltrends.com/wp-content/uploads/2011/09/Cloud-Computing.jpg

3 Computing history (1)  Abacus 2700–2300 BC 3 http://upload.wikimedia.org/wikipedia/commons/e/ea/Boulier1.JPG http://retrocalculators.com/abacus_files/Wooden_Abacus_Russian_Wood_Schoty.jpg

4 Computing history (2)  Babbage computer 1834 - Charles Babbage 4 http://members.peak.org/~jeremy/superlative/pix/babbageMachine.jpg

5 Computing history (3)  Z1 computer Konrad Zuse, 1936 22-bit floating point Z2, Z3, … Z5 Plankalkul (ALGOL) 5 http://www.yorku.ca/lbianchi/sts3700b/z1-vb2.jpg

6 Computing history (4)  Bell 1 1940 9000 relays, 90 m 2, 10 t  Mark 1 1944 Equations  ENIAC 1946 18000 lamps, 90 × 15 m 2, 30t, 150 kW 100 kHz, + for 0.2 ms, * for 2.8 ms 6 http://mathsci.ucd.ie/~plynch/eniac/ENIAC.jpg

7 Computing history (5)  Philco-2000 1955 56000 transistors, 1200 diodes, (450 lamps) + for 1,7 microseconds, * for 40,3  CDC 6600 1960 169000 transistors 100 MHz 7 http://upload.wikimedia.org/wikipedia/commons/thumb/c/c4/CDC_6600.jc.jpg/800px-CDC_6600.jc.jpg

8 Computing history (6)  System-360 1964, First integral DOS, OS/360  Intel 8008 1972 8 bit  Intel 8088  PC XT -> PC AT (80286) 8 http://www.wired.com/images/article/full/2008/04/ibm_360_500px.jpg

9 Performance progress (1)  2010: 2.57 petaflops  2005: 280.6 teraflops  2000: 4.94 teraflops  1995: 170 gigaflops  15,100 times faster  1,650 times faster  19 times faster  The baseline 9 http://royal.pingdom.com/2010/12/02/incredible-growth-supercomputing-performance-1995-2010/

10 Performance progress (2)  In 2010, we measure the performance of the fastest supercomputers in petaflops (quadrillions of operations per second). In 1995, we used gigaflops (billions of operations per second). We are now using the scale a million times larger than we did 15 years ago. 10

11 Tasks and computers  Need for performance Amount of the data Resolution / quality / complexity  Growing demand More online users More applications running 11

12 Scaling thing (1)  Personal computer Simple, personal computing tasks 12 http://a57.foxnews.com/global.fncstatic.com/static/managed/img/Health/2009/July/660/371/COMPUTER-GIRL_640.jpg?ve=1

13 Scaling thing (2)  Network Common tasks, resources 13 http://www.lucartech.com/images/Services_network.jpg

14 Scaling thing (3)  Cluster Processing power, large IO 14 http://www.biomedcentral.com/content/figures/1471-2105-11-217-1-l.jpg http://upload.wikimedia.org/wikipedia/commons/thumb/c/c5/MEGWARE.CLIC.jpg/300px-MEGWARE.CLIC.jpg

15 Scaling thing (4)  Cloud The topic we will speak about… 15 http://www.bluesci.org/wordpress/wp-content/uploads/2011/09/Sevensheaven_illustration-Cloud_Computing.jpg

16 Cloud computing (1) 16 http://en.wikipedia.org/wiki/File:Cloud_computing.svg

17 Cloud computing (2)  Grid computing  SOA  Client-server distributed application that distinguishes between service providers (servers) and service requesters (clients)  Peer-to-peer distributed architecture without the need for central coordination 17

18 5 essential characteristics  On-demand self-service  Broad network access  Resource pooling  Rapid elasticity  Measured service 18

19 Service models  Infrastructure (IaaS)  Platform (PaaS)  Software (SaaS)  Network (NaaS)  Database (DBaaS) 19 http://upload.wikimedia.org/wikipedia/commons/3/3c/Cloud_computing_layers.png

20 Deployment models  Public cloud  Community cloud  Hybrid cloud  Private cloud 20 http://upload.wikimedia.org/wikipedia/commons/8/87/Cloud_computing_types.svg

21 Comparison for SaaS CriteriaPublic cloud Private cloud Initial costTypically zeroTypically high Running costPredictableUnpredictable CustomizationImpossiblePossible Privacy No (Host has access to the data) Yes Single sign-onImpossiblePossible Scaling up Easy while within defined limits Laborious but no limits 21

22 Virtualization (1)  VM technology allows multiple virtual machines to run on a single physical machine 22

23 Virtualization (2) Advantages of virtual machines: Run operating systems where the physical hardware is unavailable; Easier to create new machines, backup machines, etc.; Software testing using “clean” installs of operating systems and software; Emulate more machines than are physically available; Timeshare lightly loaded systems on one host, Debug problems (suspend and resume the problem machine); Easy migration of virtual machines (shutdown needed or not); Run legacy systems! 23

24 Advantages of Cloud Computing (1)  Lower computer costs  Improved performance  Reduced software costs  Instant software updates  Improved document format compatibility 24

25 Advantages of Cloud Computing (2)  Unlimited storage capacity  Increased data reliability  Universal document access  Latest version availability  Easier group collaboration  Device independence 25

26 Disadvantages of Cloud Computing (1)  Requires a constant Internet connection  Does not work well with low-speed connections  Features might be limited 26

27 Disadvantages of Cloud Computing (2)  Can be slow  Stored data might not be secure  Stored data can be lost  Compatibility for clouds/DB/etc. 27

28 28 http://www.treloarphysio.com/blog/wp-content/uploads/2012/02/relax-relaxing-8925208-1024-768.jpg

29 What is Cloud Computing? 1. Web-scale problems 2. Large data centers 3. Different models of computing 4. Highly-interactive Web applications 29

30 1. Web-Scale Problems  Characteristics: Definitely data-intensive May also be processing intensive  Examples: Crawling, indexing, searching, mining the Web “Post-genomics” life sciences research Other scientific data (physics, astronomers, etc.) Sensor networks Web 2.0 applications … 30

31 How much data?  Wayback Machine has 2 PB + 20 TB/month (2006)  Google processes 20 PB a day (2008)  “all words ever spoken by human beings” ~ 5 EB  NOAA has ~1 PB climate data (2007)  CERN’s LHC will generate 15 PB a year (2008) 640K ought to be enough for anybody. 31

32 Maximilien Brice, © CERN 32

33 Maximilien Brice, © CERN 33

34 What to do with more data?  Answering factoid questions Pattern matching on the Web Works amazingly well  Learning relations Start with seed instances Search for patterns on the Web Using patterns to find more instances Who shot Abraham Lincoln?  X shot Abraham Lincoln Birthday-of(Mozart, 1756) Birthday-of(Einstein, 1879) Wolfgang Amadeus Mozart (1756 - 1791) Einstein was born in 1879 PERSON (DATE – PERSON was born in DATE (Brill et al., TREC 2001; Lin, ACM TOIS 2007) (Agichtein and Gravano, DL 2000; Ravichandran and Hovy, ACL 2002; … ) 34

35 2. Large Data Centers  Web-scale problems? Throw more machines at it!  Clear trend: centralization of computing resources in large data centers Necessary ingredients: fiber, juice, and space  Important Issues: Redundancy Efficiency Utilization Management 35

36 Source: Harper’s (Feb, 2008) 36

37 Maximilien Brice, © CERN 37

38 Key Technology: Virtualization Hardware Operating System App Traditional Stack Hardware OS App Hypervisor OS Virtualized Stack 38

39 3. Different Computing Models  Utility computing Why buy machines when you can rent cycles? Examples: Amazon’s EC2, GoGrid, AppNexus  Platform as a Service (PaaS) Give me nice API and take care of the implementation Example: Google App Engine, Heroku  Software as a Service (SaaS) Just run it for me! Example: Gmail “Why do it yourself if you can pay someone to do it for you?” 39

40 4. Web Applications  What is the nature of software applications? From the desktop to the browser SaaS = Web-based applications Examples: Google Maps, Facebook  How do we deliver highly-interactive Web-based applications? AJAX (asynchronous JavaScript and XML) For better, or for worse… 40

41 What is the course about?  MapReduce: the “back-end” of cloud computing Batch-oriented processing of large datasets  Ajax: the “front-end” of cloud computing Highly-interactive Web-based applications  Computing “in the clouds” Amazon’s EC2/S3 as an example of utility computing 41

42 Amazon Web Services  Elastic Compute Cloud (EC2) Rent computing resources by the hour Basic unit of accounting = instance-hour Additional costs for bandwidth  Simple Storage Service (S3) Persistent storage Charge by the GB/month Additional costs for bandwidth 42

43 Simple Storage Service  Pay for what you use: $0.20 per GByte of data transferred, $0.15 per GByte-Month for storage used, Second Life Update: ○ 1TBytes, 40,000 downloads in 24 hours - $200 43

44 Some cloud providers 44

45 Cloud Computing Zen  Don’t get frustrated (take a deep breath)… This is bleeding edge technology Those W$*#T@F! moments  Be patient… This is the second first time I’ve taught this course  Be flexible… There will be unanticipated issues along the way  Be constructive… Tell me how I can make everyone’s experience better 45

46 Source: Wikipedia 46

47 Web-Scale Problems?  Don’t hold your breath: Biocomputing Nanocomputing Quantum computing …  It all boils down to… Divide-and-conquer Throwing more hardware at the problem Simple to understand… a lifetime to master… 47

48 Divide and Conquer “Work” w1w1 w2w2 w3w3 r1r1 r2r2 r3r3 “Result” “worker” Partition Combine 48

49 Different Workers  Different threads in the same core  Different cores in the same CPU  Different CPUs in a multi-processor system  Different machines in a distributed system 49

50 Choices, Choices, Choices  Commodity vs. “exotic” hardware  Number of machines vs. processor vs. cores  Bandwidth of memory vs. disk vs. network  Different programming models 50

51 Flynn’s Taxonomy Instructions Single (SI)Multiple (MI) Data Multiple (MD) SISD Single-threaded process MISD Pipeline architecture SIMD Vector Processing MIMD Multi-threaded Programming Single (SD) 51

52 SISD DDDDDDD Processor Instructions 52

53 SIMD D0D0 Processor Instructions D0D0 D0D0 D0D0 D0D0 D0D0 D1D1 D2D2 D3D3 D4D4 … DnDn D1D1 D2D2 D3D3 D4D4 … DnDn D1D1 D2D2 D3D3 D4D4 … DnDn D1D1 D2D2 D3D3 D4D4 … DnDn D1D1 D2D2 D3D3 D4D4 … DnDn D1D1 D2D2 D3D3 D4D4 … DnDn D1D1 D2D2 D3D3 D4D4 … DnDn D0D0 53

54 MIMD DDDDDDD Processor Instructions DDDDDDD Processor Instructions 54

55 Memory Typology: Shared Memory Processor 55

56 Memory Typology: Distributed MemoryProcessorMemoryProcessor MemoryProcessorMemoryProcessor Network 56

57 Memory Typology: Hybrid Memory Processor Network Processor Memory Processor Memory Processor Memory Processor 57

58 Parallelization Problems  How do we assign work units to workers?  What if we have more work units than workers?  What if workers need to share partial results?  How do we aggregate partial results?  How do we know all the workers have finished?  What if workers die? What is the common theme of all of these problems? 58

59 General Theme?  Parallelization problems arise from: Communication between workers Access to shared resources (e.g., data)  Thus, we need a synchronization system!  This is tricky: Finding bugs is hard Solving bugs is even harder 59

60 Managing Multiple Workers  Difficult because (Often) don’t know the order in which workers run (Often) don’t know where the workers are running (Often) don’t know when workers interrupt each other  Thus, we need: Semaphores (lock, unlock) Conditional variables (wait, notify, broadcast) Barriers  Still, lots of problems: Deadlock, livelock, race conditions,...  Moral of the story: be careful! Even trickier if the workers are on different machines 60

61 Patterns for Parallelism  Parallel computing has been around for decades  Here are some “design patterns” … 61

62 Master/Slaves slaves master 62

63 Producer/Consumer Flow CP P P C C CP P P C C 63

64 Work Queues C P P P C C shared queue WWWWW 64

65 Rubber Meets Road  From patterns to implementation: pthreads, OpenMP for multi-threaded programming MPI for clustering computing …  The reality: Lots of one-off solutions, custom code Write you own dedicated library, then program with it Burden on the programmer to explicitly manage everything  MapReduce to the rescue! 65

66 Map/Reduce (1)  Document  Query 66 http://ayende.com/blog/4435/map-reduce-a-visual-explanation

67 Map/Reduce (2)  Query result 67 http://ayende.com/blog/4435/map-reduce-a-visual-explanation

68 Map/Reduce (3)  Reduce 68 http://ayende.com/blog/4435/map-reduce-a-visual-explanation

69 Map/Reduce (4)  Reduce… 69 http://ayende.com/blog/4435/map-reduce-a-visual-explanation

70 Map/Reduce (5)  Reduce… 70 http://ayende.com/blog/4435/map-reduce-a-visual-explanation

71 Map/Reduce performance  Sorting 2 10 100 bytes records (~1TB) 71 http://static.usenix.org/event/osdi04/tech/full_papers/dean/dean.pdf

72 Security in a cloud  Traditional threats to a software  Functional threats of cloud components  Attacks on a client  Virtualization threats  Threat of cloud complexity  Attacks on hypervisor  Threats of VM migration  Attacks on management systems  Privacy, personal data 72

73 Traditional threats to a software The traditional treads are related to the vulnerabilities of network protocols, operating systems, modular components and other similar weaknesses. This is a classic security threat, to solve that, it is sufficient to use anti-virus software, firewall and other components discussed later. It is important that these tools are adapted to the cloud environment to run effectively in virtualization. 73

74 Functional threats of cloud components  This type of attack is associated with multiple layers of the "clouds", the main principle ofv security – the general level of security is the security of the weakest element. 74 Cloud elementMeans of security Proxy serverProtection against DoS-attacks Web serverMonitoring the integrity of the web pages Application serverShielding of the applications Data storage layerProtection against SQL injections Data storage systemsAccess control and backups

75 Attacks on a client These types of attacks have worked out in a web environment, but they are just as relevant in cloud environments, as users connect to the cloud through a web browser. Attacks include such types as Cross Site Scripting (XSS), DoS attacks, interception of web sessions, stealing passwords, "the man in the middle” and others. 75

76 Virtualization threats Since the platform for the cloud elements, usually is a virtual environment, the attack on virtualization threatens the entire cloud as a whole. This type of attack is unique to cloud computing. 76

77 Threat of cloud complexity Monitoring the events in the "cloud" and management of them is also a security issue. How do we ensure that all resources are counted and that there is no rogue virtual machine that perform third- party processes and do not interfere in mutual configuration of the layers and elements of the "cloud"? 77

78 Attacks on hypervisor In fact, a key element in the virtual system is a hypervisor which provides separation of physical computer resources among virtual machines. Interfering the work of the hypervisor or its breach may allow one virtual machine to access resources of other – network traffic, stored data. This can also lead to virtual machine displacement from the server. 78

79 Threats of VM migration Note that the virtual machine itself is a file that can be executed on different nodes of the "cloud". The system of virtual machine management includes mechanisms for the transfer (migration) of virtual machines. Nevertheless, it is possible to steal virtual machine file and run it out of the cloud. It is impossible to steal the physical server from the data centre, but you can steal files of virtual machines across the network without physical access to servers. 79

80 Attacks on management systems A large number of virtual machines that are used in the "clouds", especially in public clouds require a management system that can reliably control the creation, transfer and utilization of virtual machines. The interference in the management system can lead to ghost virtual machines, blocking some of the machines and the substitution of elements or layers in the cloud to the rogue. 80

81 Privacy, personal data When it comes to the privacy of data, there are a lot of problems with the legislation – such as the processing of personal data and its protection. Choosing a cloud computing as a solution for business systems, it is important to take into account the confidentiality of the data that will be stored in a "cloud". To store secret and top secret data in the "cloud" environments is not absolutely safe – that's why government agencies are still not switched to “clouds” 81

82 Thank you! 82


Download ppt "Cloud Computing Part #1 1. 2"

Similar presentations


Ads by Google