Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Centers and Cloud Computing 1. 2 Data Centers 3.

Similar presentations


Presentation on theme: "Data Centers and Cloud Computing 1. 2 Data Centers 3."— Presentation transcript:

1 Data Centers and Cloud Computing 1

2 2

3 Data Centers 3

4 What is a Data Center? 4 “A facility used to house computer systems and associated components.” (Wikipedia) “It is the brain of a company and the place where the most critical processes are run.” (SAP) “It is a factory that transforms and stores bits.” (Albert Greenberg - Microsoft) Collection of physical compute, storage, and network resources.

5 The Data Center in a nutshell 5 WholeSale Data Center - Virginia 10’s to 100’000s servers Highly connected: fiber optic cables Cooling; in generous proportions Diesel Generators

6 Data Centers Challenges 6 Large-scale - 10’s to 100’s of thousands of servers - high bandwidth and low latency is critical Availability and high performance - 99.99X% availability - redundancy of all critical components Security and performance isolation - controlled access to infrastructure - secure the data Complexity - plethora of components - plethora of software and hardware failures

7 7 Anatomy of a Data Center Typical physical network topology is a tree Data Center Internet Rack with Servers Top of Rack (TOR) Switch Aggregation Switches Core Switch

8 8 Anatomy of a Data Center Common traffic patterns Data Center Internet Rack with Servers Top of Rack (TOR) Switch Aggregation Switches Core Switch North-south - common with web sites East-West - common with back-end of web sites/ web services - common in Big Data Many-to-one - causes TCP incast

9 9 Anatomy of a Data Center Data Center Internet Rack with Servers Top of Rack (TOR) Switch Aggregation Switches Core Switch What are the challenges with this design? Links higher in topology are oversubscribed 1 - cannot handle all servers sending at maximum rate - design tradeoff to scale Single point of failure - redundancy increase costs 1 Oversubscription ratio: capacity of links below a switch relative to capacity of links above

10 10 Anatomy of a Data Center Can we achieve full bisection bandwidth? (oversubscription ratio of 1:1) Partially, because … - requires enterprise-level switches - even they become saturated at large scale

11 Can we do better? 11

12 12 Emerging data center topology – Fat Trees - use cheap identical commodity switches Goals: - provides redundancy - provides full bisection bandwidth the bottleneck is the network interface, not the link in the network - help address large volumes of data

13 13 Main idea: inter-connect racks using a fat-tree topology E.g.: given K-ports identical switches, where K = 4: Pod 0 Pod 1 Pod 2 Pod 3 Core Edge Aggre gation Challenges -More switches than the tree topology -Different routing approach -Does not solve TCP incast Emerging data center topology – Fat Trees

14 14

15 How do we make use of Data Centers? 15

16 16 Cloud Computing 17

17 What is Cloud Computing? Delivery of on-demand shared computing resources – everything from applications, services to compute, storage and network resources.

18 18 What is Cloud Computing? Data Center Hardware/software of the data center implements the Cloud Cloud

19 19 Cloud Computing – Key Characteristics Virtualized Resources - physical resources divided into pieces - each customer gets an isolated piece Rapid elasticity - capabilities can be elastically provisioned/released Pay-per-use - only pay for the resources you use On-demand - customers can request/release resources whenever they want Resilient - multiple pools of resources that are unlikely to fail simultaneously Shared - multiple customers share the same physical resources

20 20 Data Center Hardware/software of the data center implements the Cloud Cloud Computing Stack The Cloud Computing Stack organizes the hardware/software into various layers Cloud The various types of Clouds all have a Cloud Computing Stack backed by a data center Cloud Computing

21 21 Cloud Computing – Categories Data Center Physical Plant/Building Cloud Computing Stack Networking Firewalls/Security Servers and Storage Virtualization Operating Systems Development Tools and Database Management Hosted Applications Suites of Services IaaS - customers lease virtual machines, virtual storage, virtual networks - customers must manage operating system, file system, etc..

22 22 Cloud Computing – Categories Cloud Computing Stack PaaS - customers lease resources to run applications written in a specific language such as Python, Java, MapReduce - cloud provider manages the operating system, file system, and network Data Center Physical Plant/Building Networking Firewalls/Security Servers and Storage Virtualization Operating Systems Development Tools and Database Management Hosted Applications Suites of Services

23 23 Cloud Computing – Categories Cloud Computing Stack SaaS - customers lease machines that run specific software - it is what most people mean when they say the “Cloud” Data Center Physical Plant/Building Networking Firewalls/Security Servers and Storage Virtualization Operating Systems Development Tools and Database Management Hosted Applications Suites of Services

24 24 Cloud Computing – Deployment Models Private Cloud - only available to users (e.g. departments) within a company or organization Public Cloud - anyone can request and use the cloud Hybrid Cloud - a composition of public and private cloud resources - bounded by standardized or proprietary technology

25 25 Applications Suited for the Cloud  Web sites, or web services  Big Data 26

26 Web services Web Server Business logic Database HTTPS GET/POST

27 27 Business logic Big Data BIG DATA Motivation: - handle massive amounts of data - leverage parallelization - separate programming abstractions from the runtime execution model - must be fast and easy to use

28 28 How can we process big data, leveraging parallelization? Want speed and accuracy and minimum network traffic Cloud Computing can scale. Cloud Computing runs on Data Centers at scale. Data Centers can scale. Document 1 Document 2 Document 3 Requirement: For every word in the documents, print the document id’s where it appears For example: Magic Cloud1, 2 Computing1, 2 can1, 3 scale1, 2, 3 runs2 Data2,3 Centers2, 3 Term Document

29 29 …. we can use MapReduce Divide analysis into two parts MAP task: - given a subset of the data - extract relevant data and obtain partial results (process) REDUCE task: - receive partial results from each MAP task; - combine into final result Programming paradigm Doc1: Cloud Computing can scale. Doc2:Cloud Computing runs on Data Centers at scale. MAP REDUCE Cloud1 Computing1 can1 scale1 Cloud1, 2 Computing1, 2 can1 scale1, 2 runs2 Data2 Centers2 Cloud2 Computing2 runs2 Data2 Centers2 scale2

30 Let’s put pieces together! How Internet Search works? Web Server HTTPS GET/POST Google Index Cloud1, 2 Computing1, 2 can1, 3 scale1, 2, 3 runs2 Data2,3 Centers2, 3 Database Cloud Computing can scale. Cloud Computing runs on Data Centers at scale. Data Centers can scale. Ranking algos  Document 2  Document 1 MAP 1 MAP 2 MAP 3 REDUCE World Wide Web Crawler 31 Cloud Data Center

31 31 Lot of challenges in Cloud Computing Large Scale Networks - 100’s of thousands of servers Shared Infrastructure - customers competing for bandwidth Security - virtual machines/storage/network must be isolated among customers Fixing Problems – many! -customer application -customer operating system -physical/virtual network interface, switch -top of rack/aggregation/core switch


Download ppt "Data Centers and Cloud Computing 1. 2 Data Centers 3."

Similar presentations


Ads by Google