Presentation is loading. Please wait.

Presentation is loading. Please wait.

The University of Adelaide, School of Computer Science

Similar presentations


Presentation on theme: "The University of Adelaide, School of Computer Science"— Presentation transcript:

1 The University of Adelaide, School of Computer Science
Computer Architecture A Quantitative Approach, Fifth Edition The University of Adelaide, School of Computer Science 14 May 2018 Chapter 6 Warehouse-Scale Computers to Exploit Request-Level and Data-Level Parallelism: Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 2 — Instructions: Language of the Computer

2 The University of Adelaide, School of Computer Science
14 May 2018 Introduction Introduction Warehouse-scale computer (WSC) Provides Internet services Search, social networking, online maps, video sharing, online shopping, , cloud computing, etc. Differences with HPC “clusters”: Clusters have higher performance processors and network Clusters emphasize thread-level parallelism, WSCs emphasize request-level parallelism Differences with datacenters: Datacenters consolidate different machines and software into one location Datacenters emphasize virtual machines and hardware heterogeneity in order to serve varied customers Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 2 — Instructions: Language of the Computer

3 The University of Adelaide, School of Computer Science
14 May 2018 Introduction Introduction Important design factors for WSC: Cost-performance Small savings add up, huge systems Energy efficiency Affects power distribution and cooling Work per joule Dependability via redundancy Network I/O Interactive and batch processing workloads Ample computational parallelism is not important Most jobs are totally independent “Request-level parallelism” Operational costs count Power consumption is a primary, not secondary, constraint when designing system Scale and its opportunities and problems Can afford to build customized systems since WSC require volume purchase Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 2 — Instructions: Language of the Computer

4 The University of Adelaide, School of Computer Science
14 May 2018 Introduction Introduction Characteristics different from server architecture: Ample parallelism Server applications need parallelism to utilize the parallel architecture Billions of Web requests in data parallel fashion for WSC Interactive Internet services, also known as software as a service (SaaS) with millions of independent users, rarely has read-write dependence, also refer as request-level parallelism Operational costs count Server is designed for peak performance, only worry about power not to exceed the cooling capacity, often ignore the operation cost WSC has longer life term: Energy, Power, Cooling represent 30% cost Scale and its opportunities and problems Extreme server is very expensive, partly due to custom design and only a few built WSC includes thousand simple servers, hence can cut down the cost, however, with difficult availability, dependability issues Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 2 — Instructions: Language of the Computer

5 Prgrm’g Models and Workloads
The University of Adelaide, School of Computer Science 14 May 2018 Prgrm’g Models and Workloads Batch processing framework: MapReduce Map: applies a programmer-supplied function to each logical input record Runs on thousands of computers Provides new set of key-value pairs as intermediate values Reduce: collapses values using another programmer-supplied function See Fig 6.2 (next slide) for MapReduce usage at Google Programming Models and Workloads for WSCs Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 2 — Instructions: Language of the Computer

6 Prgrm’g Models and Workloads
The University of Adelaide, School of Computer Science 14 May 2018 Prgrm’g Models and Workloads Example: (calculate #occurrence of every word) map (String key, String value): // key: document name // value: document contents for each word w in value (occurance=1 in a document) EmitIntermediate(w,”1”); // Produce list of all words reduce (String key, Iterator values): // key: a word // value: a list of counts int result = 0; for each v in values: result += ParseInt(v); // get integer from key-value pair Emit(AsString(result)); Programming Models and Workloads for WSCs Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 2 — Instructions: Language of the Computer

7 Prgrm’g Models and Workloads
The University of Adelaide, School of Computer Science 14 May 2018 Prgrm’g Models and Workloads MapReduce runtime environment schedules map and reduce task to WSC nodes Can be thought as a generalization of SIMD Map can be partitioned and replicated Reduce is a reduction function from Map outputs Availability: Use replicas of data across different servers Use relaxed consistency: No need for all replicas to always agree Programming Models and Workloads for WSCs Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 2 — Instructions: Language of the Computer

8 The University of Adelaide, School of Computer Science
14 May 2018 WSC Utilization WSC hardware and software must cope with variability in load Use data replication to overcome failure Internal file systems to supply data Use relaxed consistency, allow inconsistency, different from database Workload demands Google search varies by a factor of 2 depending on the time of a day Unlike HPC which tries to maximum utilization Programming Models and Workloads for WSCs Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 2 — Instructions: Language of the Computer

9 Copyright © 2011, Elsevier Inc. All rights Reserved.
Figure 6.3  Average CPU utilization of more than 5000 servers during a 6-month period at Google. Servers are rarely completely idle or fully utilized, instead operating most of the time at between 10% and 50% of their maximum utilization. The column the third from the right in Figure 6.4 calculates percentages plus or minus 5% to come up with the weightings; thus, 1.2% for the 90% row means that 1.2% of servers were between 85% and 95% utilized. Copyright © 2011, Elsevier Inc. All rights Reserved.

10 Copyright © 2011, Elsevier Inc. All rights Reserved.
Each rack has 48 1U server Figure 6.5 Hierarchy of switches in a WSC with array of Racks. (Based on Figure 1.2 of Barroso and Hölzle [2009].) Copyright © 2011, Elsevier Inc. All rights Reserved.

11 Computer Architecture of WSC
The University of Adelaide, School of Computer Science 14 May 2018 Computer Architecture of WSC WSC often use a hierarchy of networks for interconnection (cost reduction) Each 19” rack holds 48 1U servers connected to a rack switch Rack switches are uplinked to switch higher in hierarchy Uplink has 48 / n times lower bandwidth, where n = # of uplink ports (to the array switch) “Oversubscription” Goal is to maximize locality of communication relative to the rack (i.e. minimize inter-rack communication) Computer Ar4chitecture of WSC Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 2 — Instructions: Language of the Computer

12 The University of Adelaide, School of Computer Science
14 May 2018 Storage Storage options: Use disks inside the servers, or Network attached storage through Infiniband WSCs generally rely on local disks Google File System (GFS) uses local disks and maintains at least three relicas Computer Ar4chitecture of WSC Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 2 — Instructions: Language of the Computer

13 The University of Adelaide, School of Computer Science
14 May 2018 Array Switch Switch that connects an array of racks, designed for 50,000 servers Array switch should have 10 X the bisection bandwidth of rack switch Cost of n-port switch grows as n2 Often utilize content addressable memory chips and FPGAs Computer Ar4chitecture of WSC Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 2 — Instructions: Language of the Computer

14 The University of Adelaide, School of Computer Science
14 May 2018 WSC Memory Hierarchy Servers can access DRAM and disks on other servers using a NUMA-style interface Computer Ar4chitecture of WSC Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 2 — Instructions: Language of the Computer

15 Copyright © 2011, Elsevier Inc. All rights Reserved.
DRAM Bandwidth DRAM Latency Figure 6.7 Graph of latency, bandwidth, and capacity of the memory hierarchy of a WSC for data in Figure 6.6 [Barroso and Hölzle 2009]. Copyright © 2011, Elsevier Inc. All rights Reserved.

16 Copyright © 2011, Elsevier Inc. All rights Reserved.
Figure 6.8 The Layer 3 network used to link arrays together and to the Internet [Greenberg et al. 2009]. Some WSCs use a separate border router to connect the Internet to the datacenter Layer 3 switches. (To interconnect 50,000 servers) (See examples about memory latency and data transfer in Page ) Copyright © 2011, Elsevier Inc. All rights Reserved.

17 Infrastructure and Costs of WSC
The University of Adelaide, School of Computer Science 14 May 2018 Infrastructure and Costs of WSC Location of WSC Proximity to Internet backbones, electricity cost, property tax rates, low risk from earthquakes, floods, and hurricanes Power distribution (5 steps, 4 voltage changes) Physcical Infrastrcuture and Costs of WSC Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 2 — Instructions: Language of the Computer

18 Infrastructure and Costs of WSC
The University of Adelaide, School of Computer Science 14 May 2018 Infrastructure and Costs of WSC Cooling Air conditioning used to cool server room 64 F – 71 F Keep temperature higher (closer to 71 F) Cooling towers can also be used Minimum temperature is “wet bulb temperature” Physcical Infrastrcuture and Costs of WSC Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 2 — Instructions: Language of the Computer

19 Infrastructure and Costs of WSC
The University of Adelaide, School of Computer Science 14 May 2018 Infrastructure and Costs of WSC Cooling system also uses water (evaporation and spills) E.g. 70,000 to 200,000 gallons per day for an 8 MW facility Power cost breakdown: Chillers: % of the power used by the IT equipment Air conditioning: % of the IT power, mostly due to fans How many servers can a WSC support? Each server: “Nameplate power rating” gives maximum power consumption To get actual, measure power under actual workloads Oversubscribe cumulative server power by 40%, but monitor power closely Physcical Infrastrcuture and Costs of WSC Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 2 — Instructions: Language of the Computer

20 Measuring Efficiency of a WSC
The University of Adelaide, School of Computer Science 14 May 2018 Measuring Efficiency of a WSC Power Utilization Effectiveness (PEU) = Total facility power / IT equipment power Median PUE on 2006 study was 1.69 Performance Latency is important metric because it is seen by users Bing study: users will use search less as response time increases Service Level Objectives (SLOs)/Service Level Agreements (SLAs) E.g. 99% of requests be below 100 ms Physcical Infrastrcuture and Costs of WSC Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 2 — Instructions: Language of the Computer

21 Copyright © 2011, Elsevier Inc. All rights Reserved.
Figure 6.11 Power utilization efficiency of 19 datacenters in 2006 [Greenberg et al. 2006]. The power for air conditioning (AC) and other uses (such as power distribution) is normalized to the power for the IT equipment in calculating the PUE. Thus, power for IT equipment must be 1.0 and AC varies from about 0.30 to 1.40 times the power of the IT equipment. Power for “other” varies from about 0.05 to 0.60 of the IT equipment. Copyright © 2011, Elsevier Inc. All rights Reserved.

22 Power in IT Equipment (2007)
The University of Adelaide, School of Computer Science 14 May 2018 Power in IT Equipment (2007) 33% of power for processors 30% for DRAM 10% for disk 5% for networking 22% for others (inside the server) Physcical Infrastrcuture and Costs of WSC Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 2 — Instructions: Language of the Computer

23 The University of Adelaide, School of Computer Science
14 May 2018 Cost of a WSC (Fig. 6.13) Capital expenditures (CAPEX) Cost to build a WSC Operational expenditures (OPEX) Cost to operate a WSC Physcical Infrastrcuture and Costs of WSC Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 2 — Instructions: Language of the Computer

24 Copyright © 2011, Elsevier Inc. All rights Reserved.
Figure 6.18 The best SPECpower results as of July 2010 versus the ideal energy proportional behavior. The system was the HP ProLiant SL2x170z G6, which uses a cluster of four dual-socket Intel Xeon L5640s with each socket having six cores running at 2.27 GHz. The system had 64 GB of DRAM and a tiny 60 GB SSD for secondary storage. (The fact that main memory is larger than disk capacity suggests that this system was tailored to this benchmark.) The software used was IBM Java Virtual Machine version 9 and Windows Server 2008, Enterprise Edition. Copyright © 2011, Elsevier Inc. All rights Reserved.

25 Copyright © 2011, Elsevier Inc. All rights Reserved.
Figure 6.19 Google customizes a standard 1AAA container: 40 x 8 x 9.5 feet (12.2 x 2.4 x 2.9 meters). The servers are stacked up to 20 high in racks that form two long rows of 29 racks each, with one row on each side of the container. The cool aisle goes down the middle of the container, with the hot air return being on the outside. The hanging rack structure makes it easier to repair the cooling system without removing the servers. To allow people inside the container to repair components, it contains safety systems for fire detection and mist-based suppression, emergency egress and lighting, and emergency power shut-off. Containers also have many sensors: temperature, airflow pressure, air leak detection, and motion-sensing lighting. A video tour of the datacenter can be found at Microsoft, Yahoo!, and many others are now building modular datacenters based upon these ideas but they have stopped using ISO standard containers since the size is inconvenient. Copyright © 2011, Elsevier Inc. All rights Reserved.

26 Copyright © 2011, Elsevier Inc. All rights Reserved.
Figure 6.20 Airflow within the container shown in Figure This cross-section diagram shows two racks on each side of the container. Cold air blows into the aisle in the middle of the container and is then sucked into the servers. Warm air returns at the edges of the container. This design isolates cold and warm airflows. Copyright © 2011, Elsevier Inc. All rights Reserved.

27 Copyright © 2011, Elsevier Inc. All rights Reserved.
Figure 6.21 Server for Google WSC. The power supply is on the left and the two disks are on the top. The two fans below the left disk cover the two sockets of the AMD Barcelona microprocessor, each with two cores, running at 2.2 GHz. The eight DIMMs in the lower right each hold 1 GB, giving a total of 8 GB. There is no extra sheet metal, as the servers are plugged into the battery and a separate plenum is in the rack for each server to help control the airflow. In part because of the height of the batteries, 20 servers fit in a rack. Copyright © 2011, Elsevier Inc. All rights Reserved.

28 Copyright © 2011, Elsevier Inc. All rights Reserved.
Figure 6.22 Power usage effectiveness (PUE) of 10 Google WSCs over time. Google A is the WSC described in this section. It is the highest line in Q3 ‘07 and Q2 ’10. (From Facebook recently announced a new datacenter that should deliver an impressive PUE of 1.07 (see The Prineville Oregon Facility has no air conditioning and no chilled water. It relies strictly on outside air, which is brought in one side of the building, filtered, cooled via misters, pumped across the IT equipment, and then sent out the building by exhaust fans. In addition, the servers use a custom power supply that allows the power distribution system to skip one of the voltage conversion steps in Figure 6.9. Copyright © 2011, Elsevier Inc. All rights Reserved.

29 Copyright © 2011, Elsevier Inc. All rights Reserved.
Slower server Figure 6.24 Query–response time curve. Copyright © 2011, Elsevier Inc. All rights Reserved.

30 The University of Adelaide, School of Computer Science
14 May 2018 Cloud Computing Cloud Computing WSCs offer economies of scale that cannot be achieved with a datacenter: 5.7 times reduction in storage costs 7.1 times reduction in administrative costs 7.3 times reduction in networking costs This has given rise to cloud services such as Amazon Web Services “Utility Computing” Based on using open source virtual machine and operating system software Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 2 — Instructions: Language of the Computer


Download ppt "The University of Adelaide, School of Computer Science"

Similar presentations


Ads by Google