Presentation on theme: "Data Center Scale Computing"— Presentation transcript:
1 Data Center Scale Computing If computers of the kind I have advocated become the computers of the future, then computing may someday be organized as a public utility just as the telephone system is a public utility The computer utility could become the basis of a new and important industry.John McCarthyMIT centennial celebration (1961)Presentation by:Ken BakkeSamantha OrogvanyJohn Greene
2 Outline Introduction Data Center System Components Design and Storage ConsiderationsData Center Power supplyData Center CoolingData center failures and fault tolerancesData center repairsCurrent challengescurrent research, trends, etcConclusion
3 Data Center VS Warehouse Scale Computer Provide colocated equipmentConsolidate heterogeneous computersServe wide variety of customersBinaries typically run on a small number of computersResources are partitioned and separately managedFacility and computing resources are designed separatelyShare security, environmental and maintenance resourcesWarehouse-scale computerDesigned to run massive internet applicationsIndividual applications run on thousands of computersHomogeneous hardware and system softwareCentral management for a common resource poolThe design of the facility and the computer hardware is integratedData centers typically host a wide variety of small to medium sized applications each on dedicated hardware. These applications are decoupled and protected from each other.WSC belong to a single organization, use homogeneous hardware, systems software and a common management infrastructure. They are focused on cost efficiency.
4 Need for Warehouse-scale Computers Renewed focus on client-side consumption of web resourcesConstantly increasing numbers of web usersConstantly expanding amounts of informationDesire for rapid response for end userFocus on cost reduction delivering massive applications.Increased interest in Infrastructure as a Service (Iaas)
5 Performance and Availability Techniques ReplicationReed-Solomon codesShardingLoad-balancingHealth checkingApplication specific compressionEventual consistencyCentralized controlCanariesRedundant execution and tail tolerance
6 Major system components Typical server is 4 CPU - 8 Dual threaded cores yielding 32 coresTypical rack - 40 servers & 1 or 10 Gbps ethernet switchCluster containing cluster switch and racksA cluster may contain tens of thousands of processing threads
7 Low-end Server vs SMP Latency 1000 time faster in SMP Less impact on applications too large for single serverPerformance advantage of a cluster built with large SMP server nodes (128-core SMP) over a cluster with the same number of processor cores built with low-end server nodes (four-core SMP), for clusters of varying size.
8 Brawny vs Wimpy Advantages of wimpy computers Multicore CPUs carry a premium cost of 2-5 times vs multiple smaller CPUsMemory and IO bound applications do not take advantage of faster CPUsSlower CPUs are more power efficientDisadvantages of wimpy computerIncreasing parallelism is programmatically difficultProgramming costs increaseNetworking requirements increaseLess tasks / smaller size creates loading difficultiesAmdahl’s law impacts
9 Design Considerations Software design and improvements can be made to align with architectural choicesResource requirements and utilization can be balanced among all applicationsSpare CPU cycles can be used for process intensive applicationsSpare storage can be used for archival purposesFungible resources are more efficientWorkloads can be distributed to fully utilize serversFocus on cost-effectivenessSmart programmers may be able to restructure algorithms to match a more inexpensive design.
10 Storage Considerations Private DataLocal DRAM, SSD or DiskShared State DataHigh throughput for thousands of usersRobust performance tolerant to errorsUnstructure Storage - (Google - GFS)Master plus thousnads of “chunk” serversUtilizes every system with a disk driveCross machine replicationStructured StorageBig Table provides Row, Key, Timestamp mapping to byte arrayTrade-offs favor high performance and massive availabilityEventual consistency model leaves applications managing consistency issues
12 WSC Network Architecture Leaf BandwidthBandwidth between servers in common rackTypically managed with a commodity switchEasily increased by increasing number of ports or speed of portsBisection BandwidthBandwidth between the two halves of a clusterMatching leaf bandwidth requires as many uplinks to fabric as links within a rackSince distances are longer, optical interfaces are required.
13 Three Stage TopologyRequired to maintain same throughput as single switch.
14 Network Design Oversubscription ratios of 4-10 are common. Limit network cost per serverOffloading to special networksCentralized management
15 Service level response times Consider servers with 99th, 99.9th and 99.99th latency > 1s vs # required service requestsSelective replication is one mitigating strategy
17 Power Supply Distribution Uninterruptible Power SystemsTransfer switch used to chose active power input from either utility sources or generatorAfter a power failure, the transfer switch will detect the power generator and after seconds, provide powerThis power system has energy storage to provide additional protection between power failure of main utility power and when generators begin providing full loadLevels incoming power feed to remove spikes and lags from AC-feed
18 Example of Power Distribution Units Traditional PDUTakes in power output from UPSRegulates power with transformers to distribute power to serversHandles kW typicallyProvides Redundancy by switching between 2 power sources
19 Examples of Power Distribution Facebook’s power distribution systemDesigned to increase power efficiency by reducing energy loss to about 15%Eliminates the UPS and PDU and adds on-board 12v battery for each cabinet
20 Power Supply Cooling Needs Air Flow ConsiderationFresh Air cooling“Opening the windows”Closed loop systemUnderfloor systemsServers are on raised concrete tile floors
21 Power Cooling Systems 2-loop Systems Loop 1 - Hot Air/Cool air circuit (Red/Blue Arrows)Loop 2 - Liquid supply to Computer Room Air Conditioning Units and heat discharging
22 Example of Cooling System Design 3 - Loop SystemChiller sends cooled water to CRACsHeated water sent from building to chiller for heat dispersalCondenser water loop flows into cooling tower
25 Estimated Carbon Costs for Power Based on local utility power generated via the use of oil, natural gas, coal or renewable sources, including hydroelectricity, solar energy, wind and biofuels
26 Power Efficiency Sources of Efficiency Loss Improvements to Efficiency Overheading cooling systems, such as chillersAir movementIT EquipmentPower distribution unitImprovements to EfficiencyHandling air flow more carefully. Keep cooling path short and separate hot air from servers from systemConsider raising cooling temperaturesEmploy “free cooling” by locating datacenter in cooler climatesSelect more efficient power system
27 Data Center Failures Reliability of Data Center Fault Tolerances Trade off between cost of failures, along with repairing,and preventing failures.Fault TolerancesTraditional servers require high degree of reliability and redundancy to prevent failures as much as possibleFor data warehouses, this is not practicalExample: a cluster of 10,000 servers will have an average of 1 server failure/day
28 Data Center Failures Fault Severity Categories Corrupted Data is lost, corrupted, or cannot be regeneratedUnreachableService is downDegradedService is available, but limitedMaskedFaults occur but due to fault tolerance, this is masked from user
29 Data Center Fault Causes Software errorsFaulty configsHuman ErrorNetworking faultsFaulty hardwareIt’s easier to tolerate known hardware issues than software bugs or human error.RepairsIt’s not critical to quickly repair individual serversIn reality, repairs are scheduled as a ‘daily sweep’Individual failures mostly do not affect overall data center healthSystem is designed to tolerate faults
31 Relatively New Class of Computers Facebook founded in 2004Google’s Modular Data Center in 2005Microsoft’s Online Services Division in 2005Amazon Web Services in 2006Netflix added streaming in 2007
32 Balanced System Nature of workload at this scale is: Large volumeLarge varietyDistributedThis means no servers (or parts of servers) get to slack while others do the work.Keep servers busy to amortize costNeed high performance from all components!
33 Imbalanced Parts Latency lags bandwidth Figure from John L. Hennessy, David A. Patterson Computer Architecture, Fifth Edition A Quantitative Approach
34 Imbalanced Parts CPUs have been historical focus Figures from MSDN blog article, Background and Engineering the Windows 7 Improvements
35 Focus Needs to Shift Push toward SaaS will highlight these disparities Requires concentrating research:Improving non-CPU componentsImproving responsivenessImproving end-to-end experience
36 Why does latency matter? Responsiveness dictated by latencyProductivity affected by responsivenessFigure from John L. Hennessy, David A. Patterson Computer Architecture, Fifth Edition A Quantitative Approach
37 Real Estate Considerations LandPowerCoolingTaxesPopulationDisastersImage from Facebook’s announcement of its new Iowa DC in April 2013 (http://newsroom.fb.com/News/606/A-New-Data-Center-for-Iowa)
39 Economical Efficiency DC is non-trivial costDoes not include landServers is bigger costMore servers desirableBusy servers desirableChart pulled from
40 Improving Efficiency Better components Power-saving modes Energy proportional (less use == less energy)Power-saving modesTransparent (e.g., clock-gating)Active (e.g., CPU throttling)Inactive (e.g., idle drives stop spinning)
41 Changing Workloads Workloads more agile in nature SaaS Shorter release cyclesOffice 365 updates several times per yearSome Google services update weeklyEven major software gets rewrittenGoogle search engine re-written from scratch 4 timesInternet services are still youngUsage can be unpredictable
42 YouTube Started in 2005 Fifth most popular site within first year Images from
44 Adapting Strike balance of need to deploy with longevity Need it fast and goodDesign to make software easy to createEasier to find programmersRedesign when warrantedGoogle Search’s rewrites removed inefficienciesContrast to Intel’s backwards compatibility spanning decades
45 Future Trends Continued emphasis on: ParallelismNetworking, both within and to/from datacentersReliability via redundancyOptimizing efficiency (energy proportionality)Environmental impactEnergy costsAmdahl’s law will remain major factorNeed increased focus on end-to-end systemsComputing as a utility?
46 “Anyone can build a fast CPU. The trick is to build a fast system.” -Seymour Cray
47 “Anyone can build a fast CPU. The trick is to build a fast system.” -Seymour Cray