CMPT 880: Large-scale Multimedia Systems and Cloud Computing

CMPT 880: Large-scale Multimedia Systems and Cloud Computing
Cost of Data Centers Today, I’ll present my topic “Cost of Data Centers”. Queenie Wong Queenie Wong

Introduction GiGaom Tech News
“Facebook placed over 1 billion to a new datacenter in Iowa...” “Google spent $400 million to expand its datacenter, bringing its total spent $1.5 billion in the area…” GiGaom Tech News Does a datacenter cost over billion dollars?? How to calculate and model the cost of building and operating a datacenter? What is the total cost of ownership (TCO) of a datacenter? How to reduce the cost effectively? Have you read such news about the cost of datacenter before? After reading the news, I got some questions in mind. You can find out the answers from this presentation. Queenie Wong Queenie Wong

Modeling Costs Simplified model
Capital expense (Capex) of datacenter and server Operational expense (Opex) of datacenter and server Total Cost of Ownership (TCO) = datacenter depreciation & Opex + server depreciation & Opex Cost of software and administrators are omitted in calculation Focus on running the physical infrastructure Costs vary greatly First, I’ll talk about modeling cost. This is a simplified model. Capital expense means the money paid upfront and depreciated over a certain period of its expected lifetime. Operational expense is the monthly costs of actually running the business, such as electricity costs. The major costs of owning datacenter is the depreciation and operational expense of datacenter and server. Cost of software and administrators are omitted in calculation because we focus on running the physical infrastructure and those costs vary greatly depends on the situations. Queenie Wong Queenie Wong

Capital Costs Datacenter construction costs Server costs
Infrastructure costs Facilities dedicated to consistent power delivery Networking Switches, routers, and load balancers, etc Capital Costs includes datacenter construction costs which mainly depends on the design of the datacenter. Server costs. Infrastructure costs refers to the costs of facilities dedicated to the delivery of consistent power. Networking equipments such as switches, routers, and load balancer. Load balancer is a processing unit to optimize the utilization of resources. I will go to the details of capital cost in next few slides. Queenie Wong Queenie Wong

Datacenters Datacenter construction costs
Design, size, location, reliability, and redundancy Depreciate over years Interest rate Most large DC cost $12-15/W to build, the very small or large ones cost more Approximately 80% goes toward power and cooling, the remaining 20% toward the general building and site construction Construction costs of datacenter mainly depends on its design, size, location, reliability and redundancy. Its cost can be amortized over10-15 years, its expected lifetime. And interest rate is on top of it. Most large DC cost 12 to 15 per watt to build. The cost of datacenter is always expressed in dollars per watt of usable critical power. Critical power mean the power consumed by IT equipments for processing data. The very small or large ones cost more because some of the fixed costs cannot be amortized for the small one, and the large one require special infrastructure. Approximately 80% of construction cost goes toward power and cooling, the remaining toward the general building and site construction. Queenie Wong Queenie Wong

Datacenter Example Cost $15/W Amortized over 12 years
$1.25/W per year $0.10/W per month Financing at 8%, adding $0.06/W Total of $0.16/W per month For example, the cost of building a datacenter is $15 per watt which can be amortized over 12 years. Equal to 1.25 per Watt per year or 10 cents per month. Financing at 8% adding 6 cents. The total capital cost of datacenter is 16 cents per watt per month. Queenie Wong Queenie Wong

Servers Server costs Characterize server costs per watt
Depreciate over 3-4 years (shorter lifetime) Interest rate Characterize server costs per watt Example $4000 server with peak power consumption of 500W $8/W Depreciated 4 years $0.17/W per month Financing at 8% Adding 0.03/W per month Total cost $0.20/W per month Similarly, the cost of server can be amortized over 3-4 years, shorter lifetime than datacenter. The cost of server is also expressed in dollars per watt. For example, a 4000 dollar server with peak power consumption of 500W. The cost is $8 per watt. Amortized over 4 years. Comes to 17 cents per watt per month. Financing at 8%. Adding 3 cents per watt per month. The total capital cost is 20 cents per watt per month. Similar to the cost of datacenter which is 16 cents per watt per month Queenie Wong Queenie Wong

Infrastructure Cost Facilities dedicated to consistent power delivery and to evacuating heat Generators, transformers, and UPS systems Infrastructure cost refers to facilities dedicated to the delivery of power and cooling including generators, transformers, and UPS (Uninterruptible power supply) systems. You can see here the power is transferred along the infrastructure and finally consumed by IT equipments. One thing I want to mention here is Most power convention losses around 6% happen on the UPS. Queenie Wong Queenie Wong

Networking Equipment Links & transits
Inter-data center links between geographically distributed data centers Traffic to Internet Service Providers Regional facilities reach wide area network interconnection sites Equipment Switches, routers, load balancers Networking costs refer to links and transits expenses. For examples, the inter-data center links between geographically distributed data centers, traffic to internet service providers, regional facilities reach wide area network interconnection sites. Those are cost for inter-communication across sites or the communication to the outside. The traffic along those links are significant and require equipment such as switches, routers and load balancers to forward and distribute data. Queenie Wong Queenie Wong

Operational Costs Datacenter Server Power
Geographic location factors (Climate, taxes, salary levels) Design and age Security Server Hardware maintenance Power Operational costs. Datacenter operational costs are affected by geographic location factors, design and age of datacenter and security level. Server. Hardware repairs and maintenance. And, Power is the primary cost which dominates the operational expense. Queenie Wong Queenie Wong

Power US Environmental Protection Agency (EPA) 2007 report predicted the power consumption of DC could increase to 3% in 2011 In 2010, datacenters in US consumed between 1.7% and 2.2% of total US electricity consumption that is much lower than EPA’s predication [Koomey, Analytics Press] Google’s datacenters consumed less than 1% of electricity used by datacenters worldwide Cost of electricity is still significant As a matter of fact, In 2007, report of US Environment Protection Agency predicated the power consumption of DC could increase to 3% of total US electricity consumption in You can see the significance of power consumption of datacenter. The good news is Due to the increase of energy efficiency in datacenter, in 2010, data centers in the states consumed between 1.7% and 2.2% of total electricity that is much lower than EPA’s prediction. Surprisingly, Google’s datacenters consumed less than 1% electricity used by datacenters around the world. From this, you can tell the importance of energy efficiency and how it affects the cost of electricity. Even thought the actual consumption is much lower than the predication, the cost of electricity of datacenter is still very significant. Queenie Wong Queenie Wong

Case Study A: High-end Servers
Here is a case study for the cost of datacenter. Multi-megawatt datacenter with tier 3 classification, fully populated with high-end servers. The server related cost is 69% excluding the server power, and the datacenter related cost is 24%. The total cost is $0.996/W per month. The data here are directly from the spreadsheet downloaded from the link provided in the book Queenie Wong Queenie Wong

Case Study B: Low-end Servers
Lower-cost, higher-power servers Case Study B. datacenter with low-end servers. The server related cost has dropped significantly from 53% to 29% of total cost. The datacenter related cost is increased to 49%. The total cost is 0.483/W per month which is lower than case A. In fact, the cost of server is trending down while electricity and construction costs are trending up. Over the long term, the facility costs especially power consumption will become a larger fraction of TCO Queenie Wong Queenie Wong

Real-World Datacenter
Costs are even higher than modeled in real-world The model assume datacenter is 100% utilization with 50% CPU utilization Empty space for future development Supply maximum power consumption to server instead of the average value they consume to avoid overheat and trip a breaker (shut off) Reserves 20–50% For example A DC with 10MW of critical power will often consume just 4-6 MW In fact, costs of datacenter in real-world are even higher than the model. Simply because the model assume datacenter with 100% utilization and 50% CPU utilization. In real-world, some empty space is reserved for future development. And, it always supply maximum power consumption to server instead of the average value they consume to avoid overheat and suddenly shut off. Typically, those factors reserves 20-50%. Such that a dc with 10MW of critical power often consume only 4-6MW. Actually, power oversubscription can be applied to address this issue to reduce the stranded power and increase the overall utilization of the datacenter. Queenie Wong Queenie Wong

Case Study C: Partially Filled Datacenter
Partially Filled datacenter. In this case, a datacenter with 50% occupancy factor of case B, datacenter cost completely dominate the cost, and the server related cost was reduced to 19% from 29% in Case B. The datacenter related costs is 66% of total cost which completely dominate the cost. The total cost is 0.72/W per month which is 32% higher than case B. IN SUM, The 3 yr TCO of case B is $8,702 which is 19% lower than case A with $10,757 and 33% lower than case C with $12,968. The utilization of datacenter is the key to lower the real cost of datacenter. 3yr TCO = $12,968 Queenie Wong Queenie Wong

Energy Efficiency Datacenter facilities Servers
30% utilization Servers Power Usage Effectiveness (PUE) A state of the art DC facilities have PUE of 1.7 Inefficient DC facilities have PUE of 2.0 to 3.0 Google have PUE of 1.12 recently Now, I will talk about objectives of datacenter design. First, Energy Efficiency. Most datacenter facilities and servers only have 30% utilization. Their utilization are remarkably low. A state of the art facilities have PUE of 1.7 and inefficient DC has PUE 2 to 3. Google announced all their datacenters have PUE of 1.12 and their best site has PUE of less than 1.06 based on their calculations. Comparing to the new datacenter, the conventional ones have lower energy efficiency. Queenie Wong Queenie Wong

Resilience Built at hardware level to mask failure
UPS Generators Proposed: build at system level Eliminate expensive infrastructure (generators, UPS) Failure unit becomes an entire datacenter The workload of the failed DC can be distributed across sites Resilience. Datacenter is designed without hardware failure. Large amount of costs are spent on hardware redundancy. The proposed design is to build resilience at system level so that expensive infrastructure costs are eliminated. When the resilience build at system level, the failure unit becomes an entire datacenter. The workload of the failed DC can be distributed across sites efficiently. This can eliminate the hardware cost from a datacenter which can save a huge amount of money. Queenie Wong Queenie Wong

Agility Any server can be dynamically assigned to any service anywhere in the datacenter Dynamic growing and shrinking of server pools while maintaining high level of security and performance isolation between services Rapid virtual machine migration Conventional datacenter design against agility Fragmentation of resources Poor server to server connectivity Agility. Any server can be dynamically assigned to any service anywhere in the datacenter. Dynamic growing and shrinking of server pools while maintaining high level of security and performance isolation between services, and can support rapid virtual machine migration. However, agility cannot be incorporated into the conventional datacenter easily because those datacenters often have the problem of fragmentation of resources (machines only can use resource in the same subnet) and poor server to server connectivity (due to the hierarchical architecture design of network). Queenie Wong Queenie Wong

Design Objectives for agility
Location-independent Addressing Decouple the server’s location from its address Any server can become part of any server pool Uniform Bandwidth and Latency Service can be distributed arbitrarily in DC No bandwidth choke point Achieve high performance regardless of location Design objectives for agility Location-independent addressing – decouple the server’s location from its address, so that any server can become part of any server pool. No more fragementation of resources Uniform bandwidth and latency – service can be distributed arbitrarily in DC (which means not locally), no bandwidth choke point, achieve high performance regardless of location Queenie Wong Queenie Wong

Design Objective for Agility
Security and Performance Isolation Any server can be part of any service Services are sufficiently isolated Maintain high level of security No impact on another service E.g. Denial-of-Service attacks, configuration errors Security and performance isolation. Any serer can be part of any service. Services needed to be sufficiently isolated and maintain high level of security. no impact on another. Such that Denial-of-service attacks and configuration errors are locked out. Queenie Wong Queenie Wong

Geo-Distribution Goal: maximize performance Google: 20% revenue loss
High speed and low latency Google: 20% revenue loss Caused by 500 msecs delay in display search result Amazon: 1% sales decrease Caused by additional 100 msecs delay Strong motivation for building geographically distributed DCs to reduce delays Geo-Distribution. The goal of geo-distribution is to maximize performance with high speed and low latency because performance directly impacts revenue. Google reported 20% revenue loss due to 500 msec delay to display search results. Similiarly, Amazon also reported 1% sales decrease because of 100 msec additional delay. Those are strong motivation for building geographically distributed datacenters to reduce delays and increase revenue. Queenie Wong Queenie Wong

Placement Optimal placement and size Diverse locations
Reduce latency between DC and clients Helps with redundancy, not all areas lose power Size Determined by local demands physical size network cost Maximum benefits Placement. Optimal placement and sizing of datacenter. Diverse locations can reduce latency between DC and clients and build redundancy across sites. Because it’s not likely all areas lose power at the same time. Size of datacenter determined by the local demands, physical size, network cost, and maximum benefits. Queenie Wong Queenie Wong

Geo-Distributing Resilience at System Level
Allow entire DC to fail Eliminate expensive infrastructure costs, such as UPS systems and generators Turning geo-diversity into geo-redundancy Requires applications distributed across sites and frameworks to support Balance between communication cost and service performance Furthermore, Resilience at system level. Allow entire DC to fail that can eliminate expensive infrastructure costs. More importantly, turning geo-diversity into geo-redundancy that requires applications be distributed across sites and frameworks to support. In addition, balance the tradeoff between communication cost and service performance. I want to recap that Geo-diversity is to reduce latency and increase performance, and geo-redundancy is to replicate data across sites and to mask a failed data center at system level. Queenie Wong Queenie Wong

Cost saving approaches
Architectural redesigns Maximizing utilization of datacenter Energy-aware load balancing algorithm Minimizing electricity cost Energy cost-aware routing scheme DC power Virtualization New cooling technologies Multi-core servers There are several cost saving approaches. Architectural redesigns of networks, devices or infrastructure, Maximizing utilization of datacenter: Use energy-aware load balancing algorithm. Minimizing electricity cost. Apply cost-aware routing scheme, enhancement on DC power, virtualization, new cooling technologies, and multi-core servers. Queenie Wong Queenie Wong

Internet-Scale Systems
Large distributed systems with request routing and replication incorporated Able to manage millions of users concurrently Composed of tens or even hundreds of sites Tolerate faults Dynamic mapping clients to servers Replicate the data at multiple sites if necessary Now, I will talk about an optimization approach to reduce the electricity cost for internet-scale system proposed by Qureshi (ku/ra/shi) and other authors. First, I want to characterize internet-scale systems. Which is a Large distributed systems with request routing and replication incorporated. Able to manage millions of users concurrently. Composed of tens or even hundreds of sites. Tolerate faults which can mask a failure of hardware even datacenter. Able to map clients to servers dynamically and replicate the date at multiple sites. Queenie Wong Queenie Wong

Energy Elasticity Assumption: Elastic clusters
Energy consumed by a cluster depends on the load placed on it Ideal: consume no power in the absence of load Reality: about 60% of peak in the absence of load Savings can be achieved from routing power demand away from high priced areas, turning off under-utilized components Key: System’s energy elasticity is turned into energy savings Energy Elasticity. It’s a very important assumption that clusters are energy elastic. If not, this approach will not work. Energy elasticity refers to Energy consumed by a cluster depends on the load placed on it. Ideal: consume no power in the absence of load. In reality, consume about 60% of peak power consumption. With adequate elasticity, savings can be achieved from routing power demand away from high priced areas and turn off the under-utilized components from those high priced areas, and always activate only the minimum number of servers needed. The main idea is to utilize system’s energy elasticity and turn into energy savings. Queenie Wong Queenie Wong

Energy cost-aware routing
System requirements Fully replicate Clusters with energy elasticity Electricity prices have temporal and geographic disparity Map client requests to clusters where the total electricity cost of the system is minimized under certain constraints Applicable to both large and small systems Now, I will talk about how this energy cost-aware routing approach works. Requirements are system is fully replicate and with energy elastic cluster. Electricity prices are temporal and geographic variations. Energy cost-aware routing policy will map client requests to clusters where the total electricity cost of the system is minimized under certain constraint. and it can apply to large and small systems. Queenie Wong Queenie Wong

Price variation Geographic Temporal
US electricity market differ regionally Different generation sources (coal, natural gas, nuclear power, etc) Taxes Temporal Real-time markets: prices are calculated every 5 mins volatile Price variation over locations and times Geographic. In US, electricity market differs regionally. Use different generation sources including coal, natural gas, nuclear power. Different tax rate Temporal Rea-time market. The prices are calculated every 5 mins and the price is volatile and fluctuated. Queenie Wong Queenie Wong

Constraints Latencies Bandwidth
High service performance with low client latencies E.g. Map a client’s request to a cluster within the max radical geographic distance Bandwidth Temporal and spatial variation Additional cost when exceeding the limit Constraints. Latencies. High service performance and lower client latencies are required. For example, map a client’s request to a cluster within the max radical distance. Bandwidth varies over time and network and additional cost applied when exceeding the limit Queenie Wong Queenie Wong

Simulation Data Routing schemes
Hourly electricity prices (Jan 2006 – Mar 2009) Akamai workload data set at public clusters in 18 US cities No sufficient network distance info, only coarse measurement Routing schemes Akamai’s original allocation Price-conscious optimizer Simulation Data use historical hourly electricity prices from Jan 2006 to March 2009 Akamai (A/ka/ma/i) workload data set at public clusters in 18 US cities. Akamai have two types of clusters: private and public. Specific users are served by private clusters and the rest are served by public cluster. In this experiment, we only focus on public clusters Price-conscious optimizer scheme Due to no sufficient network distance info, only coarse measurement was obtained. Routing schemes Akamai’s original allocation scheme. Queenie Wong Queenie Wong

Price-conscious Optimizer
Map a client to a cluster with lowest prices which within some predefined max radial distance Consider another cluster if the selected cluster is nearing its capacity Map a client to the closest cluster when no clusters fall within max radial distance, and consider any other nearby clusters Controlled by two parameters Price differentials threshold (minimum price difference) Distance threshold (maximum radical geographic distance) Price-conscious optimizer Map a client to a cluster with lowest prices which within some predefined max radial distance. Consider another cluster if the selected cluster is approaching its capacity Map a client to the closest cluster when no cluster fall within max radial distance, and consider any other nearby clusters This routing policy is controlled by two parameters Price differentials threshold which is the minimum price difference distance threshold which is the maximum radical geographic distance Queenie Wong Queenie Wong

Simulation Results Reduced energy cost
by at least 2% without any increase in bandwidth costs or significant reduction in performance by 30% with relaxed bandwidth constraints around 13% with strict bandwidth constraints A dynamic solution (without distance constraint) beat a static solution (place all servers in cheapest market) without bandwidth constraints 45% versus 35% savings Results Reduced energy cost by at least 2% without any increase in bandwidth costs or significant reduction in performance. By 30% with relaxed bandwidth constraints Around 13% with strict bandwidth constraints. The bandwidth constraint was set to 95th percent capacity More interesting. A dynamic solution (without distance constraint) 45% savings beat a static solution with 35% savings (which place all server in cheapest market) I guess it’s because of cost of routing The simulation results are very appealing. Queenie Wong Queenie Wong

Cons Only applicable to some locations with temporal and spatial electricity price variations Increase in routing energy Delay reduction in client performance Bandwidth May increase bandwidth cost Complexity Cons of this approach. Only applicable to some locations with temporal and spatial variations. May increase in routing energy and delay that causes reduction in client performance. Increase bandwidth cost. Last, the complexity of the implementation is very high Queenie Wong Queenie Wong

VL2 Practical network architecture supports agility
Uniform high capacity between servers Traffic flow should be limited only by the network-interface cards, not the architecture of the network Performance isolation between services Traffic of one service should not be affected by traffic of any other service Virtual Layer 2 - Just as if each service was connected by a separate physical switch VL2 is a practical network architecture developed by Microsoft Research to support agility. VL2 has three main features. First, it can uniform high capacity between servers. The other way to say this is high server to server connectivity is the opposite of low server to server connectivity from the conventional datacenter we mentioned earlier. such that traffic flow should be limited only by the network cards, not the architecture of the network. Second, performance isolation between services with the introduction of virtual layer 2. Traffic of one service should not be affected by traffic of any other service. Just as if each service was connected by a separate physical switch. Queenie Wong Queenie Wong

VL2 Ethernet layer-2 semantics Flat addressing
allow services to be placed anywhere Load balancing to spread traffic uniformly across the DC Just as if servers were on a LAN - where any IP address can be connected to any port of an Ethernet switch Configure server with whatever IP address the service expects Last, Ethernet layer-2 semantics. Flat addressing allow services to be place anywhere, so that load balancing to spread traffic uniformly across the DC. Just if servers were on a LAN where any IP address can be connected to any port of an Ethernet switch. Servers in DC can be configured with whatever IP address the service expects to remove the physical boundaries. Queenie Wong Queenie Wong

VL2 Addressing Scheme Separate server names from locations
Two separate address families Topologically significant Locator Addresses (LAs) Flat Application Addresses (Aas) We need a new VL2 Addressing Scheme to achieve the goal of agility. The Key is to separate server names form locations. Two separate address families. Topologically significant Locator addresses (LA) and flat application addresses (AA). All switches and routers are associated with Locator address and running link-state protocol to pass information across the network. Each Application address server is associated with Locator Address, the directory system stores the mapping from Application Address to Locator Address. The mapping is created when applications servers are provisioned to a service and assigned Application Address to the server whatever the service expected address. With these two addressing schemes, link state routing can be maintained while eliminating the problem of physical boundaries. Queenie Wong Queenie Wong

FORTE FORTE: Flow Optimization based framework for request-Routing and Traffic Engineering Carbon emissions of a DC are depended on it’s electricity fuel in the region Dynamically controls the user traffic directed to DC by weighting each request’s effect on three metrics: Access latency Carbon footprint Electricity cost FORTE: Flow Optimization based framework for request routing and traffic engineering. This idea is similar to cost-aware routing but the goal is to reduce carbon emissions. Basically, carbon emissions of a DC are depended on its electricity fuel in the region. This approach is to dynamically controls the user traffic directed to DC by weighting each request’s effort on three metrics: access latency, carbon footprint and electricity cost. Queenie Wong Queenie Wong

FORTE Allow operators to balance performance with cost and carbon footprint by applying the linear programming approach to solve the user assignments problem Then, determine if data replication or migration to the selected DC is needed Results: Reduce carbon emission by 10% without increasing the mean latency nor the electricity bill This allow operators to balance performance with cost and carbon footprint by applying the linear programming approach to solve the user assignments problem. Then, determine if data replication or migration to the selected DC is needed. Experiment results: it can reduce carbon emission by 10% without increasing the mean latency nor the electricity bill. Queenie Wong Queenie Wong

TIVC TIVC: Time-Interleaved Virtual Clusters Problems:
Current resource reservation model only provisions CPU and memory resources Cloud applications with time-varying bandwidth nature A new virtual network abstraction to specify the time-varying network requirement of cloud applications Increase utilization of both network resources and VM TIVC: Time-interleaved virtual clusters. This approach is to address the problems of current resource reservation model. The current model only provisions CPU and memory resources and ignore the networking needs. This may cause different services compete for resource and result in lower performance. Moreover, most cloud applications with time-varying bandwidth nature. This new virtual network abstraction can specify the time-varying network requirement of applications, so that the utilization of both network resources and VM can be increased as well as the overall utilization of datacenter, and ultimately lower the cost of datacenter. Queenie Wong Queenie Wong

TIVC Compared to virtual cluster (VC), TIVC reduced the completion time significantly This graph show the time to complete all 5000 jobs under VC and TIVC. From this graph, we can see TIVC reduced the completion time significantly. By comparing to VC, TIVC reduces the completion time by 41%, 20%, 23% and 34% for sort, join, aggregation and mixed jobs respectively. Queenie Wong Queenie Wong

Energy Storage Devices
Different types of Energy Storage Devices (ESD) Lead-acid batteries (common used in DCs) Ultra-capacitors (UC) Compressed Air Energy Storage (CAES) Flywheels (gaining acceptance in DC) Different trade-offs between their power, energy costs, lifetime, energy efficiency Hybrid combinations may be more effective Place different ESDs at different levels of power hierarchy according to their advantages Energy storage devices. There are Different types of energy storage devices. Lead(lead)-acid batteries (common used in DCs) fair energy efficiency but low power cost Ultra-capacitors with very long life cycle and high energy efficiency Compressed air energy storage. (Compressed air consume energy and then pressurized air can be used for electricity generation). Very high power cost Flywheels (gaining acceptance in DC) the momentum of rotating wheels which can provide high power needs They all have different trade-off between their power, energy costs, lifetime and energy efficiency. Hybrid combinations may be more effective. Another thought is to place different ESDs at different levels of power hierarchy according to their advantages Queenie Wong Queenie Wong

Lyapunov Optimization
Online control algorithm to minimize the time average cost Make use of UPS to store electricity Store the electricity when prices are low and draw it when the prices are high No suffer from the “curse of dimensionality” as dynamic programming Without requiring any knowledge of the system statistics Easy to implement Lyapunov (Lee/a/pun/nof) Lyapunov optimization. An online control algorithm to minimize the time average cost. This approach make use of UPS to store electricity when prices are cheap and draw it when the prices are high. This method does not suffer from the curse of dimensionality as other dynamic programming approaches. Moreover, this does not require any knowledge of the system statistics. Easy to implement because reuse of existing devices Queenie Wong Queenie Wong

Summary Maximize utilization of datacenters
Minimize cost for electricity Architectural redesign of datacenter, network and server Geo-redundancy to mask failure of datacenter Optimization of resources Trends High demand of Low-end server in order to lower hardware cost due to low utilization of datacenter Electricity costs dominate TCO Power & Energy Efficiency Summary In sum, we have discussed few different approaches. Some of them aim at maximizing utilization of datacenter and minimizing cost for electricity, architectural redesign of datacenter, network and server such as VL2 and TIVC. Geo-redundancy to mask failure of datacenter instead of building redundancy at hardware level. Using optimization algorithms such as cost-aware routing approach and FORTE to direct client request are the most cost effective solution; however, it requires support of routing and replication. Moreover, It’s hard to balance the tradeoffs. The approaches mentioned above may work for some situations but not another. There are many solutions from different perspectives to reduce the cost of data center. However, reducing cost requires the coordination across servers, os, vm, applications, facilities, and each part of the system works collaboratively. Trends: High demand of low-end server in order to lower hardware cost. Electricity costs will dominate the total cost of ownership. Power and energy efficiency devices can make a big difference. Again, the key of lowering the actual cost of datacenter is to increase its overall utilization Queenie Wong Queenie Wong

Review Queenie Wong

TCO Comparisons DC amortization DC interest DC opex
B C DC amortization $0.104 $0.208 DC interest $0.093 $0.186 DC opex $0.040 $0.080 server amortization $0.556 $0.111 server interest $0.109 $0.022 server opex $0.028 $0.006 server power $0.033 $0.054 PUE overhead Total $0.996 $0.483 $0.720 3-yr TCO $10,757 $8,702 $12,968 First, I will review of the case study of TCO. Basically, the interest rate, cost of datacenter and PUE are the same for these three cases. The differences are the following: Case A with high-end server which cost 6000 and the cost of electricity is 6 cents per kilowatt per hour. Case B with mid range server which is cheaper, faster only cost 2000 but has higher power consumption, and the cost of electricity for Case B is 10 cents per kilowatt per hour. Case C is taken from Case B but with only 50% occupancy. As we measure the cost of datacenter by the dollars per watt of critical power for data processing, the datacenter in Case C is only half filled, this double the cost of datacenter related cost. Case B has the lowest 3-yr TCO and Case C has the highest cost because it is half filled. We can see how the utilization of datacenter affect the total cost. Queenie Wong Queenie Wong

TCO Breakdown DC amortization DC interest DC opex server amortization
10.46% 24% 21.55% 49% 28.92% 66% DC interest 9.32% 19.21% 25.77% DC opex 4.02% 8.27% 11.10% server amortization 55.78% 69% 22.98% 29% 15.42% 19% server interest 10.92% 4.50% 3.02% server opex 2.79% 1.15% 0.77% server power 3.36% 7% 11.17% 22% 7.50% 15% PUE overhead Total 100% Now, we take a closer look. This illustrate how the cost of hardware and occupancy affect the breakdown. Case A, the server related cost dominate the datacenter ownership cost. With mid-range server in Case B, the server related cost was dropped to 29% from 69% in A. For this reason, the datacenter related cost raise to 49% from 24% in A and the electricity cost also raise to 22% because the mid-range server consume more power. With partially filled center in Case C, the datacenter related cost completely dominated the total cost and result in higher datacenter ownership cost. Queenie Wong Queenie Wong

Geo-Redundancy In the case of datacenter failure, requests can be directed to a different datacenter Requirements Data replication across sites Special software and framework to support Pros Eliminate the cost of infrastructure redundancy Cons Expensive inter-data center communication costs Reliability versus Communication costs Geo-redundancy. Instead of hardware redundancy in each datacenter, data is replicated in each datacenter. In the case of datacenter failure, requests can be directed to a different datacenter. This requires data replication across sites and special software and framework to support. The advantage is to eliminate the cost of infrastructure and increase the reliability of the system but come with expensive communication costs Queenie Wong Queenie Wong

Energy Elasticity Assumption: Elastic clusters
Energy consumed by a cluster depends on the load placed on it Ideal: consume no power in the absence of load Reality: about 60% of peak in the absence of load Savings can be achieved from routing power demand away from high priced areas turning off under-utilized components Key: System’s energy elasticity is turned into energy savings Energy Elasticity. It’s a very important feature of the cluster. Energy elasticity refers to energy consumed by a cluster depends on the load placed on it. Ideal: consume no power in the absence of load. In reality, consume about 60% of peak power consumption. With energy elasticity, we can save energy cost if the server is idle. As I mentioned earlier the electricity cost varies over locations and time. If we direct the request from those high price areas to low price area, we can make the server in those high priced idle, even turn them off, the goal is to utilize the servers in low priced areas and only activate the minimum number of servers needed. We can save energy cost from those idle server. The main idea is to utilize system’s energy elasticity and turn into energy savings. Queenie Wong Queenie Wong

Cost-aware Routing: Case 1
Map a client to a cluster with lowest prices which within some predefined max radial distance Consider another cluster if the selected cluster is approaching its capacity 1500 distance threshold = 1500 price threshold = 5 C4:43 C2:40 We got an idea. Now we need a routing scheme to choose the cluster. Case 1. Map a client request to a cluster with lowest prices within some predefined max radial distance. Why do we need a distance threshold? This can control the maximum delay. Consider another cluster if the selected one is approaching its bandwidth capacity. This policy can prevent the bandwidth cost beyond its allowance. In this example, we have max radial distance threshold set to 1500 km and price threshold set to 5 dollars. This mean the system only reroute the client request if the price differences between the closest cluster and the cluster with lowest price within the max radial distance are larger than the threshold 5 dollars. This is used to balance the cost of rerouting. We need this price threshold set to some values that can cover the cost of rerouting and other additional overhead. In this example, A is the location of client. Cluster C1 with electricity price 50 dollar, C2 with price 40, C4 with price 43. The system will choose C2 if it’s not approaching its bandwidth limit. Otherwise, it will choose C4. It will not consider c3 even its price is 35 because it is outside the predefined max distance. A C1:50 C3:35 Cluster: Electricity Price Queenie Wong Queenie Wong

Cost-aware Routing: Case 2
Map a client to the closest cluster when no clusters fall within max radial distance Consider any other nearby clusters < 50 km 1500 distance threshold = 1500 price threshold = 5 C4:43 Case 2. Map a client to the closest cluster when no cluster fall within the max radial distance. And consider any other nearby clusters with distance less than 50 km. In this example, it will choose c3. If C3 is nearing its capacity, it will consider c4 only if it is within 50 km B C3:35 Queenie Wong Queenie Wong

Simulation Results Reduced system energy cost
by at least 2% without any increase in bandwidth costs or significant reduction in performance Fully elastic system 30% with relaxed bandwidth constraints 13% with strict bandwidth constraints Assume the data center can achieve 30% energy cost reduction Case D (Case B with 30% energy cost reduction) Results Reduced system energy cost by at least 2% without any increase in bandwidth costs or significant reduction in performance. By 30% with relaxed bandwidth constraints Around 13% with strict bandwidth constraints. This is the energy cost reduction for the entire system. Now, we focus on the datacenter. Assume the data center can achieve 30% energy cost reduction, using this cost saving approach or other approaches. We have case D which is taken from Case B with 30% energy cost reduction. Queenie Wong Queenie Wong

TCO Comparisons DC amortization DC interest DC opex
B C D DC amortization $0.104 $0.208 DC interest $0.093 $0.186 DC opex $0.040 $0.080 server amortization $0.556 $0.111 server interest $0.109 $0.022 server opex $0.028 $0.006 server power $0.033 $0.054 $0.038 PUE overhead Total $0.996 $0.483 $0.720 $0.451 3-yr TCO $10,757 $8,702 $12,968 $8,118 Here, I just want to demonstrate how the energy cost affect the TCO. With energy cost reduced by 30% in Case D, the TCO of Case D was decreased by 6.7% comparing to Case B. The 3 year TCO was reduced to $8118 Queenie Wong Queenie Wong

95th Percentile Metering
Billing method for bandwidth Sample the traffic every 5 minutes Sort and rank the samples collected over the billing cycle Typically, the bandwidth are charged based on 95th percentile of usage. This is one of the billing methods and is commonly used in datacenters. The network provider sample the traffic (inbound and outbound) of datacenter every 5 minutes to measure its Mbit per second over the billing cycle. Around 8000 samples are collected and sorted by bit rate and ranked. The data in the graph does not correspond to a datacenter, it just gives you an idea about this method. Each bin represent a sample. Queenie Wong Queenie Wong

95th Percentile Metering
Measure bandwidth based on 95th percentile of usage Allow occasional bursts beyond committed base rate (< top 5% samples) Cost is determined by the sample size of the 95th percentile, not the area under the curve The samples in this graph are sorted and ranked. The measurement of bandwidth is based on 95th percentile of usage. In this case, 6Mbps is the measured bandwidth. This billing method allow occasional burst beyond the committed base rate but they must not exceed the top 5 percent of samples. Cost is determined by the measured bandwidth which is 6Mbps here. For example, if the allowance is 8 Mbps, you will not be penalized with higher bill even though you have a reading of 10 which is over the allowance. You are safe, as long as the 95 percentile below or equal to the allowance. The key thing here is the cost is determined by the sample size of the 95th percentile, not the area under the curve. This means we can utilize more bandwidth for data processing without extra cost. Queenie Wong Queenie Wong

Smoothing Resource Consumption
Set prices varying with resource availability Differentiate demands by urgency Shift workload to change inefficient usage pattern Since we are charged based on 95th percentile of usage, not the actual throughput. The ultimate usage pattern I would suggest is in red. This is a bit extreme. What I mean is you charged based on the 95 percentile anyway. You can pack the bins more evenly to utilize all the bandwidth to make every dollar worth, or you can even shift the workload to lower the bandwidth. But this is not recommended because this will affect the utilization of other resources. For example, servers. You can set prices according to the resource availability to incent more usage at night-time. Differentiate demands by urgency, so that you are able to prioritize them and shift workload to get an efficient usage pattern Queenie Wong Queenie Wong

Cost Optimization Hardware Resource management
Eliminate hardware redundancy Mid-range servers Resource management Minimize electricity cost Prioritize workloads Shift workload to create efficient usage pattern There are many different approach to cost optimization for datacenter. We can work on Hardware, eliminate hardware redundancy, redesign hardware, use mid-range servers instead of the high-end ones Resource management. minimize electricity cost using optimization algorithm, prioritize workloads, shift workload to make use of bandwidth Queenie Wong Queenie Wong

References Barroso, Luiz André, and Urs Hölzle. "The datacenter as a computer: An introduction to the design of warehouse-scale machines." Synthesis Lectures on Computer Architecture 4.1 (2009): Barroso, Luiz André, and Urs Hölzle. “TCO calculations for case studies in Chapter 6.” < pub?key=phRJ4tNx2bFOHgYskgpoXAA&output=xls> Greenberg, Albert, et al. "The cost of a cloud: research problems in data center networks." ACM SIGCOMM Computer Communication Review 39.1 (2008): Qureshi, Asfandyar, et al. "Cutting the electric bill for internet-scale systems." ACM SIGCOMM Computer Communication Review 39.4 (2009): Queenie Wong Queenie Wong

References Koomey, Jonathan Growth in Data center electricity use 2005 to Oakland, CA: Analytics Press. August 1. < Greenberg, Albert, et al. "VL2: a scalable and flexible data center network." ACM SIGCOMM Computer Communication Review. Vol. 39. No. 4. ACM, 2009. Gao, Peter Xiang, et al. "It's not easy being green." ACM SIGCOMM Computer Communication Review 42.4 (2012): Xie, Di, et al. "The only constant is change: incorporating time-varying network reservations in data centers." ACM SIGCOMM Computer Communication Review 42.4 (2012): Queenie Wong Queenie Wong

Reference Wang, Di, et al. "Energy storage in datacenters: what, where, and how much?." ACM SIGMETRICS Performance Evaluation Review. Vol. 40. No. 1. ACM, 2012. Urgaonkar, Rahul, et al. "Optimal power cost management using stored energy in data centers." Proceedings of the ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems. ACM, 2011. Semaphore Corporation. 95th percentile bandwidth metering explained and analyzed. Web. April Higginbotham, Stacey. Data center rivals Facebook and Google pump $700M in new construction into Iowa. 23 April Web. 23 May Queenie Wong

CMPT 880: Large-scale Multimedia Systems and Cloud Computing

Similar presentations

Presentation on theme: "CMPT 880: Large-scale Multimedia Systems and Cloud Computing"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CMPT 880: Large-scale Multimedia Systems and Cloud Computing

Similar presentations

Presentation on theme: "CMPT 880: Large-scale Multimedia Systems and Cloud Computing"— Presentation transcript:

Similar presentations

About project

Feedback