Presentation is loading. Please wait.

Presentation is loading. Please wait.

SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 by Dalia Kaulakiene, Aalborg University (Denmark) Christian Thomsen,

Similar presentations


Presentation on theme: "SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 by Dalia Kaulakiene, Aalborg University (Denmark) Christian Thomsen,"— Presentation transcript:

1 SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 by Dalia Kaulakiene, Aalborg University (Denmark) Christian Thomsen, Aalborg University (Denmark) Torben Bach Pedersen, Aalborg University (Denmark) Ugur Çetintemel, Brown University (USA) Tim Kraska, Brown University (USA)

2 Amazon Web Services EC2 cloud ContractPrice per hour (*) Reserved instances1-year or 3-year contract$0.0581 on-demandNo contract$0.128 Spot instancesNo contract, Can be revoked $0.0365 (**) * c3.large instance type (Linux) in ap-northeast-1 region ** Average price in 1 week, Mar 23 - Mar 30, 2015 2SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2

3 Amazon Spot market User bids for the machine –“I need 8 vCPUs machine in region A” –“maximum I will pay $0.5 per hour” If the spot price < $0.5: –The user gets an instance If (and when) the spot price > $0.5: –AWS takes back an instance 3SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 ap-northeast-1c ap-northeast-1a

4 Problems The user needs to execute an analytical workload on spot instances Hadoop job Data in Amazon S3 Problem1. Execution time is unknown AWS organizes instances into 9 families with 4-5 instance types  General purpose: T2, M4, M3  Compute optimized: C4, C3  Memory optimized: R3  GPU: G2  Storage optimized: I2, D2 Problem2. Execution cost is unknown (and varies) 7 regions, 2-4 availability zones in each Spot prices changes in real-time 4SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2

5 SpotADAPT Estimates execution time on AWS instances Estimates execution price in AWS regions Proposes deployment w/ optimization goals: Fastest execution within budget or Cheapest execution within time constraints Monitors execution Proposes re-deployment if: Instance is taken away by AWS Cheaper or faster deployment is available 5SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2

6 Execution time estimation 1.Dataset size increase effect Increasing (sampled) input size Executing on same machine More micro-runs does not improve accuracy! SpotADAPT takes few micro-runs to estimate the time of large dataset 6SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 AWS instance family: Slowest machine(2 vCPUs) More powerful machine (4 vCPUs) …(.. vCPUs) Most powerful machine (32 vCPUs) Wordcount

7 Execution time estimation (cont.) 7SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 2.Scale-up effect Increasing machine power (# vCPUs) Using same dataset SpotADAPT takes 1 micro- dataset, executes workload on few instance types in the family, estimates the time of large dataset on all instances 3.Combine Estimate execution time on all machines using large dataset Wordcount

8 SpotADAPT flow 8SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2

9 SpotADAPT. Step 1 Hadoop job Data: Bucket in AWS S3 Optimization goals: Cheapest execution within time boundaries or Fastest execution within budget boundaries 9SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2

10 SpotADAPT. Step 2 Setup: Prepare data for micro-runs  for data size effect estimation  for scale-up effect estimation Execute micro runs for each AWS instance family:  On base instance type – for data size effect  On other instance types using one micro-dataset – for scale-up effect 10SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2

11 SpotADAPT. Step 2 (cont.) Execution time estimation: Data size effect Scale-up effect Combining both Execution time (slowest instance, large dataset) Scale factor (time on slower instance / time on 2x powerful instance) Execution time (slowest instance, large dataset) Execution time (2x instance, large dataset) … Execution time (most powerful instance, large dataset) 11SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 AWS instance family: Slowest machine(2 vCPUs) More powerful machine (4 vCPUs) …(.. vCPUs) Most powerful machine (32 vCPUs)

12 SpotADAPT. Step 2 (cont.) Execution price estimation For each instance family For each instance type in the family For each region For each availability zone For on-demand For spot (assuming start time = current time) 12SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2

13 SpotADAPT. Step 3 Initial deployment Choose best combination:  AWS region, zone  Instance type  Pricing model For fastest execution: 1. Choose fastest instance 2. Find the deployment which gives cheaper execution than the budget 3. If nothing found, choose second fastest, repeat For cheapest execution: 1. Choose cheapest deployment 2. If execution time exceeds the deadline, choose second best deployment, repeat 13SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2

14 SpotADAPT. Step 3 (cont.) Adaptive (re-)Deployment: When instance is taken back by Amazon (Out-of-bid re-deployment) When prices in current region increase When prices in other region decrease Aligned with optimization goals 14SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2

15 Simulation Fastest execution: SpotADAPT Oracle time Fast compute: Fast mem: Cheapest execution: SpotADAPT Oracle time: Oracle time+price: Cheap vCPU: default oracle Workloads Wordcount Selfjoin Spot price traces Jan 8, 2015 – April 8, 2015 9 AWS regions, 21 availability zones in total Strategies: 15SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2

16 Results (Fastest execution) SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC216 Budget <= $0.1 Wordcount Selfjoin Default strategies FAIL Budget <= $0.5 SpotADAPT == Oracle

17 Results (Cheapest execution) Wordcount Deadline 9.5h Selfjoin Deadline 6h Deadline 1h 17SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 SpotADAPT == Oracle Cheap vCPU fails 60% of times SpotADAPT is 0.3% more expensive

18 Results – Adaptive re-deployment Initial deployment Re-deployment 18SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2

19 Summary SpotADAPT estimates time on AWS instances Only few micro-runs on some instances in the family Estimates execution price in AWS regions Using the most recent price is as good as knowing all future prices Proposes deployment w/ optimization goals: Fastest execution within budget or Cheapest execution within time constraints Monitors execution Proposes re-deployment if: Instance is taken away by AWS Cheaper or faster deployment is available 19SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2

20 Thank you! daliak@cs.aau.dk 20SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2

21 Future work More diverse workloads Larger input datasets SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC221

22 Backup slides Setup time: For 50GB Wordcount, for 80GB Selfjoin Setup time is ~ 15% of execution time on slowest machine Setup price: Setup price is ~ 50% of execution price on on-demand market SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC222


Download ppt "SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 by Dalia Kaulakiene, Aalborg University (Denmark) Christian Thomsen,"

Similar presentations


Ads by Google