Presentation is loading. Please wait.

Presentation is loading. Please wait.

Wes Lloyd, Shrideep Pallickara, Olaf David, James Lyon, Mazdak Arabi, Ken Rojas November 6, 2012 Colorado State University, Fort Collins, Colorado USA.

Similar presentations


Presentation on theme: "Wes Lloyd, Shrideep Pallickara, Olaf David, James Lyon, Mazdak Arabi, Ken Rojas November 6, 2012 Colorado State University, Fort Collins, Colorado USA."— Presentation transcript:

1 Wes Lloyd, Shrideep Pallickara, Olaf David, James Lyon, Mazdak Arabi, Ken Rojas November 6, 2012 Colorado State University, Fort Collins, Colorado USA UCC 2012: 5 th IEEE/ACM International Conference on Utility and Cloud Computing

2 Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds Outline Background Research Problem Research Questions Experimental Setup Experimental Results Conclusions 2

3 3

4 Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds Traditional Application Deployment 4 Data Spatial DBrDBMS DODB / NOSQL Business Services Logging Tracking DB App Server Apache Tomcat Object Store Single Server

5 Object Store Geospatial DB rDBMS NOSQL DB File Server Services Distributed Cache Logging Server Apache Tomcat 5

6 Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds Application Component Deployment 6 App Server Component Deployment Application Components Application “Stack” Virtual Machine (VM) Images PERFORMANCE rDBMS r/o File Server Log Server Load Balancer Image 2 rDBMS write... Image 1 App Server File Server Log Server rDBMS write Image n rDBMS r/o Load Balancer Dist. cache

7 Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds n=# components; k=# components per set Permutations Combinations But neither describes partitions of a set! Application Deployments 7

8 Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds Bell’s Number 8 M odel Component Deployment n = #components Application “Stack” VM deployments # of Configurations D atabase F ile Server L og Server... k= #configs config 1 MD F L config 2 M F L config n ML F D 1 VM : 1..n components nk 415 552 6203 7877 84,140 921,147 n... D Number of ways a set of n elements can be partitioned into non-empty subsets

9 9

10 Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds Problem Statement How should application components be deployed to ?  Provide high throughput (requests/sec)  With low resource costs (# of VMs) To guide VM image composition Avoid resource contention from interfering components 10

11 Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds 11 VM Physical Machine (PM) Resources VM PERFORMANCE Resource Contention Resource Surplus

12 Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds Resource Utilization Statistics c 12 CPU - CPU time - CPU time in user mode - CPU time in kernel mode - CPU idle time - # of context switches - CPU time waiting for I/O - CPU time serving soft interrupts - Load average (# proc / 60 secs) Disk - Disk sector reads - Disk sector reads completed - Merged adjacent disk reads - Time spent reading from disk - Disk sector writes - Disk sector writes completed - Merged adjacent disk writes - Time spent writing to disk Network - Network bytes sent - Network bytes received PM VM PM VM

13 Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds Can Resource Utilization Statistics 13 Model Application Performance?

14 14

15 Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds Research Questions Which resource utilization statistics are the best predictors? How should resource utilization data be treated for use in models? Which modeling techniques are best for predicting application performance and ranking performance of service compositions? 15 RQ1) RQ2) RQ3)

16 16

17 Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds RUSLE2 Model “Revised Universal Soil Loss Equation” Combines empirical and process-based science Prediction of rill and interrill soil erosion resulting from rainfall and runoff USDA-NRCS agency standard model Used by 3,000+ field offices Helps inventory erosion rates Sediment delivery estimation Conservation planning tool 17

18 Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds RUSLE2 Web Service Multi-tier client/server application RESTful, JAX-RS/Java using JSON objects Surrogate for common architectures 18 App Server Apache Tomcat Geospatial rDBMS File Server nginx Logging Codebeamer OMS3 RUSLE2 POSTGRESQL POSTGIS 1.7+ million shapes57k XML files, 305Mb

19 Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds Eucalyptus 2.0 Private Cloud (9) Sun X6270 blade servers Dual Intel Xeon 4-core 2.8 GHz CPUs 24 GB ram, 146 GB 15k rpm HDDs CentOS 5.6 x86_64 (host OS) Ubuntu 9.10 x86_64 (guest OS) Eucalytpus 2.0 Amazon EC2 API support 8 Nodes (NC), 1 Cloud Controller (CLC, CC, SC) Managed mode networking with private VLANs XEN hypervisor v 3.4.3, paravirtualization 19

20 Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds RUSLE2 Components Virtual MachineDescription M ModelApache Tomcat 6.0.20, Wine 1.0.1, RUSLE2 Model, Object Modeling System (OMS 3.0) D DatabasePostgresql-8.4, and PostGIS 1.4.0-2. soil data: 1.7 million shapes, 167 million points management data: 98 shapes, 489k points climate data: 31k shapes, 3 million points 4.6 GB for the state of TN F File Servernginx http server 0.7.62 57,185 XML files consisting of 305MB. L LoggerCodebeamer 5.5 running 32-bit ApacheTomcat 6.0 Custom REST/JSON logging service as wrapper. 20

21 Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds SC2 M D F L SC4 M DFL SC7 L MDF SC3 M DF L SC5 MD F L SC6 MD FL SC8 MDF L SC9 MD LF SC10 M FD L SC11 M FDL SC12 M LD F SC13 M LDF SC14 M D L F SC15 M L F D SC1 M D F L 21 (15) Tested Component Deployments Each VM deployed to separate physical machine All components installed on composite image Script enabled/disabled components to achieve configs

22 Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds RUSLE2 Application Variants 22 D-bound o Model 21% o Database 77% o File I/O.75% o Overhead1% o Logging.1% M-bound o Model 73% o Database 1% o File I/O 18% o Overhead 8% o Logging 1% D-bound:join w/ a nested query M-bound:standard model

23 23 SC15 SC14 SC13 SC12 SC11 SC10 SC9 SC8 SC7 SC6 SC5 SC4 SC3 SC2 SC1 CPU time disk sector reads disk sector writes net bytes rcv’d net bytes sent Resource Utilization Variance for Component Deployments Boxes represent absolute deviation from mean Magnitude of variance for deployments

24 Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds Tested Resource Utilization Variables c 24 Network - Network bytes sent (nbr) - Network bytes received (nbs) CPU - CPU time - CPU time in user mode (cpu usr) - CPU time in kernel mode (cpu krn) - CPU idle time (cpu_idle) - # of context switches (contextsw) - CPU time waiting for I/O (cpu_io_wait) - CPU time serving soft interrupts (cpu_sint_time) - (loadavg) (# proc / 60 secs) Disk - Disk sector reads (dsr) - Disk sector reads completed (dsreads) - Merged adjacent disk reads (drm) - Time spent reading from disk (readtime) - Disk sector writes (dsw) - Disk sector writes completed (dswrites) - Merged adjacent disk writes (dwm) - Time spent writing to disk (writetime)

25 Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds 25 100 random runs JSON object 20x Ensembles 100 random runs SC5 MD F L SC8 MDF L SC11 M FDL SC14 M D L F SC1 M D F L (15) RUSLE2 deployments Resource Utilization Data script capture Experimental Data Collection 1 st run  training dataset 2 nd run  test dataset

26 26

27 Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds RQ1 – Which are the best predictors? VM Variables CPU Disk I/O Network I/O 27

28 Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds RQ1 – Which are the best predictors? PM Variables 28 CPU Network I/O

29 Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds RQ2 – How should VM resource utilization data be used by performance models? Combination: RU data =RU M +RU D +RU F +RU L Used Individually: RU data ={RU M ; RU D ; RU F ; RU L ;} 29

30 Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds RQ2 – How should VM resource utilization data be used by performance models? 30 D-bound separate D-bound combined M-bound separate M-bound combined Treating VM data separately for D-bound was better ! RU M or RU MDFL for M-bound was better ! Note the larger RMSE for D-bound RU MDFL !

31 Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds RQ3 – Which modeling techniques were best? Multiple Linear Regression (MLR) Stepwise Multiple Linear Regression (MLR-step) Multivariate Adaptive Regression Splines (MARS) Artificial Neural Network (ANNs) 31

32 Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds RQ3 – Which modeling techniques were best? 32 Multiple Linear Regression Stepwise MLR Multivariate Adaptive Regresion Splines Artifical Neural Network RU MDFL data used to compare models. Had high RMSE test error for D-Bound (32% avg) Model performance did not vary much Best vs. Worst D-BoundM-Bound.11% RMSE train.08%.89% RMSE test.08%.40 rank err.66

33 33

34 Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds Conclusions CPU statistics were the best predictors The best treatment of resource utilization statistics was model specific. - (RU MDFL ) best for M-Bound RUSLE2 (more I/O) - Individual VM stats (e.g. RU M ) best for D-Bound RUSLE2 (more CPU) ANN and MARS provided lower RMS error. All models adequately predicted performance and ranks 34 RQ1) RQ2) RQ3)

35 35

36 36

37 Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds Gaps in Related Work Existing approaches do not consider VM image composition Complementary component placements Interference among components Minimization of resources (# VMs) Load balancing of physical resources Performance models ignore Disk I/O Network I/O VM and component location Approaches & Gaps 37

38 Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds Application Servers Load Balancer Service Requests noSQL data stores rDBMS distributed cache Infrastructure Management Problems & Challenges 38 Scale Services Tune Application Parameters Tune Virtualization Parameters

39 Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds Provisioning Variation Problems & Challenges 39 VM Physical Host VM Ambiguous Mapping VM Request(s) to launch VMs VMs Reserve PM Memory Blocks VMs Share PM CPU / Disk / Network PERFORMANCE

40 Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds Application Profiling Variables Predictive Power 40

41 Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds Application Deployment Challenges VM image composition Service isolation vs. scalability Resource contention among components Provisioning variation Across physical hardware 41

42 Resource Utilization Statistics VMs Reserve PM memory Share CPU, disk, and network I/O resources VM application performance Reflects quality of load balancing of shared resources Resource contention  performance degradation Resource surplus  good performance, higher costs 42

43 Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds Resource Utilization Variables StatisticDescription P/VCPU timeCPU time in ms P/Vcpu usrCPU time in user mode in ms P/Vcpu krnCPU time in kernel mode in ms P/Vcpu_idleCPU idle time in ms P/VcontextswNumber of context switches P/Vcpu_io_waitCPU time waiting for I/O to complete P/Vcpu_sint_timeCPU time servicing soft interrupts VdsrDisk sector reads (1 sector = 512 bytes) VdsreadsNumber of completed disk reads VdrmNumber of adjacent disk reads merged VreadtimeTime in ms spent reading from disk VdswDisk sector writes (1 sector = 512 bytes) VdswritesNumber of completed disk writes VdwmNumber of adjacent disk writes merged VwritetimeTime in ms spent writing to disk P/VnbrNetwork bytes sent P/VnbsNetwork bytes received P/VloadavgAvg # of running processes in last 60 sec 43

44 Nov 6, 2012 IEEE/ACM UCC 2012 Performance Modeling to Support Multi-Tier Application Deployment to Infrastructure-as-a-Service Clouds Experimental Data Script captured resource utilization stats Virtual machines Physical Machines Training data: first complete run 20 different ensembles of 100 model runs 15 component configurations 30,000 model runs Test data: second complete run 30,000 model runs 44


Download ppt "Wes Lloyd, Shrideep Pallickara, Olaf David, James Lyon, Mazdak Arabi, Ken Rojas November 6, 2012 Colorado State University, Fort Collins, Colorado USA."

Similar presentations


Ads by Google