© 2003, Carla Ellis Self-Scaling Benchmarks Peter Chen and David Patterson, A New Approach to I/O Performance Evaluation – Self-Scaling I/O Benchmarks,

Slides:



Advertisements
Similar presentations
C82MST Statistical Methods 2 - Lecture 2 1 Overview of Lecture Variability and Averages The Normal Distribution Comparing Population Variances Experimental.
Advertisements

Cloud Computing Resource provisioning Keke Chen. Outline  For Web applications statistical Learning and automatic control for datacenters  For data.
Ira Cohen, Jeffrey S. Chase et al.
Lecture 36: Chapter 6 Today’s topic –RAID 1. RAID Redundant Array of Inexpensive (Independent) Disks –Use multiple smaller disks (c.f. one large disk)
Workloads Experimental environment prototype real sys exec- driven sim trace- driven sim stochastic sim Live workload Benchmark applications Micro- benchmark.
Synthesizing Representative I/O Workloads for TPC-H J. Zhang*, A. Sivasubramaniam*, H. Franke, N. Gautam*, Y. Zhang, S. Nagar * Pennsylvania State University.
Generating Synthetic Workloads Using Iterative Distillation Zachary Kurmas – Georgia Tech Kimberly Keeton – HP Labs Kenneth Mackenzie – Reservoir Labs.
ITEC 451 Network Design and Analysis. 2 You will Learn: (1) Specifying performance requirements Evaluating design alternatives Comparing two or more systems.
Interactive Analytical Processing in Big Data Systems: A Cross-Industry Study of MapReduce Workloads Jackie.
- Sam Ganzfried - Ryan Sukauye - Aniket Ponkshe. Outline Effects of asymmetry and how to handle them Design Space Exploration for Core Architecture Accelerating.
Host Load Trace Replay Peter A. Dinda Thesis Seminar 11/23/98.
An Adaptable Benchmark for MPFS Performance Testing A Master Thesis Presentation Yubing Wang Advisor: Prof. Mark Claypool.
Statistics II: An Overview of Statistics. Outline for Statistics II Lecture: SPSS Syntax – Some examples. Normal Distribution Curve. Sampling Distribution.
Characterizing and Predicting TCP Throughput on the Wide Area Network Dong Lu, Yi Qiao, Peter Dinda, Fabian Bustamante Department of Computer Science Northwestern.
EL 933 Final Project Presentation Combining Filtering and Statistical Methods for Anomaly Detection Augustin Soule Kav´e SalamatianNina Taft.
Analysis of Individual Variables Descriptive – –Measures of Central Tendency Mean – Average score of distribution (1 st moment) Median – Middle score (50.
Realistic CPU Workloads Through Host Load Trace Playback Peter A. Dinda David R. O’Hallaron Carnegie Mellon University.
Confidence Intervals: Estimating Population Mean
Continuous Probability Distributions A continuous random variable can assume any value in an interval on the real line or in a collection of intervals.
Chapter 9 Overview  Reasons to monitor SQL Server  Performance Monitoring and Tuning  Tools for Monitoring SQL Server  Common Monitoring and Tuning.
IE 594 : Research Methodology – Discrete Event Simulation David S. Kim Spring 2009.
AM Recitation 2/10/11.
1. An Overview of the Data Analysis and Probability Standard for School Mathematics? 2.
1 Reading Report 9 Yin Chen 29 Mar 2004 Reference: Multivariate Resource Performance Forecasting in the Network Weather Service, Martin Swany and Rich.
Chapter 12 Multiple Regression and Model Building.
Managing Multi-Configuration Hardware via Dynamic Working Set Analysis By Ashutosh S.Dhodapkar and James E.Smith Presented by Kyriakos Yioutanis.
Integrating Reserve Risk Models into Economic Capital Models Stuart White, Corporate Actuary Casualty Loss Reserve Seminar, Washington D.C September.
Part IV The General Linear Model. Multiple Explanatory Variables Chapter 13.3 Fixed *Random Effects Paired t-test.
© 2003, Carla Ellis Experimentation in Computer Systems Research Why: “It doesn’t matter how beautiful your theory is, it doesn’t matter how smart you.
Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis [1] 4/24/2014 Presented by: Rakesh Kumar [1 ]
© 1998, Geoff Kuenning Linear Regression Models What is a (good) model? Estimating model parameters Allocating variation Confidence intervals for regressions.
A Really Bad Graph. For Discussion Today Project Proposal 1.Statement of hypothesis 2.Workload decisions 3.Metrics to be used 4.Method.
Practical Statistical Analysis Objectives: Conceptually understand the following for both linear and nonlinear models: 1.Best fit to model parameters 2.Experimental.
Test Loads Andy Wang CIS Computer Systems Performance Analysis.
© 2003, Carla Ellis Simulation Techniques Overview Simulation environments emulation exec- driven sim trace- driven sim stochastic sim Workload parameters.
Building a Parallel File System Simulator E Molina-Estolano, C Maltzahn, etc. UCSC Lab, UC Santa Cruz. Published in Journal of Physics, 2009.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
IE241: Introduction to Hypothesis Testing. We said before that estimation of parameters was one of the two major areas of statistics. Now let’s turn to.
ICOM 6115: Computer Systems Performance Measurement and Evaluation August 11, 2006.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
A Process Control Screen for Multiple Stream Processes An Operator Friendly Approach Richard E. Clark Process & Product Analysis.
1 Evaluation of Cooperative Web Caching with Web Polygraph Ping Du and Jaspal Subhlok Department of Computer Science University of Houston presented at.
Chapter 10 Verification and Validation of Simulation Models
Simulation Techniques Overview Simulation environments emulation/ exec- driven event- driven sim trace- driven sim stochastic sim Workload parameters System.
CHAPTER-6 Sampling error and confidence intervals.
Replicating Memory Behavior for Performance Skeletons Aditya Toomula PC-Doctor Inc. Reno, NV Jaspal Subhlok University of Houston Houston, TX By.
Chapter 3 System Performance and Models Introduction A system is the part of the real world under study. Composed of a set of entities interacting.
“Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore.
Measuring the Capacity of a Web Server USENIX Sympo. on Internet Tech. and Sys. ‘ Koo-Min Ahn.
GENDER AND AGE RECOGNITION FOR VIDEO ANALYTICS SOLUTION PRESENTED BY: SUBHASH REDDY JOLAPURAM.
OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.
An Empirical Study of OS Errors Chou, Yang, Chelf, Hallem, and Engler SOSP 2001 Characterizing a workload w.r.t reliability.
1 Evaluation of Cooperative Web Caching with Web Polygraph Ping Du and Jaspal Subhlok Department of Computer Science University of Houston presented at.
Test Loads Andy Wang CIS Computer Systems Performance Analysis.
1
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Correlating Resource Demand Information with ARM Data for Application Services Jerome Rolia and Vidar Vetland* Carleton University Department of Systems.
© 2003, Carla Ellis Model Vague idea “groping around” experiences Hypothesis Initial observations Experiment Data, analysis, interpretation Results & final.
OPERATING SYSTEMS CS 3502 Fall 2017
Statistical Modelling
4- Performance Analysis of Parallel Programs
Materials for Lecture 18 Chapters 3 and 6
Unistore: Project Updates
1 Department of Engineering, 2 Department of Mathematics,
1 Department of Engineering, 2 Department of Mathematics,
1 Department of Engineering, 2 Department of Mathematics,
Dept. of Computer Science, Univ. of Rochester
Statistics II: An Overview of Statistics
Retrieval Performance Evaluation - Measures
Srinivas Neginhal Anantharaman Kalyanaraman CprE 585: Survey Project
Presentation transcript:

© 2003, Carla Ellis Self-Scaling Benchmarks Peter Chen and David Patterson, A New Approach to I/O Performance Evaluation – Self-Scaling I/O Benchmarks, Predicted I/O Performance, SIGMETRICS 1993.

Workloads Experimental environment prototype real sys exec- driven sim trace- driven sim stochastic sim Live workload Benchmark applications Micro- benchmark programs Synthetic benchmark programs Traces Distributions & other statistics monitor analysis generator Synthetic traces “Real” workloads Made-up © 2003, Carla Ellis Data sets You are here

© 2003, Carla Ellis Goals A benchmark that automatically scales across current and future systems –It dynamically adjusts to system under test Predicted performance based on self- scaling evaluation results –Estimate performance for unmeasured workloads –Basis for comparing different systems

© 2003, Carla Ellis Characteristics of an Ideal I/O Benchmark Benchmark should 1.Help in understanding why, isolate reasons for poor performance 2.Be I/O limited 3.Scale gracefully 4.Allow fair comparisons among machines 5.Be relevant to a wide range of applications 6.Be tightly specified, reproducible, explicitly state assumptions Current benchmarks fail

© 2003, Carla Ellis Overview of Approach Step 1: scaling: Benchmark automatically explores workload space to find relevant workload. –By depending on system under test, the ability to compare systems on benchmark results is lost Step 2: Predicted performance scheme helps restore that capability –Accuracy of prediction must be assured

© 2003, Carla Ellis Workload Parameters uniqueBytes – total size of data accessed sizeMean – average size of an I/O request –Individual requests chosen from normal distribution readFrac – fraction of reads; fraction of writes is 1- readFrac seqFrac – fraction of requests that are sequential access –For multiple processes, each has its own thread processNum – concurrency Workload is user-level program with parameters set

Representativeness Does such a synthetic workload have the “right” set of parameters to capture a real application (characterized by its values for that set of parameters)?

Benchmarking Results Set of performance graphs, one for each parameter, while holding all other parameters fixed at their focal point values. –75% performance point –Found by iterative search process More of workload space is explored Does not capture dependencies among parameters

focal point = (21MB, 10KB, 0,1,0)

© 2003, Carla Ellis Families of Graphs General applicability – representative across range of parameter (75% rationale) Multiple performance regions – especially evident for uniqueBytes because of storage hierarchy issues –On border – unstable –mid-range focal points

cache disk Larger requests better Reads are better than writes Sequential helps Sequential has little effect

© 2003, Carla Ellis Predicted Performance Problem: benchmark chosen will be different for 2 different systems so they can not be directly compared. Solution: Estimate performance for unmeasured workloads so a common set of benchmarks can be used for comparisons

© 2003, Carla Ellis How to Predict Assume the shape of performance curve for one parameter is independent of values of other parameters. Use self-scaling benchmark to measure with all but one parameter fixed at focal point

Solid lines measured performance with sizeMean fixed on left (S f ), processNum fixed on right (P f ) Predict throughput curve with sizeMean at S 1 by assuming constant ratio Throughput(processNum, sizeMean f ) Throughput(processNum, sizeMean 1 ) which is known at processNum P f in righthand graph

Accuracy of Predictions For SPARCstation + 1disk Measured at random points in parameter space. Error correlated to uniqueBytes

Comparisons

For Discussion Next Thursday (because of snow) Survey the types of workloads – especially the standard benchmarks – used in your proceedings (10 papers). is a great resource © 2003, Carla Ellis

Continued discussion of reinterpreting an experimental paper into strong inference model