Introduction Background Knowledge Workload Modeling Related Work Conclusion & Future Work Modeling Many-Task Computing Workloads on a Petaflop Blue Gene.

Slides:



Advertisements
Similar presentations
Doc.: IEEE /1216r1 Submission November 2009 BroadcomSlide 1 Internet Traffic Modeling Date: Authors: NameAffiliationsAddressPhone .
Advertisements

1 Continuous random variables Continuous random variable Let X be such a random variable Takes on values in the real space  (-infinity; +infinity)  (lower.
Doc.: IEEE /1387r0 Submission Nov Yan Zhang, et. Al.Slide 1 HEW channel modeling for system level simulation Date: Authors:
A Dynamic World, what can Grids do for Multi-Core computing? Daniel Goodman, Anne Trefethen and Douglas Creager
Bag-of-Tasks Scheduling under Budget Constraints Ana-Maria Oprescu, Thilo Kielman Presented by Bryan Rosander.
Network and Service Assurance Laboratory Analysis of self-similar Traffic Using Multiplexer & Demultiplexer Loaded with Heterogeneous ON/OFF Sources Huai.
Computer Science Generating Streaming Access Workload for Performance Evaluation Shudong Jin 3nd Year Ph.D. Student (Advisor: Azer Bestavros)
Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 19 Scheduling IV.
All Hands Meeting, 2006 Title: Grid Workflow Scheduling in WOSE (Workflow Optimisation Services for e- Science Applications) Authors: Yash Patel, Andrew.
LHC Collimation Working Group – 19 December 2011 Modeling and Simulation of Beam Losses during Collimator Alignment (Preliminary Work) G. Valentino With.
ANALYZING STORAGE SYSTEM WORKLOADS Paul G. Sikalinda, Pieter S. Kritzinger {psikalin, DNA Research Group Computer Science Department.
GlobeTraff A traffic workload generator for the performance evaluation of ICN architectures K.V. Katsaros, G. Xylomenos, G.C. Polyzos A.U.E.B. (presented.
CMPT 855Module Network Traffic Self-Similarity Carey Williamson Department of Computer Science University of Saskatchewan.
On the Self-Similar Nature of Ethernet Traffic - Leland, et. Al Presented by Sumitra Ganesh.
The Organic Grid: Self- Organizing Computation on a Peer-to-Peer Network Presented by : Xuan Lin.
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
October 14, 2002MASCOTS Workload Characterization in Web Caching Hierarchies Guangwei Bai Carey Williamson Department of Computer Science University.
OS Fall ’ 02 Performance Evaluation Operating Systems Fall 2002.
Network Traffic Measurement and Modeling CSCI 780, Fall 2005.
A Nonstationary Poisson View of Internet Traffic T. Karagiannis, M. Molle, M. Faloutsos University of California, Riverside A. Broido University of California,
Performance Evaluation
Lecture Slides Elementary Statistics Twelfth Edition
OS Fall ’ 02 Performance Evaluation Operating Systems Fall 2002.
July 13, “How are Real Grids Used?” The Analysis of Four Grid Traces and Its Implications IEEE Grid 2006 Alexandru Iosup, Catalin Dumitrescu, and.
Self-Similar through High-Variability: Statistical Analysis of Ethernet LAN Traffic at the Source Level Walter Willinger, Murad S. Taqqu, Robert Sherman,
1 10. Joint Moments and Joint Characteristic Functions Following section 6, in this section we shall introduce various parameters to compactly represent.
Bridge the gap between HPC and HTC Applications structured as DAGs Data dependencies will be files that are written to and read from a file system Loosely.
The Effects of Systemic Packets Loss on Aggregate TCP Flows Thomas J. Hacker May 8, 2002 Internet 2 Member Meeting.
Self-Similarity of Network Traffic Presented by Wei Lu Supervised by Niclas Meier 05/
1 Chapters 9 Self-SimilarTraffic. Chapter 9 – Self-Similar Traffic 2 Introduction- Motivation Validity of the queuing models we have studied depends on.
HeteroPar 2013 Optimization of a Cloud Resource Management Problem from a Consumer Perspective Rafaelli de C. Coutinho, Lucia M. A. Drummond and Yuri Frota.
1 Reasons for parallelization Can we make GA faster? One of the most promising choices is to use parallel implementations. The reasons for parallelization.
Self-Organizing Agents for Grid Load Balancing Junwei Cao Fifth IEEE/ACM International Workshop on Grid Computing (GRID'04)
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
DataSys Laboratory Dr. Ioan Raicu Juan Carlos Hernández Munuera, MS 2011 Hui Jin, Tonglin Li Paper submission: –Ke Wang, Ioan Raicu. “SimMatrix: Exploring.
EVENT MANAGEMENT IN MULTIVARIATE STREAMING SENSOR DATA National and Kapodistrian University of Athens.
Chapter 4 – Modeling Basic Operations and Inputs  Structural modeling: what we’ve done so far ◦ Logical aspects – entities, resources, paths, etc. 
Emalayan Vairavanathan
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Review and Preview This chapter combines the methods of descriptive statistics presented in.
Traffic Modeling.
Extreme-scale computing systems – High performance computing systems Current No. 1 supercomputer Tianhe-2 at petaflops Pushing toward exa-scale computing.
1 Statistical Distribution Fitting Dr. Jason Merrick.
Ch5. Probability Densities II Dr. Deshi Ye
Gennaro Cordasco - How Much Independent Should Individual Contacts be to Form a Small-World? - 19/12/2006 How Much Independent Should Individual Contacts.
ICOM 6115: Computer Systems Performance Measurement and Evaluation August 11, 2006.
The Dirichlet Labeling Process for Functional Data Analysis XuanLong Nguyen & Alan E. Gelfand Duke University Machine Learning Group Presented by Lu Ren.
1 Self Similar Traffic. 2 Self Similarity The idea is that something looks the same when viewed from different degrees of “magnification” or different.
Ó 1998 Menascé & Almeida. All Rights Reserved.1 Part V Workload Characterization for the Web.
Chapter 3 System Performance and Models Introduction A system is the part of the real world under study. Composed of a set of entities interacting.
Social-Aware Stateless Forwarding in Pocket Switched Networks Soo-Jin SHIN
OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.
MODELING THE SELF-SIMILAR BEHAVIOR OF PACKETIZED MPEG-4 VIDEO USING WAVELET-BASED METHODS Dogu Arifler and Brian L. Evans The University of Texas at Austin.
Scheduling MPI Workflow Applications on Computing Grids Juemin Zhang, Waleed Meleis, and David Kaeli Electrical and Computer Engineering Department, Northeastern.
ECE-7000: Nonlinear Dynamical Systems 2. Linear tools and general considerations 2.1 Stationarity and sampling - In principle, the more a scientific measurement.
User Mobility Modeling and Characterization of Mobility Patterns Mahmood M. Zonoozi and Prem Dassanayake IEEE Journal on Selected Areas in Communications.
CHARACTERIZING CLOUD COMPUTING HARDWARE RELIABILITY Authors: Kashi Venkatesh Vishwanath ; Nachiappan Nagappan Presented By: Vibhuti Dhiman.
1 Internet Traffic Measurement and Modeling Carey Williamson Department of Computer Science University of Calgary.
Simulation. Types of simulation Discrete-event simulation – Used for modeling of a system as it evolves over time by a representation in which the state.
1
Spark on Entropy : A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud Huankai Chen PhD Student at University of Kent.
OPERATING SYSTEMS CS 3502 Fall 2017
Chapter 5a: CPU Scheduling
Grid Computing.
Chapter 6: CPU Scheduling
Evaluation of Load Balancing Algorithms and Internet Traffic Modeling for Performance Analysis By Arthur L. Blais.
CPU Scheduling G.Anuradha
COMP60611 Fundamentals of Parallel and Distributed Systems
Network Traffic Modeling
Lecture 2 Part 3 CPU Scheduling
CPSC 641: Network Traffic Self-Similarity
Presentation transcript:

Introduction Background Knowledge Workload Modeling Related Work Conclusion & Future Work Modeling Many-Task Computing Workloads on a Petaflop Blue Gene / P Supercomputer2

Introduction Background Knowledge Workload Modeling Related Work Conclusion & Future Work Modeling Many-Task Computing Workloads on a Petaflop Blue Gene / P Supercomputer3

Workload analysis is important to evaluate the performance of large- scale distributed computer systems and software (e.g. job scheduling systems). Although real workloads (traces) are usually collected and they reflect reality, they have not yet become widely available. Workload models, which generate synthetic workloads, have a number of advantages over traces (easy to get, representative enough, and can evaluate performance at large scales). Simplified assumptions: fixed-interval, bulk, or Poisson job arrivals Main properties: pseudo-periodicity, long range dependence (LRD), and the “bag-of-tasks” behavior with strong temporal locality. Modeling Many-Task Computing Workloads on a Petaflop Blue Gene / P Supercomputer4

Bridge the gap between HPC and HTC Comes from the data driven model: move tasks to data Loosely coupled apps with HPC orientations Many activities coupled by distributed file system ops Many resources over short time periods Modeling Many-Task Computing Workloads on a Petaflop Blue Gene / P Supercomputer5

Falkon –Fast and Lightweight Task Execution Framework – cts/Falkon/index.htmlhttp://datasys.cs.iit.edu/proje cts/Falkon/index.html Swift –Parallel Programming System – wift/index.phphttp:// wift/index.php Modeling Many-Task Computing Workloads on a Petaflop Blue Gene / P Supercomputer6

Introduction Background Knowledge Workload Modeling Related Work Conclusion & Future Work Modeling Many-Task Computing Workloads on a Petaflop Blue Gene / P Supercomputer7

Bag-of-Task (BOT): applications that can be structured as a set of independent computational tasks are called bag-of-tasks. BOTs had been proved to be an effective approach for workload modeling. Self Similarity: refers to situation where a phenomenon has the same general characteristics at different scales. Long-range dependency (LRD): a phenomenon that relates to the rate of decay of statistical dependence, with the implication that this decays more slowly than an exponential decay, typically a power-like decay. Some self-similar processes may exhibit long-range dependence, but not all processes having long-range dependence are self-similar. Joint Distribution: describes the relationship between two or more variables, specifies the probability of different combinations of values. Autocorrelation: the cross-correlation of a signal with itself. Informally, it is the similarity between observations as a function of the time separation between them. Correlation coefficient (0, 1): measures the correlation between two variables, is defined as the covariance divided by the standard deviations. Modeling Many-Task Computing Workloads on a Petaflop Blue Gene / P Supercomputer8

Introduction Background Knowledge Workload Modeling Related Work Conclusion & Future Work Modeling Many-Task Computing Workloads on a Petaflop Blue Gene / P Supercomputer9

Another workload log is from running MTC applications on the 128-node Linux Cluster of ANL and U Chicago (36M tasks with 825 BOTs named as ANLUC workload). BOT partition rule: all jobs in the same BOT have exactly the same values with respect to the following attributes: user name, job name, and requested number of processors. General to Cloud: network communication overhead is dwarfed by the task length. Modeling Many-Task Computing Workloads on a Petaflop Blue Gene / P Supercomputer10 Real traces obtained from running MTC applications for 17 month period (173M tasks). Filtered out the logs to isolate workload on a 40K-node IBM Blue Gene/P supercomputer (34.8M tasks with the minimum runtime of 0 seconds, medium of 30 seconds, maximum runtime of seconds, average runtime of seconds, and standard deviation of ; partitioned into 1395 BOTs; named as BGP workload).

BOT size concerns how many tasks in the BOT. Runtime is the execution time of entire BOT, which is time period between the arrival of its first task, and the end of execution of its last task. CPU time means the total execution of all tasks in a BOT. It is calculated as summation of the execution time of each individual task in the BOT. Average execution time is calculated by CPU time divided by BOT size. In-Bag Inter-arrival is calculated as the time between the arrival of its first task, and the arrival of its last task. Dividing the In-Bag Inter- arrival by runtime, we get the arrival intensity during the execution. Throughput is calculated as BOT size divided by the runtime. Modeling Many-Task Computing Workloads on a Petaflop Blue Gene / P Supercomputer11

Modeling Many-Task Computing Workloads on a Petaflop Blue Gene / P Supercomputer12

Five random variable distributions, namely Generalized Pareto (GP), Weibull(wbl), Lognormal (logn), Gamma (gamma), and Exponential (expn), to fit the traces. These five distributions have been extensively used in workload analysis. Conclusion: GP distribution is the best fit; maybe not general enough, but similar method could be used. Modeling Many-Task Computing Workloads on a Petaflop Blue Gene / P Supercomputer13 BGP workload left; ANLUC workload right

Do burst exist, and for how long? Do the bursts have any periodicity pattern? Represent the BOT arrivals as a rate process, which characterizes LRD and the periodicity properties of the workload. These properties can be observed via the autocorrelation function of the BOT arrivals. Conclusion: BGP workload has strong monthly periodic pattern in this workload, and decays after 300 days; ANLUC workload has burst, and staged periodic patterns (lag 60 to lag 100, and lag 140 to lag 172). Modeling Many-Task Computing Workloads on a Petaflop Blue Gene / P Supercomputer14 BGP workload left; ANLUC workload right

BOT Runtime with respect to BOT Size Throughput Trend BOT Execution Time with respect to BOT Size BOT Execution Time with respect to BOT In-Bag Inter-Arrival Modeling Many-Task Computing Workloads on a Petaflop Blue Gene / P Supercomputer15

Modeling Many-Task Computing Workloads on a Petaflop Blue Gene / P Supercomputer16 BGP workload left; ANLUC workload right Conclusion: with the increase of the BOT size, the BOT runtime is also growing. What’s more, BGP workload has a lower growing speed compared to ANLUC workload, which is because the ANLUC workload has smaller BOT size.

Throughput has a relationship with respect to the BOT size, and the number of busy workers. Modeling Many-Task Computing Workloads on a Petaflop Blue Gene / P Supercomputer17 BGP workload left; ANLUC workload right Conclusion: as the BOT size increases, the throughput is also increasing, even though bigger BOT size increases the runtime a little bit; with the growing of number of workers, the throughput also increases because more workers are executing tasks.

Modeling Many-Task Computing Workloads on a Petaflop Blue Gene / P Supercomputer18 BGP workload left; ANLUC workload right Conclusion: the BOT CPU time is increasing as the BOT size grows; the average execution per task (CPU time / BOT size) always remains same at the same level with different BOT sizes. This means that the increasing speed of the BOT CPU time is as fast as the increasing speed of the BOT size.

In order to model the arrival pattern of the tasks inside a BOT, it is necessary to discuss the In- Bag inter-arrival pattern of each task in the BOT; We assume that the tasks in a BOT are arriving with a Uniform distribution. The arrival frequency can be calculated as dividing the BOT size by the In-Bag inter-arrival time. Modeling Many-Task Computing Workloads on a Petaflop Blue Gene / P Supercomputer19 BGP workload left; ANLUC workload right Conclusion: the BOT CPU time is increasing as the BOT In-Bag inter-arrival time increases. In addition, there exists a lower bound (thick blue lines) of BOT CPU. This is due to the scheduling capability of the framework (e.g. Falkon).

Introduction Background Knowledge Workload Modeling Related Work Conclusion & Future Work Modeling Many-Task Computing Workloads on a Petaflop Blue Gene / P Supercomputer20

Dumitrescu et al. covered the practical aspects of running workloads in large-scale grids. H. Li et al conducted a comprehensive statistical analysis of a variety of workload traces, which includes the workload patterns and the corresponding models with software. S.B. Lowen et al. described the job traffic as a stochastic fractal-based point process, based on which, H. Li modeled the job arrivals by modeling inter-arrival time processes with Markov modulated Poisson process (MMPP). Mallat et al. introduced a particular analysis-by-synthesis method called matching pursuit to model the Pseudo-Periodicity of the workloads. A. Iosup et al. found that the Weibull distribution is the best fit for modeling the inter- arrival times of BOT for workloads in the Grid environment. Modeling Many-Task Computing Workloads on a Petaflop Blue Gene / P Supercomputer21 All the research focuses on the traditional HPC and HTC applications under the Cluster and Grid environments; and in the meantime, we take some of their methods, such as statistics analysis of the job inter-arrival time, and apply them to model workflow-generated MTC workloads in supercomputers and clusters, and generalize to cloud.

Introduction Background Knowledge Workload Modeling Related Work Conclusion & Future Work Modeling Many-Task Computing Workloads on a Petaflop Blue Gene / P Supercomputer22

Extracted two groups of BOTs from the traces obtained from running MTC applications on a Blue Gene/P supercomputer and the ANLUC Cluster. Applied five random variable distributions to model the cumulative distribution function of the BOT workloads, and found that GP distribution is the most suitable one. Defined some BOT attributes, such as BOT size, runtime, CPU times, and inter-arrival time of BOT and examine the correlations among them. More realistic synthetic workloads could be created that can be used to study a variety of scheduling algorithms. Continue to collect more MTC workloads to be able to analyze more BOTs, and try to establish some representative models for different MTC workloads. Dependent tasks generated by the workflow system will also be investigated, and we plan to analyze the BOT behavior of these workloads in a Cloud environment. Modeling Many-Task Computing Workloads on a Petaflop Blue Gene / P Supercomputer23

More information: – – Contact: Questions? Modeling Many-Task Computing Workloads on a Petaflop Blue Gene / P Supercomputer24