System Utilization Benchmark on the Cray T3E and IBM SP Adrian Wong, Leonid Oliker, William Kramer, Teresa Kaltz, Therese Enright and David Bailey National.

Slides:



Advertisements
Similar presentations
Pricing for Utility-driven Resource Management and Allocation in Clusters Chee Shin Yeo and Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS)
Advertisements

Resource Management §A resource can be a logical, such as a shared file, or physical, such as a CPU (a node of the distributed system). One of the functions.
CPU Scheduling Questions answered in this lecture: What is scheduling vs. allocation? What is preemptive vs. non-preemptive scheduling? What are FCFS,
1 Uniprocessor Scheduling Types of scheduling –The aim of processor scheduling is to assign processes to be executed by the processor so as to optimize.
 Basic Concepts  Scheduling Criteria  Scheduling Algorithms.
Scheduling of parallel jobs in a heterogeneous grid environment Scheduling of parallel jobs in a heterogeneous grid environment Each site has a homogeneous.
Operating Systems 1 K. Salah Module 2.1: CPU Scheduling Scheduling Types Scheduling Criteria Scheduling Algorithms Performance Evaluation.
CS 3013 & CS 502 Summer 2006 Scheduling1 The art and science of allocating the CPU and other resources to processes.
OS Spring ’ 04 Scheduling Operating Systems Spring 2004.
Review: Operating System Manages all system resources ALU Memory I/O Files Objectives: Security Efficiency Convenience.
Cs238 CPU Scheduling Dr. Alan R. Davis. CPU Scheduling The objective of multiprogramming is to have some process running at all times, to maximize CPU.
Informationsteknologi Tuesday, October 9, 2007Computer Systems/Operating Systems - Class 141 Today’s class Scheduling.
Wk 2 – Scheduling 1 CS502 Spring 2006 Scheduling The art and science of allocating the CPU and other resources to processes.
1 Lecture 10: Uniprocessor Scheduling. 2 CPU Scheduling n The problem: scheduling the usage of a single processor among all the existing processes in.
Uniprocessor Scheduling
Scheduling in Heterogeneous Grid Environments: The Effects of Data Migration Leonid Oliker, Hongzhang Shan Future Technology Group Lawrence Berkeley Research.
 Scheduling  Linux Scheduling  Linux Scheduling Policy  Classification Of Processes In Linux  Linux Scheduling Classes  Process States In Linux.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 6: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms.
CPU Scheduling Chapter 6 Chapter 6.
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Lecture 5 Operating Systems.
CSC 360- Instructor: K. Wu CPU Scheduling. CSC 360- Instructor: K. Wu Agenda 1.What is CPU scheduling? 2.CPU burst distribution 3.CPU scheduler and dispatcher.
Uniprocessor Scheduling
CS3530 OPERATING SYSTEMS Summer 2014 Processor Scheduling Chapter 5.
Chapter 6 CPU SCHEDULING.
Scheduling of Parallel Jobs In a Heterogeneous Multi-Site Environment By Gerald Sabin from Ohio State Reviewed by Shengchao Yu 02/2005.
Seaborg Cerise Wuthrich CMPS Seaborg  Manufactured by IBM  Distributed Memory Parallel Supercomputer  Based on IBM’s SP RS/6000 Architecture.
1 Distributed Process Scheduling: A System Performance Model Vijay Jain CSc 8320, Spring 2007.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 6: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms.
Scheduling. Alternating Sequence of CPU And I/O Bursts.
SSS Test Results Scalability, Durability, Anomalies Todd Kordenbrock Technology Consultant Scalable Computing Division Sandia is a multiprogram.
Uniprocessor Scheduling Chapter 9. Aim of Scheduling To improve: Response time: time it takes a system to react to a given input Turnaround Time (TAT)
CPU Scheduling CSCI 444/544 Operating Systems Fall 2008.
Supercomputing ‘99 Parallelization of a Dynamic Unstructured Application using Three Leading Paradigms Leonid Oliker NERSC Lawrence Berkeley National Laboratory.
Uniprocessor Scheduling
Processes and Process Control 1. Processes and Process Control 2. Definitions of a Process 3. Systems state vs. Process State 4. A 2 State Process Model.
Power-Aware Parallel Job Scheduling
Uniprocessor Scheduling Chapter 9. Aim of Scheduling Assign processes to be executed by the processor or processors: –Response time –Throughput –Processor.
Silberschatz and Galvin  Operating System Concepts Module 5: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor.
1 CS.217 Operating System By Ajarn..Sutapart Sappajak,METC,MSIT Chapter 5 CPU Scheduling Slide 1 Chapter 5 CPU Scheduling.
Static Process Scheduling
1 Uniprocessor Scheduling Chapter 3. 2 Alternating Sequence of CPU And I/O Bursts.
Chapter 4 CPU Scheduling. 2 Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling Algorithm Evaluation.
Chapter 9 Uniprocessor Scheduling Operating Systems: Internals and Design Principles, 6/E William Stallings Dave Bremer Otago Polytechnic, N.Z. ©2008,
2004 Queue Scheduling and Advance Reservations with COSY Junwei Cao Falk Zimmermann C&C Research Laboratories NEC Europe Ltd.
Lecture 4 CPU scheduling. Basic Concepts Single Process  one process at a time Maximum CPU utilization obtained with multiprogramming CPU idle :waiting.
CPU scheduling.  Single Process  one process at a time  Maximum CPU utilization obtained with multiprogramming  CPU idle :waiting time is wasted 2.
1 Lecture 5: CPU Scheduling Operating System Fall 2006.
OPERATING SYSTEMS CS 3502 Fall 2017
lecture 5: CPU Scheduling
OpenPBS – Distributed Workload Management System
EEE Embedded Systems Design Process in Operating Systems 서강대학교 전자공학과
Chapter 5a: CPU Scheduling
Uniprocessor Scheduling
CPU Scheduling Chapter 5.
Process Scheduling B.Ramamurthy 9/16/2018.
Chapter 6: CPU Scheduling
Operating Systems CPU Scheduling.
Process Scheduling B.Ramamurthy 11/18/2018.
CPU Scheduling G.Anuradha
Module 5: CPU Scheduling
3: CPU Scheduling Basic Concepts Scheduling Criteria
A Characterization of Approaches to Parrallel Job Scheduling
CPU SCHEDULING.
Chapter 6: CPU Scheduling
Process Scheduling B.Ramamurthy 2/23/2019.
Uniprocessor scheduling
Operating System , Fall 2000 EA101 W 9:00-10:00 F 9:00-11:00
Module 5: CPU Scheduling
IIS Progress Report 2016/01/18.
Module 5: CPU Scheduling
Presentation transcript:

System Utilization Benchmark on the Cray T3E and IBM SP Adrian Wong, Leonid Oliker, William Kramer, Teresa Kaltz, Therese Enright and David Bailey National Energy Research Scientific Computing Center Lawrence Berkeley National Laboratory

Scientific Supercomputer Workload Long running batch jobs (hours) Typically 64 nodes per job Often long list of queued jobs Job turnaround maybe days

Motivations –Ability to fully utilize a large computer is almost as important as the speed of the computer. –Large capability mainframes rarely have idle cycles - need to maximize users’ productivity. –Need a way to measure potential day-to-day utilization. –No metric to gauge configuration changes other than anecdotal. –Increased complexity of scheduling with parallel platforms A test to assess system capabilities & configuration effects on utilization Effective System Performance (ESP)

Parallel Job Scheduling Optimization problem in packing with space (processor) and time constraints Dynamic situation Tradeoffs in turnaround, utilization & fairness

Scheduling Strategies Job Queue Hole Order of Submission Best-Fit-First Scan queue for best fit First-Come-First-Serve Wait for right size hole Starvation of large jobs May idle system Respects submission order

Key OS System Capabilities Swapping / Gang-scheduling Job migration / compaction Priority preemption Backfill Disjoint partitions Checkpoint / restart Dynamically adjustable queue structures

ESP Design Goals & Attributes Transferable metric(s) / Valid comparisons Reproducible Easily interpreted results Portable Platform size and speed independent Capture essence of real workload Compact and easily distributed Easy to run (< 12 hours) Automated / no human intervention Focus on utilization / factor out CPU speed Test responsiveness & adaptability of scheduler

ESP Design Start with throughput test Profile of jobs determined by historical accounting data Find applications with appropriate size and time Use two full configuration jobs to encapsulate change of operational mode (e.g. interactive to batch) Submit jobs in three blocks in pseudo-random order

ESP Test Schematic time <12 hours full config #1full config #2 regular jobs >10% regular jobs shutdown/ reboot (opt) regular jobs Vanilla variant (throughput)

Individual Applications in Jobmix

Jobmix Application Elapsed Times T3E SP Increasing Partition Size

Platforms Tested Cray T3E –512 processors –450 MHz Alpha EV56 –Microkernel MPP OS –NQS & Global Resource Mgr –Oversubscription possible –BFF strategy w/ dynamic queue configs IBM SP –512 processors –200 MHz Power3 –Semiautonomous Monolithic OSes –Loadleveller batch queues –FCFS w/ backfill (backfill disabled in 1st attempt)

T3E Chronology (with swap) Insufficient work; Tailend dilemma Starvation of large jobs Normalized = Elapsed / Theoretical Min

T3E Chronology (without swap) Slight decrease in utilization w/o swap capability Higher overall efficiency - significant overhead w/ swap

SP Chronology Waiting for machine to idle

Queue Wait Times (normalized) Jobs sorted by Partition Size & Submit Time T3E Swap T3E NoSwap SP BFF - larger jobs = longer wait FCFS - less dependence on size Swap permits more simultaneous jobs running = shorter wait times Idling twice causes 3 distinct regimes of wait times

Restoring Backfill on the SP Recognized that backfill is the standard mode for Loadleveller Have problems with backfill and ESP stipulations However… interesting data from invalid testshot

Backfill Effect I (Chronology) SP FCFS SP FCFS w/ backfill Highly efficient, but violates test Need to selectively backfill

Backfill Effect II (Queue Wait Times) SP FCFS SP FCFS w/ backfill

Backfill and Flaw in ESP test FC job submitted All jobs finish except one Guaranteed FC runtime time Backfill is working as expected but long-running job negates effect of reservation time - need finer granularity jobs Stipulation for FC jobs? 1. Run immediately (possibly premature termination of running jobs) T3E 2. Run after current jobs finish SP w/ backfill 3. No further jobs launched until FC finishes SP

Further Design Issues How to end the test? Possible to use backfill (globally or selectively)? Can we formulate a turnaround metric? Scalability in size and speed Finer granularity of jobs cf. overall test Perhaps need additional vanilla throughput test to evaluate purely scheduler performance

Conclusions & Observations SP - Can achieve very high utilization with backfill and no topology constraints SP -Lack of adaptability with dynamic workload - run ASAP mode T3E - Swapping with high overhead degrades utilization T3E - Can adapt to dynamic workload requirements

Ongoing and Future Work Scheduled test run on 512-way Origin 2K & Compaq SC Vanilla throughput runs on T3E and SP Redesign for next version of ESP Distribute ESP to other interested sites