B4 Application Environment Load Balancing Job and Queue Management Tim Smith CERN/IT.

Slides:



Advertisements
Similar presentations
Tom Sugden EPCC OGSA-DAI Future Directions OGSA-DAI User's Forum GridWorld 2006, Washington DC 14 September 2006.
Advertisements

LEAD Portal: a TeraGrid Gateway and Application Service Architecture Marcus Christie and Suresh Marru Indiana University LEAD Project (
Scheduling Introduction to Scheduling
Scheduling in Web Server Clusters CS 260 LECTURE 3 From: IBM Technical Report.
© 2004, D. J. Foreman 1 Scheduling & Dispatching.
CPU Scheduling Tanenbaum Ch 2.4 Silberchatz and Galvin Ch 5.
CSC 322 Operating Systems Concepts Lecture - 11: by Ahmed Mumtaz Mustehsan Special Thanks To: Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall,
Chapter 1.2 Operating Systems. Layered Operating System model Hardware Operating System Application.
Abdulrahman Idlbi COE, KFUPM Jan. 17, Past Schedulers: 1.2 & : circular queue with round-robin policy. Simple and minimal. Not focused on.
Operating Systems: Introduction n 1. Historical Development n 2. The OS as a Resource Manager n 3. Definitions n 4. The Process.
Futures – Alpha Cloud Deployment and Application Management.
“Managing a farm without user jobs would be easier” Clusters and Users at CERN Tim Smith CERN/IT.
Chap 5 Process Scheduling. Basic Concepts Maximum CPU utilization obtained with multiprogramming CPU–I/O Burst Cycle – Process execution consists of a.
1 Routing and Scheduling in Web Server Clusters. 2 Reference The State of the Art in Locally Distributed Web-server Systems Valeria Cardellini, Emiliano.
Why static is bad! Hadoop Pregel MPI Shared cluster Today: static partitioningWant dynamic sharing.
Chapter 3 Operating Systems. Chapter 3 Operating Systems 3.1 The Evolution of Operating Systems 3.1 The Evolution of Operating Systems 3.2 Operating System.
Tao Yang, UCSB CS 240B’03 Unix Scheduling Multilevel feedback queues –128 priority queues (value: 0-127) –Round Robin per priority queue Every scheduling.
Fair Scheduling in Web Servers CS 213 Lecture 17 L.N. Bhuyan.
1 Introduction to Load Balancing: l Definition of Distributed systems. Collection of independent loosely coupled computing resources. l Load Balancing.
5: CPU-Scheduling1 Jerry Breecher OPERATING SYSTEMS SCHEDULING.
1 Status of the ALICE CERN Analysis Facility Marco MEONI – CERN/ALICE Jan Fiete GROSSE-OETRINGHAUS - CERN /ALICE CHEP Prague.
ADLB Update Recent and Current Adventures with the Asynchronous Dynamic Load Balancing Library Rusty Lusk Mathematics and Computer Science Division Argonne.
PROOF: the Parallel ROOT Facility Scheduling and Load-balancing ACAT 2007 Jan Iwaszkiewicz ¹ ² Gerardo Ganis ¹ Fons Rademakers ¹ ¹ CERN PH/SFT ² University.
Chapter 8 Windows Outline Programming Windows 2000 System structure Processes and threads in Windows 2000 Memory management The Windows 2000 file.
Parallel Computing The Bad News –Hardware is not getting faster fast enough –Too many architectures –Existing architectures are too specific –Programs.
Performance and Exception Monitoring Project Tim Smith CERN/IT.
Track 1: Cluster and Grid Computing NBCR Summer Institute Session 2.2: Cluster and Grid Computing: Case studies Condor introduction August 9, 2006 Nadya.
CPU Scheduling Chapter 6 Chapter 6.
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Lecture 5 Operating Systems.
CSC 360- Instructor: K. Wu CPU Scheduling. CSC 360- Instructor: K. Wu Agenda 1.What is CPU scheduling? 2.CPU burst distribution 3.CPU scheduler and dispatcher.
OPERATING SYSTEMS CPU SCHEDULING.  Introduction to CPU scheduling Introduction to CPU scheduling  Dispatcher Dispatcher  Terms used in CPU scheduling.
Threads Many software packages are multi-threaded Web browser: one thread display images, another thread retrieves data from the network Word processor:
Chapter 5 – CPU Scheduling (Pgs 183 – 218). CPU Scheduling  Goal: To get as much done as possible  How: By never letting the CPU sit "idle" and not.
Grid Resource Allocation and Management (GRAM) Execution management Execution management –Deployment, scheduling and monitoring Community Scheduler Framework.
Systems II San Pham CS /20/03. Topics Operating Systems Resource Management – Process Management – CPU Scheduling – Deadlock Protection/Security.
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
Distribution and components. 2 What is the problem? Enterprise computing is Large scale & complex: It supports large scale and complex organisations Spanning.
2.5 Scheduling Given a multiprogramming system. Given a multiprogramming system. Many times when more than 1 process is waiting for the CPU (in the ready.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Lecture 7: CPU Scheduling Chapter 5.
Computer Science Lecture 7, page 1 CS677: Distributed OS Multiprocessor Scheduling Will consider only shared memory multiprocessor Salient features: –One.
CERN DNS Load Balancing VladimírBahylIT-FIO NicholasGarfieldIT-CS.
2.5 Scheduling. Given a multiprogramming system, there are many times when more than 1 process is waiting for the CPU (in the ready queue). Given a multiprogramming.
Windows Azure Fundamentals Services Storage. Table of contents Overview Cloud service basics Managing cloud services Cloud storage basics Table storage.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
Process Control Management Prepared by: Dhason Operating Systems.
(re)-Architecting cloud applications on the windows Azure platform CLAEYS Kurt Technology Solution Professional Microsoft EMEA.
The Performance and Exception Monitoring Project Tim Smith IT/PDP.
Distributed Server Scheduler Eyal Serero Alex Fishgate Supervisor : Vitaly Suchin.
Lecture 4 CPU scheduling. Basic Concepts Single Process  one process at a time Maximum CPU utilization obtained with multiprogramming CPU idle :waiting.
CPU scheduling.  Single Process  one process at a time  Maximum CPU utilization obtained with multiprogramming  CPU idle :waiting time is wasted 2.
Lecture 5 Scheduling. Today CPSC Tyson Kendon Updates Assignment 1 Assignment 2 Concept Review Scheduling Processes Concepts Algorithms.
lecture 5: CPU Scheduling
Copyright ©: Nahrstedt, Angrave, Abdelzaher
Linux Scheduler.
Introduction to Load Balancing:
Query Performance Tuning: Start to Finish
Copyright ©: Nahrstedt, Angrave, Abdelzaher
Uniprocessor Scheduling
Accounting information and limits
Lecture 23: Process Scheduling for Interactive Systems
Support for ”interactive batch”
TDC 311 Process Scheduling.
Scheduling.
Operating systems Process scheduling.
Chapter 5: CPU Scheduling
CPU SCHEDULING.
CS703 – Advanced Operating Systems
Scheduling & Dispatching
Scheduling & Dispatching
Presentation transcript:

B4 Application Environment Load Balancing Job and Queue Management Tim Smith CERN/IT

2001/05/25Tim Smith: LCCWS in FNAL2 Application Environment (I) Q: How to ensure compatibility between execution and development environments  Accessing ‘static’ system and application tools  Remote: Shared file system  Local:  Tools for client synchronisation  ‘Pre-compiler’ to hide target differences  Defining environment variables  Group accounts: Useful, but bad for security / auditing  Framework for environment definition

2001/05/25Tim Smith: LCCWS in FNAL3 Application Environment (II)  Accessing user files and application libraries  Remote:  Shared file system  Put / get mechanisms  Local:  Ship libraries with the job  Tool for sinking to clients  Static vs dynamically linked binaries  Dynamic has little benefit on a 2 processor batch  Security issues with picking up random libraries on scavenger nodes that are not managed centrally  Some 3 rd party libraries only available dynamically

2001/05/25Tim Smith: LCCWS in FNAL4 Load Balancing  Interactive: DNS (vs scripts)  Round robin vs Metric based  Since can’t predict future state  Sophistication level: load, # sessions, …  Dealing with load anomalies: true interactive functions  Batch  Over-subscribing CPUs  Internal blocking VS Context switching  Queue abuse  Master Configuration  Queue length, scheduling complexity

2001/05/25Tim Smith: LCCWS in FNAL5 Job and Queue Management  Delegated responsibilities  Tuning  Host affinities  Priorities  Regulation  Difficulty of config. and interpret. of fair-shares  Job dispatching  Forecasting  Explaining pending status  Diagnosing and reacting to ‘bad hosts’  Scalability issues due to job volume  Accounting and dispatching