Grid Computing: An Overview and Tutorial Kenny Daily BIT Presentation 22/09/2016.

Slides:



Advertisements
Similar presentations
Southgreen HPC system Concepts Cluster : compute farm i.e. a collection of compute servers that can be shared and accessed through a single “portal”
Advertisements

Software Tools Using PBS. Software tools Portland compilers pgf77 pgf90 pghpf pgcc pgCC Portland debugger GNU compilers g77 gcc Intel ifort icc.
DCC/FCUP Grid Computing 1 Resource Management Systems.
6/2/20071 Grid Computing Sun Grid Engine (SGE) Manoj Katwal.
Sun Grid Engine Grid Computing Assignment – Fall 2005 James Ruff Senior Department of Mathematics and Computer Science Western Carolina University.
Chapter 6: An Introduction to System Software and Virtual Machines
1 Chapter Overview Introduction to Windows XP Professional Printing Setting Up Network Printers Connecting to Network Printers Configuring Network Printers.
Understanding the Basics of Computational Informatics Summer School, Hungary, Szeged Methos L. Müller.
Apache Airavata GSOC Knowledge and Expertise Computational Resources Scientific Instruments Algorithms and Models Archived Data and Metadata Advanced.
Thomas Finnern Evaluation of a new Grid Engine Monitoring and Reporting Setup.
Gilbert Thomas Grid Computing & Sun Grid Engine “Basic Concepts”
Sun Grid Engine. Grids Grids are collections of resources made available to customers. Compute grids make cycles available to customers from an access.
Bigben Pittsburgh Supercomputing Center J. Ray Scott
March 3rd, 2006 Chen Peng, Lilly System Biology1 Cluster and SGE.
INVITATION TO COMPUTER SCIENCE, JAVA VERSION, THIRD EDITION Chapter 6: An Introduction to System Software and Virtual Machines.
Parallel Programming on the SGI Origin2000 With thanks to Igor Zacharov / Benoit Marchand, SGI Taub Computer Center Technion Moshe Goldberg,
Grid Computing at Yahoo! Sameer Paranjpye Mahadev Konar Yahoo!
1 High-Performance Grid Computing and Research Networking Presented by David Villegas Instructor: S. Masoud Sadjadi
CPSC 171 Introduction to Computer Science System Software and Virtual Machines.
Software Tools Using PBS. Software tools Portland compilers pgf77 pgf90 pghpf pgcc pgCC Portland debugger GNU compilers g77 gcc Intel ifort icc.
Cluster Computing Applications for Bioinformatics Thurs., Sept. 20, 2007 process management shell scripting Sun Grid Engine running parallel programs.
SPI NIGHTLIES Alex Hodgkins. SPI nightlies  Build and test various software projects each night  Provide a nightlies summary page that displays all.
Portable Batch System – Definition and 3 Primary Roles Definition: PBS is a distributed workload management system. It handles the management and monitoring.
An operating system (OS) is a collection of system programs that together control the operation of a computer system.
7.1 Operating Systems. 7.2 A computer is a system composed of two major components: hardware and software. Computer hardware is the physical equipment.
“Candidates were not advantaged by defining every type of operating system provided as examples in the explanatory notes of the standard. Candidates who.
Chapter 1: Introduction What is an Operating System? Mainframe Systems Desktop Systems Multiprocessor Systems Distributed Systems Clustered System Real.
1 High-Performance Grid Computing and Research Networking Presented by Javier Delgodo Slides prepared by David Villegas Instructor: S. Masoud Sadjadi
System SOFTWARE.
SQL Database Management
Welcome to Indiana University Clusters
Chapter 1: Introduction
Chapter 1: Introduction
Applied Operating System Concepts
Welcome to Indiana University Clusters
Using Paraguin to Create Parallel Programs
GWE Core Grid Wizard Enterprise (
Chapter 1: Introduction
Chapter 1: Introduction
Technology Literacy Hardware.
BIMSB Bioinformatics Coordination
Chapter 1: Introduction
NGS computation services: APIs and Parallel Jobs
7 Operating system Foundations of Computer Science ã Cengage Learning.
Chapter 1: Introduction
Chapter 1: Introduction
OPERATING SYSTEM OVERVIEW
Chapter 1: Introduction
Compiling and Job Submission
Chapter 2: System Structures
Operating Systems.
Information Technology Ms. Abeer Helwa
Chapter 1: Introduction
Language Processors Application Domain – ideas concerning the behavior of a software. Execution Domain – Ideas implemented in Computer System. Semantic.
Gonçalo Borges Jornadas LIP – 21/22 Dezembro Peniche
Chapter 1: Introduction
Sun Grid Engine.
Chapter 1: Introduction
Operating System Introduction.
Introduction to High Performance Computing Using Sapelo2 at GACRC
Chapter 1: Introduction
The Main Features of Operating Systems
LO2 – Understand Computer Software
Software - Operating Systems
High-Performance Grid Computing and Research Networking
Chapter 1: Introduction
Quick Tutorial on MPICH for NIC-Cluster
Working in The IITJ HPC System
Chapter 1: Introduction
Presentation transcript:

Grid Computing: An Overview and Tutorial Kenny Daily BIT Presentation 22/09/2016

Outline ● Motivational Story – Electric Fish ● Grid Computing Overview ● N1 Sun Grid Engine Software ● Use of UCI's cluster ● My Research (why I use the grid) ● Misc.

Electric Fish? Image from

Grid Computing Basics A grid is a collection of computing resources that perform tasks, and appears to users as a large system that provides a single point of access to powerful distributed resources. Users treat the grid as a single computational resource (One place for everything!)

Why Use a Grid? Take advantage of available computational resources Don't wear out your own computer Automation of repetitive tasks Easier to manage!

Grid Computing Basics TeraGrid has 102 teraflops of computing capability and more than 15 petabytes (quadrillions of bytes) of storage Image from

A couple of definitions Jobs – Users' requests for resources to run a program. These resources can be memory, computing speed, amount of time needed, specific computers, architectures, etc. Queues - provide services for jobs (the computers to run the jobs)

N1 Sun Grid Engine Software (SGE) A resource management software that: ● Accepts jobs submitted by users and ● Schedules them based on multiple requirements (time, hardware, etc). It's responsible for the best use of resources based on policies of usage, productivity, etc. Provides performance monitoring: Up to the minute access to resource consumption and system load information

When you submit a job The SGE does all of the following: 1. Accepts jobs (users' requests for computer resources) 2. Puts jobs in a holding area until the jobs can be run. 3. Sends jobs from the holding area to an execution device. 4. Manages running jobs. 5. Logs the record of job execution when the jobs are finished.

Types of Jobs Batch – do the same thing to a bunch of files (also called array jobs) Interactive – takes input during execution (may be from user or another program) Parallel – separate parts of a program that don't depend on each other, combine results later (support for multiple interfaces for parallel computing such as MPI)

As A User Prerequisites: Access to the system – talk to advisor or sponsor, send to Basic familiarity with a UNIX environment (navigation, running commands, text editing) More advanced features: Programming in C, Perl, Java, Python, etc.

As A User SSH to a host in the cluster: ssh -Y Some setup is required, but helpdesk can get you started on that.

Commands To Know Manage queues (qhost, qconf, qmod) Submit and delete jobs (qsub, qdel, qresub) Check job status (qstat) Suspend or enable queues and jobs (qhold, qrls) GUI to do most of this: qmon Each of these commands have hundreds of options, but the man pages are great!

About The Environment: Queues Display a list of available queues: qconf -sql all.q bubs_cluster.q casp_cluster.q chem_cluster.q dna_cluster.q..... Display properties of a queue: qconf -sq queuename Display list of resources you can request for a job: qconf -sc (all of the options available to set)

About The Environment: Hosts Find hosts to which you can submit jobs: qconf -ss List all the available hosts: qhost Information about a particular host: qconf -se hostname

About the Environment: Jobs What's running now: qstat About a particular user's jobs: qstat -u username About a particular job: qstat -j jobid Only show running jobs: qstat -s r job-ID prior name user state submit/start at queue slots ja-task-ID parallel-m lzhang1 r 01/03/ :02: parallel-m lzhang1 r 01/03/ :02: parallel-m lzhang1 r 01/03/ :02: parallel-m lzhang1 r 01/03/ :02: parallel-m lzhang1 r 01/03/ :02:44 1 1

Submitting a simple job qsub scriptname It's really that easy!!! Let's see some examples.

Examples 1. Simple Jobs 2. Array Jobs 3. Batch Jobs

Why I Use The Grid I'm studying how different measures of similarity can be used to search for chemical molecules For example, I have a new chemical structure that I found. What else is like it? First, what is a “good” measure of similarity? Looks the same, same properties, same overall shape, etc...

Why I Use The Grid I have at least 6 different similarity measures with 3 different ways of computing them, and at least 12 different data sets to test them on! I need to be able to take the pairwise similarity of each dataset to a random dataset of 50K molecules and then analyze the results. Each experiment will take hours of processing time 6 x 3 x 12 x 12 = 2592 Hours, or ~100 days!

Why I Use The Grid Batch processing lets me distribute those experiments over a bunch of computers and run them all at the same time! Let's look at the code that handles this (warning, kind of ugly!!!) Not the best way to do this – interfaces exist in Java, Perl, C, and Python to do this within the code

Thank you! Questions, Comments, Suggestions?

References ● TeraGrid - ● Sun Grid Engine Documentation ● SGE man pages