Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of.

Slides:

Advertisements

Similar presentations

NGS computation services: API's,

Advertisements

Operating System.

Overview of Wisconsin Campus Grid Dan Bradley Center for High-Throughput Computing.

Ian C. Smith* Introduction to research computing using Condor *Advanced Research Computing University of Liverpool.

Southgreen HPC system Concepts Cluster : compute farm i.e. a collection of compute servers that can be shared and accessed through a single “portal”

Information Technology Center Introduction to High Performance Computing at KFUPM.

Division of Pharmacokinetics and Drug Therapy Department of Pharmaceutical Biosciences Uppsala University Estimating and forecasting in vivo drug disposition.

SUMS Storage Requirement 250 TB fixed disk cache 130 TB annual increment for permanently on- line data 100 TB work area (not controlled by SUMS) 2 PB near-line.

Chapter 1 Introduction 1.1A Brief Overview - Parallel Databases and Grid Databases 1.2Parallel Query Processing: Motivations 1.3Parallel Query Processing:

High Performance Computing (HPC) at Center for Information Communication and Technology in UTM.

Introduction to client/server architecture

COMPUTER CONCEPTS.

JGI/NERSC New Hardware Training Kirsten Fagnan, Seung-Jin Sul January 10, 2013.

Introduction to UNIX/Linux Exercises Dan Stanzione.

Research Computing with Newton Gerald Ragghianti Newton HPC workshop Sept. 3, 2010.

Operating System. Architecture of Computer System Hardware Operating System (OS) Programming Language (e.g. PASCAL) Application Programs (e.g. WORD, EXCEL)

Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 1 Introduction Read:

Computer Graphics Graphics Hardware

ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.

GLAST LAT ProjectDOE/NASA Baseline-Preliminary Design Review, January 8, 2002 K.Young 1 LAT Data Processing Facility Automatically process Level 0 data.

University of Southampton Clusters: Changing the Face of Campus Computing Kenji Takeda School of Engineering Sciences Ian Hardy Oz Parchment Southampton.

Computing Labs CL5 / CL6 Multi-/Many-Core Programming with Intel Xeon Phi Coprocessors Rogério Iope São Paulo State University (UNESP)

Cluster Computing Applications for Bioinformatics Thurs., Aug. 9, 2007 Introduction to cluster computing Working with Linux operating systems Overview.

VIPBG LINUX CLUSTER By Helen Wang March 29th, 2013.

MapReduce: Simplified Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat.

March 3rd, 2006 Chen Peng, Lilly System Biology1 Cluster and SGE.

Distributed Monte Carlo Instrument Simulations at ISIS Tom Griffin, ISIS Facility & University of Manchester.

3rd June 2004 CDF Grid SAM:Metadata and Middleware Components Mòrag Burgon-Lyon University of Glasgow.

The Red Storm High Performance Computer March 19, 2008 Sue Kelly Sandia National Laboratories Abstract: Sandia National.

17-April-2007 High Performance Computing Basics April 17, 2007 Dr. David J. Haglin.

Experiences with a HTCondor pool: Prepare to be underwhelmed C. J. Lingwood, Lancaster University CCB (The Condor Connection Broker) – Dan Bradley

CSIU Submission of BLAST jobs via the Galaxy Interface Rob Quick Open Science Grid – Operations Area Coordinator Indiana University.

Parallel Programming on the SGI Origin2000 With thanks to Igor Zacharov / Benoit Marchand, SGI Taub Computer Center Technion Moshe Goldberg,

NGS Innovation Forum, Manchester4 th November 2008 Condor and the NGS John Kewley NGS Support Centre Manager.

HPC for Statistics Grad Students. A Cluster Not just a bunch of computers Linked CPUs managed by queuing software – Cluster – Node – CPU.

Bioinformatics Core Facility Guglielmo Roma January 2011.

Computational Research in the Battelle Center for Mathmatical medicine.

GRID activities in Wuppertal D0RACE Workshop Fermilab 02/14/2002 Christian Schmitt Wuppertal University Taking advantage of GRID software now.

C-Store: MapReduce Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY May. 22, 2009.

ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.

Computing Issues for the ATLAS SWT2. What is SWT2? SWT2 is the U.S. ATLAS Southwestern Tier 2 Consortium UTA is lead institution, along with University.

Introduction Goal: connecting multiple computers to get higher performance – Multiprocessors – Scalability, availability, power efficiency Job-level (process-level)

CCJ introduction RIKEN Nishina Center Kohei Shoji.

Introduction to Parallel Computing Presented by The Division of Information Technology Computer Support Services Department Research Support Group.

Wouter Verkerke, NIKHEF 1 Using ‘stoomboot’ for NIKHEF-ATLAS batch computing What is ‘stoomboot’ – Hardware –16 machines, each 2x quad-core Pentium = 128.

NREL is a national laboratory of the U.S. Department of Energy, Office of Energy Efficiency and Renewable Energy, operated by the Alliance for Sustainable.

Enabling Grids for E-sciencE LRMN ThIS on the Grid Sorina CAMARASU.

Creating Grid Resources for Undergraduate Coursework John N. Huffman Brown University Richard Repasky Indiana University Joseph Rinkovsky Indiana University.

A Web Based Job Submission System for a Physics Computing Cluster David Jones IOP Particle Physics 2004 Birmingham 1.

Grid Computing: An Overview and Tutorial Kenny Daily BIT Presentation 22/09/2016.

An Brief Introduction Charlie Taylor Associate Director, Research Computing UF Research Computing.

Advanced Computing Facility Introduction

GRID COMPUTING.

Welcome to Indiana University Clusters

Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming

Operating System.

Types of Operating System

Using Paraguin to Create Parallel Programs

Where are being used the OS?

Architecture & System Overview

Grid Computing.

National Center for Genome Analysis Support

CommLab PC Cluster (Ubuntu OS version)

TYPES OFF OPERATING SYSTEM

Introduction to High Throughput Computing and HTCondor

Laura Bright David Maier Portland State University

Introduction to High Performance Computing Using Sapelo2 at GACRC

Chapter-1 Computer is an advanced electronic device that takes raw data as an input from the user and processes it under the control of a set of instructions.

Introduction to research computing using Condor

Presentation transcript:

Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of Liverpool

Overview  Introduction to research computing  High Performance Computing (HPC)  HPC facilities at Liverpool  High Throughput Computing using Condor  University of Liverpool Condor Service  Some research computing examples using Condor  Next steps

What’s special about research computing ?  Often researchers need to tackle problems which are far too demanding for a typical PC or laptop computer  Programs may take too long to run or …  require too much memory or …  too much storage (disk space) or …  all of these !  Special computer systems and programming methods can help overcome these barriers

Speeding things up  Key to reducing run times is parallelism - splitting large problems into smaller tasks which can be tackled at the same time (i.e. “in parallel” or “concurrently”)  Two main types of parallelism:  data parallelism  functional parallelism (pipelining)  Tasks may be independent or inter-dependent (this eventually limits the speed up which can be achieved)  Fortunately many problems in medical/bio/life science exhibit data parallelism and tasks can be performed independently  This can lead to very significant speed ups !

Some sources of parallelism  Analysing patient data from clinical trials  Repeating calculations with different random numbers e.g. bootstrapping and Monte Carlo methods  Dividing sequence data by chromosome  Splitting chromosome sequences into smaller parts  Partitioning large BLAST (or other sequence) databases and/or query files

High Performance Computing (HPC)  Uses powerful special purpose systems called HPC clusters  Contain large numbers of processors acting in parallel  Each processor may contain multiple processing elements (cores) which can also work in parallel  Provide lots of memory and large amounts of fast (parallel) disk storage – ideal for data-intensive applications  Almost all clusters run the UNIX operating system  Typically run parallel programs containing inter-dependent tasks (e.g. finite element analysis codes) but also suitable for biostatistics and bioinformatics applications

HPC cluster hardware (architecture) Network Switch Compute Node Head Node parallel filestore high speed network standard (ethernet) network connection to outside world

Typical cluster use (1) Network Switch Head Node parallel filestore login and upload data input data files

Typical cluster use (2) Compute Node Head Node submit jobs (programs) login from outside

Typical cluster use (3) Network Switch Compute Node Parallel Filestore input data

Typical cluster use (4) Network Switch Compute Node task synchronisation (only if needed !) compute nodes process data in parallel

Typical cluster use (5) Network Switch Compute Node parallel filestore output data (results)

Typical cluster use (6) Network Switch Head Node parallel filestore login and download results output data files

Parallel BLAST example login as: ian password: Last login: Tue Feb 24 14:45: from uxa.liv.ac.uk

Parallel BLAST example login as: ian password: Last login: Tue Feb 24 14:45: from uxa.liv.ac.uk ~]$ cd /users/ian/chris/perl #change folder perl]$

Parallel BLAST example login as: ian password: Last login: Tue Feb 24 14:45: from uxa.liv.ac.uk ~]$ cd /users/ian/chris/perl perl]$ perl]$ ls -lh farisraw_*.fasta #list files

Parallel BLAST example login as: ian password: Last login: Tue Feb 24 14:45: from uxa.liv.ac.uk ~]$ cd /users/ian/chris/perl perl]$ perl]$ ls -lh farisraw_*.fasta -rw-r--r-- 1 ian ph 1.2G Feb 23 12:25 farisraw_1.fasta -rw-r--r-- 1 ian ph 1.4G Feb 23 12:25 farisraw_2.fasta -rw-r--r-- 1 ian ph 1.2G Feb 23 12:26 farisraw_3.fasta -rw-r--r-- 1 ian ph 1.3G Feb 23 12:26 farisraw_4.fasta -rw-r--r-- 1 ian ph 1.2G Feb 23 12:27 farisraw_5.fasta -rw-r--r-- 1 ian ph 1.3G Feb 23 12:27 farisraw_6.fasta -rw-r--r-- 1 ian ph 1.2G Feb 23 12:28 farisraw_7.fasta -rw-r--r-- 1 ian ph 1.2G Feb 23 12:24 farisraw_8.fasta -rw-r--r-- 1 ian ph 9.6G Feb 23 11:18 farisraw_complete.fasta perl]$ original query file

Parallel BLAST example login as: ian password: Last login: Tue Feb 24 14:45: from uxa.liv.ac.uk ~]$ cd /users/ian/chris/perl perl]$ perl]$ ls -lh farisraw_*.fasta -rw-r--r-- 1 ian ph 1.2G Feb 23 12:25 farisraw_1.fasta -rw-r--r-- 1 ian ph 1.4G Feb 23 12:25 farisraw_2.fasta -rw-r--r-- 1 ian ph 1.2G Feb 23 12:26 farisraw_3.fasta -rw-r--r-- 1 ian ph 1.3G Feb 23 12:26 farisraw_4.fasta -rw-r--r-- 1 ian ph 1.2G Feb 23 12:27 farisraw_5.fasta -rw-r--r-- 1 ian ph 1.3G Feb 23 12:27 farisraw_6.fasta -rw-r--r-- 1 ian ph 1.2G Feb 23 12:28 farisraw_7.fasta -rw-r--r-- 1 ian ph 1.2G Feb 23 12:24 farisraw_8.fasta -rw-r--r-- 1 ian ph 9.6G Feb 23 11:18 farisraw_complete.fasta perl]$ partial query files

Parallel BLAST example perl]$ cat blast.sub #show file contents job file

Parallel BLAST example perl]$ cat blast.sub #!/bin/bash #$ -cwd -V #$ -o stdout #$ -e stderr #$ -pe smp 8 blastn -query farisraw_${SGE_TASK_ID}.fasta \ -db /users/ian/chris/faris/dbs/RM_allfarisgenes_linuxbldb \ -out output${SGE_TASK_ID}.txt \ -word_size 11 -evalue culling_limit 1 -max_target_seqs 5 \ -num_threads 8 \ -outfmt "6 qseqid sseqid qlen length qstart qend sstart send mismatch \ gaps qseq sseq pident evalue“ perl]$ job options

Parallel BLAST example perl]$ cat blast.sub #!/bin/bash #$ -cwd -V #$ -o stdout #$ -e stderr #$ -pe smp 8 blastn -query farisraw_${SGE_TASK_ID}.fasta \ -db /users/ian/chris/faris/dbs/RM_allfarisgenes_linuxbldb \ -out output${SGE_TASK_ID}.txt \ -word_size 11 -evalue culling_limit 1 -max_target_seqs 5 \ -num_threads 8 \ -outfmt "6 qseqid sseqid qlen length qstart qend sstart send mismatch \ gaps qseq sseq pident evalue“ perl]$ BLAST query file

Parallel BLAST example perl]$ cat blast.sub #!/bin/bash #$ -cwd -V #$ -o stdout #$ -e stderr #$ -pe smp 8 blastn -query farisraw_${SGE_TASK_ID}.fasta \ -db /users/ian/chris/faris/dbs/RM_allfarisgenes_linuxbldb \ -out output${SGE_TASK_ID}.txt \ -word_size 11 -evalue culling_limit 1 -max_target_seqs 5 \ -num_threads 8 \ -outfmt "6 qseqid sseqid qlen length qstart qend sstart send mismatch \ gaps qseq sseq pident evalue“ perl]$ BLAST database

Parallel BLAST example perl]$ cat blast.sub #!/bin/bash #$ -cwd -V #$ -o stdout #$ -e stderr #$ -pe smp 8 blastn -query farisraw_${SGE_TASK_ID}.fasta \ -db /users/ian/chris/faris/dbs/RM_allfarisgenes_linuxbldb \ -out output${SGE_TASK_ID}.txt \ -word_size 11 -evalue culling_limit 1 -max_target_seqs 5 \ -num_threads 8 \ -outfmt "6 qseqid sseqid qlen length qstart qend sstart send mismatch \ gaps qseq sseq pident evalue“ perl]$ output file

Parallel BLAST example perl]$ cat blast.sub #!/bin/bash #$ -cwd -V #$ -o stdout #$ -e stderr #$ -pe smp 8 blastn -query farisraw_${SGE_TASK_ID}.fasta \ -db /users/ian/chris/faris/dbs/RM_allfarisgenes_linuxbldb \ -out output${SGE_TASK_ID}.txt \ -word_size 11 -evalue culling_limit 1 -max_target_seqs 5 \ -num_threads 8 \ -outfmt "6 qseqid sseqid qlen length qstart qend sstart send mismatch \ gaps qseq sseq pident evalue“ perl]$ takes on values [1..8] when jobs are submitted

Parallel BLAST example perl]$ cat blast.sub #!/bin/bash #$ -cwd -V #$ -o stdout #$ -e stderr #$ -pe smp 8 blastn -query farisraw_${SGE_TASK_ID}.fasta \ -db /users/ian/chris/faris/dbs/RM_allfarisgenes_linuxbldb \ -out output${SGE_TASK_ID}.txt \ -word_size 11 -evalue culling_limit 1 -max_target_seqs 5 \ -num_threads 8 \ -outfmt "6 qseqid sseqid qlen length qstart qend sstart send mismatch \ gaps qseq sseq pident evalue“ perl]$ use all 8 cores on each compute node (in parallel)

Parallel BLAST example perl]$ cat blast.sub #!/bin/bash #$ -cwd -V #$ -o stdout #$ -e stderr #$ -pe smp 8 blastn -query farisraw_${SGE_TASK_ID}.fasta \ -db /users/ian/chris/faris/dbs/RM_allfarisgenes_linuxbldb \ -out output${SGE_TASK_ID}.txt \ -word_size 11 -evalue culling_limit 1 -max_target_seqs 5 \ -num_threads 8 \ -outfmt "6 qseqid sseqid qlen length qstart qend sstart send mismatch \ gaps qseq sseq pident evalue“ perl]$ qsub –t 1-8 blast.sub #submit jobs

Parallel BLAST example perl]$ cat blast.sub #!/bin/bash #$ -cwd -V #$ -o stdout #$ -e stderr #$ -pe smp 8 blastn -query farisraw_${SGE_TASK_ID}.fasta \ -db /users/ian/chris/faris/dbs/RM_allfarisgenes_linuxbldb \ -out output${SGE_TASK_ID}.txt \ -word_size 11 -evalue culling_limit 1 -max_target_seqs 5 \ -num_threads 8 \ -outfmt "6 qseqid sseqid qlen length qstart qend sstart send mismatch \ gaps qseq sseq pident evalue“ perl]$ qsub –t 1-8 blast.sub Your job-array :1 ("blast.sub") has been submitted perl]$

Parallel BLAST example perl]$ qstat #show job status

Parallel BLAST example perl]$ qstat job-ID prior name user state submit/start at queue slots ja-task-ID blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: perl]$

Parallel BLAST example perl]$ qstat job-ID prior name user state submit/start at queue slots ja-task-ID blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: perl]$ indicates job is running

Parallel BLAST example perl]$ qstat job-ID prior name user state submit/start at queue slots ja-task-ID blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: perl]$ name of compute node job is running on

Parallel BLAST example perl]$ qstat job-ID prior name user state submit/start at queue slots ja-task-ID blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: perl]$ qstat

Parallel BLAST example perl]$ qstat job-ID prior name user state submit/start at queue slots ja-task-ID blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: perl]$ qstat job-ID prior name user state submit/start at queue slots ja-task-ID blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: perl]$ qstat

Parallel BLAST example perl]$ qstat job-ID prior name user state submit/start at queue slots ja-task-ID blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: perl]$ qstat job-ID prior name user state submit/start at queue slots ja-task-ID blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: blast.sub ian r 02/26/ :32: perl]$ qstat perl]$

Parallel BLAST example perl]$ ls -lh output*.txt #list output files

Parallel BLAST example perl]$ ls -lh output* -rw-r--r-- 1 ian ph 45M Feb 26 14:38 output1.txt -rw-r--r-- 1 ian ph 22M Feb 26 14:38 output2.txt -rw-r--r-- 1 ian ph 59M Feb 26 14:38 output3.txt -rw-r--r-- 1 ian ph 20M Feb 26 14:38 output4.txt -rw-r--r-- 1 ian ph 28M Feb 26 14:38 output5.txt -rw-r--r-- 1 ian ph 13M Feb 26 14:38 output6.txt -rw-r--r-- 1 ian ph 30M Feb 26 14:38 output7.txt -rw-r--r-- 1 ian ph 30M Feb 26 14:38 output8.txt perl]$ partial output files

Parallel BLAST example perl]$ ls -lh output*.txt -rw-r--r-- 1 ian ph 45M Feb 26 14:38 output1.txt -rw-r--r-- 1 ian ph 22M Feb 26 14:38 output2.txt -rw-r--r-- 1 ian ph 59M Feb 26 14:38 output3.txt -rw-r--r-- 1 ian ph 20M Feb 26 14:38 output4.txt -rw-r--r-- 1 ian ph 28M Feb 26 14:38 output5.txt -rw-r--r-- 1 ian ph 13M Feb 26 14:38 output6.txt -rw-r--r-- 1 ian ph 30M Feb 26 14:38 output7.txt -rw-r--r-- 1 ian ph 30M Feb 26 14:38 output8.txt perl]$ cat output*.txt > output_complete.txt perl]$ ls -lh output_complete.txt combine partial results files

Parallel BLAST example perl]$ ls -lh output*.txt -rw-r--r-- 1 ian ph 45M Feb 26 14:38 output1.txt -rw-r--r-- 1 ian ph 22M Feb 26 14:38 output2.txt -rw-r--r-- 1 ian ph 59M Feb 26 14:38 output3.txt -rw-r--r-- 1 ian ph 20M Feb 26 14:38 output4.txt -rw-r--r-- 1 ian ph 28M Feb 26 14:38 output5.txt -rw-r--r-- 1 ian ph 13M Feb 26 14:38 output6.txt -rw-r--r-- 1 ian ph 30M Feb 26 14:38 output7.txt -rw-r--r-- 1 ian ph 30M Feb 26 14:38 output8.txt perl]$ cat output*.txt > output_complete.txt perl]$ ls -lh output_complete.txt -rw-r--r-- 1 ian ph 499M Feb 26 14:44 output_complete.txt perl]$ combined results

Some HPC clusters available at Liverpool  bioinf1  System bought by the Institute of Translational Medicine for use in biomedical research about 5 years ago  9 compute nodes each with 8 cores and 32 GB of memory (one node has 128 GB of memory)  76 TB of main (parallel) storage  chadwick  Main CSD HPC cluster for research use  118 nodes each with 16 cores and 64 GB memory (one node has 2 TB of memory)  Total of 135 TB of main (parallel) storage  Fast (40 GB/s) internal network

High Throughput Computing (HTC) using Condor  No dedicated hardware - uses ordinary classroom PCs to run jobs when then they would otherwise be idle (usually evenings and weekends)  Jobs may be interrupted by users logging into Condor PCs – works best for short running jobs (10-20 minutes ideally)  Only suitable for applications which use independent tasks (need to use HPC inter-dependent tasks)  No shared storage – all data files must be transferred to/from the Condor PCs  Limited memory and disk space available since Condor uses only commodity PCs  However… Condor is well suited to many statistical and data- intensive applications !

A “typical” Condor pool Condor Server Desktop PC Execute hosts login and upload input data

A “typical” Condor pool Condor Server Desktop PC Execute hosts jobs

A “typical” Condor pool Condor Server Desktop PC Execute hosts results

A “typical” Condor pool Condor Server Desktop PC Execute hosts download results

University of Liverpool Condor Pool  contains around 750 classroom PCs running the CSD Managed Windows 7 Service  Each PC can support a maximum of 4 jobs concurrently giving a theoretical capacity of 3000 parallel jobs  Typical spec: 3.3 GHz Intel i3 dual-core processor, 8 GB memory, 128 GB disk space  Tools are available to help in running large numbers of R and MATLAB jobs (other software may work but not commercial packages such as SAS and Stata)  Single job submission point for Condor jobs provided by powerful UNIX server  Service can be also accessed from a Windows PC/laptop using Desktop Condor (even from off-campus)

Desktop Condor (1)

Desktop Condor (2)

Desktop Condor (3)

Personalised Medicine example  project is an example of a Genome-Wide Association Study  aims to identify genetic predictors of response to anti-epileptic drugs  try to identify regions of the human genome that differ between individuals (referred to as Single Nucleotide Polymorphisms or SNPs)  800 patients genotyped at SNPs along the entire genome  Statistically test the association between SNPs and outcomes (e.g. time to withdrawal of drug due to adverse effects)  large data-parallel problem using R – ideal for Condor  divide datasets into small partitions so that individual jobs run for minutes  batch of 26 chromosomes (2 600 jobs) required ~ 5 hours wallclock time on Condor but ~ 5 weeks on a single PC

Radiotherapy example  large 3 rd party application code which simulates photon beam radiotherapy treatment using Monte Carlo methods  tried running simulation on 56 cores of high performance computing cluster but no progress after 5 weeks  divided problem into 250 then and eventually Condor jobs  required ~ days of CPU time (equivalent to ~ 3.5 years on dual core PC)  Condor simulation completed in less than one week  average run time was ~ 70 min

Summary  Parallelism can help speed up the solution of many research computing problems by dividing large problems into many smaller ones which can be tackled at the same time  High Performance Computing clusters  Typically used for small numbers of long running jobs  Ideal for applications requiring lots of memory and disk storage space  Almost all systems are UNIX-based  Condor High Throughput Computing Service  Typically used for large/very large numbers of short running jobs  Limited memory and storage available on Condor PCs  Support available for applications using R (and MATLAB)  No UNIX knowledge needed with Desktop Condor

Next steps  Condor Service information:  Information on bioinf1 and HPC clusters:  Information on the Advanced Research Computing (ARC) facilities:  To contact the ARC team  To request an account on Condor or chadwick use:  For an account on bioinf1 – just ask me !