Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of.

Similar presentations


Presentation on theme: "Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of."— Presentation transcript:

1 Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of Liverpool

2 Overview  Introduction to research computing  High Performance Computing (HPC)  HPC facilities at Liverpool  High Throughput Computing using Condor  University of Liverpool Condor Service  Some research computing examples using Condor  Next steps

3 What’s special about research computing ?  Often researchers need to tackle problems which are far too demanding for a typical PC or laptop computer  Programs may take too long to run or …  require too much memory or …  too much storage (disk space) or …  all of these !  Special computer systems and programming methods can help overcome these barriers

4 Speeding things up  Key to reducing run times is parallelism - splitting large problems into smaller tasks which can be tackled at the same time (i.e. “in parallel” or “concurrently”)  Two main types of parallelism:  data parallelism  functional parallelism (pipelining)  Tasks may be independent or inter-dependent (this eventually limits the speed up which can be achieved)  Fortunately many problems in medical/bio/life science exhibit data parallelism and tasks can be performed independently  This can lead to very significant speed ups !

5 Some sources of parallelism  Analysing patient data from clinical trials  Repeating calculations with different random numbers e.g. bootstrapping and Monte Carlo methods  Dividing sequence data by chromosome  Splitting chromosome sequences into smaller parts  Partitioning large BLAST (or other sequence) databases and/or query files

6 High Performance Computing (HPC)  Uses powerful special purpose systems called HPC clusters  Contain large numbers of processors acting in parallel  Each processor may contain multiple processing elements (cores) which can also work in parallel  Provide lots of memory and large amounts of fast (parallel) disk storage – ideal for data-intensive applications  Almost all clusters run the UNIX operating system  Typically run parallel programs containing inter-dependent tasks (e.g. finite element analysis codes) but also suitable for biostatistics and bioinformatics applications

7 HPC cluster hardware (architecture) Network Switch Compute Node Head Node parallel filestore high speed network standard (ethernet) network connection to outside world

8 Typical cluster use (1) Network Switch Head Node parallel filestore login and upload data input data files

9 Typical cluster use (2) Compute Node Head Node submit jobs (programs) login from outside

10 Typical cluster use (3) Network Switch Compute Node Parallel Filestore input data

11 Typical cluster use (4) Network Switch Compute Node task synchronisation (only if needed !) compute nodes process data in parallel

12 Typical cluster use (5) Network Switch Compute Node parallel filestore output data (results)

13 Typical cluster use (6) Network Switch Head Node parallel filestore login and download results output data files

14 Parallel BLAST example login as: ian ian@bioinf1's password: Last login: Tue Feb 24 14:45:31 2015 from uxa.liv.ac.uk

15 Parallel BLAST example login as: ian ian@bioinf1's password: Last login: Tue Feb 24 14:45:31 2015 from uxa.liv.ac.uk [ian@bioinf1 ~]$ cd /users/ian/chris/perl #change folder [ian@bioinf1 perl]$

16 Parallel BLAST example login as: ian ian@bioinf1's password: Last login: Tue Feb 24 14:45:31 2015 from uxa.liv.ac.uk [ian@bioinf1 ~]$ cd /users/ian/chris/perl [ian@bioinf1 perl]$ [ian@bioinf1 perl]$ ls -lh farisraw_*.fasta #list files

17 Parallel BLAST example login as: ian ian@bioinf1's password: Last login: Tue Feb 24 14:45:31 2015 from uxa.liv.ac.uk [ian@bioinf1 ~]$ cd /users/ian/chris/perl [ian@bioinf1 perl]$ [ian@bioinf1 perl]$ ls -lh farisraw_*.fasta -rw-r--r-- 1 ian ph 1.2G Feb 23 12:25 farisraw_1.fasta -rw-r--r-- 1 ian ph 1.4G Feb 23 12:25 farisraw_2.fasta -rw-r--r-- 1 ian ph 1.2G Feb 23 12:26 farisraw_3.fasta -rw-r--r-- 1 ian ph 1.3G Feb 23 12:26 farisraw_4.fasta -rw-r--r-- 1 ian ph 1.2G Feb 23 12:27 farisraw_5.fasta -rw-r--r-- 1 ian ph 1.3G Feb 23 12:27 farisraw_6.fasta -rw-r--r-- 1 ian ph 1.2G Feb 23 12:28 farisraw_7.fasta -rw-r--r-- 1 ian ph 1.2G Feb 23 12:24 farisraw_8.fasta -rw-r--r-- 1 ian ph 9.6G Feb 23 11:18 farisraw_complete.fasta [ian@bioinf1 perl]$ original query file

18 Parallel BLAST example login as: ian ian@bioinf1's password: Last login: Tue Feb 24 14:45:31 2015 from uxa.liv.ac.uk [ian@bioinf1 ~]$ cd /users/ian/chris/perl [ian@bioinf1 perl]$ [ian@bioinf1 perl]$ ls -lh farisraw_*.fasta -rw-r--r-- 1 ian ph 1.2G Feb 23 12:25 farisraw_1.fasta -rw-r--r-- 1 ian ph 1.4G Feb 23 12:25 farisraw_2.fasta -rw-r--r-- 1 ian ph 1.2G Feb 23 12:26 farisraw_3.fasta -rw-r--r-- 1 ian ph 1.3G Feb 23 12:26 farisraw_4.fasta -rw-r--r-- 1 ian ph 1.2G Feb 23 12:27 farisraw_5.fasta -rw-r--r-- 1 ian ph 1.3G Feb 23 12:27 farisraw_6.fasta -rw-r--r-- 1 ian ph 1.2G Feb 23 12:28 farisraw_7.fasta -rw-r--r-- 1 ian ph 1.2G Feb 23 12:24 farisraw_8.fasta -rw-r--r-- 1 ian ph 9.6G Feb 23 11:18 farisraw_complete.fasta [ian@bioinf1 perl]$ partial query files

19 Parallel BLAST example [ian@bioinf1 perl]$ cat blast.sub #show file contents job file

20 Parallel BLAST example [ian@bioinf1 perl]$ cat blast.sub #!/bin/bash #$ -cwd -V #$ -o stdout #$ -e stderr #$ -pe smp 8 blastn -query farisraw_${SGE_TASK_ID}.fasta \ -db /users/ian/chris/faris/dbs/RM_allfarisgenes_linuxbldb \ -out output${SGE_TASK_ID}.txt \ -word_size 11 -evalue.00001 -culling_limit 1 -max_target_seqs 5 \ -num_threads 8 \ -outfmt "6 qseqid sseqid qlen length qstart qend sstart send mismatch \ gaps qseq sseq pident evalue“ [ian@bioinf1 perl]$ job options

21 Parallel BLAST example [ian@bioinf1 perl]$ cat blast.sub #!/bin/bash #$ -cwd -V #$ -o stdout #$ -e stderr #$ -pe smp 8 blastn -query farisraw_${SGE_TASK_ID}.fasta \ -db /users/ian/chris/faris/dbs/RM_allfarisgenes_linuxbldb \ -out output${SGE_TASK_ID}.txt \ -word_size 11 -evalue.00001 -culling_limit 1 -max_target_seqs 5 \ -num_threads 8 \ -outfmt "6 qseqid sseqid qlen length qstart qend sstart send mismatch \ gaps qseq sseq pident evalue“ [ian@bioinf1 perl]$ BLAST query file

22 Parallel BLAST example [ian@bioinf1 perl]$ cat blast.sub #!/bin/bash #$ -cwd -V #$ -o stdout #$ -e stderr #$ -pe smp 8 blastn -query farisraw_${SGE_TASK_ID}.fasta \ -db /users/ian/chris/faris/dbs/RM_allfarisgenes_linuxbldb \ -out output${SGE_TASK_ID}.txt \ -word_size 11 -evalue.00001 -culling_limit 1 -max_target_seqs 5 \ -num_threads 8 \ -outfmt "6 qseqid sseqid qlen length qstart qend sstart send mismatch \ gaps qseq sseq pident evalue“ [ian@bioinf1 perl]$ BLAST database

23 Parallel BLAST example [ian@bioinf1 perl]$ cat blast.sub #!/bin/bash #$ -cwd -V #$ -o stdout #$ -e stderr #$ -pe smp 8 blastn -query farisraw_${SGE_TASK_ID}.fasta \ -db /users/ian/chris/faris/dbs/RM_allfarisgenes_linuxbldb \ -out output${SGE_TASK_ID}.txt \ -word_size 11 -evalue.00001 -culling_limit 1 -max_target_seqs 5 \ -num_threads 8 \ -outfmt "6 qseqid sseqid qlen length qstart qend sstart send mismatch \ gaps qseq sseq pident evalue“ [ian@bioinf1 perl]$ output file

24 Parallel BLAST example [ian@bioinf1 perl]$ cat blast.sub #!/bin/bash #$ -cwd -V #$ -o stdout #$ -e stderr #$ -pe smp 8 blastn -query farisraw_${SGE_TASK_ID}.fasta \ -db /users/ian/chris/faris/dbs/RM_allfarisgenes_linuxbldb \ -out output${SGE_TASK_ID}.txt \ -word_size 11 -evalue.00001 -culling_limit 1 -max_target_seqs 5 \ -num_threads 8 \ -outfmt "6 qseqid sseqid qlen length qstart qend sstart send mismatch \ gaps qseq sseq pident evalue“ [ian@bioinf1 perl]$ takes on values [1..8] when jobs are submitted

25 Parallel BLAST example [ian@bioinf1 perl]$ cat blast.sub #!/bin/bash #$ -cwd -V #$ -o stdout #$ -e stderr #$ -pe smp 8 blastn -query farisraw_${SGE_TASK_ID}.fasta \ -db /users/ian/chris/faris/dbs/RM_allfarisgenes_linuxbldb \ -out output${SGE_TASK_ID}.txt \ -word_size 11 -evalue.00001 -culling_limit 1 -max_target_seqs 5 \ -num_threads 8 \ -outfmt "6 qseqid sseqid qlen length qstart qend sstart send mismatch \ gaps qseq sseq pident evalue“ [ian@bioinf1 perl]$ use all 8 cores on each compute node (in parallel)

26 Parallel BLAST example [ian@bioinf1 perl]$ cat blast.sub #!/bin/bash #$ -cwd -V #$ -o stdout #$ -e stderr #$ -pe smp 8 blastn -query farisraw_${SGE_TASK_ID}.fasta \ -db /users/ian/chris/faris/dbs/RM_allfarisgenes_linuxbldb \ -out output${SGE_TASK_ID}.txt \ -word_size 11 -evalue.00001 -culling_limit 1 -max_target_seqs 5 \ -num_threads 8 \ -outfmt "6 qseqid sseqid qlen length qstart qend sstart send mismatch \ gaps qseq sseq pident evalue“ [ian@bioinf1 perl]$ qsub –t 1-8 blast.sub #submit jobs

27 Parallel BLAST example [ian@bioinf1 perl]$ cat blast.sub #!/bin/bash #$ -cwd -V #$ -o stdout #$ -e stderr #$ -pe smp 8 blastn -query farisraw_${SGE_TASK_ID}.fasta \ -db /users/ian/chris/faris/dbs/RM_allfarisgenes_linuxbldb \ -out output${SGE_TASK_ID}.txt \ -word_size 11 -evalue.00001 -culling_limit 1 -max_target_seqs 5 \ -num_threads 8 \ -outfmt "6 qseqid sseqid qlen length qstart qend sstart send mismatch \ gaps qseq sseq pident evalue“ [ian@bioinf1 perl]$ qsub –t 1-8 blast.sub Your job-array 20164.1-8:1 ("blast.sub") has been submitted [ian@bioinf1 perl]$

28 Parallel BLAST example [ian@bioinf1 perl]$ qstat #show job status

29 Parallel BLAST example [ian@bioinf1 perl]$ qstat job-ID prior name user state submit/start at queue slots ja-task-ID ----------------------------------------------------------------------------------------------- 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp00.liv.ac.uk 8 1 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp07.liv.ac.uk 8 2 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp01.liv.ac.uk 8 3 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp04.liv.ac.uk 8 4 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp03.liv.ac.uk 8 5 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp02.liv.ac.uk 8 6 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp05.liv.ac.uk 8 7 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp06.liv.ac.uk 8 8 [ian@bioinf1 perl]$

30 Parallel BLAST example [ian@bioinf1 perl]$ qstat job-ID prior name user state submit/start at queue slots ja-task-ID ----------------------------------------------------------------------------------------------- 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp00.liv.ac.uk 8 1 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp07.liv.ac.uk 8 2 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp01.liv.ac.uk 8 3 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp04.liv.ac.uk 8 4 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp03.liv.ac.uk 8 5 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp02.liv.ac.uk 8 6 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp05.liv.ac.uk 8 7 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp06.liv.ac.uk 8 8 [ian@bioinf1 perl]$ indicates job is running

31 Parallel BLAST example [ian@bioinf1 perl]$ qstat job-ID prior name user state submit/start at queue slots ja-task-ID ----------------------------------------------------------------------------------------------- 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp00.liv.ac.uk 8 1 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp07.liv.ac.uk 8 2 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp01.liv.ac.uk 8 3 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp04.liv.ac.uk 8 4 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp03.liv.ac.uk 8 5 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp02.liv.ac.uk 8 6 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp05.liv.ac.uk 8 7 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp06.liv.ac.uk 8 8 [ian@bioinf1 perl]$ name of compute node job is running on

32 Parallel BLAST example [ian@bioinf1 perl]$ qstat job-ID prior name user state submit/start at queue slots ja-task-ID ----------------------------------------------------------------------------------------------- 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp00.liv.ac.uk 8 1 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp07.liv.ac.uk 8 2 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp01.liv.ac.uk 8 3 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp04.liv.ac.uk 8 4 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp03.liv.ac.uk 8 5 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp02.liv.ac.uk 8 6 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp05.liv.ac.uk 8 7 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp06.liv.ac.uk 8 8 [ian@bioinf1 perl]$ qstat

33 Parallel BLAST example [ian@bioinf1 perl]$ qstat job-ID prior name user state submit/start at queue slots ja-task-ID ----------------------------------------------------------------------------------------------- 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp00.liv.ac.uk 8 1 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp07.liv.ac.uk 8 2 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp01.liv.ac.uk 8 3 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp04.liv.ac.uk 8 4 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp03.liv.ac.uk 8 5 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp02.liv.ac.uk 8 6 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp05.liv.ac.uk 8 7 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp06.liv.ac.uk 8 8 [ian@bioinf1 perl]$ qstat job-ID prior name user state submit/start at queue slots ja-task-ID ----------------------------------------------------------------------------------------------- 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp00.liv.ac.uk 8 1 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp07.liv.ac.uk 8 2 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp02.liv.ac.uk 8 6 [ian@bioinf1 perl]$ qstat

34 Parallel BLAST example [ian@bioinf1 perl]$ qstat job-ID prior name user state submit/start at queue slots ja-task-ID ----------------------------------------------------------------------------------------------- 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp00.liv.ac.uk 8 1 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp07.liv.ac.uk 8 2 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp01.liv.ac.uk 8 3 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp04.liv.ac.uk 8 4 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp03.liv.ac.uk 8 5 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp02.liv.ac.uk 8 6 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp05.liv.ac.uk 8 7 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp06.liv.ac.uk 8 8 [ian@bioinf1 perl]$ qstat job-ID prior name user state submit/start at queue slots ja-task-ID ----------------------------------------------------------------------------------------------- 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp00.liv.ac.uk 8 1 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp07.liv.ac.uk 8 2 20157 0.55500 blast.sub ian r 02/26/2015 14:32:49 bs@comp02.liv.ac.uk 8 6 [ian@bioinf1 perl]$ qstat [ian@bioinf1 perl]$

35 Parallel BLAST example [ian@bioinf1 perl]$ ls -lh output*.txt #list output files

36 Parallel BLAST example [ian@bioinf1 perl]$ ls -lh output* -rw-r--r-- 1 ian ph 45M Feb 26 14:38 output1.txt -rw-r--r-- 1 ian ph 22M Feb 26 14:38 output2.txt -rw-r--r-- 1 ian ph 59M Feb 26 14:38 output3.txt -rw-r--r-- 1 ian ph 20M Feb 26 14:38 output4.txt -rw-r--r-- 1 ian ph 28M Feb 26 14:38 output5.txt -rw-r--r-- 1 ian ph 13M Feb 26 14:38 output6.txt -rw-r--r-- 1 ian ph 30M Feb 26 14:38 output7.txt -rw-r--r-- 1 ian ph 30M Feb 26 14:38 output8.txt [ian@bioinf1 perl]$ partial output files

37 Parallel BLAST example [ian@bioinf1 perl]$ ls -lh output*.txt -rw-r--r-- 1 ian ph 45M Feb 26 14:38 output1.txt -rw-r--r-- 1 ian ph 22M Feb 26 14:38 output2.txt -rw-r--r-- 1 ian ph 59M Feb 26 14:38 output3.txt -rw-r--r-- 1 ian ph 20M Feb 26 14:38 output4.txt -rw-r--r-- 1 ian ph 28M Feb 26 14:38 output5.txt -rw-r--r-- 1 ian ph 13M Feb 26 14:38 output6.txt -rw-r--r-- 1 ian ph 30M Feb 26 14:38 output7.txt -rw-r--r-- 1 ian ph 30M Feb 26 14:38 output8.txt [ian@bioinf1 perl]$ cat output*.txt > output_complete.txt [ian@bioinf1 perl]$ ls -lh output_complete.txt combine partial results files

38 Parallel BLAST example [ian@bioinf1 perl]$ ls -lh output*.txt -rw-r--r-- 1 ian ph 45M Feb 26 14:38 output1.txt -rw-r--r-- 1 ian ph 22M Feb 26 14:38 output2.txt -rw-r--r-- 1 ian ph 59M Feb 26 14:38 output3.txt -rw-r--r-- 1 ian ph 20M Feb 26 14:38 output4.txt -rw-r--r-- 1 ian ph 28M Feb 26 14:38 output5.txt -rw-r--r-- 1 ian ph 13M Feb 26 14:38 output6.txt -rw-r--r-- 1 ian ph 30M Feb 26 14:38 output7.txt -rw-r--r-- 1 ian ph 30M Feb 26 14:38 output8.txt [ian@bioinf1 perl]$ cat output*.txt > output_complete.txt [ian@bioinf1 perl]$ ls -lh output_complete.txt -rw-r--r-- 1 ian ph 499M Feb 26 14:44 output_complete.txt [ian@bioinf1 perl]$ combined results

39 Some HPC clusters available at Liverpool  bioinf1  System bought by the Institute of Translational Medicine for use in biomedical research about 5 years ago  9 compute nodes each with 8 cores and 32 GB of memory (one node has 128 GB of memory)  76 TB of main (parallel) storage  chadwick  Main CSD HPC cluster for research use  118 nodes each with 16 cores and 64 GB memory (one node has 2 TB of memory)  Total of 135 TB of main (parallel) storage  Fast (40 GB/s) internal network

40 High Throughput Computing (HTC) using Condor  No dedicated hardware - uses ordinary classroom PCs to run jobs when then they would otherwise be idle (usually evenings and weekends)  Jobs may be interrupted by users logging into Condor PCs – works best for short running jobs (10-20 minutes ideally)  Only suitable for applications which use independent tasks (need to use HPC inter-dependent tasks)  No shared storage – all data files must be transferred to/from the Condor PCs  Limited memory and disk space available since Condor uses only commodity PCs  However… Condor is well suited to many statistical and data- intensive applications !

41 A “typical” Condor pool Condor Server Desktop PC Execute hosts login and upload input data

42 A “typical” Condor pool Condor Server Desktop PC Execute hosts jobs

43 A “typical” Condor pool Condor Server Desktop PC Execute hosts results

44 A “typical” Condor pool Condor Server Desktop PC Execute hosts download results

45 University of Liverpool Condor Pool  contains around 750 classroom PCs running the CSD Managed Windows 7 Service  Each PC can support a maximum of 4 jobs concurrently giving a theoretical capacity of 3000 parallel jobs  Typical spec: 3.3 GHz Intel i3 dual-core processor, 8 GB memory, 128 GB disk space  Tools are available to help in running large numbers of R and MATLAB jobs (other software may work but not commercial packages such as SAS and Stata)  Single job submission point for Condor jobs provided by powerful UNIX server  Service can be also accessed from a Windows PC/laptop using Desktop Condor (even from off-campus)

46 Desktop Condor (1)

47 Desktop Condor (2)

48 Desktop Condor (3)

49 Personalised Medicine example  project is an example of a Genome-Wide Association Study  aims to identify genetic predictors of response to anti-epileptic drugs  try to identify regions of the human genome that differ between individuals (referred to as Single Nucleotide Polymorphisms or SNPs)  800 patients genotyped at 500 000 SNPs along the entire genome  Statistically test the association between SNPs and outcomes (e.g. time to withdrawal of drug due to adverse effects)  large data-parallel problem using R – ideal for Condor  divide datasets into small partitions so that individual jobs run for 15-30 minutes  batch of 26 chromosomes (2 600 jobs) required ~ 5 hours wallclock time on Condor but ~ 5 weeks on a single PC

50 Radiotherapy example  large 3 rd party application code which simulates photon beam radiotherapy treatment using Monte Carlo methods  tried running simulation on 56 cores of high performance computing cluster but no progress after 5 weeks  divided problem into 250 then 5 000 and eventually 50 000 Condor jobs  required ~ 2 600 days of CPU time (equivalent to ~ 3.5 years on dual core PC)  Condor simulation completed in less than one week  average run time was ~ 70 min

51 Summary  Parallelism can help speed up the solution of many research computing problems by dividing large problems into many smaller ones which can be tackled at the same time  High Performance Computing clusters  Typically used for small numbers of long running jobs  Ideal for applications requiring lots of memory and disk storage space  Almost all systems are UNIX-based  Condor High Throughput Computing Service  Typically used for large/very large numbers of short running jobs  Limited memory and storage available on Condor PCs  Support available for applications using R (and MATLAB)  No UNIX knowledge needed with Desktop Condor

52 Next steps  Condor Service information: http://condor.liv.ac.uk http://condor.liv.ac.uk  Information on bioinf1 and HPC clusters: http://clusterinfo.liv.ac.uk http://clusterinfo.liv.ac.uk  Information on the Advanced Research Computing (ARC) facilities: http://www.liv.ac.uk/csd/advanced-research-computing http://www.liv.ac.uk/csd/advanced-research-computing  To contact the ARC team email: arc-support@liverpool.ac.uk arc-support@liverpool.ac.uk  To request an account on Condor or chadwick use: http://www.liv.ac.uk/media/livacuk/computingservices/help/eScienceform.pdf  For an account on bioinf1 – just ask me ! (i.c.smith@liverpool.ac.uk )i.c.smith@liverpool.ac.uk


Download ppt "Ian C. Smith* Introduction to research computing using the High Performance Computing facilities and Condor *Advanced Research Computing University of."

Similar presentations


Ads by Google