How to get started on cees Mandy SEP Style. Resources Cees-clusters SEP-reserved disk20TB SEP reserved node35 (currently 25) Default max node149 (8 cores.

Slides:



Advertisements
Similar presentations
© 2007 IBM Corporation IBM Global Engineering Solutions IBM Blue Gene/P Job Submission.
Advertisements

Parallel ISDS Chris Hans 29 November 2004.
Using the Argo Cluster Paul Sexton CS 566 February 6, 2006.
Southgreen HPC system Concepts Cluster : compute farm i.e. a collection of compute servers that can be shared and accessed through a single “portal”
ISG We build general capability Job Submission on the Olympus Cluster J. DePasse; S. Brown, PhD; T. Maiden Pittsburgh Supercomputing Center Public Health.
Using the BYU Supercomputers. Resources Basic Usage After your account is activated: – ssh ssh.fsl.byu.edu You will be logged in to an interactive node.
K.Harrison CERN, 23rd October 2002 HOW TO COMMISSION A NEW CENTRE FOR LHCb PRODUCTION - Overview of LHCb distributed production system - Configuration.
Israel Cluster Structure. Outline The local cluster Local analysis on the cluster –Program location –Storage –Interactive analysis & batch analysis –PBS.
ISG We build general capability Purpose After this tutorial, you should: Be comfortable submitting work to the batch queuing system of olympus and be familiar.
JGI/NERSC New Hardware Training Kirsten Fagnan, Seung-Jin Sul January 10, 2013.
Electronic Visualization Laboratory, University of Illinois at Chicago MPI on Argo-new Venkatram Vishwanath Electronic Visualization.
Introduction to HPC resources for BCB 660 Nirav Merchant
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
MSc. Miriel Martín Mesa, DIC, UCLV. The idea Installing a High Performance Cluster in the UCLV, using professional servers with open source operating.
VIPBG LINUX CLUSTER By Helen Wang March 29th, 2013.
Bigben Pittsburgh Supercomputing Center J. Ray Scott
Using the BYU Supercomputers. Resources Basic Usage After your account is activated: – ssh You will be logged in to an interactive.
Linux & Shell Scripting Small Group Lecture 3 How to Learn to Code Workshop group/ Erin.
Parallel Programming on the SGI Origin2000 With thanks to Igor Zacharov / Benoit Marchand, SGI Taub Computer Center Technion Moshe Goldberg,
HPC for Statistics Grad Students. A Cluster Not just a bunch of computers Linked CPUs managed by queuing software – Cluster – Node – CPU.
M. Schott (CERN) Page 1 CERN Group Tutorials CAT Tier-3 Tutorial October 2009.
Running Parallel Jobs Cray XE6 Workshop February 7, 2011 David Turner NERSC User Services Group.
Outline: Status: Report after one month of Plans for the future (Preparing Summer -Fall 2003) (CNAF): Update A. Sidoti, INFN Pisa and.
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
HPC at HCC Jun Wang Outline of Workshop2 Familiar with Linux file system Familiar with Shell environment Familiar with module command Familiar with queuing.
HUBbub 2013: Developing hub tools that submit HPC jobs Rob Campbell Purdue University Thursday, September 5, 2013.
Portable Batch System – Definition and 3 Primary Roles Definition: PBS is a distributed workload management system. It handles the management and monitoring.
Cliff Addison University of Liverpool NW-GRID Training Event 26 th January 2007 SCore MPI Taking full advantage of GigE.
CCJ introduction RIKEN Nishina Center Kohei Shoji.
Debugging Lab Antonio Gómez-Iglesias Texas Advanced Computing Center.
The RAL PPD Tier 2/3 Current Status and Future Plans or “Are we ready for next year?” Chris Brew PPD Christmas Lectures th December 2007.
Introduction to Parallel Computing Presented by The Division of Information Technology Computer Support Services Department Research Support Group.
Wouter Verkerke, NIKHEF 1 Using ‘stoomboot’ for NIKHEF-ATLAS batch computing What is ‘stoomboot’ – Hardware –16 machines, each 2x quad-core Pentium = 128.
NREL is a national laboratory of the U.S. Department of Energy, Office of Energy Efficiency and Renewable Energy, operated by the Alliance for Sustainable.
MapReduce: Simplied Data Processing on Large Clusters Written By: Jeffrey Dean and Sanjay Ghemawat Presented By: Manoher Shatha & Naveen Kumar Ratkal.
Using ROSSMANN to Run GOSET Studies Omar Laldin ( using materials from Jonathan Crider, Harish Suryanarayana ) Feb. 3, 2014.
CFI 2004 UW A quick overview with lots of time for Q&A and exploration.
An Brief Introduction Charlie Taylor Associate Director, Research Computing UF Research Computing.
IPPP Grid Cluster Phil Roffe David Ambrose-Griffith.
Gridengine Configuration review ● Gridengine overview ● Our current setup ● The scheduler ● Scheduling policies ● Stats from the clusters.
Advanced Computing Facility Introduction
Compute and Storage For the Farm at Jlab
GRID COMPUTING.
Specialized Computing Cluster An Introduction
Welcome to Indiana University Clusters
PARADOX Cluster job management
Unix Scripts and PBS on BioU
Assumptions What are the prerequisites? … The hands on portion of the workshop will be on the command-line. If you are not familiar with the command.
HPC usage and software packages
Welcome to Indiana University Clusters
Cluster / Grid Status Update
How to use the HPCC to do stuff
Lecture Topics: 11/1 Processes Process Management
Hodor HPC Cluster LON MNG HPN Head Node Comp Node Comp Node Comp Node
Architecture & System Overview
CommLab PC Cluster (Ubuntu OS version)
NGS computation services: APIs and Parallel Jobs
Welcome to our Nuclear Physics Computing System
Paul Sexton CS 566 February 6, 2006
Bruce Pullig Solution Architect
Compiling and Job Submission
CCR Advanced Seminar: Running CPLEX Computations on the ISE Cluster
Welcome to our Nuclear Physics Computing System
High Performance Computing in Bioinformatics
Michael P. McCumber Task Force Meeting April 3, 2006
gLite Job Management Christos Theodosiou
Introduction to High Performance Computing Using Sapelo2 at GACRC
Quick Tutorial on MPICH for NIC-Cluster
Executing Host Commands
Production client status
Presentation transcript:

How to get started on cees Mandy SEP Style

Resources Cees-clusters SEP-reserved disk20TB SEP reserved node35 (currently 25) Default max node149 (8 cores per node) Computer node hardware2.26 GHz Dual Processor Quad-Core Nehalem cees-rcf SEP-reserved disk30TB SEP reserved node21 (16 cores per node) Default max node137 (16 cores per node) Computer node hardwaresandy bridge

Home and working directories /home/username – 10GB quota – Backed up daily – Mounted read-only on compute nodes /data/sep/username – Everyone have write access to 20TB in /data/cees – Not backed up – SEP partition in /data/sep (20TB for cees-clusters and 30TB for cees-rcfs) Options 1) Run your code in /home but use absolute paths for outputting in /data 2) Run your code in /data but back-up your code in /home Tips A lot faster to write to /tmp within each node first and then copy back to /data

Where is SEPlib? # my own environmental variable setenv SEP /usr/local/SEP setenv SEPINC /usr/local/SEP/include setenv SEPBIN /usr/local/SEP/bin

How to submit a job Number of nodes and cores you need

How to submit a job The max run time of your job before it is killed Note: must be < 2hours for default queue

How to submit a job Stdout and Stderr logs

How to submit a job Queue, either default or sep

How to submit a job Jobname

How to submit a job The command for your jobs

How to submit a job Submit your job using qsub

Do not run big jobs on the head node -Talk to Dennis when moving large dataset - You can use cees-rcf-tools to test jobs as well

Check jobs

Cancel jobs

Need 40 nodes Need 1 node Need 40 nodes Need 1 node Need 40 nodes Ex. Stacking, step sizes, updating Ex. Pre-stack forward or adjoint operation Typical computation structure 1 job or many jobs?

reserved queue jobs can run forever default queue jobs must finish in 2 hours Waiting…

Need 40 nodes Need 1 node Need 40 nodes Need 1 node Need 40 nodes Ex. Stacking, step sizes, updating Ex. Pre-stack forward or adjoint operation I am taking over every single node. muahahaha

Bob’s advice Break your jobs into 2 hours block and use the default queue Only store intermediate result on the clusters

Scripting is useful for job management On cees-clusters /data/sep/mandyman/Tutorial 1.Embarrassingly parallel jobs submission 2.Timer to check jobs

Sharing resources We are here now

Sharing resources