Rhea Analysis & Post-processing Cluster Robert D. French NCCS User Assistance.

Slides:



Advertisements
Similar presentations
ECMWF 1 COM INTRO 2004: Introduction to file systems Introduction to File Systems Computer User Training Course 2004 Carsten Maaß User Support.
Advertisements

Complementary Capability Computing on HPCx Dr Alan Gray.
Metadata Performance Improvements Presentation for LUG 2011 Ben Evans Principal Software Engineer Terascala, Inc.
Atlas Status Update Chris Fuson. 2 Atlas Update - Timeline March 06, 2014 –Installed patches that targeted memory contention on the meta data server to.
FAMILIARIZATION AND USAGE TRAINING FACIL FILE SERVER Introduction and General Use Training.
Academic and Research Technology (A&RT)
ORNL is managed by UT-Battelle for the US Department of Energy Tools Available for Transferring Large Data Sets Over the WAN Suzanne Parete-Koon Chris.
High Performance Computing (HPC) at Center for Information Communication and Technology in UTM.
Site Report US CMS T2 Workshop Samir Cury on behalf of T2_BR_UERJ Team.
Introduction to Flash Jeremy Johnson & Frank Witmer Computing and Research Services 8 Jul 2014.
ORNL is managed by UT-Battelle for the US Department of Energy Data Management User Guide Suzanne Parete-Koon Oak Ridge Leadership Computing Facility.
Plans for Exploitation of the ORNL Titan Machine Richard P. Mount ATLAS Distributed Computing Technical Interchange Meeting May 17, 2013.
Illinois Campus Cluster Program User Forum October 24, 2012 Illini Union Room 210 2:00PM – 3:30PM.
Cluster Components Compute Server Disk Storage Image Server.
MIGRATING TO THE SHARED COMPUTING CLUSTER (SCC) SCV Staff Boston University Scientific Computing and Visualization.
Eos Center-wide File Systems Chris Fuson Outline 1 Available Center-wide File Systems 2 New Lustre File System 3 Data Transfer.
Research Support Services Research Support Services.
Introduction to HPC resources for BCB 660 Nirav Merchant
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
Introduction to the HPCC Jim Leikert System Administrator High Performance Computing Center.
MSc. Miriel Martín Mesa, DIC, UCLV. The idea Installing a High Performance Cluster in the UCLV, using professional servers with open source operating.
VIPBG LINUX CLUSTER By Helen Wang March 29th, 2013.
Bigben Pittsburgh Supercomputing Center J. Ray Scott
David Hutchcroft on behalf of John Bland Rob Fay Steve Jones And Mike Houlden [ret.] * /.\ /..‘\ /'.‘\ /.''.'\ /.'.'.\ /'.''.'.\ ^^^[_]^^^ * /.\ /..‘\
Lab System Environment
ATLAS DC2 seen from Prague Tier2 center - some remarks Atlas sw workshop September 2004.
Katie Antypas User Services Group Lawrence Berkeley National Lab 17 February 2012 JGI Training Series.
JLab Scientific Computing: Theory HPC & Experimental Physics Thomas Jefferson National Accelerator Facility Newport News, VA Sandy Philpott.
HPC for Statistics Grad Students. A Cluster Not just a bunch of computers Linked CPUs managed by queuing software – Cluster – Node – CPU.
Architecture and ATLAS Western Tier 2 Wei Yang ATLAS Western Tier 2 User Forum meeting SLAC April
NCCS User Forum 11 December GSFC NCCS NCCS User Forum12/11/082 Agenda Welcome & Introduction Phil Webster, CISTO Chief Current System Status Fred.
Active Storage Processing in Parallel File Systems Jarek Nieplocha Evan Felix Juan Piernas-Canovas SDM CENTER.
2011/08/23 國家高速網路與計算中心 Advanced Large-scale Parallel Supercluster.
ARCHER Advanced Research Computing High End Resource
BlueWaters Storage Solution Michelle Butler NCSA January 19, 2016.
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
National Energy Research Scientific Computing Center (NERSC) CHOS - CHROOT OS Shane Canon NERSC Center Division, LBNL SC 2004 November 2004.
ORNL is managed by UT-Battelle for the US Department of Energy Best OLCF (or, How to OLCF) Bill Renaud OLCF User Support.
OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY The stagesub tool Sudharshan S. Vazhkudai Computer Science Research Group CSMD Oak Ridge National.
AT LOUISIANA STATE UNIVERSITY CCT: Center for Computation & LSU Condor in Louisiana Tevfik Kosar Center for Computation & Technology Louisiana.
Getting Started: XSEDE Comet Shahzeb Siddiqui - Software Systems Engineer Office: 222A Computer Building Institute of CyberScience May.
LIOProf: Exposing Lustre File System Behavior for I/O Middleware
The RAL PPD Tier 2/3 Current Status and Future Plans or “Are we ready for next year?” Chris Brew PPD Christmas Lectures th December 2007.
StoRM + Lustre Proposal YAN Tian On behalf of Distributed Computing Group
NREL is a national laboratory of the U.S. Department of Energy, Office of Energy Efficiency and Renewable Energy, operated by the Alliance for Sustainable.
ATLAS Computing Wenjing Wu outline Local accounts Tier3 resources Tier2 resources.
IHEP Computing Center Site Report Gang Chen Computing Center Institute of High Energy Physics 2011 Spring Meeting.
Introduction to Data Analysis with R on HPC Texas Advanced Computing Center Feb
ORNL is managed by UT-Battelle for the US Department of Energy OLCF HPSS Performance Then and Now Jason Hill HPC Operations Storage Team Lead
An Brief Introduction Charlie Taylor Associate Director, Research Computing UF Research Computing.
Advanced Computing Facility Introduction
Compute and Storage For the Farm at Jlab
Specialized Computing Cluster An Introduction
Welcome to Indiana University Clusters
A Brief Introduction to NERSC Resources and Allocations
Buying into “Summit” under the “Condo” model
HPC usage and software packages
Cluster / Grid Status Update
Hodor HPC Cluster LON MNG HPN Head Node Comp Node Comp Node Comp Node
Architecture & System Overview
CMS analysis job and data transfer test results
CyberShake Study 16.9 Discussion
Computing Infrastructure for DAQ, DM and SC
Welcome to our Nuclear Physics Computing System
CCR Advanced Seminar: Running CPLEX Computations on the ISE Cluster
Welcome to our Nuclear Physics Computing System
ETHZ, Zürich September 1st , 2016
Introduction to High Performance Computing Using Sapelo2 at GACRC
Quick Tutorial on MPICH for NIC-Cluster
H2020 EU PROJECT | Topic SC1-DTH | GA:
Presentation transcript:

Rhea Analysis & Post-processing Cluster Robert D. French NCCS User Assistance

2 Rhea Quick Overview 200 Dell PowerEdge C6220 Nodes – 196 Compute / 4 Login – RHEL 6.4 – 2 x 8-Core Intel Xeon 2.0 GHz Hyperthreading is enabled, so “top” shows 32 CPUs – 64GB of RAM – New 56Gb/s IB Fabric Mounts Atlas – Does not mount Widow Replaces Lens No Preemptive Queue

3 Allocation & Billing Rhea is prioritized as an extra resource for INCITE and ALCC users through the end of the year. – DD Projects may request access 1 node hour charged per node per hour – Ex: 10 nodes for 2 hours = 20 node hours Each project will be awarded 1,000 hours per Month – Separate from Titan / Eos usage – Request more if you run low

4 Rhea Queue Policy Job SizeJob LengthJob LimitsRestricted by 1 – 16 Nodes0 – 12 Hours 12 – 36 Hours 36 – 96 Hours 3 eligible / unlimited running 2 active 1 active User System 17 – 32 Nodes0 – 12 Hours 12 – 36 Hours 2 active 1 active System 33 – 128 Nodes0 – 3 Hours1 activeSystem Should minimize large jobs swamping the system Small runs should complete quickly Request a Reservation for more nodes / longer wall-times

5 Software Stack Most Lens software will already be installed Here are some highlights: – Visualization: ParaView, VisIt, VMD – Compilers: GCC, Intel, and PGI – Scientific Languages: MATLAB, Octave, R, SciPy – Data Management: Globus, BBCP, NetCDF, HDF5, Adios – Debugging: DDT, Vampir, Valgrind Full list of installed software available on our website If you can’t find what you need, just ask!

6 Transitioning to Rhea Titan Lens Widow Now: Titan and Lens mount Widow

7 Transitioning to Rhea Titan Rhea Lens Widow Atlas Soon (mid-to-late November): Titan will mount both Atlas and Widow Move data to Atlas and take advantage of Rhea

8 Transitioning to Rhea Titan Rhea Atlas Near Future: Lens will be decommissioned Rhea will be the center’s viz & analysis cluster Widow

9 Questions?

Spider II Directory Layout Changes Chris Fuson

11 OLCF Center-wide File Systems Spider – Center-wide scratch space – Temporary; not backed-up – Available from compute nodes – Fast access to job-related temporary files and for staging large files to and from archival storage – Contains multiple Lustre file systems

12 Spider I v/s Spider II Spider I Widow [1-3] 240 GB/s 10 PB 3 MDS 192 OSS 1,344 OST Current Center-wide Scratch Decommissioned Early January, 2014 Atlas [1-2] 1 TB/s 30 PB 2 MDS 288 OSS 2,016 OST Available on Additional OLCF Systems Soon Spider II

13 Spider II Change Overview 1.New directory structure – Organized by project – Each project given a directory on one of the atlas filesystems – WORKDIR now within project areas » You may have multiple WORKDIRs » * Requires Change 2.Quota increases – Increased file system size allows for increased quotas 3.All areas purged – To help ensure space available for all projects Before using Spider II, please note the following:

14 Purpose: Batch job I/O Path: -$MEMBERWORK/ 10 TB quota 14 day purge Permissions: -User allowed to change permissions to share within project -No automatic permission changes Spider II Directory Structure ProjectID Member Work World Work Project Work

15 Purpose: Data sharing within project Path: -$PROJWORK/ 100 TB quota 90 day purge Permissions: -Read, Write, Execute access for project members Spider II Directory Structure ProjectID Member Work World Work Project Work

16 Purpose: Data sharing with users who are not members of project Path: -$WORLDWORK/ 10 TB quota 14 day purge Permissions: -Read, Execute for world -Read, Write, Execute for project Spider II Directory Structure ProjectID Member Work World Work Project Work

17 Spider II Directory Structure New directory structure –Organized by project

18 Before Using Atlas Modify scripts to point to new directory structure /tmp/work/$USER $WORKDIR /tmp/proj/ $MEMBERWORK/ Migrate data You will need to transfer needed data onto Spider II (atlas) $PROJWORK/

19 Questions? More information: – –

20 Other Items Dec 17 th - Titan to return to 100% 2013 User Survey – Available on olcf.ornl.gov

21 Thanks for your time.