Www.openfabrics.org Operational and Application Experiences with the Infiniband Environment Sharon Brunett Caltech May 1, 2007.

Slides:

Advertisements

Similar presentations

SCARF Duncan Tooke RAL HPCSG. Overview What is SCARF? Hardware & OS Management Software Users Future.

Advertisements

Issues of HPC software From the experience of TH-1A Lu Yutong NUDT.

1 Agenda … HPC Technology & Trends HPC Platforms & Roadmaps HP Supercomputing Vision HP Today.

Thoughts on Shared Caches Jeff Odom University of Maryland.

Report of Liverpool HEP Computing during 2007 Executive Summary. Substantial and significant improvements in the local computing facilities during the.

S.Chechelnitskiy / SFU Simon Fraser Running CE and SE in a XEN virtualized environment S.Chechelnitskiy Simon Fraser University CHEP 2007 September 6 th.

CURRENT AND FUTURE HPC SOLUTIONS. T-PLATFORMS  Russia’s leading developer of turn-key solutions for supercomputing  Privately owned  140+ employees.

Information Technology Center Introduction to High Performance Computing at KFUPM.

Managing Linux Clusters with Rocks Tim Carlson - PNNL

Southgrid Status Pete Gronbech: 27th June 2006 GridPP 16 QMUL.

Presented by: Yash Gurung, ICFAI UNIVERSITY.Sikkim BUILDING of 3 R'sCLUSTER PARALLEL COMPUTER.

S. Gadomski, "ATLAS computing in Geneva", journee de reflexion, 14 Sept ATLAS computing in Geneva Szymon Gadomski description of the hardware the.

HIGH PERFORMANCE COMPUTING ENVIRONMENT The High Performance Computing environment consists of high-end systems used for executing complex number crunching.

HELICS Petteri Johansson & Ilkka Uuhiniemi. HELICS COW –AMD Athlon MP 1.4Ghz –512 (2 in same computing node) –35 at top500.org –Linpack Benchmark 825.

CSC Site Update HP Nordic TIG April 2008 Janne Ignatius Marko Myllynen Dan Still.

VTF Applications Performance and Scalability Sharon Brunett CACR/Caltech ASCI Site Review October 28,

High Performance Computing (HPC) at Center for Information Communication and Technology in UTM.

Maintaining and Updating Windows Server 2008

70-291: MCSE Guide to Managing a Microsoft Windows Server 2003 Network Chapter 14: Troubleshooting Windows Server 2003 Networks.

CPP Staff - 30 CPP Staff - 30 FCIPT Staff - 35 IPR Staff IPR Staff ITER-India Staff ITER-India Staff Research Areas: 1.Studies.

Hands-On Microsoft Windows Server 2008 Chapter 11 Server and Network Monitoring.

CH 13 Server and Network Monitoring. Hands-On Microsoft Windows Server Objectives Understand the importance of server monitoring Monitor server.

Windows Server 2008 Chapter 11 Last Update

Cluster computing facility for CMS simulation work at NPD-BARC Raman Sehgal.

Building a High-performance Computing Cluster Using FreeBSD BSDCon '03 September 10, 2003 Brooks Davis, Michael AuYeung, Gary Green, Craig Lee The Aerospace.

HPC at IISER Pune Neet Deo System Administrator

Hands-On Microsoft Windows Server 2008

Bob Thome, Senior Director of Product Management, Oracle SIMPLIFYING YOUR HIGH AVAILABILITY DATABASE.

Operational computing environment at EARS Jure Jerman Meteorological Office Environmental Agency of Slovenia (EARS)

Chapter 4 COB 204. What do you need to know about hardware? 

Net Optics Confidential and Proprietary Net Optics appTap Intelligent Access and Monitoring Architecture Solutions.

WORK ON CLUSTER HYBRILIT E. Aleksandrov 1, D. Belyakov 1, M. Matveev 1, M. Vala 1,2 1 Joint Institute for nuclear research, LIT, Russia 2 Institute for.

IT Infrastructure Chap 1: Definition

Performance Concepts Mark A. Magumba. Introduction Research done on 1058 correspondents in 2006 found that 75% OF them would not return to a website that.

Module 7: Fundamentals of Administering Windows Server 2008.

David Hutchcroft on behalf of John Bland Rob Fay Steve Jones And Mike Houlden [ret.] * /.\ /..‘\ /'.‘\ /.''.'\ /.'.'.\ /'.''.'.\ ^^^[_]^^^ * /.\ /..‘\

PARMON A Comprehensive Cluster Monitoring System A Single System Image Case Study Developer: PARMON Team Centre for Development of Advanced Computing,

Sensitivity of Cluster File System Access to I/O Server Selection A. Apon, P. Wolinski, and G. Amerson University of Arkansas.

Sandor Acs 05/07/

What’s New in WatchGuard XCS v9.1 Update 1. WatchGuard XCS v9.1 Update 1  Enhancements that improve ease of use New Dashboard items  Mail Summary >

Headline in Arial Bold 30pt HPC User Forum, April 2008 John Hesterberg HPC OS Directions and Requirements.

Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.

ITEP computing center and plans for supercomputing Plans for Tier 1 for FAIR (GSI) in ITEP  8000 cores in 3 years, in this year  Distributed.

Towards Exascale File I/O Yutaka Ishikawa University of Tokyo, Japan 2009/05/21.

Nanco: a large HPC cluster for RBNI (Russell Berrie Nanotechnology Institute) Anne Weill – Zrahia Technion,Computer Center October 2008.

1 Lattice QCD Clusters Amitoj Singh Fermi National Accelerator Laboratory.

Cluster Software Overview

CASPUR Site Report Andrei Maslennikov Lead - Systems Rome, April 2006.

IHEP(Beijing LCG2) Site Report Fazhi.Qi, Gang Chen Computing Center,IHEP.

Efficiency of small size tasks calculation in grid clusters using parallel processing.. Olgerts Belmanis Jānis Kūliņš RTU ETF Riga Technical University.

2011/08/23 國家高速網路與計算中心 Advanced Large-scale Parallel Supercluster.

Copyright 2007, Information Builders. Slide 1 Machine Sizing and Scalability Mark Nesson, Vashti Ragoonath June 2008.

December 26, 2015 RHIC/USATLAS Grid Computing Facility Overview Dantong Yu Brookhaven National Lab.

Final Implementation of a High Performance Computing Cluster at Florida Tech P. FORD, X. FAVE, K. GNANVO, R. HOCH, M. HOHLMANN, D. MITRA Physics and Space.

PROOF tests at BNL Sergey Panitkin, Robert Petkus, Ofer Rind BNL May 28, 2008 Ann Arbor, MI.

Performance profiling of Experiments’ Geant4 Simulations Geant4 Technical Forum Ryszard Jurga.

Pathway to Petaflops A vendor contribution Philippe Trautmann Business Development Manager HPC & Grid Global Education, Government & Healthcare.

LIOProf: Exposing Lustre File System Behavior for I/O Middleware

Maintaining and Updating Windows Server 2008 Lesson 8.

Introduction to Performance Tuning Chia-heng Tu PAS Lab Summer Workshop 2009 June 30,

Slide 1 User-Centric Workload Analytics: Towards Better Cluster Management Saurabh Bagchi Purdue University Joint work with: Subrata Mitra, Suhas Javagal,

Slide 1 Cluster Workload Analytics Revisited Saurabh Bagchi Purdue University Joint work with: Subrata Mitra, Suhas Javagal, Stephen Harrell (Purdue),

SQL Database Management

Hands-On Microsoft Windows Server 2008

Kilohertz Decision Making on Petabytes

SiCortex Update IDC HPC User Forum

The Neuronix HPC Cluster:

Presentation transcript:

Operational and Application Experiences with the Infiniband Environment Sharon Brunett Caltech May 1, 2007

2 Outline  Production Environment using Infiniband  Hardware configuration  Software stack  Usage model  Infiniband particulars  Sample application  Benchmarks  Issues  A less challenging future  A collection of hoped for improvements

3 Extreme Networks Black Diamond 8810 Copper GigE Extreme Networks Black Diamond 8810 Copper GigE Opteron/Infiniband Cluster Configuration 8 GB memory 2.2 GHz 256 GB scratch 8 GB memory 2.2 GHz 256 GB scratch 86 dual CPU, dual core AMD Opteron nodes Voltaire Infininband Switch Voltaire Infininband Switch 16 GB memory 2.2 GHz quad CPU, dual core 16 GB memory 2.2 GHz quad CPU, dual core AMD Opteron head/login node (shc.cacr.caltech.edu) 8 ~ 25 TB /pvfs/data-store02 ~ 25 TB /pvfs/data-store03 8 GB memory 2.4 GHz 256 GB scratch 8 GB memory 2.4 GHz 256 GB scratch 38 dual CPU, dual core AMD Opteron nodes 8 … 124 … : 124 : ~ 24 TB (RAID6) /nfs/data-store01 Opteron dual CPU dual core, 16 GB NFS server Opteron dual CPU dual core, 16 GB NFS server

4 Compute Resource Utilization Summary  Even balance between active projects  76% utilization for 2007  up from 64.9% in 2006  Mix of development and production jobs  Typically ranging in size from 4 to 32 nodes, 2 to 24 hours  Approx 100 user accounts, 5 partner projects

5 Production Environment  Software stack impacting Golden Image  SLES9 (security patched) kernel version  Mellanox Infiniband drivers v3.5.5 No sources available to us  Parallel Virtual File System (pvfs) v2  OpenMPI (2.1.X)  Torque  Maui  Software stack - user tools  Plotting and Data Visualization Tool - Tecplot  Debugger - Totalview  Numerical Computing Environment/language - Matlab  Portable Extensible Toolkit for Scientific Computation - PETSc  Hierarchial Data Format (HDF) v4,5

6 SCS Grains Simulation  Highly resolved simulations of shear compression polycrystal specimen tests  Production run stats  LLNL’s alc, 12 hours 118 CPUs, 900K steps, 4.4 GB of dumps

7 Sample Application MPI profile  As problem size grows, MPI impact less due to better load balancing  MPI_Waitall and AllReduce are major time consumers  Run smaller benchmarks for tuning suggestions

8 PMB PingPong

9 PMB PingPong

10 PMB MPI_AllReduce

11 Tuning Tests Revealed Infiniband Issues  The Port Management (PM) facility gives sysadmin/user ability to analyze and maintain the Infiniband environment  Particular ports had high PortRcvErrors, indicative of a bad link Moving cables and swapping in a new IB blade isolated the problem further  Congestion reduced by configurable threshold limit (HOQlife)

12 Problem IB Blade Identified New Challenges Arise  Servicing the Infiniband switch, as currently installed, is no picnic  Note how working parts need to be dismantled to access parts needing service Cable tracing and stress needs attention  Line boards can take multiple re-seatings before they’re “snug”  As Mark says…hardware should be treated like a delicate flower

13 Lessons Learned  Sections of the code with MPI collective calls sensitive to msg lengths and process counts  Run indicative benchmarks as part of production run set up process  Use Voltaire’s PM utility to routinely monitor the fabric for problems  Functionality and performance  Buy dinner for Trent and Ira  test out linkcheck and ibcheckfabric on our little cluster

14 Making our Lives Easier  Mellanox drivers -> OpenIB ?  Locally built golden image gives flexibility but has drawbacks  Automatic probing of PM counter report files to compare against “known good” states  Report suspect components  Use standard/factory benchmarks to verify Infiniband cluster is working at customer site as well as when the integrated system shipped!  Increasingly important as cluster expands  Incorportate low level PM facilities into support level tools for better integrated monitoring