UPPMAX and UPPNEX: Enabling high performance bioinformatics Ola Spjuth, UPPMAX

Slides:



Advertisements
Similar presentations
Data services on the NGS.
Advertisements

SCARF Duncan Tooke RAL HPCSG. Overview What is SCARF? Hardware & OS Management Software Users Future.
ITS Training and Awareness Session Research Support Jeremy Maris & Tom Armour ITS
LIBRA: Lightweight Data Skew Mitigation in MapReduce
IBM 1350 Cluster Expansion Doug Johnson Senior Systems Developer.
Beowulf Supercomputer System Lee, Jung won CS843.
LinkSCEEM-2: A computational resource for the development of Computational Sciences in the Eastern Mediterranean Mostafa Zoubi SESAME SESAME – LinkSCEEM.
Linux Clustering A way to supercomputing. What is Cluster? A group of individual computers bundled together using hardware and software in order to make.
World’s Leading Provider of Turn-key Compute Solutions for NGS / Bioinformatics.
MASPLAS ’02 Creating A Virtual Computing Facility Ravi Patchigolla Chris Clarke Lu Marino 8th Annual Mid-Atlantic Student Workshop On Programming Languages.
Cluster Computer For Bioinformatics Applications Nile University, Bioinformatics Group. Hisham Adel 2008.
High Performance Computing (HPC) at Center for Information Communication and Technology in UTM.
Bill Wrobleski Director, Technology Infrastructure ITS Infrastructure Services.
Research Computing with Newton Gerald Ragghianti Nov. 12, 2010.
07/14/08. 2 Points Introduction. Cluster and Supercomputers. Cluster Types and Advantages. Our Cluster. Cluster Performance. Cluster Computer for Basic.
The University of Texas Research Data Repository : “Corral” A Geographically Replicated Repository for Research Data Chris Jordan.
Chapter 2 Computer Clusters Lecture 2.1 Overview.
HPC at IISER Pune Neet Deo System Administrator
Bioinformatics Core Facility Ernesto Lowy February 2012.
ScotGrid: a Prototype Tier-2 Centre – Steve Thorn, Edinburgh University SCOTGRID: A PROTOTYPE TIER-2 CENTRE Steve Thorn Authors: A. Earl, P. Clark, S.
How to Resolve Bottlenecks and Optimize your Virtual Environment Chris Chesley, Sr. Systems Engineer
The Computer Systems By : Prabir Nandi Computer Instructor KV Lumding.
Research Support Services Research Support Services.
Grid Computing Oxana Smirnova NDGF- Lund University R-ECFA meeting in Sweden Uppsala, May 9, 2008.
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
Stern Center for Research Computing
The SLAC Cluster Chuck Boeheim Assistant Director, SLAC Computing Services.
INTRODUCTION The GRID Data Center at INFN Pisa hosts a big Tier2 for the CMS experiment, together with local usage from other HEP related/not related activities.
The Red Storm High Performance Computer March 19, 2008 Sue Kelly Sandia National Laboratories Abstract: Sandia National.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Objectives.
S&T IT Research Support 11 March, 2011 ITCC. Fast Facts Team of 4 positions 3 positions filled Focus on technical support of researchers Not “IT” for.
Frontiers in Massive Data Analysis Chapter 3.  Difficult to include data from multiple sources  Each organization develops a unique way of representing.
NML Bioinformatics Service— Licensed Bioinformatics Tools High-throughput Data Analysis Literature Study Data Mining Functional Genomics Analysis Vector.
ROOT for Data Analysis1 Intel discussion meeting CERN 5 Oct 2003 Ren é Brun CERN Distributed Data Analysis.
CCS Overview Rene Salmon Center for Computational Science.
CLUSTER COMPUTING TECHNOLOGY BY-1.SACHIN YADAV 2.MADHAV SHINDE SECTION-3.
The CRI compute cluster CRUK Cambridge Research Institute.
Nanco: a large HPC cluster for RBNI (Russell Berrie Nanotechnology Institute) Anne Weill – Zrahia Technion,Computer Center October 2008.
1 THE EARTH SIMULATOR SYSTEM By: Shinichi HABATA, Mitsuo YOKOKAWA, Shigemune KITAWAKI Presented by: Anisha Thonour.
Patryk Lasoń, Marek Magryś
HPC at HCC Jun Wang Outline of Workshop2 Familiar with Linux file system Familiar with Shell environment Familiar with module command Familiar with queuing.
Building and managing production bioclusters Chris Dagdigian BIOSILICO Vol2, No. 5 September 2004 Ankur Dhanik.
1 Welcome from the Local Organizers Erwin Laure Director PDC-HPC
1 TCS Confidential. 2 Objective : In this session we will be able to learn:  What is Cloud Computing?  Characteristics  Cloud Flavors  Cloud Deployment.
The Worldwide LHC Computing Grid Frédéric Hemmer IT Department Head Visit of INTEL ISEF CERN Special Award Winners 2012 Thursday, 21 st June 2012.
Background Computer System Architectures Computer System Software.
Computer System Evolution. Yesterday’s Computers filled Rooms IBM Selective Sequence Electroinic Calculator, 1948.
Capacity Planning in a Virtual Environment Chris Chesley, Sr. Systems Engineer
Next Generation of Apache Hadoop MapReduce Owen
Course 03 Basic Concepts assist. eng. Jánó Rajmond, PhD
WP5 – Infrastructure Operations Test and Production Infrastructures StratusLab kick-off meeting June 2010, Orsay, France GRNET.
Chapter 16 Client/Server Computing Dave Bremer Otago Polytechnic, N.Z. ©2008, Prentice Hall Operating Systems: Internals and Design Principles, 6/E William.
An Brief Introduction Charlie Taylor Associate Director, Research Computing UF Research Computing.
Scientific Data Processing Portal and Heterogeneous Computing Resources at NRC “Kurchatov Institute” V. Aulov, D. Drizhuk, A. Klimentov, R. Mashinistov,
Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming
LCG 3D Distributed Deployment of Databases
LinkSCEEM-2: A computational resource for the development of Computational Sciences in the Eastern Mediterranean Mostafa Zoubi SESAME Outreach SESAME,
Heterogeneous Computation Team HybriLIT
Hodor HPC Cluster LON MNG HPN Head Node Comp Node Comp Node Comp Node
Grid Computing.
Recap: introduction to e-science
Grid Canada Testbed using HEP applications
Overview Introduction VPS Understanding VPS Architecture
Shared Research Computing Policy Advisory Committee (SRCPAC)
Advanced Computing Facility Introduction
Trip report: Visit to UPPNEX
H2020 EU PROJECT | Topic SC1-DTH | GA:
Campus and Phoenix Resources
Argon Phase 3 Feedback June 4, 2019.
Presentation transcript:

UPPMAX and UPPNEX: Enabling high performance bioinformatics Ola Spjuth, UPPMAX

High-performance bioinformatics Trivial/embarrassingly parallelizable – Mass of individual tasks (or divide up problems), run in parallel – E.g. analyze several sequences Non-trivial parallelism – Single task on many processors (data partitioning) – Example: Molecular dynamics

Resources for high-performance computing (HPC) Supercomputers – “a computer at the frontline of current processing capacity, particularly speed of calculation” Clusters – Processors in close proximity GRID computing – Distributed systems, (joined clusters)

UPPMAX Uppsala university’s resource for high performance computing (HPC) and related know-how – Computational clusters 6000 cores – Storage 1.4 PB parallel storage

A project at UPPMAX 13,152 MSEK from KAW/SNIC ( ) ~1 M cpuh/month on a shared cluster (kalkyl) ~1 PB cluster-attached parallel storage (bubo) Long term storage on SweStore (>1 PB) SMP machine, 64 core, 2TB RAM (halvan)

The cluster kalkyl 348 nodes with 8 cores each – 324 nodes with 24 GB – 16 nodes with 48 GB – 16 nodes with 72 GB – Total: 2784 cores SLURM queuing system

UPPNEX data flow

Knowledge Base / Community website

UPPNEX Application Experts Assist with NGS Analysis Available via mailing-list or by direct contact

Project growth

UPPNEX storage usage

Used CPU core h / month 1 week maintenance stop for move to new computer hall

A typical day at UPPMAX

UPPNEX software used

Conclusions: Community needs (storage) Access to high-availability storage Access to long term storage Sustainable file infrastructure

Support new types of HPC users and usage Keep up with the bioinformatics software flood Managing data growth (previously only computations) Conclusions: UPPNEX main challenges