The Cambridge Research Computing Service

Slides:



Advertisements
Similar presentations
CORE: e-Infrastructure solutions IDC HPC User Forum: 5 July 2012.
Advertisements

STFC and the UK e-Infrastructure Initiative The Hartree Centre Prof. John Bancroft Project Director, the Hartree Centre Member, e-Infrastructure Leadership.
IDC HPC User Forum Conference Appro Product Update Anthony Kenisky, VP of Sales.
CUHP Cambridge University Health Partners (CUHP) unites a world-leading University and three high- performing NHS Foundation Trusts centred on the Cambridge.
Dell IT Innovation Labs in the Cloud “The power to do more!” Andrew Underwood – Manager, HPC & Research Computing APJ Solutions Engineering Team.
© Hitachi Data Systems Corporation All rights reserved. 1 1 Det går pænt stærkt! Tony Franck Senior Solution Manager.
© Copyright 2010 Hewlett-Packard Development Company, L.P. 1 HP + DDN = A WINNING PARTNERSHIP Systems architected by HP and DDN Full storage hardware and.
LARGE SCALE DEPLOYMENT OF DAP AND DTS Rob Kooper Jay Alemeda Volodymyr Kindratenko.
1 Intel® Many Integrated Core (Intel® MIC) Architecture MARC Program Status and Essentials to Programming the Intel ® Xeon ® Phi ™ Coprocessor (based on.
CEMS: The Facility for Climate and Environmental Monitoring from Space Victoria Bennett, ISIC/CEDA/NCEO RAL Space.
Copyright 2009 Fujitsu America, Inc. 0 Fujitsu PRIMERGY Servers “Next Generation HPC and Cloud Architecture” PRIMERGY CX1000 Tom Donnelly April
1 Dell World 2014 Reducing IT complexity and improving utilization though convergence infrastructure solutions Dell World 2014 Ravi Pendekanti, Vice President,
CRISP & SKA WP19 Status. Overview Staffing SKA Preconstruction phase Tiered Data Delivery Infrastructure Prototype deployment.
March 9, 2015 San Jose Compute Engineering Workshop.
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. HP Update IDC HPC Forum.
MidVision Enables Clients to Rent IBM WebSphere for Development, Test, and Peak Production Workloads in the Cloud on Microsoft Azure MICROSOFT AZURE ISV.
NICS Update Bruce Loftis 16 December National Institute for Computational Sciences University of Tennessee and ORNL partnership  NICS is the 2.
Introducing: Chengdu’s Industrial Cloud Huawei & GDS Services Industrial Cloud Solution for SMEs Author/ID: Zhao Zhijuan/ Dept: Industry Solutions.
Architecture of a platform for innovation and research Erik Deumens – University of Florida SC15 – Austin – Nov 17, 2015.
FusionCube At-a-Glance. 1 Application Scenarios Enterprise Cloud Data Centers Desktop Cloud Database Application Acceleration Midrange Computer Substitution.
EGI-InSPIRE RI EGI Compute and Data Services for Open Access in H2020 Tiziana Ferrari Technical Director, EGI.eu
CLOUD COMPUTING Presented to Graduate Students Mechanical Engineering Dr. John P. Abraham Professor, Computer Engineering UTPA.
EGI-InSPIRE RI An Introduction to European Grid Infrastructure (EGI) March An Introduction to the European Grid Infrastructure.
Data storage innovations demonstrated in collaboration with CERN openlab Dan Chester, Seagate Systems.
Extreme Scale Infrastructure
Dell EMC Service Provider Cloud Infrastructure.
READ ME FIRST Use this template to create your Partner datasheet for Azure Stack Foundation. The intent is that this document can be saved to PDF and provided.
Dell EMC NFV Validated Systems: vCPE & SD-WAN.
Run Azure Services in your datacenter
Univa Grid Engine Makes Work Management Automatic and Efficient, Accelerates Deployment of Cloud Services with Power of Microsoft Azure MICROSOFT AZURE.
A Brief Introduction to NERSC Resources and Allocations
Dell EMC Service Provider Cloud Infrastructure
TOPdesk Service Management Software on Azure
Organizations Are Embracing New Opportunities
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CLOUD COMPUTING
More than IaaS Academic Cloud Services for Researchers
What is it ? …all via a single, proven Platform-as-a-Service.
Digital Transformation for Modern Service Providers
What is HPC? High Performance Computing (HPC)
H2020, COEs and PRACE.
Dell EMC Service Provider
Data Analytics and CERN IT Hadoop Service
Status and Challenges: January 2017
National e-Infrastructure Vision
HPC-SIG – UK update Simon Burbidge
Appro Xtreme-X Supercomputers
Bridges and Clouds Sergiu Sanielevici, PSC Director of User Support for Scientific Applications October 12, 2017 © 2017 Pittsburgh Supercomputing Center.
Daniel Murphy-Olson Ryan Aydelott1
Couchbase Server is a NoSQL Database with a SQL-Based Query Language
Windows Server* 2016 & Intel® Technologies
OCP: High Performance Computing Project
Azure Hybrid Use Benefit Overview
CIOs, IT, and Digital Transformation
AWS. Introduction AWS launched in 2006 from the internal infrastructure that Amazon.com built to handle its online retail operations. AWS was one of the.
The Brocade Cloud Manageability Vision
OpenNebula Offers an Enterprise-Ready, Fully Open Management Solution for Private and Public Clouds – Try It Easily with an Azure Marketplace Sandbox MICROSOFT.
Built on the Powerful Microsoft Azure Platform, Lievestro Delivers Care Information, Capacity Management Solutions to Hospitals, Medical Field MICROSOFT.
NSF : CIF21 DIBBs: Middleware and High Performance Analytics Libraries for Scalable Data Science PI: Geoffrey C. Fox Software: MIDAS HPC-ABDS.
FACTON Provides Businesses with a Cloud Solution That Elevates Enterprise Product Cost Management to a New Level Using the Power of Microsoft Azure MICROSOFT.
Dr. John P. Abraham Professor, Computer Engineering UTPA
Utilizing the Capabilities of Microsoft Azure, Skipper Offers a Results-Based Platform That Helps Digital Advertisers with the Marketing of Their Mobile.
NSF cloud Chameleon: Phase 2 Networking
DeFacto Planning on the Powerful Microsoft Azure Platform Puts the Power of Intelligent and Timely Planning at Any Business Manager’s Fingertips Partner.
Cloud Enables Reliable, Cost-Controlled Access to IT Business Intelligence and Usage Analytics MINI-CASE STUDY “Microsoft Azure has allowed us to focus.
Dell Data Protection | Rapid Recovery: Simple, Quick, Configurable, and Affordable Cloud-Based Backup, Retention, and Archiving Powered by Microsoft Azure.
Research IT Services to support eHealth
Media365 Portal by Ctrl365 is Powered by Azure and Enables Easy and Seamless Dissemination of Video for Enhanced B2C and B2B Communication MICROSOFT AZURE.
IBM Power Systems.
DATS International Portfolio.
The Diagnostic Cockpit of the Future II Bram Stolk, PhD
Presentation transcript:

The Cambridge Research Computing Service A brief overview Dr Paul Calleja Director of Research Computing University of Cambridge

From EDSAC to Openstack Cambridge has had an in house research computing development and provision activity in place service for the last 77 years EDSAC 1949 Darwin 1996 CSD3

Strategy – what are we trying to achieve Deliver cutting-edge research capability using state of the art technologies within the broad area of data-centric high- performance computing, driving research discovery and innovation. Strategic outcomes World class innovative data centric research computing provision Diversity of high value user driven services Drive research discovery & Innovation within Cambridge and the national science communities that we serve Delivery economic impact within UK economy

Strategy – how do we get there Continue in house technology innovation, currently focused on :- Convergence of HPC and Openstack technologies Next generation tiered storage – strong focus om parallel file systems and NVMe Large scale genomics analysis software Hospital clinical informatics platforms Data analytics and machine learning platforms Data visualisation platforms Continue to build best in class, in house capability in:- System design, integration and solution support User support Scientific support RSE (6 FTE)

Delivery focus Driving Discovery, Innovation & Impact

Team structure 28 FTE across 6 groups

Cambridge research computing capability Highly resilient HPC DC 100 Cabinets, 30 Kw water cooled racks, 1000Kw IT Load People 28 FTE technical team Skill focus in :- HPC system integration Large scale storage Openstack development & deployment Scientific support Systems 3.7 PF (2500 servers X86 + GPU), 280 node Hadoop system 23 PB storage + Intel Lustre & tape Value of equipment in service ~£20M

Research computing usage and outputs 1600 active from 387 research groups from 42 University departments + National HPC users Usage growth rate is 28% CAGR year on year for last 9 years, growth rate is expected to grow with Openstack usage models Research computing services support a current active grant portfolio of £120 – which represents 8% of the Universities annual grant income Underpinning 2000 publications over the last 9 years, current output ~300 per year

CSD3 Platform 1152

Peta-4 & Wilkes leading UK academic systems Largest open access academic X86 system in UK 1152 32 core skylake nodes – 36000 cores 2.0 PF Fastest academic supercomputer in the UK KNL 341 nodes 0.5 PF We can run this as a single heterogeneous system yielding 2.4 PF Wilkes-2 Largest open access academic GPU system in UK 1.2 PF (20% over design performance) 360 P100

Solid state I/O accelerator

What does it look like ? 24 Dell EMC PowerEdge R740xd Each with 24 Intel SSD P4600 0.5 PB of Total Available Space 500 B/s R 350 W Number 2 in I/O 500 Integration with SLURM and flexible storage orchestrator to be reconfigured to provide maximum performance

Cambridge HPC services Central HPC data analytics service and Service Pay per use service of large central HPC and storage systems X86, KNL, GPU Research computing cloud (new in 2018) Infrastructure as service Clinical cloud VM service Scientific Openstack cloud for IRIS Secure data storage and archive service (new in 2019) NHS IGT ISO 27001 Data Analytics Service (beta) Hadoop / Sparc

Cambridge HPC services Bio-lab Develop, deploy and support Open-CB next gen genomics analysis platform Deploy and support Biocomputing scientific gateways Deploy and support wide range of medical imaging, microscopy and structure determination platforms Scientific computing support Team of scientific programme experts that provide in depth application development support to users. Very flexible support model Able to pool fractional FTE funds from grant to part time FTE on long term basis

Cambridge HPC services HPC and Big Data innovation lab Holds a large range of test /dev HPC Data analytics hardware Specific Lab engineering resource Open to third party use Used to drive HPC and Big Data R & D for RCS and our customers Strong industrial supply chain collaboration Strong user driven inputs Outputs POC’s, case studies and white papers Drives innovation in research computing solution development and usage for both The University and wider community System design, procurement and managed hosting service for group owned resources

Cambridge HPC services HPC and Big Data innovation lab Holds a large range of test /dev HPC Data analytics hardware Specific Lab engineering resource Open to third party use Used to drive HPC and Big Data R & D for RCS and our customers Strong industrial supply chain collaboration Strong user driven inputs Outputs POC’s, case studies and white papers Drives innovation in research computing solution development and usage for both The University and wider community System design, procurement and managed hosting service for group owned resources

IRIS at Cambridge 36 32 core skylake nodes, 384 GB RAM, dual low latency 25g ethernet, OPA 1 PB lustre Provisioned as core hours – 2,270,592 per quarter Bare metal, Openstack VM’s, Openstack slurm as a service Available in next few weeks fr on boarding IRIS users

The EPSRC Tier 2

CSD3 EPSRC Tier 2 ecosystem Oxford, £3M, 22 8way DGX 492 TF, 426 T500, 1PB PFS EPCC £2.4M 8960 core Broadwell FDR, 300 TF, 0.5 PB PFS, £5M EPSRC, £1.2 M DiRAC, £2.8M Cam, 24,000 core Skylake, 360 P100, 341 KNL, 5 PB PFS, 10 PB tape, 1PB SSD, 80 nodes Hadoop – total 3.1PF (SKX+KNL =1.7PF 75 T500) GPU system 1.2 PF 100 T500) – OPA EDR islands 2:1 CSD3 UCL £4M 17000 core Broadwell 523 TF 395 T500, OPA islands 3:1, 1PF PFS Loughborough, £3.2M, 1400 core Broadwell, 499 TF 395 T500, OPA islands 3:1 Bristol, £3M, Arm 10,000 core thunderX2 Cray ~300 TF Aries interconnect

DiRAC national HPC service Cambridge are a long standing DiRAC delivery partner ~500 Skylake nodes + 13% of our KNL and GPU system, 3 PB Lustre Co development partner in DAC and Openstack