Presentation on theme: "CASC Spring Meeting 2012 Craig A. Stewart"— Presentation transcript:
1 Penguin Computing / IU Partnership HPC “cluster as a service” and Cloud Services CASC Spring Meeting 2012Craig A. StewartExecutive Director, Pervasive Technology InstituteAssociate Dean, Research TechnologiesMatthew JacobsSVP Corporate DevelopmentPenguincomputing.com
3 What is POD Services On-demand HPC system Compute, storage, low latency fabrics, GPU, non-virtualizedRobust software infrastructureFull automationUser and administration space controlsSecure and seamless job migrationExtensible frameworkComplete billing infrastructureServicesCustom product designSite and workflow integrationManaged servicesApplication supportHPC support expertiseSkilled HPC administratorsLeverage 13 yrs serving HPC marketInternet (150Mb, burstable to 1Gb)
4 Penguin HPC Cloud Services Penguin Computing on DemandTrue HPC in the cloud on a pay-as-you-go basisOverflow, Targeted workload, Targeted user setPost-Purchase CollocationCollocation services provided by PenguinCost Reduction, Budget reallocationPublic-Private On-Demand PartnershipsPenguin-owned and operated PODs, hosted at academic or government facilitiesRevenue sharing, Augment local resources, Self-sustaining growthPOD HybridOn-premise cluster designed to mean usage + PODSave on initial capital outlay while sustaining high service level to usersOEM HPC CloudPOD distribution to internal or external customersAugment local resource, expertise, fund growthHPC SaaS PlatformHosting platform for SAAS providersOn-demand delivery platform for ISVsTurnkey Managed ServicesRemote managed services for penguin and non-Penguin clustersAugment local expertise, reduce costs
5 Scyld HPC Cloud Management System Created by POD Developers and AdministratorsCreate and Manage User and Group HierarchiesSimultaneously Manage Multiple Collocated ClustersCreate Customer Facing Web PortalsUse Web Services to Integrate with Back-End SystemsDeploy HTML5 Based Cluster Management ToolsSecurely Migrate User WorkloadsEfficiently Schedule and Manage Cluster ResourcesCreate and Deploy Virtual Headnodes for User-Specific Clusters
6 12 Million Commercial Jobs and Counting… Current data centers: Salt Lake City, Indiana University, Mountain View1,500 cores (AMD and Intel)240 TB on-demand storageReplaced in-house image analysis cluster with POD and co-located storageProvides cloud analysis services on POD for world-wide bioinformatics customersReplaced Amazon AWS cloud usage with PODTools workflow migration systemNihon ESI provides crash analysis analyses to Honda R&D during Japan’s brown-outs
7 The POD Advantage Persistent, customized user environment High-speed Intel and AMD compute nodes (physical)Fast access to local storage (data guaranteed to be local)Highly secure (https, shared key authentication, IP matching, VPN)Billed by the fractional core hourHPC expertise included (Penguin’s core business for many years)Cluster software stack includedTroubleshooting included in supportCollocated storage options availableHighly dependable and dynamically scalable
8 Clouds look serene enough - But is ignorance bliss? In the cloud, do you know:Where your data are?What laws prevail over the physical location of your data?What license you really agreed to?What is the security (electronic / physical) around your data?And how exactly do you get to that cloud, or get things out of it?How secure your provider is financially? (The fact that something seems unimaginable, like cloud provider such-and-such going out of business abruptly, does not mean it is impossible!)Photo by
9 Penguin Computing & IU partner for “Cluster as a Service” Just what it says: Cluster as a ServiceCluster physically located on IU’s campus, in IU’s Data CenterAvailable to anyone at a .edu or FFRDC (Federally Funded Research and Development Center)To use it:Go to podiu.penguincomputing.comFill out registration formVerify via yourGet out your credit cardGo computingThis builds on Penguin’s experience - currently host Life Technologies' BioScope and LifeScope in the cloud (http://lifescopecloud.com)
10 We know where the data are … and they are secure
11 An example of NET+ Services / Campus Bridging "We are seeing the early emergence of a meta-university — a transcendent, accessible, empowering, dynamic, communally constructed framework of open materials and platforms on which much of higher education worldwide can be constructed or enhanced.” Charles Vest, president emeritus of MIT, 2006NET+ Goal: achieve economy of scale and retain reasonable measure of controlSee: Brad Wheeler and Shelton Waggener Above-Campus Services: Shaping the Promise of Cloud Computing for Higher Education. EDUCAUSE Review, vol. 44, no. 6 (November/December 2009):Campus Bridging goal – make it all feel like it’s just a peripheral to your laptop (see pti.iu.edu/campusbridging)
12 IU POD – Innovation Through Partnership True On-Demand HPC for Internet2Creative Public/Private model to address HPC shortfallTurning lost EC2 dollars into central IT expansionTiered channel strategy expansion to EDU sectorProgram and discipline-specific enhancements under wayObjective third party resource for collaborationEDU, Federal and Commercial
13 POD IU (Rockhopper) specifications Server Information ArchitecturePenguin Computing Altus 1804TFLOPS4.4Clock Speed2.1GHzNodes11 compute; 2 login; 4 management; 3 serversCPUs4 x 2.1GHz 12-core AMD Opteron 6172 processors per compute nodeMemory TypeDistributed and SharedTotal Memory1408 GBMemory per Node128GB 1333MHz DDR3 ECCLocal Scratch Storage6TB locally attached SATA2Cluster Scratch100TB LustreFurther DetailsOSCentOS 5NetworkQDR (40Gb/s) Infiniband, 1Gb/s ethernetJob Management SoftwareSGEJob Scheduling SoftwareJob Scheduling policyFair ShareAccesskeybased ssh login to headnodes remote job control via Penguin's PODShell
14 Available applications at POD IU (Rockhopper) Package nameSummaryCOAMPSCoupled ocean / atmosphere meoscale prediction systemDesmondDesmond is a software package developed at D. E. Shaw Research to perform high-speed molecular dynamics simulations of biological systems on conventional commodity clusters.GAMESSGAMESS is a program for ab initio molecular quantum chemistry.GalaxyGalaxy is an open, web-based platform for data intensive biomedical research.GROMACSGROMACS is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles.HMMERHMMER is used for searching sequence databases for homologs of protein sequences, and for making protein sequence alignments.Intelcompilers and librariesLAMMPSLAMMPS is a classical molecular dynamics code, and an acronym for Large-scale Atomic/Molecular Massively Parallel Simulator.MM5The PSU/NCAR mesoscale model (known as MM5) is a limited-area, nonhydrostatic, terrain-following sigma-coordinate model designed to simulate or predict mesoscale atmospheric circulation. The model is supported by several pre- and post-processing programs, which are referred to collectively as the MM5 modeling system.mpiBLASTmpiBLAST is a freely available, open-source, parallel implementation of NCBI BLAST.NAMDNAMD is a parallel molecular dynamics code for large biomolecular systems.
15 Available applications at POD IU (Rockhopper) Package nameSummaryNCBI-BlastThe Basic Local Alignment Search Tool (BLAST) finds regions of local similarity between sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance of matches.OpenAtomOpenAtom is a highly scalable and portable parallel application for molecular dynamics simulations at the quantum level. It implements the Car-Parrinello ab-initio Molecular Dynamics (CPAIMD) method.OpenFoamThe OpenFOAM® (Open Field Operation and Manipulation) CFD Toolbox is a free, open source CFD software package produced by OpenCFD Ltd. It has a large user base across most areas of engineering and science, from both commercial and academic organisations. OpenFOAM has an extensive range of features to solve anything from complex fluid flows involving chemical reactions, turbulence and heat transfer, to solid dynamics and electromagnetics.OpenMPIInfinibad based Message Passing Interface - 2 (MPI-2) implementationPOPPOP is an ocean circulation model derived from earlier models of Bryan, Cox, Semtner and Chervin in which depth is used as the vertical coordinate. The model solves the three-dimensional primitive equations for fluid motions on the sphere under hydrostatic and Boussinesq approximations.Portland GroupcompilersRR is a language and environment for statistical computing and graphics.WRFThe Weather Research and Forecasting (WRF) Model is a next-generation mesoscale numerical weather prediction system designed to serve both operational forecasting and atmospheric research needs. It features multiple dynamical cores, a 3-dimensional variational (3DVAR) data assimilation system, and a software architecture allowing for computational parallelism and system extensibility.