Download presentation
Presentation is loading. Please wait.
Published byEric Lang Modified over 6 years ago
1
The Palmolive Effect: A model for implementing OpenPOWER hardware into a heterogeneous infrastructure
3
Jobs Run on Mix Architecture While Users Get Coffee
Creating a mix architecture that incorporates the best new hardware without the users knowing allows for deeper processing of data. Users inherently no nothing about architectures or even what they are currently running on in terms of desktop computers. Why would we as administrators force them to manage workloads across machines for which they have no formal knowledge. It is our responsibility to lower the activation energy for all users to take advantage of resources without having to gain specific knowledge.
4
What is the CGRB at Oregon State
Why Do Computational Methods and Techniques Need a Heterogeneous Environment How to Manage User Setting Across Multiple Architectures Scheduler’s Help Users Run Job Across Different Architectures. Running Jobs within a Heterogeneous Architecture. Examples of building tools on OpenPOWER based systems.
5
Core Facilities & Computational Science
Building infrastructure for Researchers Tools for Data Mining & Data Processing Building New Algorithms / New Tools Creating Deliverables for Publications Ways to Reduce Cost and Increase Scope 5
6
Building Infrastructure
CGRB Computational Infrastructure Started out as a single machine in 2000 (4 processors, 4G RAM) A small cluster was added in 2002 allowing users to process more then a limited number of jobs at a time (20 machine, 40 processor, 512G RAM). The CGRB grew this over time to always provide a limited set of resources to all users. Groups were encouraged to add to the cluster with processing hardware included into grants to support research work. Researchers pay for having hardware managed by the core facility and included into our infrastructure with access provided to their research group. CGRB created a storage server where users can add storage to support research at a cost model that can be included into grants. CGRB created pathways allowing general users to rent processing, rent storage, web and time from our core facility.
7
CGRB Infrastructure GENOME Cloud ~20,000 Jobs / Day 4100 Processors
3.5+ PB Redundant Storage 10 machines with greater then 1TB of memory 6x POWER8 Systems Increase access to resources Decrease analysis time Increase Network Speed
8
Why Do Computational Methods and Techniques Need a Heterogeneous Environment
9
Non-Heterogenous Infrastructure
Limits Users and Researchers to one set of tools. No ability to negotiate price since your stuck in a single architecture. Many x86 based machines get easily overloaded. Limits to Input/Output pathways. Single processor vendor put the users in line with only that forward pathway.
10
Heterogeneous Infrastructure
Provides users with a cafeteria of tools coming from any architecture. Can negotiate price since users can use any architecture that can run the tools. Allows users to find pathways around limits on Input/Output pathways. Multiple processor vendors put the users into the forward pathway that best fits their needs.
11
How to Manage User Setting Across Multiple Architectures
12
Use Environment Setting
Use shell and environmental settings to help manage user information across multiple architectures. Things like “uname” will provide information about platform and other important configurations. Use different global settings files that users can source to provide architecture specific paths and settings. Separate directories for all binaries of a different architectures so users can easily work with tools.
13
Using “uname” to set ARCH
CSH BASH
14
Using “uname” to set ARCH (csh)
15
Using “uname” to set ARCH (csh)
16
Scheduler’s Help Users Run Job Across Different Architectures
17
Scheduler will Help Users
Use of schedulers will allow users/services to submit jobs to the infrastructure. Legacy schedulers are more aware of a mixed architecture. Using the environmental variables set to a specific architecture users can freely submit jobs knowing the system will find the correct binary for each system.
18
Grid Engine / SGE / OGE / Son of GE / Univa
Gridware was sold to and improved by Sun Microsystems and became known as Sun Grid Engine (SGE), CODINE (Computing in Distributed Networked Environments) or GRD (Global Resource Director). This tool created a grid computing computer cluster software system (otherwise known as a batch-queuing system). There have been open source versions and multiple commercial versions of this technology, initially from Sun, later from Oracle and then from Univa Corporation.
19
Interacting with Machines and Running Jobs
20
Benefits of a Heterogeneous Environment
21
GZIP CAPI Card Working with IBM and OPENPower the CGRB was able to test a new GZIP CAPI card. Massive increase of speed to compress and de-compress standard gzip files. Reduces load on CPU resources allowing them to be used for real processing. Jobs that took over 60 hours are not taking less then 1 hour. Blog Post: A Better Way to Compress Big Data
22
CGRB Uses Development Environment
New Sequence Alignment Tool using GPU Hardware With the CAPI and NVLink built into the OpenPOWER hardware we wanted to see if we can do real work on the hardware by means of genomic sequence alignment. CASSA - CUDA Accelerated Scalable Sequence Aligner CGRB has developed a new GPU based HTS alignment tools using the new IBM and NVIDIA hardware. CASSA can run the Bowtie2 or BWA seed methods. CASSA runs everything on the GPU so it can run on Windows as well as POWER8 and x86 based Linux distributions.
23
CASSA Performance
24
CGRB IBM Collaboration Summary
Using IBM Power8 the CGRB was able to increase processing throughput 2-10x just by compiling software. CGRB was able to work with IBM to continue porting software which was both beneficial to our users and IBM. CGRB used the new development environment to create a new tool called CASSA to reduce time running HTS sequence alignment. CASSA will be provided to the entire research community as soon as possible.
25
Summary Computational Science is used in a gamut of research and clinical areas. Many times new technologies and hardware change the way computational science can be done. New technologies can generate magnitudes of order greater precision, change the scope of work or reduce bias. New technologies many times require changes in tools and hardware. Running a heterogeneous infrastructure will provide your group with the greatest flexibility to take on the future of computing. Using different architectures in the same infrastructure is easy...
26
Acknowledgements >CGRB Staff Ryan Kitchen Shawn O’neil Ian Munoz
################################################################################################ ## Acknowledgements >CGRB Staff Ryan Kitchen Shawn O’neil Ian Munoz Brent Kronmiller Matthew Peterson >Nimbix Cloud Services Leo Reiter Tom McNeill >IBM Charles J. Foretich Keith Brown Stan Gowen Terry Leatherland Denise Ruffner Indrajit Poddar Hal Porter Linton Ward >Software Tool CentOS Linux Open Source Software (GNU) Son of Grid Engine (SGE) NVIDIA Cuda >NVIDIA Jon Saposhnik Robert Crovella
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.