We think you have liked this presentation. If you wish to download it, please recommend it to your friends in any social system. Share buttons are a little bit lower. Thank you!
Presentation is loading. Please wait.
Published byKaylin Wamsley
Modified over 2 years ago
© 2012 Pittsburgh Supercomputing Center UV@PSC Big Memory = New Science Jim Kasdorf Director of Special Projects HPC User Forum Höchstleistungsrechenzentrum Universität Stuttgart July 9, 2012
© 2012 Pittsburgh Supercomputing Center DISCLAIMER Nothing in this presentation represents an official view or position of any part of the U.S. government, nor of any private corporation, nor of the Pittsburgh Supercomputing Center nor of Carnegie Mellon University. All errors are my own. I would have prepared a shorter presentation but I did not have the time (apologies to Blaise Pascal).
© 2012 Pittsburgh Supercomputing Center Introduction to PSC Data Supercell* Anton@PSC* UV@PSC Technology UV@PSC Science *For reference; perfunctory presentation Outline
© 2012 Pittsburgh Supercomputing Center The Pittsburgh Supercomputing Center Established in 1986 Joint effort of Carnegie Mellon University and the University of Pittsburgh together with the Westinghouse Electric Company Major funding provided by U. S. NSF Other support from NIH, DARPA, DOE, Commonwealth of Pennsylvania Partner in NSF XSEDE - Extreme Science and Engineering Discovery & Environment (cyberinfrastructure program)
© 2012 Pittsburgh Supercomputing Center Pittsburgh Supercomputing Center National Mission Provide integrated, flexible environment for solving large-scale computational problems Advance science and technology through collaborative and internal research Educate researchers about benefits of supercomputing and train them in High Performance Computing (HPC) Support local and national education efforts with innovative outreach programs Improve competitiveness of U.S. industry through use of computational science
© 2012 Pittsburgh Supercomputing Center6 New Systems Enable New Science X-MP 1986 Y-MP 1989 CM-2 1990 CM-5 1992 C-90 1992 T3D 1993 T3E 1996 HP SC45: TCS 2001 HP GS1280 2003 XT3 2005 SGI Altix®4700 2008 Anton 2010 SGI Altix® UV1000 2010
© 2012 Pittsburgh Supercomputing Center Current Computational Environment Two Unique Computers Anton: D.E. Shaw Research MD machine Only example outside of DESRES Blacklight: SGI Altix® UV1000 World’s largest ccNUMA shared memory
© 2012 Pittsburgh Supercomputing Center Robust Support Systems (“The Nest”) Data Communications: LAN, MAN, WAN –Three Rivers Optical Exchange –Fiber-based, DWDM core infrastructure –Over 40 Gbps of WAN connectivity Shared Lustre Storage Long-Term Storage (“Archiver”) Utility systems: login nodes, etc.
© 2012 Pittsburgh Supercomputing Center PSC Data Supercell * A disk-based, mass-store system Designed to satisfy the needs of data-intensive applications: –Safely storing very large amounts of data –Superior read and write performance –Enhanced functionality & flexibility –More cost effective than tape system Support for collaborative work An outgrowth of –Previous generations of PSC-wide file systems –Initial project funding by National Archives * ©, Patent Pending
© 2012 Pittsburgh Supercomputing Center Data Storage and Access Changing demands driven by growth of data- intensive applications. –Capacity: growth of system sizes and dataset numbers & sizes –Performance: both higher aggregate bandwidth and lower latency –Accessibility: local and remote access –Heterogeneity: incorporate diverse systems Changing technologies –Large market for commodity disk drives rapid development and lower cost (compared to tape) –Powerful, open-source disk file-system software (c.f. expensive, proprietary hierarchical storage systems)
© 2012 Pittsburgh Supercomputing Center Goals Performance improvements to support new, data-intensive research modes –Lower latency (150s/tape 15 ms/disk) –High bandwidth at low cost (DSC is 24x DMF/tape system) –Capacity: scalable and readily expandable Functionality –Expandability –Reliability – redundancy designed in –Data Security – superior to single tape copy –User Interfaces – transparent to users (cf DMF/FAR) –Support for collaborative data projects –Manageability Cost –No more expensive that current tape-based systems –Maintenance – simple self-maintenance (no costly maintenance contracts) –Power – allow increase as long as small compared to other costs –Space – better than old silos, no worse than best, newer robotics.
© 2012 Pittsburgh Supercomputing Center Building Blocks Hardware: commodity components –SATA disks (non-Enterprise, 3 TB, 5400 rpm) –SuperMicro disk shelves –SAS interconnect fabric –Multiple, commodity servers –10GigE interconnect fabric ( IB when drivers available and tested) –Redundant UPS support for all components –Support for remote operation (including power cycling) Software –FreeBSD OS –Open Source ZFS –PSC-developed SLASH2 software –PSC-developed monitoring, management support
© 2012 Pittsburgh Supercomputing Center Status / Plans Status –In production –Data copied out of DMF/tape system –Disk failure rate (including infant mortality) < vendor estimate. Plans –System hardening –Adding system interfaces –Automating operations –Performance optimization –Disk scrubbing (ZFS) –Block-level deduplication (ZFS) –Reduced power consumption (disks to standby) –Write-read-verify –…
© 2012 Pittsburgh Supercomputing Center Anton
© 2012 Pittsburgh Supercomputing Center Anton Massively parallel supercomputer Designed and built by D.E. Shaw Research (DESRES) Special-purpose system for Molecular Dynamics (MD) simulations of proteins and other biological macromolecules
© 2012 Pittsburgh Supercomputing Center Anton Runs MD simulations fully in hardware Compared to the previous state-of-the-art Anton provides a speedup of ~100 fold rendering millisecond timescales attainable Uses custom-designed ASICs and novel simulation algorithms
© 2012 Pittsburgh Supercomputing Center Anton @PSC NIH-funded National Resource for Biomedical Supercomputing at PSC DESRES made available an Anton system without cost for non-commercial research use by not-for-profit institutions Allocation by review committee convened by U.S. National Research Council The Anton at NRBSC is the only one available outside of DESRES.
© 2012 Pittsburgh Supercomputing Center UV@PSC: Blacklight The World’s Largest Hardware-Coherent Shared Memory Computer for Data Analytics and Data-Intensive Simulation
© 2012 Pittsburgh Supercomputing Center Blacklight
© 2012 Pittsburgh Supercomputing Center Why Shared Memory? Enable memory-intensive computation Enable data exploration statisticsstatistics machine learning visualizationvisualization graph analytics...... Increase user productivity productivity algorithm expression interactivityinteractivity rapid prototyping ISV apps high- productivity languages …… Change the way we look at data Change the way we look at data Boost scientific output Broaden participation Boost scientific output Broaden participation Production use and also an experiment
© 2012 Pittsburgh Supercomputing Center PSC’s Blacklight: SGI Altix ® UV 1000 Two×16 TB of cache-coherent shared memory –Hardware coherency unit: 1 cache line (64B) –16 TB exploits the processor’s full 44-bit physical address space –Ideal for fine-grained shared memory applications, e.g. graph algorithms, sparse matrices 32 TB addressable with PGAS languages (e.g. SGI UPC) –Low latency, high injection rate supports one-sided messaging –Also ideal for fine-grained shared memory applications NUMAlink ® 5 interconnect –Fat tree topology spanning full UV system; low latency, high bisection bandwidth –Hardware acceleration for PGAS, MPI, gather/scatter, remote atomic memory operations, etc. Intel Nehalem EX8 processors: 4096 cores (2048 cores per SSI) –Eight-cores per socket, two hardware threads per core, four flops/clock, 2.27 GHz, 24MB L3, Turbo Boost, QPI –Four memory channels per socket strong memory bandwidth –x86 instruction set with SSE 4.2 excellent portability and ease of use SUSE Linux operating system –Supports OpenMP, p-threads, MPI, PGAS models high programmer productivity –Supports a huge number of ISV applications high end-user productivity
© 2012 Pittsburgh Supercomputing Center Programming Models & Languages UV supports an extremely broad range of programming models and languages for science, engineering, and computer science –Parallelism Coherent shared memory: OpenMP, POSIX threads (“p-threads”), OpenMPI, q-threads Distributed shared memory: UPC, Co-Array Fortran* Distributed memory: MPI, Charm++ Full Linux OS can support arbitrary domain-specific languages –Languages C, C++, Java, UPC, Fortran, Co-Array Fortran* R, R-MPI Python, Perl, … → Rapidly express algorithms that defy distributed-memory implementation. → To existing codes, offer 16 TB hardware-enabled shared memory and high concurrency. * pending F2008-compliant compilers
© 2012 Pittsburgh Supercomputing Center Use Cases Interactive analysis of large datasets –Foster totally new ways of exploring large datasets. Interactive queries and deeper analyses limited only by the community’s imagination. –Example: Fit the whole ClueWeb09 corpus into RAM; exploring graph database technology (Neo4j) to aid in inferencing. Familiar, high-productivity programming languages –MATLAB, R, Java, etc. –Leverage tools that scientists, engineers, and computer scientists already know and use. Lower the barrier to using HPC. ISV applications –ADINA, MolPro, MATLAB, Gaussian, VASP,... –Access vast memory from even a modest number of cores; extend users’ workflows from their desktop to HPC, allowing higher fidelity while applying the same validated approaches. Again, lower the barrier to using HPC.
© 2012 Pittsburgh Supercomputing Center Use Cases Algorithm expression –Rapid development of algorithms for large-scale data analysis. –Rapid prototyping and “one-off” analyses. –Implement algorithms and analyses, e.g. graph-theoretical, for which distributed-memory implementations have been elusive or impractical. –Example: Hy Trac (CMU; with Renyue Cen, Princeton) began several 4.5TB, 1024-core N-body runs of 30-billion particles to study the evolution of dark matter halos during the epoch when the first stars, galaxies, and quasars formed. The abundance and growth of the halos will be compared with the abundance and star formation rates of early galaxies to understand the connection between mass and light in the early universe. He uses OpenMP (shared memory) for agile development of new algorithms and he is planning runs with 70 billion particles.
© 2012 Pittsburgh Supercomputing Center Big Wins: Genome Assembly “Next Generation Sequencers” Fast sequencing runs Significantly decreased cost Produce “short reads” <200bp The Challenge: Assembling millions to billions of reads
© 2012 Pittsburgh Supercomputing Center Little Skate NIH “model organism” Fill crucial gaps in human biomedical knowledge One of eleven non-mammals No reference genome: de novo assembly 3.4 billion bases; billions of 100-base reads ABySS s/w builds relationship graph in memory Unsuccessful for nearly a year on distributed memory system. Draft genome in weeks on Blacklight
© 2012 Pittsburgh Supercomputing Center Colonial Tuco Tuco
© 2012 Pittsburgh Supercomputing Center Genetic Underpinnings of Complex Phenotypes in Non-Model Mammals Effect in transition from solitary to social living in the Colonial tuco-tuco: endangered from reduced diversity Illumina® sequencer: billions of bps / day: assembly requires ~700 GB. Unsuccessful on distributed-memory system
© 2012 Pittsburgh Supercomputing Center Identification of Lignocellulosic Enzymes through Sequencing and Assembly of Metagenome Soil Enrichment Using 3.5 TB of memory, Mostafa Elshahed, Rolf Prade, and Brian Couger (Oklahoma State Univ.) achieved the largest metagenomic assembly to date. –1.5 billion pair-data reads of 100 basepairs each, approximately 300 gigabases in total –VelvetH / VelvetG software Their study used a soil sample from a sugar- cane field in Brazil, enriched in a bioreactor, to identify previously unknown enyzmes that may be effective in breaking down non-feed stock lignocellulosic plants such as switchgrass and wheat straw. Such enzymes could produce biofuel with a much higher ratio of energy per quantity of input crop. Metagenomics allows the study of microbial communities like those present in this stream receiving acid drainage from surface coal mining.
© 2012 Pittsburgh Supercomputing Center Genetic Basis for Congenital Heart Disease Screening >100,000 mouse fetuses for heart defects Sequence defective mouse genome and compare to healthy mouse Life Technologies SOLiD™ sequencer: 700M reads CLC Bio sw on Blacklight: whole-mouse assembly: 8hr
© 2012 Pittsburgh Supercomputing Center Cold Spring Harbor Laboratory –Four wheat species : ALLPATHS-LG –Triticum aestivum: 16 Gbp; 5TB RAM; largest assembly ever De novo transcriptome assemblies –USDA, Forest Service: Douglas fir –University of Virginia:Trinity / ALLPATHS-LG –University of Miami: Functional and comparative genomics Many More Assemblies
© 2012 Pittsburgh Supercomputing Center Big Wins: Chemistry and Materials Science Re-parameterize MD force fields for RNA –MP2/aug-cc-pvtz calculations on mononucleotides with ~30 atoms –“Same calculations on Kraken take an order of magnitude longer and they irritate the sysadmins because they swamp the disk resources. Molpro Hartree-Fock and closed-shell CCSD(t) simulations to calculate the energy of a trimer –Uses a large amount of memory from the large basis set used
© 2012 Pittsburgh Supercomputing Center Big Wins: Visualization MassiveBlack: Largest Cosmological Simulation of Its Kind – Black Hole and Galaxy Formation Kraken Simulation: 100K cores, 65.5 billion particles Moved four TB to Blacklight: hold complete snapshot in memory to color-map properties interactively
© 2012 Pittsburgh Supercomputing Center Role of Electron Physics in Development of Turbulent Magnetic Reconnection in Collisionless Plasmas Homa Karimabadi (University of California, San Diego) et al. have characterized, with much greater realism than was previously possible, how turbulence within sheets of electrons generates helical magnetic structures called “flux ropes” — which physicists believe play a large role in magnetic reconnection. Used Blacklight (working with PSC scientist Joel Welling) to visualize the simulations, one run of which can generate over 200 terabytes. Blacklight’s shared-memory architecture was critical, says Karimabadi, for the researchers being able to solve the physics of flux rope formation. Their findings are important for NASA’s upcoming Magnetosphere Multiscale Mission to observe and measure magnetic reconnection. This visualization, produced on Blacklight, shows magnetic-field lines (intensity coded by color, blue through red, negative to positive) and associated tornado-like streamlines (white) of a large flux rope formed due to tearing instability in thin electron layers.
© 2012 Pittsburgh Supercomputing Center Many New Areas Economics Natural Language Processing Mining Large Graphs Malware Triage and SW Security Machine Learning Behavioral and Neural Sciences –Understanding Neural Pathways Computer and Computation Research
© 2012 Pittsburgh Supercomputing Center Summary Blacklight’s hardware-enabled cache-coherent shared memory is enabling new data-intensive and memory-intensive analytics and simulations. In particular, Blacklight is: –Enabling new kinds of analyses on large data –Bringing new communities into HPC –Increasing the productivity of both “traditional HPC” and new users PSC is actively working with the research community to bring these powerful analysis capabilities to diverse fields of research. Demand for Blacklight is very high: 22M hours requested at the June XRAC meeting for only 7M hours available. As an experiment in architecture, Blacklight is a clear success.
© 2012 Pittsburgh Supercomputing Center Acknowledgements Markus Dittrich, NSBRC Group Leader Michael Levine, Scientific Director Nick Nystrom, Director of Strategic Applications J. R. Scott, Director of Systems and Operations
© 2012 Pittsburgh Supercomputing Center Big Memory = New Science Jim Kasdorf Director of Special Projects HPC User Forum Imperial College, London.
Tackling I/O Issues 1 David Race 16 March 2010.
© 2010 Pittsburgh Supercomputing Center Pittsburgh Supercomputing Center RP Update July 1, 2010 Bob Stock Associate Director
SDSC RP Update TeraGrid Roundtable Reviewing Dash Unique characteristics: –A pre-production/evaluation “data-intensive” supercomputer based.
INTRODUCTION TO XSEDE. INTRODUCTION Extreme Science and Engineering Discovery Environment (XSEDE) “most advanced, powerful, and robust collection.
Introduction to Grid Application On-Boarding Nick Werstiuk
PSC Blacklight, a Large Hardware-Coherent Shared Memory Resource In TeraGrid Production Since 1/18/2011.
The Future of the iPlant Cyberinfrastructure: Coming Attractions.
1 Building National Cyberinfrastructure Alan Blatecky Office of Cyberinfrastructure EPSCoR Meeting May 21,
CCS Overview Rene Salmon Center for Computational Science.
Introduction to Data Analysis with R on HPC Texas Advanced Computing Center Feb
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
DATABASE MANAGEMENT SYSTEMS IN DATA INTENSIVE ENVIRONMENNTS Leon Guzenda Chief Technology Officer.
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA; SAN DIEGO IEEE Symposium of Massive Storage Systems, May 3-5, 2010 Data-Intensive Solutions.
Northwest Indiana Computational Grid Preston Smith Rosen Center for Advanced Computing Purdue University - West Lafayette West Lafayette Calumet.
A Framework for Visualizing Science at the Petascale and Beyond Kelly Gaither Research Scientist Associate Director, Data and Information Analysis Texas.
1 Intel® Many Integrated Core (Intel® MIC) Architecture MARC Program Status and Essentials to Programming the Intel ® Xeon ® Phi ™ Coprocessor (based on.
Silicon Graphics, Inc. Poster Presented by: SGI Proprietary Technologies for Breakthrough Research Rosario Caltabiano North East Higher Education & Research.
Multicore Applications in Physics and Biochemical Research Hristo Iliev Faculty of Physics Sofia University “St. Kliment Ohridski” 3 rd Balkan Conference.
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Michael L. Norman Principal Investigator Interim Director, SDSC Allan Snavely.
© 2008 IBM Corporation System z10 The Cornerstone of the New Enterprise Data Center Mark Anzani VP System z Technology Deployment.
Engenio 7900 HPC Storage System. 2 LSI Confidential LSI In HPC LSI (Engenio Storage Group) has a rich, successful history of deploying storage solutions.
© 2011 Pittsburgh Supercomputing Center The Anton Project at the National Resource for Biomedical Supercomputing HPC User Forum Houston, TX April 6, 2011.
Extreme Scalability Working Group (XS-WG): Status Update Nick Nystrom Director, Strategic Applications Pittsburgh Supercomputing Center October 21, 2010.
April 2009 OSG Grid School - RDU 1 Open Science Grid John McGee – Renaissance Computing Institute University of North Carolina, Chapel.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 1.
System Area Network Abhiram Shandilya 12/06/01. Overview Introduction to System Area Networks SAN Design and Examples SAN Applications.
Life and Health Sciences Summary Report. “Bench to Bedside” coverage Participants with very broad spectrum of expertise bridging all scales –From molecule.
Seaborg Cerise Wuthrich CMPS Seaborg Manufactured by IBM Distributed Memory Parallel Supercomputer Based on IBM’s SP RS/6000 Architecture.
RNA-Seq 2013, Boston MA, 6/20/2013 Optimizing the National Cyberinfrastructure for Lower Bioinformatic Costs: Making the Most of Resources for Publicly.
1 Chapter 11: Data Centre Administration Objectives Data Centre Structure Data Centre Structure Data Centre Administration Data Centre Administration Data.
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. Big Data Directions Greg.
Rensselaer Why not change the world? Rensselaer Why not change the world? 1.
Priority Research Direction (I/O Models, Abstractions and Software) Key challenges What will you do to address the challenges? – Develop newer I/O models.
1 TCS Confidential. 2 Objective : In this session we will be able to learn: What is Cloud Computing? Characteristics Cloud Flavors Cloud Deployment.
Agile Infrastructure built on OpenStack Building The Next Generation Data Center with OpenStack John Griffith, Senior Software Engineer,
August 27, 2008 Platform Market, Business & Strategy.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Xsede eXtreme Science and Engineering Discovery Environment Ron Perrott University of Oxford 1.
Advisor: Dr. Aamir Shafi Co-Advisor: Mr. Ali Sajjad Member: Dr. Hafiz Farooq Member: Mr. Tahir Azim Optimizing N-body Simulations for Multi-core Compute.
1 EMC CONFIDENTIAL—INTERNAL USE ONLY EMC’s End-to-End Capabilities for Microsoft EMC helps you successfully plan, design, deploy and manage your Microsoft.
Distributed Processing, Client/Server and Clusters Chapter 16.
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Capability Computing – High-End Resources Wayne Pfeiffer Deputy Director NPACI & SDSC NPACI.
March 9, 2015 San Jose Compute Engineering Workshop.
Headline in Arial Bold 30pt HPC User Forum, April 2008 John Hesterberg HPC OS Directions and Requirements.
Empowering Business in Real Time. © Copyright 2009, OSIsoft Inc. All rights Reserved. Virtualization and HA PI Systems: Three strategies to keep your PI.
Experts in numerical algorithms and High Performance Computing services Challenges of the exponential increase in data Andrew Jones March 2010 SOS14.
Introductory to the English “NMS” Model The English National Measurement System (NMS) is –in short – described, as illustration of an efficient National.
1 CMPE 511 HIGH PERFORMANCE COMPUTING CLUSTERS Dilek Demirel İşçi.
© 2017 SlidePlayer.com Inc. All rights reserved.