SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA; SAN DIEGO IEEE Symposium of Massive Storage Systems, May 3-5, 2010 Data-Intensive Solutions.

Slides:



Advertisements
Similar presentations
1 Copyright © 2012 Oracle and/or its affiliates. All rights reserved. Convergence of HPC, Databases, and Analytics Tirthankar Lahiri Senior Director, Oracle.
Advertisements

Query Processing and Optimizing on SSDs Flash Group Qingling Cao
Appro Xtreme-X Supercomputers A P P R O I N T E R N A T I O N A L I N C.
♦ Commodity processor with commodity inter- processor connection Clusters Pentium, Itanium, Opteron, Alpha GigE, Infiniband, Myrinet, Quadrics, SCI NEC.
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO HPEC September 16, 2010 Dr. Mike Norman, PI Dr. Allan Snavely, Co-PI.
SAN DIEGO SUPERCOMPUTER CENTER Using Gordon to Accelerate LHC Science Rick Wagner San Diego Supercomputer Center XSEDE 13 July 22-25, 2013 San Diego, CA.
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Gordon: NSF Flash-based System for Data-intensive Science Mahidhar Tatineni 37.
SAN DIEGO SUPERCOMPUTER CENTER Choonhan Youn Viswanath Nandigam, Nancy Wilkins-Diehr, Chaitan Baru San Diego Supercomputer Center, University of California,
SAN DIEGO SUPERCOMPUTER CENTER Niches, Long Tails, and Condos Effectively Supporting Modest-Scale HPC Users 21st High Performance Computing Symposia (HPC'13)
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA; SAN DIEGO SDSC RP Update Trestles Recent Dash results Gordon schedule SDSC’s broader HPC.
National Center for Atmospheric Research John Clyne 4/27/11 4/26/20111.
PRAKTICKÝ ÚVOD DO SUPERPOČÍTAČE ANSELM Infrastruktura, přístup a podpora uživatelů David Hrbáč
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA; SAN DIEGO SDSC RP Update October 21, 2010.
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Performance of Applications Using Dual-Rail InfiniBand 3D Torus Network on the.
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Status Update TeraGrid Science Advisory Board Meeting July 19, 2010 Dr. Mike.
IDC HPC User Forum Conference Appro Product Update Anthony Kenisky, VP of Sales.
ASKAP Central Processor: Design and Implementation Calibration and Imaging Workshop 2014 ASTRONOMY AND SPACE SCIENCE Ben Humphreys | ASKAP Software and.
MSSG: A Framework for Massive-Scale Semantic Graphs Timothy D. R. Hartley, Umit Catalyurek, Fusun Ozguner, Andy Yoo, Scott Kohn, Keith Henderson Dept.
SUMS Storage Requirement 250 TB fixed disk cache 130 TB annual increment for permanently on- line data 100 TB work area (not controlled by SUMS) 2 PB near-line.
National Partnership for Advanced Computational Infrastructure Advanced Architectures CSE 190 Reagan W. Moore San Diego Supercomputer Center
NPACI: National Partnership for Advanced Computational Infrastructure August 17-21, 1998 NPACI Parallel Computing Institute 1 Cluster Archtectures and.
Software Defined Storage Storage Built for the Virtualized Data Center Kelly Murphy Founder / CTO Gridstore.
Gordon: Using Flash Memory to Build Fast, Power-efficient Clusters for Data-intensive Applications A. Caulfield, L. Grupp, S. Swanson, UCSD, ASPLOS’09.
1 CS : Technology Trends Ion Stoica ( September 12, 2011.
Real Parallel Computers. Modular data centers Background Information Recent trends in the marketplace of high performance computing Strohmaier, Dongarra,
© 2013 Mellanox Technologies 1 NoSQL DB Benchmarking with high performance Networking solutions WBDB, Xian, July 2013.
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO TeraGrid Coordination Meeting June 10, 2010 TeraGrid Forum Meeting June 16, 2010.
© Copyright 2010 Hewlett-Packard Development Company, L.P. 1 HP + DDN = A WINNING PARTNERSHIP Systems architected by HP and DDN Full storage hardware and.
Swiss Academic Compute Cloud Project Lightning Talk at CH OpenStack User Meeting Nov Tyanko Alekseyev (UZH) Markus Eurich (ETH) Dean Flanders.
1 Advanced Storage Technologies for High Performance Computing Sorin, Faibish EMC NAS Senior Technologist IDC HPC User Forum, April 14-16, Norfolk, VA.
SDSC RP Update TeraGrid Roundtable Changes in SDSC Allocated Resources We will decommission our IA-64 cluster June 30 (rather than March 2010)
SDSC RP Update TeraGrid Roundtable Reviewing Dash Unique characteristics: –A pre-production/evaluation “data-intensive” supercomputer based.
National Center for Supercomputing Applications Observational Astronomy NCSA projects radio astronomy: CARMA & SKA optical astronomy: DES & LSST access:
CCS machine development plan for post- peta scale computing and Japanese the next generation supercomputer project Mitsuhisa Sato CCS, University of Tsukuba.
SIMPLE DOES NOT MEAN SLOW: PERFORMANCE BY WHAT MEASURE? 1 Customer experience & profit drive growth First flight: June, minute turn at the gate.
/ ZZ88 Performance of Parallel Neuronal Models on Triton Cluster Anita Bandrowski, Prithvi Sundararaman, Subhashini Sivagnanam, Kenneth Yoshimoto,
The Red Storm High Performance Computer March 19, 2008 Sue Kelly Sandia National Laboratories Abstract: Sandia National.
Business Intelligence Appliance Powerful pay as you grow BI solutions with Engineered Systems.
Hardware Trends. Contents Memory Hard Disks Processors Network Accessories Future.
Data Storage Systems: A Survey Abdullah Aldhamin July 29, 2013 CMPT 880: Large-Scale Multimedia Systems and Cloud Computing Course Project.
Taking the Complexity out of Cluster Computing Vendor Update HPC User Forum Arend Dittmer Director Product Management HPC April,
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Michael L. Norman Principal Investigator Interim Director, SDSC Allan Snavely.
HPCVL High Performance Computing Virtual Laboratory Founded 1998 as a joint HPC lab between –Carleton U. (Comp. Sci.) –Queen’s U. (Engineering) –U. of.
HPC computing at CERN - use cases from the engineering and physics communities Michal HUSEJKO, Ioannis AGTZIDIS IT/PES/ES 1.
SoCal Infrastructure OptIPuter Southern California Network Infrastructure Philip Papadopoulos OptIPuter Co-PI University of California, San Diego Program.
SAN DIEGO SUPERCOMPUTER CENTER SDSC's Data Oasis Balanced performance and cost-effective Lustre file systems. Lustre User Group 2013 (LUG13) Rick Wagner.
ITEP computing center and plans for supercomputing Plans for Tier 1 for FAIR (GSI) in ITEP  8000 cores in 3 years, in this year  Distributed.
2009/4/21 Third French-Japanese PAAP Workshop 1 A Volumetric 3-D FFT on Clusters of Multi-Core Processors Daisuke Takahashi University of Tsukuba, Japan.
1 CS : Technology Trends Ion Stoica and Ali Ghodsi ( August 31, 2015.
“The UCSD Big Data Freeway System” Invited Short Talk Workshop on “Enriching Human Life and Society” UC San Diego February 6, 2014 Dr. Larry Smarr Director,
Presented by NCCS Hardware Jim Rogers Director of Operations National Center for Computational Sciences.
Computing Issues for the ATLAS SWT2. What is SWT2? SWT2 is the U.S. ATLAS Southwestern Tier 2 Consortium UTA is lead institution, along with University.
 System Requirements are the prerequisites needed in order for a software or any other resources to execute efficiently.  Most software defines two.
Tackling I/O Issues 1 David Race 16 March 2010.
Parallel Computers Today Oak Ridge / Cray Jaguar > 1.75 PFLOPS Two Nvidia 8800 GPUs > 1 TFLOPS Intel 80- core chip > 1 TFLOPS  TFLOPS = floating.
Opportunistic Computing Only Knocks Once: Processing at SDSC Ian Fisk FNAL On behalf of the CMS Collaboration.
The Evolution of the Italian HPC Infrastructure Carlo Cavazzoni CINECA – Supercomputing Application & Innovation 31 Marzo 2015.
High Performance Cyberinfrastructure Discovery Tools for Data Intensive Research Larry Smarr Prof. Computer Science and Engineering Director, Calit2 (UC.
Introduction to Data Analysis with R on HPC Texas Advanced Computing Center Feb
High Performance Computing (HPC)
Jay Boisseau, Director Texas Advanced Computing Center
Appro Xtreme-X Supercomputers
Bridges and Clouds Sergiu Sanielevici, PSC Director of User Support for Scientific Applications October 12, 2017 © 2017 Pittsburgh Supercomputing Center.
CS : Technology Trends August 31, 2015 Ion Stoica and Ali Ghodsi (
File Processing : Storage Media
Oracle Storage Performance Studies
File Processing : Storage Media
TeraScale Supernova Initiative
The 100 TB Sorting Competition - A Quick Review
Presentation transcript:

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA; SAN DIEGO IEEE Symposium of Massive Storage Systems, May 3-5, 2010 Data-Intensive Solutions at SDSC Michael L. Norman Interim Director, SDSC Distinguished Professor of Physics 26 th IEEE (MSST2010) Symposium on Massive Storage Systems and Technologies

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA; SAN DIEGO Michael L. Norman Principal Investigator Interim Director, SDSC Allan Snavely Co-Principal Investigator Project Scientist

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA; SAN DIEGO IEEE Symposium of Massive Storage Systems, May 3-5, 2010 What is Gordon? A “data-intensive” supercomputer based on SSD flash memory and virtual shared memory SW Emphasizes MEM and IOPS over FLOPS A system designed to accelerate access to massive data bases being generated in all fields of science, engineering, medicine, and social science The NSF’s most recent Track 2 award to the San Diego Supercomputer Center (SDSC) Coming Summer 2011

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA; SAN DIEGO IEEE Symposium of Massive Storage Systems, May 3-5, 2010 Why Gordon? Growth of digital data is exponential “data tsunami” Driven by advances in digital detectors, networking, and storage technologies Making sense of it all is the new imperative data analysis workflows data mining visual analytics multiple-database queries data-driven applications

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA; SAN DIEGO IEEE Symposium of Massive Storage Systems, May 3-5, 2010 The Memory Hierarchy of a Typical HPC Cluster Shared memory programming Message passing programming Latency Gap Disk I/O

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA; SAN DIEGO IEEE Symposium of Massive Storage Systems, May 3-5, 2010 The Memory Hierarchy of Gordon Shared memory programming Disk I/O

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA; SAN DIEGO IEEE Symposium of Massive Storage Systems, May 3-5, 2010 Gordon Architecture: “Supernode” 32 Appro Extreme-X compute nodes Dual processor Intel Sandy Bridge 240 GFLOPS 64 GB 2 Appro Extreme-X IO nodes Intel SSD drives 4 TB ea. 560,000 IOPS ScaleMP vSMP virtual shared memory 2 TB RAM aggregate 8 TB SSD aggregate 240 GF Comp. Node 64 GB RAM 240 GF Comp. Node 64 GB RAM 4 TB SSD I/O Node vSMP memory virtualization

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA; SAN DIEGO IEEE Symposium of Massive Storage Systems, May 3-5, 2010 Gordon Architecture: Full Machine 32 supernodes = 1024 compute nodes Dual rail QDR Infiniband network 3D torus (4x4x4) 4 PB rotating disk parallel file system >100 GB/s SN DDDDDD

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA; SAN DIEGO IEEE Symposium of Massive Storage Systems, May 3-5, 2010 Gordon Aggregate Capabilities Speed245 TFLOPS Mem (RAM)64 TB Mem (SSD)256 TB Mem (RAM+SSD)320 TB Ratio (MEM/SPEED)1.31 BYTES/FLOP IO rate to SSDs35 Million IOPS Network bandwidth16 GB/s bi-directional Network latency 1  sec. Disk storage4 PB Disk IO Bandwidth>100 GB/sec

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA; SAN DIEGO Dash: a working prototype of Gordon

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA; SAN DIEGO Dash Architecture (Nehalem) 4 supernodes = 64 physical nodes One core can access 1.75 TB “memory” via vSMP

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA; SAN DIEGO NIH Biological Networks Pathway Analysis Interaction networks or graphs occur in many disciplines, e.g. epidemiology, phylogenetics and systems biology Performance is limited by latency of a Database query and aggregate amount of fast storage available

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA; SAN DIEGO IEEE Symposium of Massive Storage Systems, May 3-5, 2010 Biological Networks Query timing

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA; SAN DIEGO Palomar Transient Factory (PTF) Nightly wide-field surveys using Palomar Schmidt telescope Image data sent to LBL for archive/analysis 100 new transients every minute Large, random queries across multiple databases for IDs

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA; SAN DIEGO PTF-DB Transient Search Random Queries requesting very small chunks of data about the candidate observations

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA; SAN DIEGO IEEE Symposium of Massive Storage Systems, May 3-5, 2010 Dash wins SC09 Storage Challenge at SC09* *beating a team led by Alex Szalay and Gordon Bell

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA; SAN DIEGO IEEE Symposium of Massive Storage Systems, May 3-5, 2010 Conclusions Gordon architecture customized for data-intensive applications, but built entirely from commodity parts Basically a Linux cluster with Large RAM memory/core Large amount of flash SSD Virtual shared memory software  10 TB of storage accessible from a single core  shared memory parallelism for higher performance Dash prototype accelerates real applications by 2-100x relative to disk depending on memory access patterns Random I/O accelerated the most Dash prototype beats all commercial offerings in MB/s/$, IOPS/$, IOPS/GB, and IOPS/Watt

Cost:Performance (Feb, 2010) HDD (SATA) DASH SSD IO DASH SUPER FUSION IO SLC SUN F5100 GB $/GB~ MB/s/$~ IOPS/$ IOPS/GB IOPS cost $ Cost of DASH includes storage, n/w, processors.

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA; SAN DIEGO IEEE Symposium of Massive Storage Systems, May 3-5, 2010 Power Metrics