Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear.

Slides:



Advertisements
Similar presentations
Timothy M. Shead Sandia National Laboratories
Advertisements

SSRS 2008 Architecture Improvements Scale-out SSRS 2008 Report Engine Scalability Improvements.
University of Chicago Department of Energy The Parallel and Grid I/O Perspective MPI, MPI-IO, NetCDF, and HDF5 are in common use Multi TB datasets also.
SALSA HPC Group School of Informatics and Computing Indiana University.
1 Presentation at SciDAC face-to-face January 2005 Ron A. Oldfield Sandia National Laboratories The Lightweight File System.
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Desktop Computing Strategic Project Sandia National Labs May 2, 2009 Jeremy Allison Andy Ambabo James Mcdonald Sandia is a multiprogram laboratory operated.
Technical Architectures
UNCLASSIFIED: LA-UR Data Infrastructure for Massive Scientific Visualization and Analysis James Ahrens & Christopher Mitchell Los Alamos National.
Exploring Communication Options with Adaptive Mesh Refinement Courtenay T. Vaughan, and Richard F. Barrett Sandia National Laboratories SIAM Computational.
Tools and Services for the Long Term Preservation and Access of Digital Archives Joseph JaJa, Mike Smorul, and Sangchul Song Institute for Advanced Computer.
SAND Number: P Sandia is a multi-program laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department.
Kaspersky Open Space Security: Release 2 World-class security solution for your business.
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear.
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear.
Mobile Distributed 3D Sensing Sandia National Laboratories Intelligent Sensors and Robotics POC: Chris Lewis
Introduction To Windows Azure Cloud
Page 1 Trilinos Software Engineering Technologies and Integration Capability Area Overview Roscoe A. Bartlett Department.
Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation,
Principles of Scalable HPC System Design March 6, 2012 Sue Kelly Sandia National Laboratories Abstract: Sandia National.
QCD Project Overview Ying Zhang September 26, 2005.
Page 1 Trilinos Software Engineering Technologies and Integration Capability Area Overview Roscoe A. Bartlett Department.
Data Intensive Computing at Sandia September 15, 2010 Andy Wilson Senior Member of Technical Staff Data Analysis and Visualization Sandia National Laboratories.
Fundamentals of Database Chapter 7 Database Technologies.
Ohio State University Department of Computer Science and Engineering 1 Cyberinfrastructure for Coastal Forecasting and Change Analysis Gagan Agrawal Hakan.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
Page 1 Trilinos Release Improvement Issues Roscoe A. Bartlett Department of Optimization & Uncertainty Estimation Trilinos.
The Red Storm High Performance Computer March 19, 2008 Sue Kelly Sandia National Laboratories Abstract: Sandia National.
Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation,
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
The Future of the iPlant Cyberinfrastructure: Coming Attractions.
Mainframe (Host) - Communications - User Interface - Business Logic - DBMS - Operating System - Storage (DB Files) Terminal (Display/Keyboard) Terminal.
Large Scale Visualization on the Cray XT3 Using ParaView Cray User’s Group 2008 May 8, 2008 Sandia is a multiprogram laboratory operated by Sandia Corporation,
Strategies for Solving Large-Scale Optimization Problems Judith Hill Sandia National Laboratories October 23, 2007 Modeling and High-Performance Computing.
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear.
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
Towards Exascale File I/O Yutaka Ishikawa University of Tokyo, Japan 2009/05/21.
The HDF Group Milestone 5.1: Initial POSIX Function Shipping Demonstration Jerome Soumagne, Quincey Koziol 09/24/2013 © 2013 The HDF Group.
Reconfigurable Computing Aspects of the Cray XD1 Sandia National Laboratories / California Craig Ulmer Cray User Group (CUG 2005) May.
STK (Sierra Toolkit) Update Trilinos User Group meetings, 2014 R&A: SAND PE Sandia National Laboratories is a multi-program laboratory operated.
Site Report DOECGF April 26, 2011 W. Alan Scott Sandia National Laboratories Sandia National Laboratories is a multi-program laboratory managed and operated.
ParaView III Software Methodology Update Where to now November 29, 2006 John Greenfield Sandia is a multiprogram laboratory operated by Sandia Corporation,
Page 1 CMake Trilinos? Roscoe A. Bartlett Department of Optimization & Uncertainty Estimation Esteban J. Guillen Department.
Comprehensive Scientific Support Of Large Scale Parallel Computation David Skinner, NERSC.
LLNL-PRES This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
Other Tools HPC Code Development Tools July 29, 2010 Sue Kelly Sandia is a multiprogram laboratory operated by Sandia Corporation, a.
Tackling I/O Issues 1 David Race 16 March 2010.
LIOProf: Exposing Lustre File System Behavior for I/O Middleware
Performing Fault-tolerant, Scalable Data Collection and Analysis James Jolly University of Wisconsin-Madison Visualization and Scientific Computing Dept.
Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation,
AZ PASS User Group Azure Data Factory Overview Josh Sivey, Solution Partner October
Photos placed in horizontal position with even amount of white space between photos and header Sandia National Laboratories is a multi-program laboratory.
Automated File Server Disk Quota Management May 13 th, 2008 Bill Claycomb Computer Systems Analyst Infrastructure Computing Systems Department Sandia is.
Virtual Directory Services and Directory Synchronization May 13 th, 2008 Bill Claycomb Computer Systems Analyst Infrastructure Computing Systems Department.
SciDAC SSS Quarterly Report Sandia Labs January 25, 2005 William McLendon Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed.
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear.
Developing IoT endpoints with mbed Client
Big Data is a Big Deal!.
Open Source distributed document DB for an enterprise
Spark Presentation.
What is SharePoint and why you should care
Unstructured Grids at Sandia National Labs
Ray-Cast Rendering in VTK-m
SDM workshop Strawman report History and Progress and Goal.
Overview of big data tools
Trilinos I/O Support (TRIOS)
Mark Quirk Head of Technology Developer & Platform Group
L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher
Presentation transcript:

Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL Trilinos I/O Support (TRIOS) Proposed I/O Software for FY11 Approved for General Release SAND P November 2010 Ron Oldfield and Gregory Sjaardema Sandia National Laboratories

Trios Overview and Goals Parallel I/O Support for required I/O libraries –Exodus, Nemesis, IOSS –TPLs: netCDF, HDF5 –POC: Greg Sjaardema Co-Design Vehicle for I/O Research –Provides “brand name” and common distribution point for rapid deployment and testing –Planned Software Products for Trios Scalable I/O Libraries for sparse and dense matrices (from NGC project) In-Transit Data Services (from CSSE/SIO Research) –Network Scalable Service Interface (Nessie) –netCDF caching/staging service –Fragment detection and tracking for CTH HPC Database services (from CSRF: HPC System Support for Informatics) –SQL Services –Rapid Ingest Capabilities –POC: Ron Oldfield November Trios Software Review

Impact of Efficient I/O Libraries Scaling Challenges for Trilinos-Based Document Clustering Strong scaling exposes weaknesses –Original methods for loading were not designed for production use. Improvements to enable large scale –Sparse Matrix Reads Keep track of mapping Parallel I/O –Dense Matrix Reads Convert to binary format Parallel I/O Data ordering –Memory efficient algorithms Multi-pass dense-matrix multiply for cosine-similarity allows calculation of full dataset Previous version could not cluster 400K docs (one matrix had to be resident in memory on each process) Early Performance Results: Bible Dataset NSSI only November Trios Software Review

Multilingual Document Clustering Performance on JaguarPF With I/O improvements, application is compute bound even at large scale November Trios Software Review

In-Transit Data Services Data Processing Between Simulation and Storage Motivation –Improve “effective” I/O performance by injecting data services between app and storage –Leverage available compute/service nodes Network Scalable Service Interface (Nessie) –Developed for the Lightweight FS Project –Framework for HPC client/server development –Designed for scalable data movement –Asynchronous RPC-like API Examples –Preprocessing for seismic imaging (ca.1995) –netCDF staging service –CTH particle detection/tracking –SQL Proxy for HPC/Database Integration –Sparse-matrix viz, real-time network analysis Client Application (compute nodes) I/O Service (compute/service nodes) Raw Data Processed Data Lustre File System Cache/aggregate /process Visualization Client November Trios Software Review

In-Transit Data Services NetCDF Staging Service Motivation –Synchronous I/O libraries require app to wait until data is on storage device –Most of application I/O is “bursty” –Not enough cache on compute nodes for typical async I/O approaches –NetCDF is basis of important I/O libs at Sandia (no code changes required) NetCDF Caching Service –Service aggregates/caches data and pushes data to storage –Async I/O allows overlap of I/O and computation Client Application (compute nodes) NetCDF Service (compute nodes) NetCDF requests Processed Data Lustre File System Cache/aggregate November Trios Software Review

In-Transit Data Services CTH Fragment Detection Motivation –Fragment detection requires data from every time step (I/O intensive) –Detection process takes 30% of time-step calculation (scaling issues) –Integrating detection software with CTH is intrusive on developer CTH fragment detection service –Extra compute nodes provide in-line processing (overlap fragment detection with time step calculation) –Only output fragments to storage (reduce I/O) –Non-intrusive Looks like normal I/O (spyplot interface) Can be configured out-of-band Status –Developing client/server stubs –Developing Paraview proxy service CTH (compute nodes) Fragment-Detection Service (compute nodes) Raw Data Fragment Data Lustre File System Detect Fragments Visualization Client spyplot Fragment detection service provides on-the-fly data analysis with no modifications to CTH. November Trios Software Review

Database Service Provides Fast Access to Remote Database from HPC Application Database Service Features –Provides “bridge” between parallel apps and external DWA –Runs on Cray XT/XMT network nodes –Applications communicate with DB service using Nessie (over Portals or IB) –Service-level access to Netezza through standard interface (e.g., ODBC, nzload) Client Application on XT/XMT (compute nodes) Database Service (service node) Portals ODBC/NZL oad © Netezza Corporation Storage Arrays November Trios Software Review

Motivating Application: NISAC/N-ABLE Modeling Economic Security Model economic impact of disruptions in infrastructure –Changes in U.S. Border Security technologies –Terrorist acts on commodity futures markets –Transportation disruptions on regional agriculture and food supply chains –Optimized military supply chains –Electric power and rail transportation disruptions on chemical supply chains Compute and data challenges –Models economy to the level of the individual firm –Model transactions from 10s of millions of companies –Simulation data ingested into DB for analysis –DB ingest is bottleneck (10x time to simulate data) –Time to solution is critical… want answers in hours November Trios Software Review

Performance-Based Motivation for Database Service ODBC is the only available interface for remote access. It’s the interface and protocols, not the network that’s the bottleneck Service can use NZLoad from the Netezza host Bytes Written Time (sec) Database Ingest Performance November Trios Software Review

Summary Trios is a perfect vehicle for I/O co-design Rapid deployment for efficient production-quality I/O libraries –Exodus, Nemesis, IOSS (from Seacas) –Sparse/Dense Matrix I/O (from NGC) Research vehicle for in-transit data services –Goal is to make efficient use of platform to reduce burden on file system –Already demonstrated value for Seismic (Salvo) –Enables exploration of new functionality for Trilinos codes on HPC systems netCDF staging to manage bursts of I/O In-transit fragment detection/tracking: reduce storage system requirements Integration with Data Warehouse Appliances Interactive observation/control of HPC application November Trios Software Review

Trios Software Products Planned Timeline for Integration and Release Seacas –Exodus, Nemesis, IOSS (in Trilinos release 10.6) –Plans to move out of Trios subdirectory to a new package named “Seacas” Sparse/Dense matrix I/O libraries –Installed and tested: November 30, 2011 Research software –Network-Scalable Service Interface (Nessie) Installed and testedJan 1, 2011 I’m not sure how to test services using the Trilinos testing framework –netCDF staging service Installed with basic tests: Feb 1, 2011 Link with Exodus and evaluate performance: March 2011 – CTH tracking service In development, need working demo for ASC level III milestone: Summer 2011 Type of release depends on copyright resolution: currently for internal release only November Trios Software Review