Presentation is loading. Please wait.

Presentation is loading. Please wait.

Trilinos I/O Support (TRIOS)

Similar presentations


Presentation on theme: "Trilinos I/O Support (TRIOS)"— Presentation transcript:

1 Trilinos I/O Support (TRIOS)
Proposed I/O Software for FY11 Approved for General Release SAND P November 2010 Ron Oldfield and Gregory Sjaardema Sandia National Laboratories

2 Trios Overview and Goals
Parallel I/O Support for required I/O libraries Exodus, Nemesis, IOSS TPLs: netCDF, HDF5 POC: Greg Sjaardema Co-Design Vehicle for I/O Research Provides “brand name” and common distribution point for rapid deployment and testing Planned Software Products for Trios Scalable I/O Libraries for sparse and dense matrices (from NGC project) In-Transit Data Services (from CSSE/SIO Research) Network Scalable Service Interface (Nessie) netCDF caching/staging service Fragment detection and tracking for CTH HPC Database services (from CSRF: HPC System Support for Informatics) SQL Services Rapid Ingest Capabilities POC: Ron Oldfield Trios Software Review November 2010

3 Impact of Efficient I/O Libraries Scaling Challenges for Trilinos-Based Document Clustering
Strong scaling exposes weaknesses Original methods for loading were not designed for production use. Improvements to enable large scale Sparse Matrix Reads Keep track of mapping Parallel I/O Dense Matrix Reads Convert to binary format Data ordering Memory efficient algorithms Multi-pass dense-matrix multiply for cosine-similarity allows calculation of full dataset Previous version could not cluster 400K docs (one matrix had to be resident in memory on each process) Early Performance Results: Bible Dataset NSSI only Trios Software Review November 2010

4 Multilingual Document Clustering Performance on JaguarPF
With I/O improvements, application is compute bound even at large scale Trios Software Review November 2010

5 In-Transit Data Services Data Processing Between Simulation and Storage
Motivation Improve “effective” I/O performance by injecting data services between app and storage Leverage available compute/service nodes Network Scalable Service Interface (Nessie) Developed for the Lightweight FS Project Framework for HPC client/server development Designed for scalable data movement Asynchronous RPC-like API Examples Preprocessing for seismic imaging (ca.1995) netCDF staging service CTH particle detection/tracking SQL Proxy for HPC/Database Integration Sparse-matrix viz, real-time network analysis Client Application (compute nodes) I/O Service (compute/service nodes) Lustre File System Processed Data Raw Data Visualization Client Cache/aggregate/process Trios Software Review November 2010

6 In-Transit Data Services NetCDF Staging Service
Motivation Synchronous I/O libraries require app to wait until data is on storage device Most of application I/O is “bursty” Not enough cache on compute nodes for typical async I/O approaches NetCDF is basis of important I/O libs at Sandia (no code changes required) NetCDF Caching Service Service aggregates/caches data and pushes data to storage Async I/O allows overlap of I/O and computation Client Application (compute nodes) NetCDF Service NetCDF requests Processed Data Lustre File System Cache/aggregate Trios Software Review November 2010

7 In-Transit Data Services CTH Fragment Detection
Motivation Fragment detection requires data from every time step (I/O intensive) Detection process takes 30% of time-step calculation (scaling issues) Integrating detection software with CTH is intrusive on developer CTH fragment detection service Extra compute nodes provide in-line processing (overlap fragment detection with time step calculation) Only output fragments to storage (reduce I/O) Non-intrusive Looks like normal I/O (spyplot interface) Can be configured out-of-band Status Developing client/server stubs Developing Paraview proxy service CTH (compute nodes) Fragment-Detection Service (compute nodes) Raw Data Fragment Data Lustre File System spyplot Detect Fragments Visualization Client Fragment detection service provides on-the-fly data analysis with no modifications to CTH. Trios Software Review November 2010

8 Client Application on XT/XMT
Database Service Provides Fast Access to Remote Database from HPC Application Client Application on XT/XMT (compute nodes) Storage Arrays Database Service (service node) Portals ODBC/NZLoad © Netezza Corporation Database Service Features Provides “bridge” between parallel apps and external DWA Runs on Cray XT/XMT network nodes Applications communicate with DB service using Nessie (over Portals or IB) Service-level access to Netezza through standard interface (e.g., ODBC, nzload) Trios Software Review November 2010

9 Motivating Application: NISAC/N-ABLE Modeling Economic Security
Model economic impact of disruptions in infrastructure Changes in U.S. Border Security technologies Terrorist acts on commodity futures markets Transportation disruptions on regional agriculture and food supply chains Optimized military supply chains Electric power and rail transportation disruptions on chemical supply chains Compute and data challenges Models economy to the level of the individual firm Model transactions from 10s of millions of companies Simulation data ingested into DB for analysis DB ingest is bottleneck (10x time to simulate data) Time to solution is critical… want answers in hours Trios Software Review November 2010

10 Performance-Based Motivation for Database Service
Bytes Written Time (sec) Database Ingest Performance ODBC is the only available interface for remote access. It’s the interface and protocols, not the network that’s the bottleneck Service can use NZLoad from the Netezza host Trios Software Review November 2010

11 Summary Trios is a perfect vehicle for I/O co-design
Rapid deployment for efficient production-quality I/O libraries Exodus, Nemesis, IOSS (from Seacas) Sparse/Dense Matrix I/O (from NGC) Research vehicle for in-transit data services Goal is to make efficient use of platform to reduce burden on file system Already demonstrated value for Seismic (Salvo) Enables exploration of new functionality for Trilinos codes on HPC systems netCDF staging to manage bursts of I/O In-transit fragment detection/tracking: reduce storage system requirements Integration with Data Warehouse Appliances Interactive observation/control of HPC application Trios Software Review November 2010

12 Trios Software Products Planned Timeline for Integration and Release
Seacas Exodus, Nemesis, IOSS (in Trilinos release 10.6) Plans to move out of Trios subdirectory to a new package named “Seacas” Sparse/Dense matrix I/O libraries Installed and tested: November 30, 2011 Research software Network-Scalable Service Interface (Nessie) Installed and testedJan 1, 2011 I’m not sure how to test services using the Trilinos testing framework netCDF staging service Installed with basic tests: Feb 1, 2011 Link with Exodus and evaluate performance: March 2011 CTH tracking service In development, need working demo for ASC level III milestone: Summer 2011 Type of release depends on copyright resolution: currently for internal release only Trios Software Review November 2010


Download ppt "Trilinos I/O Support (TRIOS)"

Similar presentations


Ads by Google