N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Fast Adaptive Storage and Retrieval Scott B. Baden Department of Computer Science and.

Slides:



Advertisements
Similar presentations
University of Chicago Department of Energy The Parallel and Grid I/O Perspective MPI, MPI-IO, NetCDF, and HDF5 are in common use Multi TB datasets also.
Advertisements

Discretizing the Sphere for Multi-Scale Air Quality Simulations using Variable-Resolution Finite-Volume Techniques Martin J. Otte U.S. EPA Robert Walko.
1 Advancing Supercomputer Performance Through Interconnection Topology Synthesis Yi Zhu, Michael Taylor, Scott B. Baden and Chung-Kuan Cheng Department.
Parallel Research at Illinois Parallel Everywhere
Simulating Supercell Storms and Tornaodes in Unprecedented Detail and Accuracy Bob Wilhelmson Professor, Atmospheric Sciences Chief Scientist,
N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE SDSU Science Bridge - CG302 - July 13, The Future of the Internet Kris Stewart.
N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE PITAC Transforming Learning Workshop - SDSC - July 17, Using IT in Formal Education.
Energy Efficient Prefetching – from models to Implementation 6/19/ Adam Manzanares and Xiao Qin Department of Computer Science and Software Engineering.
Parallel Mesh Refinement with Optimal Load Balancing Jean-Francois Remacle, Joseph E. Flaherty and Mark. S. Shephard Scientific Computation Research Center.
Energy Efficient Prefetching with Buffer Disks for Cluster File Systems 6/19/ Adam Manzanares and Xiao Qin Department of Computer Science and Software.
Latest Advances in “Hybrid” Codes & their Application to Global Magnetospheric Simulations A New Approach to Simulations of Complex Systems H. Karimabadi.
N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE SDSU Science Bridge - CG302 - July 12, Computational Science - the new way to.
Support for Adaptive Computations Applied to Simulation of Fluids in Biological Systems Kathy Yelick U.C. Berkeley.
File Systems (2). Readings r Silbershatz et al: 11.8.
Motivation  Movie  Game  Engineering Introduction  Ideally  Looks good  Fast simulation  Looks good?  Look plausible  Doesn’t need to be exactly.
UNIVERSITY of MARYLAND GLOBAL LAND COVER FACILITY High Performance Computing in Support of Geospatial Information Discovery and Mining Joseph JaJa Institute.
PhD course - Milan, March /09/ Some additional words about cloud computing Lionel Brunie National Institute of Applied Science (INSA) LIRIS.
N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Brains to Bays --Scaleable Visualization Toolkits Arthur J. Olson Interaction Environments.
Principles of Scalable HPC System Design March 6, 2012 Sue Kelly Sandia National Laboratories Abstract: Sandia National.
Profiling Grid Data Transfer Protocols and Servers George Kola, Tevfik Kosar and Miron Livny University of Wisconsin-Madison USA.
Classification of Computers
A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.
NCAS/APRIL Meeting on Urban Air Quality Modelling Dispersion modelling at Imperial College London Professor Helen ApSimon and Dr Roy Colvile Page 1/N ©
The Red Storm High Performance Computer March 19, 2008 Sue Kelly Sandia National Laboratories Abstract: Sandia National.
Brookhaven Science Associates U.S. Department of Energy MUTAC Review April , 2004, LBNL Target Simulation Roman Samulyak, in collaboration with.
So far we have covered … Basic visualization algorithms Parallel polygon rendering Occlusion culling They all indirectly or directly help understanding.
A Framework for Elastic Execution of Existing MPI Programs Aarthi Raveendran Graduate Student Department Of CSE 1.
1 LES of Turbulent Flows: Lecture 12 (ME EN ) Prof. Rob Stoll Department of Mechanical Engineering University of Utah Spring 2011.
14 Aug 08DOE Review John Huth ATLAS Computing at Harvard John Huth.
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.
Computer Science and Engineering Predicting Performance for Grid-Based P. 1 IPDPS’07 A Performance Prediction Framework.
Mark Rast Laboratory for Atmospheric and Space Physics Department of Astrophysical and Planetary Sciences University of Colorado, Boulder Kiepenheuer-Institut.
1 Computing Challenges for the Square Kilometre Array Mathai Joseph & Harrick Vin Tata Research Development & Design Centre Pune, India CHEP Mumbai 16.
FREERIDE: System Support for High Performance Data Mining Ruoming Jin Leo Glimcher Xuan Zhang Ge Yang Gagan Agrawal Department of Computer and Information.
A Collaborative Framework for Scientific Data Analysis and Visualization Jaliya Ekanayake, Shrideep Pallickara, and Geoffrey Fox Department of Computer.
GEON2 and OpenEarth Framework (OEF) Bradley Wallet School of Geology and Geophysics, University of Oklahoma
Mining Turbulence Data Ivan Marusic Department of Aerospace Engineering and Mechanics University of Minnesota Collaborators: Victoria Interrante, George.
EVLA Data Processing PDR Scale of processing needs Tim Cornwell, NRAO.
Airflow modeling for research vessels B. I. Moat 1, M. J. Yelland 1, R. W. Pascal 1, S. R. Turnock 2, S. Popinet 3 1) National Oceanography Centre, UK.
VAPoR: A Discovery Environment for Terascale Scientific Data Sets Alan Norton & John Clyne National Center for Atmospheric Research Scientific Computing.
Center for Computational Visualization University of Texas, Austin Visualization and Graphics Research Group University of California, Davis Molecular.
Large Scale Time-Varying Data Visualization Han-Wei Shen Department of Computer and Information Science The Ohio State University.
An Evaluation of Partitioners for Parallel SAMR Applications Sumir Chandra & Manish Parashar ECE Dept., Rutgers University Submitted to: Euro-Par 2001.
Generic GUI – Thoughts to Share Jinping Gwo EMSGi.org.
Aerospace Engineering N. C. State University Air Terminal Wake Vortex Simulation D. Scott McRae, Hassan A. Hassan N.C. State University 4 September 2003.
Power and Cooling at Texas Advanced Computing Center Tommy Minyard, Ph.D. Director of Advanced Computing Systems 42 nd HPC User Forum September 8, 2011.
LCSE – NCSA Partnership Accomplishments, FY01 Paul R. Woodward Laboratory for Computational Science & Engineering University of Minnesota October 17, 2001.
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Advanced User Support for MPCUGLES code at University of Minnesota October 09,
… begin …. Parallel Computing: What is it good for? William M. Jones, Ph.D. Assistant Professor Computer Science Department Coastal Carolina University.
Dynamics of a Gas Bubble in an Inclined Channel at Finite Reynolds Number Catherine Norman Michael J. Miksis Northwestern University.
Computing Issues for the ATLAS SWT2. What is SWT2? SWT2 is the U.S. ATLAS Southwestern Tier 2 Consortium UTA is lead institution, along with University.
Tackling I/O Issues 1 David Race 16 March 2010.
System Support for High Performance Scientific Data Mining Gagan Agrawal Ruoming Jin Raghu Machiraju S. Parthasarathy Department of Computer and Information.
Computer Science and Engineering Parallelizing Feature Mining Using FREERIDE Leonid Glimcher P. 1 ipdps’04 Scaling and Parallelizing a Scientific Feature.
Simulation of a self-propelled wake with small excess momentum in a stratified fluid Matthew de Stadler and Sutanu Sarkar University of California San.
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CLOUD COMPUTING
Introduction to the Turbulence Models
Scientific Computing Department
Chamber Dynamic Response Modeling
C. F. Panagiotou and Y. Hasegawa
So far we have covered … Basic visualization algorithms
University of Technology
A topology-based approach towards
SDM workshop Strawman report History and Progress and Goal.
TeraScale Supernova Initiative
Software for Neutron Imaging Analysis
Department of Computer Science, University of Tennessee, Knoxville
Parallel Feature Identification and Elimination from a CFD Dataset
L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher
Presentation transcript:

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Fast Adaptive Storage and Retrieval Scott B. Baden Department of Computer Science and Engineering University of California, San Diego

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Motivation Some applications are able to distinguish interesting features from background data using on-line analysis

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Features

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Animation

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Fast Adaptive Storage and Retrieval If the volume fraction of interesting data is small, then we can reduce storage, memory, and network bandwidth requirements significantly by storing only what is “needed” We call a scheme that realizes this capability Adaptive Storage and Retrieval (FASTR) This is a new paradigm for scientific users, since they are reluctant to part with their data We use resources only to the extent that we require them: remote knowledge discovery and data browsing

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE The KeLP Project C++ run time libraries for parallel application & library development Hide low level details without sacrificing performance Irregular block structured data Express communication at a high level using intuitive geometric set operations Also applies to data intensive applications KeLP I/O: out of core (Bradley Broom, Rice)

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Data intensive application of KeLP KDistuf Turbulent flow with Direct Numerical Simulation Collaboration involving K. Nomura (UCSD MAE), W. Kerney and D. Shalit (UCSD CSE), G. Balls (UCSDSC), P. Diamessis (USC) Content-based data compression Borrow structured adaptive mesh refinement grid techniques to… Capture features at full resolution Discard remaining background data

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE More about the application Turbulent mixing in stably stratified flow under the influence of background shear Solve the incompressible Navier Stokes equations Follow the time evolution of regions of overturned dense fluid, which are the main agents of stirring and mixing “The efficiency of mixing in turbulent patches: inferences from direct simulations and microstructure observations,” in press, J. Phys. Ocean. Smyth, Moum, and Caldwell, 2001.

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Information discovery Oceanographic observations are incomplete: restricted to 1 dimensional observations Discovery: time evolution, energy dissipation and lifetime of overturn regions, which have irregular shapes Bill Smyth, Dept. Oceanic & Atmospheric Sciences, Oregon State University

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Fast Adaptive Storage and Retrieval Compression depends on the data, currently on Best case ~ 20:1 compression (10 GB  500 MB), worst case ~ 2.8:1 Lempel-Ziv (gzip) give us only 10%

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Further savings: another application Use volume tracking [Silver, Rutgers] to follow individual features FASTR permits us to extract only the data we need out of the many features present Computational volume: 2M pts Average feature size: 1K points Maximum feature size: 20K pts Saves additional two orders of magnitude in communication bandwidth requirements Perform local analysis on a workstation

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Future plans Develop remote analysis capability Integrate with DTF data handling infrastructure Larger scale simulations on Blue Horizon and on clusters: Study vortex pairs in a stratified turbulent environment Improved understanding of aircraft wake vortices Practical importance for air traffic control

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Remote analysis capability Perform analysis on data sets stored remotely, e.g. Data Cutter We can perform some data analysis on a local workstation For highly intensive data analysis, we can use higher end resources, but again we access only the data we need

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Publications and people FASTR is based on a research prototype called MOLD, which is the M.S. thesis research of UCSD CSE student William Kerney “MOLD: A System for Breaking Down Large Visualization and Post-Processing Problems.” Expected March Peter Diamessis, then a PhD student with Keiko Nomura (UCSD MAE Dept), used MOLD to carry out an exploration of overturns An Investigation of Vortical Structures and Density Overturns in Stably Stratified Homogeneous Turbulence by Means of Direct Numerical Simulation, P. Diamessis, PhD thesis, 2001 “Automated Tracking of Turbulent Structures in Direct Numerical Simulation,” P. Diamessis et al, PARA 2002, Helsinki, Finland. To appear.

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Software availability FASTR- contact us KeLP Hardened version of KeLP, AKA KeLP1.4 NPACI Blue Horizon, Sun HPC, Cray T3E, Linux clusters Workstations: Solaris, Linux, etc. Dual tier variant, KeLP2.1: hierarchical KeLP for SMP clusters and SMP based machines (e.g. BH)