Presentation is loading. Please wait.

Presentation is loading. Please wait.

Efficient Visualization and Analysis of Very Large Climate Data Hank Childs, Lawrence Berkeley National Laboratory December 8, 2011 Lawrence Livermore.

Similar presentations


Presentation on theme: "Efficient Visualization and Analysis of Very Large Climate Data Hank Childs, Lawrence Berkeley National Laboratory December 8, 2011 Lawrence Livermore."— Presentation transcript:

1 Efficient Visualization and Analysis of Very Large Climate Data Hank Childs, Lawrence Berkeley National Laboratory December 8, 2011 Lawrence Livermore National Laboratory

2 Problem Statement Climate data is getting increasingly large (massive data!), both spatially and temporally Climate scientists want to process this data in lots of ways: –Analyze simulation recently run on their supercomputer –Download remote data to nearby supercomputer for analysis –Download remote data to desktop computer for analysis –Send analysis routines to remote machine location, results are returned.

3 VisIt is an open source, richly featured, turn- key application for large data. Used by: –Visualization experts –Simulation code developers –Simulation code consumers Popular –R&D 100 award in 2005 –Used on many of the Top500 –>>>100K downloads Developed by: –NNSA, SciDAC, NEAMS, NSF, and more 217 pin reactor cooling simulation Run on ¼ of Argonne BG/P Image credit: Paul Fischer, ANL 1 billion grid points / time slice

4 VisIt employs a parallelized client-server architecture. Client-server observations: –Good for remote visualization –Leverages available resources –Scales well –No need to move data remote machine Parallel vis resources User data localhost – Linux, Windows, Mac Graphics Hardware

5 VisIt recently demonstrated good performance at unprecedented scale. ● Weak scaling study: ~62.5M cells/core 5 #coresProblem Size ModelMachine 8K0.5TIBM P5Purple 16K1TSunRanger 16K1TX86_64Juno 32K2TCray XT5JaguarPF 64K4TBG/PDawn 16K, 32K1T, 2TCray XT4Franklin Two trillion cell data set, rendered in VisIt by David Pugmire on ORNL Jaguar machine VisIt’s data processing techniques are more than scalability at massive concurrency; we are leveraging a suite of techniques developed over the last decade by VACET, the NNSA, and others.

6 VisIt is used to look at simulated and experimental data from many areas. Fusion, Sanderson, UUtah Particle accelerators, Ruebel, LBNL Astrophysics, Childs Nuclear Reactors, Childs

7 Problem Statement Climate data is getting increasingly massive, both spatially and temporally Climate scientists want to process this data in lots of ways: –Analyze simulation recently run on their supercomputer –Download remote data to nearby supercomputer for analysis –Download remote data to desktop computer for analysis –Send analysis routines to remote machine location, results are returned. Mission accomplished?

8 General-purpose tools vs application-specific tools Are made specifically to solve your problem: –Streamlined user interface –Application-specific analysis But, they have smaller user and developer communities, so they are: –Less robust –Less efficient algorithms –Smaller set of features Are developed by large teams, leading to: –Robustness –Efficient algorithms –Rich set of features General-purpose tools: Application-specific tools: But: –They aren’t streamlined for a given application area (lots of button clicks) –They don’t have application-specific methods So which do users want?

9 Amazing developments over the last decade… Very useful packages now available that make quick tool development possible: –Python, R, VTK, Qt But this doesn’t solve the large data issue. –… but tools now are available that do that as well: VisIt & ParaView Great idea: put all these products together into one tool (VisTrails). Users get: –The robustness, richness, and efficiency of large development efforts –Streamlined user interface & climate-specific analysis (via CDAT) This tool is UV-CDAT.

10 UV-CDAT UV-CDAT = Ultra-scale Visualization Climate Data Analysis Tools (UV-CDAT) Goal: robust tool, capable of doing powerful analysis on large climate data sets. Collaboration between two DOE BER teams One tool that incorporates many packages… –… including “VisIt” UV-CDAT Developers: Dean Williams (PI) Jim Ahrens, Andy Bauer, Dave Bader, Berk Geveci, Timo Bremer, Claudio Silva, David Partyka, Charles Doutriaux, Robert Drach, Emanuele Marques, Feiyi Wang, Jerry Potter, Galen Shipman, John Patchett, Phil Jones, Thomas Maxwell, and many more

11 Integrated UV-CDAT GUI: Project, Plot, and Variables View

12 P0 P1 P3 P2 P8 P7P6 P5 P4 P9 Pieces of data (on disk) ReadProcessRender Processor 0 ReadProcessRender Processor 1 ReadProcessRender Processor 2 Parallelized visualization data flow network P0 P3 P2 P5 P4 P7 P6 P9 P8 P1 Parallel Simulation Code Big data visualization tools use data parallelism to process data. This is a good approach for high resolution meshes … but climate data is often different This is a good approach for high resolution meshes … but climate data is often different

13 Parallelizing Processing Over Time Slices: Improving Performance For Climate Data Objective Progress & Results Impact VisIt’s parallel processing techniques were designed for single time slices of very high resolution meshes. We must adapt this approach for the lower resolution and high temporal frequency characteristic of climate data. We have modified VisIt’s underlying infrastructure to have a “parallelize over time” processing mode. We implemented a simple algorithm (“maximum value over time”) as a proof of concept Climate scientists with access to parallel resources for their data will be able to process their data significantly faster through parallelization. This software investment will enable other algorithms developed by the project to also be accelerated through parallelization. P0 P1 P2 VisIt parallelizing over a high resolution spatial mesh VisIt parallelizing temporally over a low resolution mesh P0 P1 P2 T=0 T=1 T=2 T=3 T=4 T=5 T=6 T=7 T=8 ConcurrencyAverag e Time Speedup 1480.0s- 2223.0s2.1X 4120.0s4.0X 862.0s7.8X 1629.0s16.5X 3215.5s31.0X 647.8s61.5X 1284.2s114X Scaling on 2130 time slices of NetCDF climate data (source: Wehner) This is just one aspect of the uniqueness of climate data … many more.

14 Success story: spatial extreme value analysis We were able to parallelize the spatial extreme value analysis: –Parallelization was both spatial and temporal –Parallelization was done with VisIt infrastructure, but analysis was done with R (via VisIt-R linkage) Ran on CCSM3.0, 100 year, daily precipitation data Near perfect scaling on 36,500 time slices Credit: Dave Pugmire (ORNL), Chris Paciorek (UC Berkeley)

15 Summary Visualization and analysis of massive data is possible –Other communities have built tools for this purpose –These tools are effective for: Local or remote data Modest to massive parallelization (& serial too!) Interactive or batch processing Our effort focuses on deploying VisIt and R as part of UV- CDAT. –VisIt is excellent with large data and has been adapted to work with climate data. –We also are collaborating on cutting edge analysis techniques with climate scientists

16 Example climate visualizations with VisIt


Download ppt "Efficient Visualization and Analysis of Very Large Climate Data Hank Childs, Lawrence Berkeley National Laboratory December 8, 2011 Lawrence Livermore."

Similar presentations


Ads by Google