Download presentation
Presentation is loading. Please wait.
Published byYanti Sudirman Modified over 6 years ago
1
PyStormTracker: A Parallel Object-Oriented Cyclone Tracker in Python
Albert Yau School of Marine and Atmospheric Sciences Stony Brook University Kevin Paul and John Dennis Application Scalability and Performance (ASAP) Group Computational and Information Systems Laboratory National Center for Atmospheric Research 11/17/2018
2
Outline Scientific Background Motivation and Objectives Implementation
Sample Output Benchmarks Summary and Future Work 11/17/2018
3
What are cyclones? A figure from a nineteenth century geography text showing storm frequency distribution (Hinman, 1888), adapted from Chang et al. (2002) 11/17/2018
4
Scientific Background
Cyclones are synoptic scale (~1000km) moving low pressure systems Extra-tropical cyclones are responsible for much of the day-to-day weather variability and extreme weather events at mid-latitudes Evolution of an extra-tropical cyclone as in the Norwegian Cyclone Model. (Source: weather.gov) 11/17/2018
5
Motivation Tracking extra-tropical cyclones is challenging (Neu et al., 2013): Cyclones vary greatly in size and shape, move with different velocities, split and merge over the lifetime Most current cyclone trackers are written in FORTRAN, the codes are not readily available Hard to modify, optimize, or compare existing cyclone trackers 11/17/2018
6
Objective 1 Implement a user and developer friendly cyclone tracker in Python using an object-oriented approach Flexible and extensible by inheriting or replacing the classes and methods Use existing scientific computing libraries such as NumPy and SciPy The source codes will be publicly available at GitHub 11/17/2018
7
Motivation Emergence of high resolution and multi-ensemble model data
CESM Large Ensemble project: 40 ensemble members from for CMIP5 historical and RCP8.5 scenarios Global Ensemble Forecast System (GEFS): up to 16 days of global forecasts from 21 ensemble members GEFS ensemble members Source: schumacher.atmos.colostate.edu 11/17/2018
8
Objective 2 To be able to process data from different datasets
GRIB: most commonly used in reanalysis and weather models: NCEP/NCAR Reanalysis and GEFS netCDF: a more recent standard commonly used in climate models: CMIP5 archive of more than 30 climate models PyNIO is chosen for file I/O Utilize parallel computing technique to accelerate processing of large amount of data MPI4Py as an interface to the MPI message passing library 11/17/2018
9
Implementation: Task Parallelism
grid grid centers tracks tracks (meta data only) Task 1 grid centers tracks Task 2 grid centers tracks (tree reduction) Task 3 grid centers tracks Task 4 grid centers tracks … Task n grid centers tracks 11/17/2018
10
Implementation: Classes
stormtracker.py RectGrid class: Rectilinear grid derived from Grid class Does not read data until actually required Split into smaller objects by time Tracks class: Input a list of Center Match centers to existing Tracks using nearest neighborhood method If matched, append to exisiting track; else append center as a new track Merge two Tracks objects stormtracker.py: The main controller for generating Grid, Center and Track objects Manage parallel processes from detector import RectGrid, Center from linker import Tracks If __name__ == “__main__”: grid = RectGrid(filepath=“…”) centers = grid.detect() tracks = Tracks() tracks.append_center(centers) Center class: Detect centers from Grid data using scipy.ndimage.filters Calculate great circle distances between centers detector.py linker.py class Grid(object): class Center(object): def __init__(self, time, lat, lon, var): def abs_dist(self, center): class RectGrid(Grid): def __init__(self, filepath, trange): def get_time(self): def get_var(self, trange): def split(self, num): class Tracks(object): def __init__(self): def match_center(self, centers): def append_center(self, centers): def merge_tracks(self, tracks): The code repository will be available at: 11/17/2018
11
Sample Output CESM Large Ensemble #31:
6-hourly PSL 1-degree resolution 192*288 grid Year 1990, 769 tracks Tracks deeper than 975hPa, moved 1000km and longer than 2 days 11/17/2018
12
Benchmark: CESM 1 year – 1460 time steps
Number of MPI Tasks Time Detector Linker Total 1 569.55 2 517.25 297.34 814.58 4 261.58 160.38 421.96 8 136.39 88.17 224.56 16 77.21 48.20 125.40 32 46.99 30.73 77.72 64 30.44 21.43 51.86 128 20.80 17.59 38.39 256 17.67 15.86 33.53 512 15.77 15.17 30.93 1024 15.63 15.20 30.83 11/17/2018
13
Benchmark: CESM 15 years – 22360 time steps
Number of MPI Tasks Time Detector Linker Total 16 749.45 32 537.75 475.40 64 272.01 337.67 609.69 128 143.80 264.06 407.86 256 80.64 226.78 307.42 512 47.35 209.60 256.95 1024 30.71 200.80 231.50 2048 23.16 197.29 220.45 4096 19.33 196.02 215.36 11/17/2018
14
Summary and Future Work
The cyclone tracker is implemented using an object-oriented approach, which is flexible and easily extended: For example: implement a CubedSphereGrid class; track tropical cyclones 11/17/2018
15
Summary and Future Work
We achieved reasonably good speedup in the detector, while the bottleneck is in the linker: Reduce the communication cost by preselect tracks before merging Instead of serializing (pickling) the objects, use buffered send/receive in MPI 11/17/2018
16
Summary and Future Work
Verify the quality of tracks: Compare the tracks with other well established cyclone trackers NCEP/NCAR Reanalysis: Count of cyclone tracks deeper than 975hPa, moved at least 1000km and existed longer than two days in the north Pacific winter, compared with Hodges (1999) tracker 11/17/2018
17
Thank you! Questions? Email: albert.yau@stonybrook.edu
The code repository will be available at: 11/17/2018
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.