Presentation is loading. Please wait.

Presentation is loading. Please wait.

Real-time Storm Surge Modeling in a Grid Environment Howard M. Lander

Similar presentations


Presentation on theme: "Real-time Storm Surge Modeling in a Grid Environment Howard M. Lander"— Presentation transcript:

1 Real-time Storm Surge Modeling in a Grid Environment Howard M. Lander howard@renci.org

2 Funded by NOAA & ONR Bedford Institute of Oceanography Virginia Institute of Marine Science University of Alabama, Huntsville Texas A&M Research Foundation Renaissance Computing Institute 2005/2006 SCOOP Implementation Team University of North Carolina University of Florida Louisiana State University Gulf of Maine Ocean Observing System MCNC Southeastern Universities Research Association External Resources e.g. SURAgrid regional grid infrastructure, www.sura.org/suragrid SCOOP: A Distributed Laboratory Credits: SCOOP Team

3 Acknowledgements Funding –“SURA Coastal Ocean Observing and Prediction (SCOOP) Program”, Office of Naval Research, Award N00014-04-1-0721, National Oceanic and Atmospheric Administration’s NOAA Ocean Service, Award NA04NOS4730254. SCOOP Partners –Philip Bogden (SURA and GoMOOS); Will Perrie, Bash Toulany (BIO); Charlton Purvis, Eric Bridger (GoMOOS); Greg Stone, Gabrielle Allen, Jon MacLaren, Bret Estrada, Chirag Dekate (LSU, Center for Computation and Technology); Gerald Creager, Larry Flournoy, Wei Zhao, Donna Cote and Matt Howard (TAMU); Sara Graves, Helen Conover, Ken Keiser, Matt Smith, and Marilyn Drewry (UAH); Peter Sheng, Justin Davis, Renato Figueiredo, and Vladimir Paramygin (UFL); Harry Wang, Jian Shen and David Forrest (VIMS); Hans Graber, Neil Williams and Geoff Samuels (UMiami); and Mary Fran Yafchak, Kate Barzee, Don Riley, Don Wright and Joanne Bintz (SURA), Rick Luettich (UNC-CH), Brian Blanton(SAIC), Dan Reed, Alan Blatecky, Lavanya Ramakrishnan, Gopi Kandaswamy, Ken Galluppi (RENCI), Steve Thorpe (MCNC) SCOOP and SURAGrid resource partners and system administrators –Steven Johnson (TAMU), Renato J. Figueiredo (UFL), Michael McEniry (UAH), Ian Chang-Yen (ULL), and Brad Viviano (RENCI), for providing valuable system administrator support

4 Outline Motivation Demo Scenario Grid Technologies –Grid Architecture –Resource Selection Portlet Tour

5 Motivation: Disaster Response An example close to home: North Carolina –most disasters are weather driven floods, winds, and ice Inadequate information –based on national and regional information High resolution model forecasting for local events –improves planning and preparation –shortens response and recovery time Credits: Ken Galluppi

6 Integrated Response System Hurricane Season 2005 –26 named storms, 14 hurricanes, 3 with major impact –billions of dollars economic losses SURA Coastal Observing and Prediction (SCOOP) Program –provide early and accurate forecasts, dissemination of information –to be able to interact in real-time i.e. evaluate and adapt –provide infrastructure to solve inter- disciplinary problems Today …

7 ADCIRC: Storm Surge Modeling Advanced Circulation Model (ADCIRC) –Finite Element Hydrodynamic Model for Coastal Oceans, Inlets, Rivers and Floodplains Scenarios –Daily operational 24/7/365 forecasts –Real-time ensemble model prediction –Retrospective analysis Assembling meteorological and other data sets for input –Multiple sources: U. Florida, NCEP, TAMU Hot-starting the model –NCEP 6 hour operational cycle –previous data is used to jumpstart the model run

8 Demo Scenario Multiple model runs -An ensemble of 11 input files for a single time period. -Plan is to go to 46 members for this year: We need help! -Each member of the ensemble represents a distinct forecast track for the storm. -Multiple model runs for each ensemble member. Data from Hurricane Katrina August 2005 -Generated on demand at the University of Florida for demo. -Ordinarily generated in response to storm activity in the Atlantic basin. Portal tracks activity and status in the demo –Status of compute resources –Status of input and output data. –Status of model runs.

9 [Resource Status] Site A LDM NAM UF-WANA NAH Resource Selection Application Coordinator Package Preparation Portal … 1.F 5 3 4 8.F 9...... WS-Messenger Broker Site C Resource Monitoring 1.H Site B [Wind data arrives -forecast] 7 6 10 [Output files are pushed out] [What is the best resource?] [Query site status] [Prepare the package for the resource] 2 [Get initialization files from archive or run model to generate hotstart file] [User initiates a model run] [Move the package, initiate the run] [Job finished, Move output files back] [Output files] 8.H [Model Status] Grid Architecture Visualization Wall MySQL 11

10 Technology Exposition Grid technologies (Globus) –standard job submission: Gatekeeper: used to dispatch and monitor jobs. –file transfer: GridFTP: used to move prepared package to resource and to retrieve results from resource. –queue status: Information Services/MDS: used as an input to the resource selection algorithm and displayed in a portlet. –credential repository: MyProxy: required for job submission. Domain products –Local Data Manager (ldm):event driven data transport system: used to receive input files and trigger model runs as well as to insert results. –OpenDAP: format independent network data access protocol.

11 Technology Exposition(2) Portal Technologies –NSF NMI Open Grid Computing Environment (OGCE): used to host the portlets. Eventing –LEAD WS-Messenger: enables data communication among pieces of the system. Example: the application coordinator sends status information through WS-Messenger. Web Services –Used to send job and resource status information from a MySQL database to the portlets. Also used to track flows of data files in the system. MySQL –Open source relation database used to store job and resource status information for display and analyses.

12 Application Coordinator Data Management –real-time data movement: LDM, GridFTP –previously generated files: SCOOP Catalog [UAH] and archive [TAMU, LSU] Application Preparation –conversion of data formats –self extracting archive containing binary –identify and retrieve or generate appropriate hotstart files Extensible –model parameters, template scripts and environment

13 Resource Selection Application Coordinator Globus Gatekeeper Globus GridFTP Globus MDS Globus Gatekeeper Globus GridFTP … Site A Site Z Network Weather Service a) Query queue status (free CPUs, length of queue) b) Query bandwidth c) Query current jobs Submit Job Move self extracting file Job status Move output files Globus MDS Network Weather Service MyProxy Obtain credential Resource Selection MySQL

14 Fault Tolerance and Recovery Verify correct operation of basic Grid services Implemented two phase fault recovery –Retry the failed step –Move back one step (e.g. may need to run on different resource) Proactive Monitoring and notification –Using WS-Messenger and Broker

15 Experiences from 2005 & 2006 Murphy’s Law –"If anything can go wrong, it will" –debugging is hard Resource selection –bandwidth, resource –performance, reliability –fault tolerance –failure recovery Model specifics –verification of model results Left: ADCIRC max water level for 72 hr forecast starting 29 Aug 2005,driven by the "usual, always- available” ETA winds. Right: ADCIRC max water level over ALL of UFL ensemble wind fields for 72 hr forecast starting 29 Aug 2005, driven by “UFL always-available” ETA winds. Images credit: Brian O. Blanton, SAIC

16 Conclusions and Future Work Foundation for a highly reliable distributed Grid environment for critical applications Upgrade path to OGCE2 and Globus 4.0 –Early work has been done to port to OGCE2 –Use Globus 4.0 MDS triggering Application to other environments –North Carolina Forecasting System –Package standard web services for resource selection and fault tolerant application co- ordination More sophisticated resource selection –Use historical and data from concurrent runs to make selections.

17 Portal Tour https://portal.scoop.sura.org/gridsphere End of talk!

18 More Information SCOOP –http://scoop.sura.orghttp://scoop.sura.org RENCI Projects –NCFS http://www.renci.org/projects/indexdr.php –SCOOP http://www.renci.org/projects/scoop.php http://www.scoop.unc.edu SURAGrid –https://gridportal.sura.org/https://gridportal.sura.org/

19 Design Principles Scalable real-time system –multiple large scale simulations in parallel –based on Grid technologies and standards Modular, Extensible –apply in context of other domains Adaptable –criticality of the application –variability in grid environments Framework –real-time discovery of available resources –managing the model run on an ad-hoc set of resources –continuous monitoring and adaptation active monitoring, fault tolerance, failure recovery

20

21 Portal: Monitoring

22 Resource Pool Management Resources –Local: RENCI, MCNC –SURAGrid: TAMU, ULL, etc –SCOOP Partners: UAH, UFL Software –Globus Services – GridFTP, GRAM, MDS –NWS Configuration –Resources Expansion using property files –Automated test suite to check periodically

23 Portal: Hindcast Mode Select Run Dates And Model Details

24

25


Download ppt "Real-time Storm Surge Modeling in a Grid Environment Howard M. Lander"

Similar presentations


Ads by Google