Download presentation
Presentation is loading. Please wait.
Published byEgbert Griffin Modified over 9 years ago
1
SCEC/CME Project - How Earthquake Simulations Drive Middleware Requirements Philip Maechling SCEC IT Architect 24 June 2005
2
GRIDS Center Community Workshop 2005 June 26, 20062 Southern California Earthquake Center Consortium of 15 core institutions and 39 other participating organizations, founded as an NSF STC in 1991Consortium of 15 core institutions and 39 other participating organizations, founded as an NSF STC in 1991 Co-funded by NSF and USGS under the National Earthquake Hazards Reduction Program (NEHRP)Co-funded by NSF and USGS under the National Earthquake Hazards Reduction Program (NEHRP) Mission:Mission: –Gather data on earthquakes in Southern California –Integrate information into a comprehensive, physics-based understanding of earthquake phenomena –Communicate understanding to end-users and the general public to increase earthquake awareness and reduce earthquake risk Core Institutions University of Southern California (lead) California Institute of Technology Columbia University Harvard University Massachusetts Institute of Technology San Diego State University Stanford University U.S. Geological Survey (3 offices) University of California, Los Angeles University of California, San Diego University of California, Santa Barbara University of Nevada, Reno Participating Institutions 39 national and international universities and research organizations http://www.scec.org
3
GRIDS Center Community Workshop 2005 June 26, 20063 Recent Earthquakes In California
4
GRIDS Center Community Workshop 2005 June 26, 20064 Observed Areas of Strong Ground Motion
5
GRIDS Center Community Workshop 2005 June 26, 20065 Simulations Supplement Observed Data
6
GRIDS Center Community Workshop 2005 June 26, 20066 SCEC/CME Project Goal: To develop a cyberinfrastructure that can support system-level earthquake science – the SCEC Community Modeling Environment (CME) Support: 5-yr project funded by the NSF/ITR program under the CISE and Geoscience Directorates Start date: Oct 1, 2001 SCEC/ITR Project NSF CISE GEO SCEC Institutions IRIS USGSISI SDSC InformationScienceEarthScience www.scec.org/cme
7
GRIDS Center Community Workshop 2005 June 26, 20067 SCEC/CME Scientific Workflow Construction A major SCEC/CME objective is the ability to construct and run complex scientific workflow for SHA 9000 Hazard Curve files (9000 x 0.5 Mb = 4.5Gb) Extract IMR Value Plot Hazard Map Lat/Long/Amp (xyz file) with 3000 datapoints (100Kb) Calculate Hazard Curves Gridded Region Definition IMR Definition ERF Definition Probability of Exceedence and IMR Definition GMT Map Configuration Parameters Define Scenario Earthquake Pathway 1 example
8
GRIDS Center Community Workshop 2005 June 26, 20068 SCEC/CME Scientific Workflow System
9
GRIDS Center Community Workshop 20059 SCEC/CME SRB-based Digital Library SCEC Community Library Select Receiver (Lat/Lon) Output Time History Seismograms Select Scenario Fault Model Source Model SRB-based Digital Library –More than 100 Terabytes of tape archive –4 Terabytes of on-line disk –5 Terabytes of disk cache for derivations
10
June 26, 2006GRIDS Center Community Workshop 2005 10 Component Library Workflow Template Editor (CAT) Workflow Template (WT) Query for data given metadata L. Hearn @ UBC K. Olsen @ SDSU Execution requirements I/O data descriptions COMPONENTS J. Zechar @ USC (Teamwork: Geo + CS) Domain Ontology Workflow Library Metadata Catalog Conceptual Data Query Engine (DataFinder) Data Selection D. Okaya @ USC Query for WT Workflow Instance (WI) Workflow Mapping (Pegasus) Executable Workflow Grid information services Grid Query for components INTEGRATED WORKFLOW ARCHITECTURE Engineer Tools
11
GRIDS Center Community Workshop 2005 June 26, 200611 SCEC/CME HPC Allocations SCEC/CME researchers have need and have access to significant High Performance Computing capabilities TeraGrid Allocations (April 2005 – March 2006) –TG-MCA03S012 (Olsen) 1,020,000 SUs –TG-BCS050002S (Okaya) 145,000 Sus USC HPCC Allocations –CME Group Allocations (Maechling) 100,000 SUs –Investigator Allocations (Li, Jordan) 300,000 SUs SCEC Cluster –Dedicated Pentium 4 16 Processor Cluster (102 GFlops)
12
GRIDS Center Community Workshop 2005 June 26, 200612 SCEC/CME TeraGrid Support TeraGrid Strategic Application Collaboration (SAC) greatly improved our AWM run-time on TeraGrid Advanced TeraGrid Support (ATS) for TeraShake 2 and CyberShake simulations SDSC Visualization Services support for SCEC simulations.
13
GRIDS Center Community Workshop 2005 June 26, 200613 Three Types of Simulations SCEC/CME supports widely varying types of earthquake simulations Each Simulation type creates it’s own set of middleware requirements Will Describe three examples and comment on their middleware implications and on computational system requirements: –Probabilistic Seismic Hazard Maps –3D Waveform Propagation Simulations –3D Waveform-based Intensity Measure Relationship
14
(1)Earthquake-Rupture Forecast (ERF) Probability of all possible fault-rupture events (M≥~5) for region & time span (2)Intensity-Measure Relationship (IMR) Gives Prob(IMT≥IML) for a given site and fault-rupture event Attenuation Relationships (traditional) (no physics) Full-Waveform Modeling (developmental) (more physics) Probabilistic Seismic Hazard Maps
15
GRIDS Center Community Workshop 2005 June 26, 200615 Example Hazard Curve Site: USCERF: Frankel-02 IMR: FieldIMT: Peak Velocity Time Period: 50 Years
16
GRIDS Center Community Workshop 2005 June 26, 200616 Probabilistic Hazard Map Calculations
17
GRIDS Center Community Workshop 2005 June 26, 200617 Characteristic of PSHA Simulations 10k Independent hazard curve calculations for each map calculations. –High throughput, not high performance, computing problem. 10k resulting files per map –Metadata saved for each file Short run times on each calculation –Overhead of starting up job is expensive. Would like to offer map calculations as service to SCEC users (who may not have an allocation)
18
GRIDS Center Community Workshop 2005 June 26, 200618 Middleware Implications High throughput scheduling –Well Suited to Condor Pool Bundling of short run-time jobs will reduce job startup overhead. Bundling of jobs useful for clusters execution. Metadata tracking with a RDBMS-based catalog system (e.g. Metadata Catalog System (MCS) and Replication Location Service (RLS) –Databases present installation and operational problems at ever site we request them Software support for interpreted language on Computational Clusters –Implemented in an interpreted programming language On-demand execution by non-allocated user
19
GRIDS Center Community Workshop 2005 June 26, 200619 3D Wave Propagation Simulations
20
GRIDS Center Community Workshop 2005 June 26, 200620 Characteristics of 3D Wave Propagation Simulations More physically realistic than existing PSHA but more computationally expensive. High Performance Computing, cluster-based codes 4D data calculations (time varying volumetric data) Output large volumetric data sets Physics limited by resolution of grid. Higher ground motion frequencies require denser grid. Double of density increases storage by factor of 8.
21
GRIDS Center Community Workshop 2005 June 26, 200621 Example: TeraShake Simulation Magnitude 7.7 earthquake on southern San Andreas Mesh of ~2 billion cubes, dx=200 m 0.011 sec time step, 20,000 time steps: 3 minute simulation Kinematic source (from Denali) from Cajon Creek to Bombay Beach –60 sec source duration –18,886 point sources, each 6,800 time steps in duration 240 processors at San Diego SuperComputer Center DataStar ~ 20,000 CPU hours, approximately 5 days wall clock ~ 50 Tbytes of output During execution “on-the-fly” graphics (…attempt aborted!) Metadata capture and storage in the SCEC digital library
22
GRIDS Center Community Workshop 2005 June 26, 200622 Domain Decomposition For TeraShake Simulations
23
GRIDS Center Community Workshop 2005 June 26, 200623 Simulations Supplement Observed Data
24
GRIDS Center Community Workshop 2005 June 26, 200624 Peak Velocity NW-SE RuptureSE-NW rupture
25
SCEC/CME25 Montebello: 337 cm/s Downtown: 52 cm/s Long Beach: 48 cm/s San Diego: 8 cm/s Palm Springs: 36 cm/s Montebello: 8 cm/s Downtown: 4 cm/s Long Beach: 9 cm/s San Diego: 6 cm/s Palm Springs: 23 cm/s SE-NW NW-SE
26
GRIDS Center Community Workshop 2005 June 26, 200626 Break-down of output Full volume velocities every 10th time step43.2Tb Full surface velocities every time step1.1Tb Checkpoints (restarts) every 1,000 steps3.0Tb Input files, etc0.1Tb
27
GRIDS Center Community Workshop 2005 June 26, 200627 Middleware Implications for 3D Wave Propagation Simulations Multi-day high performance runs –Check point restart support needed Schedule reservations on clusters –Reservations and special queues are often arranged. Large file and data movement –TeraByte transfers require high reliably, long term, data transfers Ability to stop and restart –Can we move restart from one system to another Draining of temporary storage during runs –Storage required for full often exceeds capability of scratch, so output files must be moved during simulation
28
GRIDS Center Community Workshop 2005 June 26, 200628 Middleware Implications for 3D Wave Propagation Simulations On the fly visualization for rapid validation of results –Verify before full simulation is completed Standard protocols for data transfers, and metadata registration into SRB-based storage
29
(1)Earthquake-Rupture Forecast (ERF) Probability of all possible fault-rupture events (M≥~5) for region & time span (2)Intensity-Measure Relationship (IMR) Gives Prob(IMT≥IML) for a given site and fault-rupture event Attenuation Relationships (traditional) (no physics) Full-Waveform Modeling (developmental) (more physics) Waveform-based Intensity Measure Relationship (CyberShake)
30
Intensity-Measure Relationship List of Supported IMTs List of Site-Related Ind. Params IMT, IML(s) Site(s) Rupture Attenuation Relationships Simulation IMRs exceed. prob. computed using a suite of synthetic seismograms Vector IMRs compute joint prob. of exceeding multiple IMTs (Bazzurro & Cornell, 2002) Multi-Site IMRs compute joint prob. of exceeding IML(s) at multiple sites (e.g., Wesson & Perkins, 2002) Various IMR types (subclasses) Gaussian dist. is assumed; mean and std. from various parameters
31
GRIDS Center Community Workshop 2005 June 26, 200631 CyberShake Simulations Push Macro and Micro Computing CyberShake requires large forward wave propagation simulations, volumetric data storage CyberShake requires 100k seismogram synthesis computations using multi-Terabyte volumetric data sets. During synthesis processing, this data needs to be disk-based. 100k of data files, and metadata, files to be managed High throughput requirements are driving implementation toward TeraGrid wide computing approach. High throughput requirements are driving integration of non-TeraGrid grids with TeraGrid
32
GRIDS Center Community Workshop 2005 June 26, 200632 Example CyberShake Region (200km x 200km) USC: 34.05,-118.24 minLat=31.889, minLon=-120.60, maxLat=36.1858, maxLon=-115.70
33
GRIDS Center Community Workshop 2005 June 26, 200633 CyberShake Strain Green Tensor AWM Large (TeraShake Scale) forward calculations for each site. –SHA typically ignore rupture > 200km from site, so this is used as cutoff distance. –20km buffer distance is used around edges of volume to reduce edge effects –65km depth to support frequencies of interest –Volume is 440km x 440km x 65km at 200m spacing 1.573 Billion mesh pts Simulation time 240 seconds –Volumetric Data Saved for 2 horizontal simulations Estimated Storage per site is: 7 TB (4.5 data 2.5 checkpoint files)
34
GRIDS Center Community Workshop 2005 June 26, 200634 Ruptures in ERF within 200KM of USC 43227 Ruptures in Frankel02 ERF with M 5.0 or larger within 200km of USC
35
GRIDS Center Community Workshop 2005 June 26, 200635 CyberShake Computational Elements
36
GRIDS Center Community Workshop 2005 June 26, 200636 CyberShake Seismogram Synthesis Requires calculation of 100,000+ seismogram for each site. Estimate Rupture Variations scale by magnitude: –Mw 5.0 x 1= 20,450 –Mw 6.0 x 10= 216,990 –Mw 7.0 x 100= 106,900 –Mw 8.0 x 1000= 9,000 ------------------ 353,340 Ruptures x 2 components Current estimated number of seismogram files per site is 43,000 (due to combining components and variations into single file per rupture).
37
GRIDS Center Community Workshop 2005 June 26, 200637 CyberShake Seismogram Synthesis Seismogram synthesis stage requires disk-based data storage of large volumetric data sets so tape based archive of volumetric data sets does not work. To distribute seismogram synthesis across TeraGrid, we need to either duplicate TB of data, or have global visibility on disks systems
38
GRIDS Center Community Workshop 2005 June 26, 200638 Example Hazard Curve Site: USCERF: Frankel-02 IMR: FieldIMT: Peak Velocity Time Period: 50 Years
39
GRIDS Center Community Workshop 2005 June 26, 200639 Workflows run Using Grid VDS Workflow Tools
40
GRIDS Center Community Workshop 2005 June 26, 200640 Examples Hazard Map Region (50km x 50km at 2km grid spacing = 625 sites) OpenSHA SA 1.0 Frankel 2002 ERF and Sadigh with 10% POE in 50 years.
41
GRIDS Center Community Workshop 2005 June 26, 200641 Summary of SCEC Experiences As soon as we develop a computational capability, the geophysicists develop application that push the technology. –Compute technology, data management technology, resource sharing technology all are applied. In many ways, IT capabilities required for geophysical problems exceed what is currently possible and limit the state of knowledge in geophysics and public safety. –For example, higher frequency simulations, are of significant interest, but exceed computational and storage capabilities currently available.
42
GRIDS Center Community Workshop 2005 June 26, 200642 Major Middleware related issues for SCEC/CME Security and Allocation Management No widely accepted CA makes adding organizations to SCEC grid problematic. Ability to run under group allocations for “on demand” requests. (Community Allocation ?)
43
GRIDS Center Community Workshop 2005 June 26, 200643 Major Middleware related issues for SCEC/CME Software Installation and Maintenance Middleware software stack, even at supercomputer systems, support should include micro jobs support such as Java. Database management support for database-oriented tools such as Metadata Catalogs are important (backup, recovery, cleanup, performance, modifications) Guidelines for tools in middleware software stack, should describe when local installations are required and when remote installations are acceptable for tools such as RLS and MCS
44
GRIDS Center Community Workshop 2005 June 26, 200644 Major Middleware related issues for SCEC/CME Supercomputing and Storage Globally (TeraGrid – wide) visible disk storage Well supported, reliable file transfers with monitoring and restart of jobs with problems are essential. Interoperability between grid tools and data management tools such as SRB must include data and metadata and metadata search.
45
GRIDS Center Community Workshop 2005 June 26, 200645 Major Middleware related issues for SCEC/CME Scheduling Issues Support for Reservation-based scheduling Partial run and restart capability Failure detection and alerting
46
GRIDS Center Community Workshop 2005 June 26, 200646 Major Middleware related issues for SCEC/CME Usability Related and Monitoring Monitoring tools that include status of available storage resources. On-the-fly visualizations for run-time validation of results Interfaces to workflow systems are complex, developer oriented interfaces. Easier to user interfaces needed
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.