Presentation on theme: "ARSC & IARC Report Juanxiong He 1,2 Greg Newby 1 1. Arctic Region Supercomputing Center, 2. International Arctic Research Center DOD/RACM Workshop, December."— Presentation transcript:
ARSC & IARC Report Juanxiong He 1,2 Greg Newby 1 1. Arctic Region Supercomputing Center, 2. International Arctic Research Center DOD/RACM Workshop, December 5-6 2008
Sketch Analyze CCSM4 and CPL7 code http://www.oc.nps.edu/NAME/he_cpl7andwrf.ppt Integrate WRFV3 into CCSM4 compilation successful all components initialization successful mapping initialization successful die at domain grid check
Integration Stragety Following the framework of sequential CCSM4 and CPL7, since they’re excellent! Don’t change any physical scheme and dynamical framework of WRFV3 as far as possible. Using macro CCSMCOUPLED and SEQ_MCT as the switches to control whether WRFV3 run alone or couple with ccsm4. Adding a module named atm_comp_mct into WRFV3 to corporate with CCSM4.
Outline atm_comp_mct PBL issue radiation Restart Parallel Time mechanism Compilation Case run
atm_comp_mct.F Public interface: atm_init_mct, atm_run_mct, atm_final_mct Private interface: atm_SetgsMap_mct atm_domain_mctatm_import_mct atm_export_mctatm_read_srfrest_mct atm_write_srfrest_mct Import the variables before the integration begins andexport the variables after integration finishes. As Tonysuggest, this issue should be tested in future! Grid distribution on the different processors Export and import state variables and flux read and write restart file
Planetary boundary layer issue All WRF PBL schemes need (1) momentum, heat and moisture similarity function at the lowest layer; (2) surface roughness length ; (3) surface temperature and humidity; (4) Bulk Richardson number. YSU also needs (5) wind at 10m. The coupler provides sensible heat, evaporation and wind stress. It means we must derive the above variables. ( constraint of flux conservation ) Surface roughness length is a very important variable for the air-sea interaction, especially at the high wind speed condition.
YSU (MRF) Typical popular efficient K profile method (Louis, 1979; Toren and Mahrt, 1986; Vogelezang and Holtslag, 1996; Noh et al., 2003) Counter-gradient transportation in the mix layer and penetration of entrainment flux at the inversion layer PBL height determined from bulk Richardson number Roughness length, Monin-Obukhov length, bulk Richardson number Primary equation
Computation I Friction velocity temperature scale Virtual temperature scale humidity scale Roughness length, Charnok equation (Beljaars, 1994) Monin-Obukhov length
Computation II Unstable condition Stable condition Bulk Richardson number (Louis, 1979)
Computation III Pressure point Halo region exchange u, v before every step computation halo_EM_couple_in_A.inc
List of changed files All the PBL-related computation completes in the subroutine atm_import_mct. Change gz1oz0, br, psim, psih from i1 type to state type in the Registry/Registry.EM Add the following entry into Registry.EM: halo HALO_EM_COUPLE_IN_A main 4:u_2,v_2 Change solve_em.F, module_first_rk_step_part1.F and module_first_rk_step_part2.F. It just delete the subroutine dummy related to gz1oz0, br, psim, psih
Atmosphere radiaton issue Use CAM package ( ra_lw_physics=3, ra_lw_physics=3 ) Add asdir, aldir, asdif and aldif as subroutine input variable (if not, set all of them equal to albedo). Release swndr, swvdr, swndf and swvdf as subroutine output variables. Unlock the CAM-related variables in the Registry.EM Add asdir, aldir, asdif, aldif, swndr, swvdr, swndf and swvdf into Registry.EM as state variables. Change module_first_rk_step_part1.F,module_radiation_driver.F and module_ra_cam.F to add thecorrespond dummies.
Restart issue Base on the restart mechanism in WRFV3 Add two new files into WRFV3 – wrf_restart_couplein.F and wrf_restart_coupleout.F Add codes and revise module_io_domain.F, output_wrf.F, input_wrf.F and module_io_wrf.F Enhance Registry framework : – Revise Registry tools: gen_wrf_io.c, reg_parse.c, data_.h and registry.h – Add a new attribute ‘c’ for the sake of reading and writing restart coupling file in Registry framework – Add several new entrys as rconfig type into Registry, such asio_form_restart_couple, rst_couple_inname andrst_couple_outname
parallel issue Most of mpi activties in WRF are undertaken in the “ local_communicator ” group Subroutine split_communicator in module_dm initialized the communicator firstly Block mpi initialization and transfer mpi communicator to WRF in subroutine split_communicator: wrf_set_dm_communicator( mpi_communicator_atm) Passing the communicator to WRF from ccsm4 in atm_init_mct: call seq_cdata_setptrs(cdata_a, ID=ATMID, mpicom=mpicom_atm, & gsMap=gsMap_atm, dom=dom_a, infodata=infodata) mpi_communicator_atm=mpicom_atm
List of changed files Revise and add new source codes intoWRFV3/frame/init_module.F,WRFV3/frame/module_io_quilt.F,WRFV3/external/RSL_LITE/module_dm.F add new file - module_atm_communicator.F into WRFV3/frame module module_atm_communicator integer, public :: mpi_communicator_atm end module module_atm_communicator Change WRFV3/frame/Makefile
Timer issue CCSM4 uses two sets of clocks, one is outside for driver, the other inside for the component. Both WRFV3 internal and CCSM4 driver timer base on ESMF. Most of their module, subroutine, function and variable are the same. Many minor differences between WRF and ESMF timer. Some have the same name with different function. Some have different name with the same function. Don’t change the framework of two Timers, just rename some of them to avoid the name and use- associated conflict and resolve the difference.
Resolve the differences To avoid the name confliction, change some functions and structure related CCSM4 in seq_ccsm_drv.F90 and atm_comp_mct.F : use ESMF_MOD, CCSM_Clock=>ESMF_Clock use ESMF_MOD, CCSM_time_initialize=>ESMF_initialize use ESMF_MOD, CCSM_time_finalize=>ESMF_finalize To avoid the function and subroutine use- associated confliction, rename all the ESMF modules in WRFV3 into *_WRF form. For example, ESMF.mod ->ESMF_WRF.mod.
list of changed files All the files under the directory external/esmf_time_f90 external/io_esmf/module_symbol_utils.F90, share/dfi.F Seq_ccsm_drv.F90
Other every minor changes to ESMF in CCSM4 Comment the line related to timeintchecknormalize inESMF_TimeIntervalget Seq_timeclockmgr_clockInit seems doesn’t deal with seq_timemgr_alarm_datestop and seq_timemgr_alarm_history in the real time condition corretly. It gives Interval ymd < 0, then seq_timemgr_clockPrint crashes. Change date=date-off to date=date+off in function get_curr_calday and get_curr_date in lnd component. It seems a typo. For a real time run, it gives the beginning julian day as 366.97……, then stops.
ATM outside clock structure (seq_timemgr_clockPrint) Clock = atm 2 (seq_timemgr_clockPrint) Start Time = 20010101 00000 (seq_timemgr_clockPrint) Curr Time = 20010101 00000 (seq_timemgr_clockPrint) Ref Time = 20010101 00000 (seq_timemgr_clockPrint) Stop Time = 20010106 00000 (seq_timemgr_clockPrint) Step number = 0 (seq_timemgr_clockPrint) Dtime = 1800 (seq_timemgr_clockPrint) Alarm = 1 seq_timemgr_alarm_run (seq_timemgr_clockPrint) Alarm = 2 seq_timemgr_alarm_stop (seq_timemgr_clockPrint) Alarm = 3 seq_timemgr_alarm_datestop (seq_timemgr_clockPrint) Prev Time = 00000000 00000 (seq_timemgr_clockPrint) Next Time = 20010106 00000 (seq_timemgr_clockPrint) Intervl yms = 9999 0 -1795851392 (seq_timemgr_clockPrint) Alarm = 4 seq_timemgr_alarm_restart (seq_timemgr_clockPrint) Alarm = 5 seq_timemgr_alarm_history (seq_timemgr_clockPrint) Prev Time = 00001130 00000 (seq_timemgr_clockPrint) Next Time = 99991201 00000 (seq_timemgr_clockPrint) Intervl yms = 9999 0 -1795851392
Compile issue Thread=1 in ccsm4 at present, no openmp. None of optimization and openmp in WRFV3, otherwise the compilation and link of seq_ccsm_drv may be subject to crash. Change Configure.wrf, top directory Makefile, share/Makefile, main/Makfile, frame/Makefile and other files related to set up compilation environmental variables in WRFV3 keep the Buildexe/cam.buildexe.csh name, but replace with all new content
#! /bin/csh –f …… # for atm_mct_comp and WRF compiling./compile em_real # prepare library file for seq_ccsm_drv compiling cd $CODEROOT/WRFV3/external/esmf_time_f90 ar ru $CODEROOT/WRFV3/main/libwrflib.a *.o …… # copy file to $LIBROOT …… cp -p main/libwrflib.a $LIBROOT/libatm.a # prepare namelist, parameter tables and initial dataset for WRF run … cp $CODEROOT/WRFV3/run/ETAMPNEW_DATA. cp$CODEROOT/WRFV3/run/namelist.input. cp $CODEROOT/WRFV3/run/wrfbdy_d01. cp $CODEROOT/WRFV3/run/wrfinput_d01. Cam.buildexe.csh
An dirty case Follow ccsm4 f4x5_g3x5 case WRFV3: 69X48X27, global, timestep=900s, 20010101:00-00-00 – 20010106:00-00-00 Successful compilation, all component have a successful initialization, most of coupling subroutines work. But the program die when running seq_domain_check_mct to check whether atmosphere and land grid be the same. The reason seems that atmosphere and land grid isn’t the same in WRF case. New remapping file must be generated by SCRIP. CCSM4 support each of component can have a different grid now. Code directory: /wrkdir/jhe/ccsm4 Case compilation, run and output directory: /wrkdir/jhe/case2
A snapshot of ccsm.log.081203-164400 ……… plat= 48 12 nextsw_cday = 1.000000000000000 --------------------- atm_init_mct finish ------------------ (seq_mct_drv) : Initialize lnd component 8 pes participating in computation for CLM ………Solver options (barotropic solver) ……..Preconditioner choice: diagonal …… eps= -438.8614196777344 data2= data1= n= Atm_init_mct finishs lnd_init_mct finishs Ocn begins integration Seq_domain_check_mct
A snapshot of lnd.log.081203-164400 All fields on history tape 1 are grid averaged Number of time samples on history tape 1 is 1 Output precision on history tape 1 = 2 hist_htapes_build Successfully initialized clm2 history files ------------------------------------------------------------ Successfully initialized the land model begin initial run at: nstep= 0 year= 2001 month= 1 day= 1 seconds= 0 ************************************************************ Attempting to read monthly vegetation data..... nstep = 0 month = 1 day = 1 ………. Successfully read monthly vegetation data for ……….
A snapshot of ocn.log.081203-164400 Constants used in this run: ………. stefan_boltzmann = 5.670000000000000E-08 W/m^2/K^4 latent_heat_vapor = 2.501000000000000E+06 J/kg latent_heat_fusion = 3.337000000000000E+09 erg/g ocn_ref_salinity = 3.470000000000000E+01 psu sea_ice_salinity = 4.000000000000000E+00 psu T0_Kelvin = 2.731500000000000E+02 K pi = 3.141592653589793E+00 End of initialization ================================================================= ======= ocn_init_mct: iday0 = 2 start_day= 1.
A snapshot of ice.log.081203-164400 read_global 15 0 -18764.24097909558 32433.52928192394 ice mask for dynamics read_global 15 0 0.000000000000000 1.000000000000000 Finished writing case2.cice.i.0000-01-01-00000.nc (ice_init_mct) idate from sync clock = 20010101 (ice_init_mct) tod from sync clock = 0 (ice_init_mct) resetting idate to match sync clock istep1: 0 idate: 20010101 sec: 0
Next step Merge with VIC and toward RACM. Generate new remapping files for atmosphere, ocean and land, testing. Test restart issue.