The standard job xcpsa now b/cs across pe configs and restarts for a 3 day run with daily dumps. This has been tested on HPCx, quest and Hector with both PGI and pathscale compilers.
For bit-comparison the following changes are required:
For the quest cluster:
The old version of ctile_09_new_param contained a bug which mean polar values for heat flux through seaice, 3201, were overwritten with inconsistant values. 3201 is also a coupling field. The overwriting was caused by calls in BL_CTL to POLAR_UV for disgnostics which weren’t included in the job. The solution was to put some if statements around these calls.
Standard robustness tests involve varying the following:
Set the restart frequency to 1 day and the run length to 3 days. Then run two tests:
Compare dumps using cumf.
Set up a 1 day run with a 1 day restart dump. Then run the job on several arrangements of processors. It’s best to vary the EW number, NS number and overall number of processors, eg:
Compare dumps using cumf.
If fields don’t compare, set the model to write extra dumps at various points in the code to narrow down where the output first diverges. Then work out array points which differ and step through the code, writing out values.
Further testing
Some calculations are not done every timestep and so it can sometimes be useful to repeat these tests over longer periodds of time.