FAMOUS

Bit comparison

The standard job xcpsa now b/cs across pe configs and restarts for a 3 day run with daily dumps. This has been tested on HPCx, quest and Hector with both PGI and pathscale compilers.


Fixes

For bit-comparison the following changes are required:

For the quest cluster:

The old version of ctile_09_new_param contained a bug which mean polar values for heat flux through seaice, 3201, were overwritten with inconsistant values. 3201 is also a coupling field. The overwriting was caused by calls in BL_CTL to POLAR_UV for disgnostics which weren’t included in the job. The solution was to put some if statements around these calls.


Tests

Standard robustness tests involve varying the following:

  1. restart frequency
  2. number of processors

Restart Test

Set the restart frequency to 1 day and the run length to 3 days. Then run two tests:

Compare dumps using cumf.

Processor Test

Set up a 1 day run with a 1 day restart dump. Then run the job on several arrangements of processors. It’s best to vary the EW number, NS number and overall number of processors, eg:

Compare dumps using cumf.

Debugging

If fields don’t compare, set the model to write extra dumps at various points in the code to narrow down where the output first diverges. Then work out array points which differ and step through the code, writing out values.
Further testing

Some calculations are not done every timestep and so it can sometimes be useful to repeat these tests over longer periodds of time.

Page last modified on August 11, 2008, at 01:51 PM by annette