FAMOUS

This page is dedicated to running FAMOUS using reconfigurations although it might help in more general cases. Please refer to general and introductory gudes and notes if you haven’t done so.

A large part of information written in this page is based on the answers to my questions by Robin Smith and Annette Osprey. Lots of thanks to both of them.

This page is meant to be as foolproof as possible and so inevitably lengthy. Don’t tell me to tidy it up.



Contents


0. set environmental constants


1. Reconfigure Atmosphere



2. Reconfigure Ocean



3. Compile


4. Run



5. Submit a continuation run




0. set environmental constants

You do not have to do these at all, but you will need to know that I did these to understand the rest of this page.



1. Reconfigure Atmosphere

1.1. UMUI: [submodel indep]-[compil+modif]-[Compile Options]

1.2. UMUI: [submodel indep]-[compil+modif]-[mod for reconfig]

1.3. UMUI: [submodel indep]-[Job submission, resources…]

1.4. UMUI: [submodel indep]-[Gen config control]

1.5. UMUI: [Atmos]-[Ancil+input]-[start dump]

1.6. UMUI: [Ocean]-[input files]-[start dump]

1.7. UMUI: press <save> and then <process>


1.8. manually post process on puma if it is necessary for you (see section [2] of NoteFamousQuest)

  jobid=xxxxx	#dont forget setting jobid first!!!!
  echo $jobid	#chk jobid!

  /home/famous/bin/he_namelist_new_phase5 $jobid
  /home/famous/bin/vfdrift_pp.sh $jobid
  /home/famous/bin/quest_queue.sh $jobid

1.9. Submit

  UMUI: press [SUBMIT] ,
  puma; ~jeff/bin/umsubmit -h quest-hpc.bris.ac.uk -u $USER -r scp $jobid , OR
  puma; clustersubmit -c n -s y -r quest $jobid

1.10. Save the original astart file


1.T. trouble shooting

(1) in my initial attempts, a $jobid.astart file was not created. .leave file looked as normal in terms of the size. However, looking in the .leave file revealed some problems. “Completion code” was 134 and I suppose anything other than 0 suggests an error. just above that there are lines like;

Abort
/exports/gpfsbig/um/PUM 64?/um/vn4.5/scripts/qsprelim: Error in dump reconfiguration - see OUTPUT

so clearly there was an error. near the end of the file, I found what was wrong;

ERROR : Reconfiguration CONTROL
No of land points in output Land-sea mask = 770
No of land points specified in namelist RECON = 836
Please reprocess the job with the correct number of land points in UMUI panel

(2) (1) is fixed and .astart file of larger size (2064384B and later found this is not quite the right size; the right size is 2080768B) is created, which looks ‘mostly’ alright, but .leave file still doesn’t seem quite happy;

Abort
/exports/gpfsbig/um/PUM 64?/um/vn4.5/scripts/qsprelim: Error in dump reconfiguration - see OUTPUT
Completion code : 134
…………<skipped>…………
No Sea Ice Temperature in input dump
Sea Ice Temperature being initialised.
Processing Field 119 Stash Code= 408 : SEA-ICE SURFACE TEMP AFTER TIMESTEP
Warning - non-constant polar row for field 119
Problem with reading T* field.
Error detected in subroutine CONTRO Lcontrol 1?.f^@^@^@^@^@^@^@^@^@^@^@^@^@TRANSPLANTING DATA ^T^@
^@^@¥^G^@^@¦^G^@^@§^G^@^@¨^G
while doing I/O on unit 21

this problem is at least circumvented (may or may not be essentially fixed) by going through the following procedure;

original:
              IF (ICODE.GT.0) THEN
                WRITE (6,*) ' Problem with reading T* field.'
                CALL ABORT_IO ('CONTROL',CMESSAGE,ICODE,NFTOUT)
              ENDIF
changed:
              IF (ICODE.GT.0) THEN
                if (ICODE.eq.1501) then !!!
                  write(6,*) ' Polar rows not constant in T*.' !!!
     &                     //' This is probably not a problem.' !!!
                else !!!
                  WRITE (6,*) ' Problem with reading T* field.'
                  CALL ABORT_IO ('CONTROL',CMESSAGE,ICODE,NFTOUT)
                end if !!!
              ENDIF
quest1:/exports/gpfsbig/home/$USER/DUMP2HOLD/um/$jobid/code/exec_build/qxrecon_dump_dir% cp qxrecon_dump qxrecon_dump.org
quest1:/exports/gpfsbig/home/$USER/DUMP2HOLD/um/$jobid/code/exec_build/qxrecon_dump_dir% make
quest1:/exports/gpfsbig/home/$USER/DUMP2HOLD/um/$jobid/code/exec_build/qxrecon_dump_dir% ls -ltr
…………………………………………
-rwxr-xr-x 1 $USER users 1380231 Feb 8 17:08 qxrecon_dump.org
-rwxr-xr-x 1 $USER users 1380423 Feb 8 17:15 qxrecon_dump
quest1:/exports/gpfsbig/home/$USER/DUMP2HOLD/um/$jobid/code/exec_build/qxrecon_dump_dir% cp qxrecon_dump /exports/gpfsbig/home/$USER/DUMP2HOLD/um/$jobid/dataw/
quest1:/exports/gpfsbig/home/$USER/DUMP2HOLD/um/$jobid/code/exec_build/qxrecon_dump_dir% cd /exports/gpfsbig/home/$USER/DUMP2HOLD/um/$jobid/dataw
quest1:/exports/gpfsbig/home/$USER/DUMP2HOLD/um/$jobid/dataw% mv $jobid.recon $jobid.recon.old
quest1:/exports/gpfsbig/home/$USER/DUMP2HOLD/um/$jobid/dataw% mv qxrecon_dump $jobid.recon
quest1:/exports/gpfsbig/home/$USER/DUMP2HOLD/um/$jobid/dataw% ls -l
…………………………………………
-rwxr-xr-x 1 $USER users 1380423 Feb 8 17:15 $jobid.recon
-rwxr-xr-x 1 $USER users 1380231 Feb 7 17:36 $jobid.recon.old

Doing these seems to fix the problem and a trouble free .astart file is created, and it has been proved that the job using this .astart file runs successfully.

(3) I was successful in the previous job but in the new job .astart file is not created. This seems a different problem from (1) or (2). .leave file says;

/exports/gpfsbig/um/PUM 64?/um/vn4.5/scripts/qsprelim[779]: /exports/gpfsbig/work/bristol/$USER/um/$jobid/dataw/ $jobid.recon: not found
/exports/gpfsbig/um/PUM 64?/um/vn4.5/scripts/qsprelim: Error in dump reconfiguration - see OUTPUT
*****************************************************************
Ending script : qsprelim
Completion code : 127
This happened because $jobid.recon was not created, and that might be because you copied the previous job to make the current job and did not undo the setting made in (2). Now I added this in the procedure (1.2) you are less likely to experience this problem. Anyway, here is how to tackle this problem;

[If you had a different kind of trouble in atmospheric reconfiguration and resolved it, please add the information about it here (or anywhere else) and share it with other users.]


2. Reconfigure Ocean

If started from atmospheric reconfiguration 1–3 should already be set correctly. In that case start from 4.

2.1. UMUI: [submodel indep]-[compil+modif]-[Compile Options]

2.2. UMUI: [submodel indep]-[Job submission, resources…]

2.3. UMUI: [submodel indep]-[Gen config control]

2.4. UMUI: [Atmos]-[Ancil+input]-[start dump]

2.5. UMUI: [Ocean]-[input files]-[start dump]

2.6. UMUI: press <save> and then <process>

2.7. manually post process on puma if necessary (see 1.8)

2.8. Submit

  UMUI: press [SUBMIT] ,
  puma; ~jeff/bin/umsubmit -h quest-hpc.bris.ac.uk -u $USER -r scp $jobid , OR
  puma; clustersubmit -c n -s y -r quest $jobid

2.9. Save the original ostart file

  echo $jobid	#chk jobid!
  cp $jobid.ostart $jobid.ostart.ini

2.T. trouble shooting in ocean reconfiguration

(1) $jobid.ostart has only about 2.8MB and completion code in the .leave file is 134.
The completion code suggest it was clearly unsuccessful. Near the bottom found a line indicating what was wrong.

  *ERROR* Stash code  103 not found on input file

If you look at stash (UMUI:[Ocean]-[STASH]-[STASH. Specification…]) you will see what is 103 for (it is ‘OCN EXTRASER 1: CONVEN TCO2′).

This error occurred because the ocean chemistry was turned on in the model but was not in the ocean start dump file. If it is Ok to turn off ocean chemistry, do so in UMUI:[Ocean]-[Scientific Parameters]-[Carbon Cycle]. Also disable extra tracers in UMUI:[Ocean]-[Tracers]-[User Defined Tracers]. Then go back to 2.6 and try again. I got a .ostart of 7274496B, which is much smaller than the case with ocean chemistry turned on (~15MB), but this turned out to be correct.

If you do need the chemistry, you’ll need some fields compatible with the rest of the restart.

[If you had a different kind of trouble in ocean reconfiguration and resolved it, please add the information about it here (or anywhere else) and share it with other users.]


3. Compile

3.1. UMUI: [submodel indep]-[compil+modif]-[Compile Options]

3.2. UMUI: [submodel indep]-[Job submission, resources…]

3.3. UMUI: [submodel indep]-[Gen config control]

3.4. UMUI: [Atmos]-[Ancil+input]-[start dump]

3.5. UMUI: [Ocean]-[input files]-[start dump]

3.6. UMUI: press <save> and then <process>

3.7. manually post process on puma if necessary (see 1.8)

3.8. Submit

  UMUI: press [SUBMIT] ,
  puma: ~jeff/bin/umsubmit -h quest-hpc.bris.ac.uk -u $USER -r scp $jobid , OR
  puma; clustersubmit -c n -s y -r quest $jobid

4. Run

4.1. UMUI: [submodel indep]-[compil+modif]-[Compile Options]

4.2. UMUI: [submodel indep]-[Job submission, resources…]

4.3. (same as 3.3) UMUI: [submodel indep]-[Gen config control]

4.4. (same as 3.4) UMUI: [Atmos]-[Ancil+input]-[start dump]

4.5. (same as 3.5) UMUI: [Ocean]-[input files]-[start dump]

4.6. UMUI: press <save> and then <process>

4.7. manually post process on puma if necessary (see 1.8)

4.8. Submit

  UMUI: press [SUBMIT] ,
  puma: ~jeff/bin/umsubmit -h quest-hpc.bris.ac.uk -u $USER -r scp $jobid , OR
  puma; clustersubmit -c n -s y -r quest $jobid
qsub: invalid option ? x
qsub: invalid option ? s
usage: qsub [-a date_time] [-A account_string] [-b secs]
[-c c[=‹INTERVAL?] ] [-C directive_prefix] [-d path] [-D path]
[-e path] [-h] [-I] [-j oe] [-k {oe}] [-l resource_list] [-m {abe}]
[-M user_list] [-N jobname] [-o path] [-p priority] [-q queue] [-r y|n]
[-S path] [-u user_list] [-X] [-W otherattributes=value?] [-v variable_list]
[-V ] [-z] [script]

If UMUI [SUBMIT] button is clicked or umsubmit is used

If clustersubmit is used

4.9. Run

4.10. check the status of the run on quest

  qstat		#check all jobs runing on quest
  qstat -u $USER	#check your jobs only@]

4.T. trouble shooting in running the model

(1) Run stopped after couple tens of seconds. Near (not quite at) the end of the output (~/umui_runs/[jobid]−012345678/[jobid]000.o1234) said “LSEGF NOT LARGE ENOUGH.”

Robin says: This comes from the subroutine that sorts out some basic stuff for the fourier filtering of the high latitude lines. It looks like the relevant loop goes through the map line by line, looking for separate areas. The maximum number of areas allowed for this procedure is set in the UMUI - it was 6, but the map has one line that needs 7.

(2) Run stopped after tens of years of simulation time
As always look in the output file ([jobid]000.o1234 or [jobid]***.leave). I got the following message;

  Model aborted with error code -    1 Routine and message:-
                          P_TH_ADJ : NEGATIVE PRESSURE VALUE CREATED.

This is a common error and I have had it many times. It seems the simulated climate went unstable and the simulation died.

Robin says: Often this means that something in the boundary conditions has pushed the climate past the model’s capability - either too hot, or too cold somewhere. Carefully check the model output fields and look for things that are out of place. Sometimes, however, this seems to happen in FAMOUS with perfectly normal climates - we’ve never worked out why.

In the previous versions of FAMOUS, this problem can be overcome simply by resubmitting the job. However, this version is bit-reproducing, so simply resubmitting the job will result in exactly the same error.

Robin continues: (If there is nothing wrong in the climate,) make a small perturbation to the climate and restart, and it’ll run fine. The easiest way to do this is to reconfigure the atmosphere dump (this adjusts some of the coastal tiling fields), although sometimes I change the date on the previous year’s ocean dump (again, by reconfiguring) and use that to restart.

So basically what you can do is use the latest atmospheric and ocean dumps as initial dumps, reconfigure, and run again;

[If you had a different kind of trouble in running the model and resolved it, please add the information about it here (or anywhere else) and share it with other users.]


5. Submit a continuation run

I have heard of a few ways to submit a continuation run.

Common for all methods

Do 5.1 and 5.2 if necessary.

5.1. UMUI: [submodel indep]-[start date + run length options]

5.2. repeat 4.1~4.5.

Then do one of (1)~(4).

(1) Use clustersubmit

  puma; clustersubmit -c y -s y -r quest $jobid
Here the flag -c is to specify whether this is a continuation run (y) or not (n). (-s is to specify whether to submit (y) or copy files over (n), and -r is to specify the target machine (e.g. quest, ormen, etc.))

(2) Modify qsubmit.quest#

5.3. modify qsubmit.quest#

5.4. submit to run

(3) Do the same thing but on puma

(4) Add a post processing script

By doing this UMUI will automatically do the same thing as (3).

There is also some information about resubmission in the NCAS-CMS website.

Page last modified on May 01, 2008, at 04:48 PM by Masaru Yoshioka