r4112 adds scripts to run Apsim on the UQ cluster
UQ uses the PBSPro batch execution environment - which provides a job queue for your long running batch simulations. It has a larger number of users than we are used to, and execution time can vary widely depending on the number of researchers in the queue. Users need to get their own login from rcc. Access to the login nodes is via ssh or a web-based remote desktop. The login nodes are very slow (intentionally) - don't attempt to run apsim on them.
Rather than compiling apsim for the native OS (CentOS) of the cluster, we use http://singularity.lbl.gov/ to run apsim in a container architecture, avoiding the need to separately compile apsim and its support libraries. Running the container over NFS is more efficient than copying and unpacking the installation. Apsim execution times are the same whether compiled native or containered. Versioning is simple - one version per container.
Running jobs follows the old paradigm:
- with the GUI's "Run on Cluster", or the BundleApsim.exe program, create a zipfile and transfer it to a login node
- unpack it in a working folder and check the headers in Apsim.pbs. Execution time and/or memory requirements might need adjustment, they are monitored and the job will be terminated if it exceeds limits. Running this script will submit the job array to the system.
- your outputs will appear in a set of .tar.gz files. Standard error will be logged in a file named like "Apsim.e1234" (where 1234 is the id of the job) and standard output in "Apsim.o1234".
r4112 has scripts to a) build apsim in a docker environment, b) create singularity containers with a specific apsim version. There is a world readable image in /home/uqpdevo1/Apsim.latest.sapp maintaned by firstname.lastname@example.org