Checkpoint file is not written

GROMACS version: 2022.5
GROMACS modification: No
Hello,
I am using an Umbrella Sampling/Simulated Tempering strategy. I started to use gmx-2022.5 and I observed that the checkpoint is not longer written. On the log file is printed:

Expanded ensemble with the legacy simulator does not always checkpoint correctly, so checkpointing is disabled. You will not be able to do a checkpoint restart of this simulation. If you use the modular simulator (e.g. by choosing md-vv integrator) then checkpointing is enabled. See GROMACS 2022.3 has issues with checkpointing expanded ensemble simulations (#4629) · Issues · GROMACS / GROMACS · GitLab for details.

I went to the issue tracker, but I did not get in the end what is the best option in my case: go to a previews GROMACS version where the checkpoint was working (however with some bugs that I do not know if they have some repercussion in my case); or if there are some possible modifications that I could include in the MDP that helps to mitigate the issue and output the checkpoint. I am using the md-vv integrator. I do not understand what is the modular simulator and the legacy simulator referenced in the printed note.

This should not happen, the modular simulator should be chosen automatically.

You can force the of the modular simulator by setting the environment variable when running mdrun: GMX_USE_MODULAR_SIMULATOR=ON

Hi @hess . Thanks for your replay! I set it GMX_USE_MODULAR_SIMULATOR=ON on my job script and then I called mdrun as: gmx mdrun -nt 12 -cpi -stepout 5000 -v -deffnm production -px production_pullx -pf production_pullf >& production.lis. But still I am not getting the checkpoint file that should be production.cpt. Here are some fragments of my MDP file:

define                                   = -DPOSRES -DPOSRES_FC_BB=0.0 -DPOSRES_FC_SC=0.0 -DPOSRES_FC_LIPID=0.0 -DDIHRES -DDIHRES_FC=0.0 -DPOSRES_LIG=0.0
integrator                               = md-vv
dt                                       = 0.004
tinit                                    = 0
nsteps                                   = 37500000
nstcomm                                  = 100
nstxout                                  = 0
nstvout                                  = 0
nstfout                                  = 0
nstcalcenergy                            = 100
nstenergy                                = 5000
nstlog                                   = 18750
nstxout_compressed                       = 18750
(...)
; Simulated Tempering
free_energy                              = expanded
init_lambda_state                        = 0
nstdhdl                                  = 50
temperature_lambdas                      = 0.0 0.06666666666666667 0.13333333333333333 0.2 0.26666666666666666 0.3333333333333333 0.4 0.4666666666666667 0.5333333333333333 0.6 0.6666666666666666 0.7333333333333333 0.8 0.8666666666666667 0.9333333333333333 1.0
init_lambda_weights                      = 0.0 3881.13794 7695.36963 11455.08984 15156.50684 18796.63477 22372.41992 25899.44141 29367.31055 32777.19531 36129.94531 39438.44922 42699.17188 45909.59375 49072.89062 52181.53125
simulated_tempering                      = yes
simulated_tempering_scaling              = linear
sim_temp_low                             = 303.15
sim_temp_high                            = 333.15
nstexpanded                              = 100
lmc_stats                                = wang-landau
lmc_move                                 = metropolis
lmc_weights_equil                        = no
wl_scale                                 = 0.999999

Then I suppose you are doing something wrong in setting the environment variable. Unfortunately mdrun doesn’t print which simulator it is using.

Do you need to export the env var.

This is my script:

#!/bin/bash
#SBATCH --job-name=15_production
#SBATCH --output=myjob.out
#SBATCH --error=myjob.err
#SBATCH --nice=0
#SBATCH --nodes=1
#SBATCH --gpus=1
#SBATCH --cpus-per-task=12
#SBATCH --partition=deflt
#SBATCH --exclude=fang41,fang49
#SBATCH --time=2-00:00:00


# This block is echoing some SLURM variables
echo "Job execution start: $(date)"
echo "JobID = $SLURM_JOBID"
echo "Host = $SLURM_JOB_NODELIST"
echo "Jobname = $SLURM_JOB_NAME"
echo "Subcwd = $SLURM_SUBMIT_DIR"
echo "SLURM_TASKS_PER_NODE = $SLURM_TASKS_PER_NODE"
echo "SLURM_CPUS_PER_TASK = $SLURM_CPUS_PER_TASK"
echo "SLURM_CPUS_ON_NODE = $SLURM_CPUS_ON_NODE"

source /data/shared/spack-0.19.1/shared.bash
module load gromacs/2022.5

cd $(pwd)

export GMX_USE_MODULAR_SIMULATOR=ON
echo "GMX_USE_MODULAR_SIMULATOR = $GMX_USE_MODULAR_SIMULATOR"
gmx mdrun -nt 12 -cpi -stepout 5000 -v -deffnm production -px production_pullx -pf production_pullf  >& production.lis

The line

echo “GMX_USE_MODULAR_SIMULATOR = $GMX_USE_MODULAR_SIMULATOR”

is wrong. There must not be a white space before an after the “=” sign. In your line, you try to execute a command GMX_USE_MODULAR_SIMULATOR with two arguments “=” and “$GMX_USE_MODULAR_SIMULATOR”:

A=5 # correct
A = 5
-bash: A: command not found