Gmx_mpi, PME and AWH

GROMACS version: 2021.4
GROMACS modification: No

I tried to conduct AWH simulation with gmx_mpi on slurm and it didn’t load and I got this error:
I got this…

Program: gmx mdrun, version 2021.5
Source file: src/gromacs/mdrunutility/multisim.cpp (line 67)
Function: std::unique_ptr<gmx_multisim_t> buildMultiSimulation(MPI_Comm, gmx::ArrayRef<const std::__cxx11::basic_string<char> >)

Feature not implemented:
Multi-simulations are only supported when GROMACS has been configured with a
proper external MPI library.

For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors

This was solved by Yuxuan Zhuang with:

module load gromacs/2021.4 gromacs=gmx_mpi

Then I struggle with -pme gpu setting, as the submission script is:
!/bin/bash

Submit to the tcb partition
#SBATCH -p tcb

The name of the job in the queue
#SBATCH -J awh_k

wall-clock time requested for this job
#SBATCH -t 24:00:00

Number of nodes and number of MPI processes per node
#SBATCH -N 4 --ntasks-per-node=4

Request GPU node(s) and two GPUs (per node)
#SBATCH -C gpu --gres=gpu:2

Output file names for stdout and stderr
#SBATCH -e "./job-%j.err"

#SBATCH -o "./slurm-%j.out"

The actual script starts here
echo $SLURM_ARRAY_TASK_ID

module unload gromacs
module unload openmpi

module load gromacs/2021.4 gromacs=gmx_mpi
module load openmpi/4.0

cd ${SLURM_SUBMIT_DIR}

mpirun -np 16 gmx_mpi mdrun -deffnm awh -cpi awh.cpt -multidir awh_1 awh_2 awh_3 awh_4 -nsteps -1 -px awh_pullx -pf awh_pullf -update gpu -v -pme gpu

The error message was:

Program: gmx mdrun, version 2021.4
Source file: src/gromacs/ewald/pme.cpp (line 894)
Function: gmx_pme_t* gmx_pme_init(const t_commrec, const NumPmeDomains&, const t_inputrec, gmx_bool, gmx_bool, gmx_bool, real, real, int, PmeRunMode, PmeGpu, const DeviceContext, const DeviceStream, const PmeGpuProgram, const gmx::MDLogger&)
MPI rank: 6 (out of 16)

Feature not implemented:
PME GPU does not support PME decomposition.

For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors

The diagnosis of the error is:

Quote : Paul Bauer : For the error, you can’t have multiple pme ranks when offloading PME work to the GPU

Solution : Provided by Magnus Lundborg:

“mpirun -np 4” if you’re running 4 awh walkers.

The expand on my answer concerning the PME GPU error:

To make sure PME can run on the GPU, use -npme 1 -pme gpu

I think this should get an issue in gitlab, as this is indeed confusing. If a user specifies -pme gpu without specifying the number of PME ranks, there should not be an error, but the choice -npme 1 should be selected.

The issue is now brought up and copy-paste to Gitlab for future user: