Gromacs run time flags for MPI and GPU

GROMACS version: 2021.3
GROMACS modification: No

Hi All. I am seeking a set of run time flags for Gromacs 2021.3 on a Linux workstation. This machine has a 32 core CPU (64 threads) and a (currently low spec) GPU. I want to achieve optimum performance by running the non-bond and/or electrostatics calculations on the GPU and other elements of the calculations on the CPU. I do not want to run the CPU too ‘hot’ so I wish to stick to (say) 16 cores (32 threads).

These flags do seem to use the GPU, but uses >90% CPU

MPFLAGS=> -ntmpi 8
GPUFLAGS=> -nb gpu -pme cpu

How can I tune these flags for a bit less CPU while keeping the GPU busy?
There are lots of parameters, that all interact, so any simple explanation is helpful.


have you tried to trust your standard GPU install to decide the spit itself? what do you get?

Good idea, I assume it must be complicated so I tried to figure out all the flags. I have just tried running with no MPI or GPU related flags. By default it seems to use 64 MPI threads and hence 64 cores, making the CPU very hot. The CPU will overheat and the calculations fail if I leave the job running for many hours in this state. As far as I can tell there is no GPU usage unless the flags are specified.

I have just tried this run time command, which seems to do what I want.

gmx mdrun -deffnm nvt -ntomp 2 -ntmpi 16 -nb gpu

This keeps the number of CPU threads down to 32 (about 50% total CPU usage).


What is your hardware and type of simulation? If you have a single GPU, try using no domain decomposition (i.e. -ntmpi 1).


Thanks for the suggestion. The hardware is 1 CPU (32 cores, 64 threads) and 1 GPU (1200 cuda cores). I have tried a number of run time parameters now.

The simulation is just MD of a protein in a water box.

gmx mdrun -s npt.tpr -deffnm npt -ntomp 32 -ntmpi 1 -pin on -nb gpu
=> 75 ns /day

gmx mdrun -s npt.tpr -deffnm npt -ntomp 4 -ntmpi 8 -pin on -nb gpu
=> 84 ns /day

Both using around 50% total CPU - I don’t want to run the CPU too hot for the cooler.
I realise I should run a standard benchmark case, not a custom case. So that the times in ns/day can be compared to other hardware setups.

well, I guess, at least you have a heat source which is not dependant on gas supply…