How to run replica exchange with NVIDIA MPS/MIG

GROMACS version:2023
GROMACS modification: No

Just came upon below link while trying to find tricks to increase output of a REMD run on gmx using a GPU based HPC server:

As per the content of above link(coauthored by @pszilard ) using NVIDIA MPS(Multi-Process Service) /MIG (Multi-Instance GPU) can boost the aggregate performance of gpu-based concurrent MD simulations by ~1.3-1.8 times for small to medium sized regular systems.
The show case given in the article consists of a batch of concurrent but independently run simulations. Now I wonder can MPS/MIG capabilities also be invoked to boost the aggregate throughput of a REMD run on gromacs? and if so, how it can be achieved? how the example script in the article shall be modified to this end?

Yes, this is possible, and straightforward when using pure-MPS (MIG is trickier and typically doesn’t offer much additional benefit).
You should:

  • Enable MPS with “nvidia-cuda-mps-control -d” on each compute node you are using.
  • To run N members per GPU, just adjust your script to add N times more MPI tasks per node, and GROMACS will automatically oversubscribe the GPUs.

Thanks for the reply. So if i want to run , say 12 replicas over 3 nodes each with 2 GPU ,i.e 6 GPUs in total, I shall activate MPS on all three nodes and submit sth along the below lines:
mpirun -np 12 gmx_mpi mdrun -multi 12 -pme gpu -nb gpu -update gpu

am i right?

Correct. However, I suggest to first consider what is that you want to accomplish? Minimum time to solution or maximum efficiency (e.g. max aggregate throughput per node) or somewhere in between and pick your setup based on that.

note that -multi has been deprecated, use -multidir.