How to run replica exchange with NVIDIA MPS/MIG

roozi · September 7, 2023, 1:21am

GROMACS version:2023
GROMACS modification: No

Hi,
Just came upon below link while trying to find tricks to increase output of a REMD run on gmx using a GPU based HPC server:

https://developer.nvidia.com/blog/maximizing-gromacs-throughput-with-multiple-simulations-per-gpu-using-mps-and-mig/

As per the content of above link(coauthored by @pszilard ) using NVIDIA MPS(Multi-Process Service) /MIG (Multi-Instance GPU) can boost the aggregate performance of gpu-based concurrent MD simulations by ~1.3-1.8 times for small to medium sized regular systems.
The show case given in the article consists of a batch of concurrent but independently run simulations. Now I wonder can MPS/MIG capabilities also be invoked to boost the aggregate throughput of a REMD run on gromacs? and if so, how it can be achieved? how the example script in the article shall be modified to this end?
B/R
Roozi

alang · September 8, 2023, 1:13pm

Yes, this is possible, and straightforward when using pure-MPS (MIG is trickier and typically doesn’t offer much additional benefit).
You should:

Enable MPS with “nvidia-cuda-mps-control -d” on each compute node you are using.
To run N members per GPU, just adjust your script to add N times more MPI tasks per node, and GROMACS will automatically oversubscribe the GPUs.

roozi · September 9, 2023, 1:54am

Thanks for the reply. So if i want to run , say 12 replicas over 3 nodes each with 2 GPU ,i.e 6 GPUs in total, I shall activate MPS on all three nodes and submit sth along the below lines:
mpirun -np 12 gmx_mpi mdrun -multi 12 -pme gpu -nb gpu -update gpu

am i right?

pszilard · September 11, 2023, 9:44am

Correct. However, I suggest to first consider what is that you want to accomplish? Minimum time to solution or maximum efficiency (e.g. max aggregate throughput per node) or somewhere in between and pick your setup based on that.

note that -multi has been deprecated, use -multidir.

Cheers,
Szilárd

Topic		Replies	Views
Using gromacs 2022 to run N replicas on ONE gpu User discussions mdrun , replica-exchange	0	1340	June 24, 2022
Using full GPU node without MPI User discussions mdrun	3	483	September 11, 2023
New blog post on maximizing GROMACS throughput on GPUs User discussions gpu , mdrun-performance	2	845	January 13, 2023
Replica Exchange: Mdrun parameters User discussions	2	1120	December 8, 2020
Efficient Use of CPU and GPU Hybridization for Multiple GROMACS Jobs on a Single Machine User discussions mdrun , simulation-setup	1	247	January 18, 2024

How to run replica exchange with NVIDIA MPS/MIG

Related topics