As per the content of above link(coauthored by @pszilard ) using NVIDIA MPS(Multi-Process Service) /MIG (Multi-Instance GPU) can boost the aggregate performance of gpu-based concurrent MD simulations by ~1.3-1.8 times for small to medium sized regular systems.
The show case given in the article consists of a batch of concurrent but independently run simulations. Now I wonder can MPS/MIG capabilities also be invoked to boost the aggregate throughput of a REMD run on gromacs? and if so, how it can be achieved? how the example script in the article shall be modified to this end?
B/R
Roozi
Thanks for the reply. So if i want to run , say 12 replicas over 3 nodes each with 2 GPU ,i.e 6 GPUs in total, I shall activate MPS on all three nodes and submit sth along the below lines:
mpirun -np 12 gmx_mpi mdrun -multi 12 -pme gpu -nb gpu -update gpu
Correct. However, I suggest to first consider what is that you want to accomplish? Minimum time to solution or maximum efficiency (e.g. max aggregate throughput per node) or somewhere in between and pick your setup based on that.
note that -multi has been deprecated, use -multidir.