GROMACS version: 2022.2
GROMACS modification: Yes/No
openmpi : 4.1.5
I have an issue running simulation with openmpi cuda aware.
I have a cluster with 2 GPUs per node in EXCLUSIVE PROCESS mode.
I configured slots to be equal to my number of GPUs on each node (2).
Typically, I tried to submit to multiple host but had the folowing error:
/site/eclub/app/x86_64/tools/openmpi/4.1.3-cuda/bin/mpirun --hostfile hostfile -x GMX_ENABLE_DIRECT_GPU_COMM -x PATH -x LD_LIBRARY_PATH -np 4 gmx_mpi mdrun -ntomp 12 -nb gpu -bonded gpu -s md2.tpr -g test.log 2>&1
<uda/bin/mpirun --hostfile hostfile -x GMX_ENABLE_DIRECT_GPU_COMM -x PATH -x LD_LIBRARY_PATH -np 4 gm>
--------------------------------------------------------------------------
WARNING: There was an error initializing an OpenFabrics device.
Local host: g-node013
Local device: mlx4_0
--------------------------------------------------------------------------
:-) GROMACS - gmx mdrun, 2022 (-:
Executable: /site/eclub/app/x86_64/discovery/gromacs/2022_gpu/bin/gmx_mpi
Data prefix: /site/eclub/app/x86_64/discovery/gromacs/2022_gpu
Working dir: /site/eclub/work/users/appadmin/sample/soft/gromacs/local_gpu
Command line:
gmx_mpi mdrun -ntomp 12 -nb gpu -bonded gpu -s md2.tpr -g test.log
Back Off! I just backed up test.log to ./#test.log.44#
Reading file md2.tpr, VERSION 5.1.3 (single precision)
Note: file tpx version 103, software tpx version 127
GMX_ENABLE_DIRECT_GPU_COMM environment variable detected, enabling direct GPU communication using GPU-aware MPI.
Changing nstlist from 10 to 100, rlist from 1 to 1.16
On host g-node010 2 GPUs selected for this run.
Mapping of GPU IDs to the 2 GPU tasks in the 2 ranks on this node:
PP:0,PP:1
PP tasks will do (non-perturbed) short-ranged and most bonded interactions on the GPU
PP task will update and constrain coordinates on the CPU
GPU direct communication will be used between MPI ranks.
To simplify, I only submit on one node always using MPI (I know it’s not optimal and I should use Thread MPI - what I’ve done and it worked perfectly- but it’s for testing purpose).
/site/eclub/app/x86_64/tools/openmpi/4.1.3-cuda/bin/mpirun --hostfile hostfile -x GMX_ENABLE_DIRECT_GPU_COMM -x PATH -x LD_LIBRARY_PATH -np **2** gmx_mpi mdrun -ntomp 12 -nb gpu -bonded gpu -s md2.tpr -g test.log 2>&1
Same error.
Then I switched my gpu to DEFAULT mode on my execution node and it worked.
I use nvidia-smi to monitor what’s going on my node I saw that , 2 MPI processes were processed by one GPU what explains my previous error (in EXCLUSIVE MODE, my GPU cannot handle more that one process at a time)
T
hu Aug 25 17:02:35 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03 Driver Version: 460.32.03 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla K40m On | 00000000:02:00.0 Off | 0 |
| N/A 23C P0 61W / 235W | 72MiB / 11441MiB | 1% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 Tesla K40m On | 00000000:84:00.0 Off | 0 |
| N/A 22C P0 62W / 235W | 140MiB / 11441MiB | 1% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 22568 C gmx_mpi 69MiB |
**| 1 N/A N/A 22568 C gmx_mpi 65MiB |**
**| 1 N/A N/A 22569 C gmx_mpi 69MiB |**
+-----------------------------------------------------------------------------+
So my question is how to prevent that behaviour keeping my GPUs in EXCLUSIVE MODE (because it’s mandatory on our cluster for policy reasons) using openmpi ?
Also, why did I not observed the same thing with Thread-MPI version of Gromacs ?
Thanks for your help.