GROMACS version: 2019.2
GROMACS modification: No
Here post your question
Dear all,
I am trying to run two MD simulation tasks on one computer equipped with2CPUs and 4 GPUs. The CPU information is as following:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 56
On-line CPU(s) list: 0-55
Thread(s) per core: 2
Core(s) per socket: 14
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 79
Model name: Intel® Xeon® CPU E5-2690 v4 @ 2.60GHz
Stepping: 1
CPU MHz: 3199.929
BogoMIPS: 5206.06
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 35840K
NUMA node0 CPU(s): 0-13,28-41
NUMA node1 CPU(s): 14-27,42-55
I first submit one MD task, using two GPUs, with the follow command:
gmx mdrun -v -deffnm md -nt 24 -gpu_id 0,1
The GPU utility (GPU-Util) of the two GPUs engaged reaches 60% and 65%, respectively, like the following:
$nvisia-smi
±----------------------------------------------------------------------------+
| NVIDIA-SMI 375.26 Driver Version: 375.26 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Graphics Device Off | 0000:02:00.0 Off | N/A |
| 53% 82C P2 151W / 250W | 197MiB / 12189MiB | 65% Default |
±------------------------------±---------------------±---------------------+
| 1 Graphics Device Off | 0000:03:00.0 Off | N/A |
| 51% 80C P2 148W / 250W | 193MiB / 12189MiB | 60% Default |
±------------------------------±---------------------±---------------------+
| 2 Graphics Device Off | 0000:82:00.0 Off | N/A |
| 23% 25C P8 8W / 250W | 2MiB / 12189MiB | 0% Default |
±------------------------------±---------------------±---------------------+
| 3 Graphics Device Off | 0000:83:00.0 Off | N/A |
| 23% 26C P8 9W / 250W | 2MiB / 12189MiB | 0% Default |
±------------------------------±---------------------±---------------------+
However, when I submit another MD task to the other two GPU, with the following command:
gmx mdrun -v -deffnm md -nt 24 -gpu_id 0,1
The GPU-util of the two GPUs running the first MD task drops to 16% and 20%, respectively. And the two GPUs running the second MD task are only 10% and 14, respectively, like the following:
±----------------------------------------------------------------------------+
| NVIDIA-SMI 367.48 Driver Version: 367.48 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 1080 Off | 0000:02:00.0 Off | N/A |
| 38% 56C P2 48W / 180W | 157MiB / 8113MiB | 16% Default |
±------------------------------±---------------------±---------------------+
| 1 GeForce GTX 1080 Off | 0000:03:00.0 Off | N/A |
| 41% 61C P2 50W / 180W | 155MiB / 8113MiB | 20% Default |
±------------------------------±---------------------±---------------------+
| 2 GeForce GTX 1080 Off | 0000:82:00.0 Off | N/A |
| 40% 60C P2 52W / 180W | 149MiB / 8113MiB | 10% Default |
±------------------------------±---------------------±---------------------+
| 3 GeForce GTX 1080 Off | 0000:83:00.0 Off | N/A |
| 38% 57C P2 49W / 180W | 149MiB / 8113MiB | 14% Default |
±------------------------------±---------------------±---------------------+
±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 35470 C gmx 155MiB |
| 1 35470 C gmx 153MiB |
| 2 35729 C gmx 147MiB |
| 3 35729 C gmx 147MiB |
±----------------------------------------------------------------------------+
I wonder whether it is possible to keep all the GPU-util high when running two MD task. How should I change and adjust parameters for the mdrun commands to achieve this purpose?
Best regards