GROMACS version: 2019
GROMACS modification: Yes, plumed
Hi,
I would like to run HREMD simulations. I planned to test the setup on a single node with 4xRTX2070 GPUs. AMD Ryzen Threadripper 2950X 16-Core Processor, PSU is 1700W, if I remember well.
I have Debian, open mpi, plumed 2.6.1, gromacs-2019.6, NVIDIA: Driver Version: 418.87.00 CUDA Version: 10.1
mpirun -np 4 /opt/gromacs-2019.6-plumed261/bin/mdrun_mpi/mdrun -v -plumed plumed.dat -multidir sim[0123] -replex 100 -hrex -dlb no
The simulations starts with no problem.
However, the last entry in the md.log:
Replica exchange at step 13800 time 27.60000
At the first run, I had this error message: GPU at 00000000:41:00.0 has fallen off the bus.
In this case no change in the system, just no visible GPU, nvidia-smi says: reboot
At the second run, after successful simulation start, the computer simply restarted without any message in any log file.
My first suspicion is that no sufficient power supply.
However, while I could monitor gpu usage by nvidia-smi, I did not see more than 50-60%.
My second thought that the automatic GPU utilization may be a problem. Or CPU utilization may also be an associated issue. From one of the log files:
This is simulation 0 out of 4 running as a composite GROMACS
multi-simulation job. Setup for this simulation:
Using 1 MPI process
*Using 8 OpenMP threads *
4 GPUs selected for this run.
Mapping of GPU IDs to the 8 GPU tasks in the 4 ranks on this node:
- PP:0,PME:0,PP:1,PME:1,PP:2,PME:2,PP:3,PME:3*
PP tasks will do (non-perturbed) short-ranged interactions on the GPU
PME tasks will do all aspects on the GPU
Pinning threads with an auto-selected logical core stride of 1
System total charge: 0.000
Now, I run regular MD using 16 cores and 4GPUs and I do not have any problem.
How should I test this issue? How should I run this HREMD simulation?
Thanks for your help and suggestions,
Tamas
Now, I run regular MD using 16 cores and 4GPUs and I do not have any problem.
How should I test this issue? How should I run this HREMD simulation?
Thanks for your help and suggestions,
Tamas