Cuda error gromacs with plumed

GROMACS version: 2022.3
GROMACS modification: No

Dear all,
in running a metadynamics simulation with gromacs 2022.3 and plumed 2.8 on a workstation with rtx 3090ti (CUDA 11.8) and intel xeon dual core (2 x 20 cores), I get the following error:

Program: gmx mdrun, version 2022.3-plumed_2.8.1
Source file: src/gromacs/gpu_utils/devicebuffer.cuh (line 91)
Function: freeDeviceBuffer<float2*>(float2**)::<lambda()>
MPI rank: 4 (out of 8)

Assertion failed:
Condition: stat == cudaSuccess
Source file: src/gromacs/gpu_utils/devicebuffer.cuh (line 91)
Function: freeDeviceBuffer<float2*>(float2**)::<lambda()>
Freeing of the device buffer failed. CUDA error #1 (cudaErrorInvalidValue):
invalid argument.

For more information and tips for troubleshooting, please check the GROMACS
website at Common Errors — GROMACS webpage https://www.gromacs.org documentation

MPI rank: 5 (out of 8)

Assertion failed:
Condition: stat == cudaSuccess
Freeing of the device buffer failed. CUDA error #1 (cudaErrorInvalidValue):

The simulation runs without error if I use gromacs compiled with MPI, but I tried to use a non MPI version of gromacs to get better performance and I got the error above.
The simulation was started with the following command:

gmx mdrun -deffnm MES_6_complex_metad -plumed plumed.dat -bonded gpu -nb gpu -pme gpu -ntmpi 8 -ntomp 5 -npme 1 -gputasks 00000000

Compilation was done with g++ 11.3 on ubuntu 22.04.

The version of non MPI gromacs works fine without plumed.

Thanks in advance
Stefano

Hello,

I’m facing exactly the same problem.
Did you say that the MPI version runs with plumed, right? Could you please print the cmake command you used?

Best,

Hello,
configuration was made with

cmake … -DGMX_BUILD_OWN_FFTW=ON -DGMX_GPU=CUDA -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda -DCMAKE_INSTALL_PREFIX=/home/stefano/gromacs2022_mpi -DGMX_SIMD=AVX2_256 -DGMX_MPI=on

after adding to PATH the path to mpi executables (I have tried both with OpenMPI or MPICH).

Best

Hi, I solved it downgrading cuda (v. 10.2) and nvidia-driver (v. 440).
Then, I’ve installed gromacs 2019.6.
Both plumed (v. 2.8.1) and gromacs were compiled without mpi.

Unfortunately, this was only way I’ve found.
I hope they will fix this problem soon. If you do so, please let me know.

Best,

Hi,
thanks for the feedback, I’m glad you solved; I posted the issue to PLUMED forum as well, so far I did not get response.

In the meantime I turned to Gromacs 2021 patched with Colvars module, as I have to run metadynamics, which is implementd in the module.
For the cent my opinion is worth, I have noticed that the colvars patched version of gromacs, beside being smoothier to install (nothing different from installing gromacs per se), runs with exactly the same performance of “native” gromacs, while I have always noticed a loss of performance with PLUMED (maybe this is my blame and is due to non optimal compilation of plumed).

All the best

Hello

I’ve managed to run Plumed distance restrain simulation using non-MPI Gromacs 2022.5 patched with Plumed 2.9.0_dev on Ubuntu 20.04, but only on one GPU card by passing a -gpu_id value to the mdrun.

I think that issue originates from the fact that when you run Gromacs on multiple GPU cards, each card gets assigned one MPI thread that contains multiple OpenMP threads. Gromacs has native thread-MPI by default, which is probably incompatible with plumed MPI. Possibly that’s why Gromacs compiled with MPI works fine with MPI version of Plumed.

Regards

Gordan

I can get Plumed 2.8.2 to run with a funnel-metadynamics simulation using non-MPI gromacs 2022.5 if I use the mdrun option -ntmpi 1 . Anything larger than 1 does not work. This also limits Gromacs to recognizing only 1 gpu, even if others are available. It would be nice if Plumed worked with Gromacs’ threadMPI because I get ~3x more performance using non-MPI on a single node with 1-4 gpus when running equilibrium simulations.

I got the same error with plumed 2.8.2 + gromacs 2022.5. My solution was to use plumed 2.8.2 with gromacs 2021.7 with MPI+cuda for my replicas exchange

This is the GROMACS error right? What does the plumed output file say?

Anyway, I am pretty sure PLUMED was not compatible with GROMACS thread MPI (https://groups.google.com/g/plumed-users/c/0B5AAbBB9M0), and I don’t think this changed. If you want to use MPI then you have to compile something like OpenMPI, link it to PLUMED for the compilation, use that PLUMED to patch GROMACS and compile that GROMACS pointing at the same MPI install.

1 Like

Sorry to add more confusion. Mine was not the same error as this post. I was trying to to Hamiltonian replica exchange (with external MPI).
I tried several combinations again on my ubuntu 22.04 (GCC 11.4.0) with cuda11.8 with openMPI 4.1.6, and they all work fine for Hamiltonian replica exchange, and I cannot reproduce my previous error.
plumed 2.8.2, gmx 2021.7
plumed 2.8.2, gmx 2022.5
plumed 2.8.3, gmx 2021.7
plumed 2.8.3, gmx 2022.5