NVIDIA ADA gpu (5070 ti and 5080) failed to be used during simulation in gromacs 2024.4

GROMACS version: 2024.4
GROMACS modification: No

I have access to 2 nvidia gpus (5070 Ti and 5080) with Driver Version: 570.124.04 CUDA Version: 12.8.

I’m trying to build GROMACS 2024.4 with CUDA GPU support in linux debian using the following commands:

tar xfz gromacs-2024.4.tar.gz
cd gromacs-2024.4
mkdir build
cd build
cmake .. -DGMX_BUILD_OWN_FFTW=ON -DREGRESSIONTEST_DOWNLOAD=ON
cmake .. -DGMX_GPU=CUDA -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda
make -j 20
make check
sudo make install
source /usr/local/gromacs/bin/GMXRC

During installation I get no errors and also i see in the output that GROMACS tries different cuda architecture numbers eg 80 etc that say success. Also I see success for CUDA architecture 90 which I think belongs to the ADA so it seems that it gets recognized.

nvcc –version works and gives me this output:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2025 NVIDIA Corporation
Built on Wed_Jan_15_19:20:09_PST_2025
Cuda compilation tools, release 12.8, V12.8.61
Build cuda_12.8.r12.8/compiler.35404655_0

But when I run a simulation my gpu isn’t recognized and it uses the CPU instead. Specifically, I get this:

Command line:
gmx mdrun -v -deffnm nvt

WARNING: An error occurred while sanity checking device #0. An unhandled error from a previous CUDA operation was detected. CUDA error #209 (cudaErrorNoKernelImageForDevice): no kernel image is available for execution on the device.

Reading file nvt.tpr, VERSION 2024.4 (single precision)
Changing nstlist from 10 to 50, rlist from 1.2 to 1.279

Using 32 MPI threads
Using 1 OpenMP thread per tMPI thread

WARNING: This run will generate roughly 2303 Mb of data

starting mdrun ‘Mixed system with molecule A and B in water’
2500000 steps, 5000.0 ps.
step 700, will finish Mon Oct 27 15:45:02 2025^Cl 0.83 imb F 6% pme/F 0.47

Received the INT signal, stopping within 200 steps

step 850, will finish Mon Oct 27 15:54:33 2025vol 0.78 imb F 5% pme/F 0.47

Dynamic load balancing report:
DLB was turned on during the run due to measured imbalance.
Average load imbalance: 8.3%.
The balanceable part of the MD step is 86%, load imbalance is computed from this.
Part of the total run time spent waiting due to load imbalance: 7.1%.
Steps where the load balancing was limited by -rdd, -rcon and/or -dds: Y 0 % Z 0 %
Average PME mesh/force load: 0.474
Part of the total run time spent waiting due to PP/PME imbalance: 12.3 %

NOTE: 7.1 % of the available CPU time was lost due to load imbalance
in the domain decomposition.
You can consider manually changing the decomposition (option -dd);
e.g. by using fewer domains along the box dimension in which there is
considerable inhomogeneity in the simulated system.
NOTE: 12.3 % performance was lost because the PME ranks
had less work to do than the PP ranks.
You might want to decrease the number of PME ranks
or decrease the cut-off and the grid spacing.

           Core t (s)   Wall t (s)        (%)
   Time:      195.187        6.100     3199.7
             (ns/day)    (hour/ns)

Performance: 24.106 0.996

Then I also tried this: cmake .. -DGMX_GPU=CUDA -DCMAKE_CUDA_ARCHITECTURES=90

But again I get the same problem during simulation. GPU not recognized and gromacs use CPU instead. Could you help me?

Best regards

I think this is answered here.

Try with GROMACS 2024.6 or one of the 2025.x versions!

1 Like

thanks, it works with 2024.6