GROMACS version: 2025.3
GROMACS modification: No
I am having issue while running gromacs with RTX 5070 Ti and CUDA 13
GPU support: CUDA
NBNxM GPU setup: super-cluster 2x2x2 / cluster 8 (cluster-pair splitting on)
GPU FFT library: cuFFT
Multi-GPU FFT: none
CUDA compiler: /usr/local/cuda-13.0/bin/nvcc nvcc: NVIDIA (R) Cuda compiler driver;Copyright (c) 2005-2025 NVIDIA Corporation;Built on Wed_Aug_20_01:58:59_PM_PDT_2025;Cuda compilation tools, release 13.0, V13.0.88;Build cuda_13.0.r13.0/compiler.36424714_0
CUDA compiler flags: -O3 -DNDEBUG
CUDA driver: 13.0
CUDA runtime: 13.0
NVIDIA GPU Status:
GPU 0: NVIDIA GeForce RTX 5070 Ti
CUDA Driver Version:
580.65.06
I have compiled gromacs with cuda 13 toolkit on my environment and using it.
Ubuntu 24.4 x64
running these commands
Command line:
gmx pdb2gmx -f protein_raw.pdb -o protein.gro -p topol.top -i posre.itp -ff amber99sb-ildn -water tip3p -ignh
Command line:
gmx pdb2gmx -f protein_raw.pdb -o protein.gro -p topol.top -i posre.itp -ff amber99sb-ildn -water tip3p -ignh
Command line:
gmx pdb2gmx -f protein_raw.pdb -o protein.gro -p topol.top -i posre.itp -ff amber99sb-ildn -water tip3p -ignh
Command line:
gmx pdb2gmx -f protein_raw.pdb -o protein.gro -p topol.top -i posre.itp -ff amber99sb-ildn -water tip3p -ignh
Command line:
gmx grompp -f minim.mdp -c solv.gro -p topol.top -o ions.tpr -maxwarn 2
Command line:
gmx genion -s ions.tpr -o solv_ions.gro -p topol.top -pname NA -nname CL -neutral -conc 0.15
Command line:
gmx mdrun -deffnm nvt -ntmpi 1 -ntomp 8 -nb gpu -pme gpu -bonded gpu -update gpu
Reading file nvt.tpr, VERSION 2025.3 (single precision)
Changing nstlist from 10 to 100, rlist from 1 to 1.166
1 GPU selected for this run.
Mapping of GPU IDs to the 2 GPU tasks in the 1 rank on this node:
PP:0,PME:0
PP tasks will do (non-perturbed) short-ranged and most bonded interactions on the GPU
PP task will update and constrain coordinates on the GPU
PME tasks will do all aspects on the GPU
Using 1 MPI thread
Using 8 OpenMP threads
NOTE: The number of threads is not equal to the number of (logical) cpus
and the -pin option is set to auto: will not pin threads to cpus.
This can lead to significant performance degradation.
Consider using -pin on (and -pinoffset in case you run multiple jobs).
starting mdrun ‘Protein in water’
50000 steps, 100.0 ps.
Program: gmx mdrun, version 2025.3
Source file: src/gromacs/fft/gpu_3dfft_cufft.cu (line 59)
Fatal error:
cufftPlanMany R2C plan failure (error code 5)
For more information and tips for troubleshooting, please check the GROMACS
website at Common errors when using GROMACS - GROMACS 2025.4 documentation
================================================================================
[ERROR] Simulation exited with code 1
and it fails with this error on the last command.
only when i disable gpu for the last task it succeed but it is taking longer than just using cpu for all tasks so gpu is useless in this case.
cpu only run ———
Environment Setup : 1s
GPU Configuration Check : 1s
Ligand/Protein Extraction : 0s
pdb2gmx (Protein Topology) : 0s
ACPYPE (Ligand Topology) : 5s
Topology Integration : 0s
System Preparation (Box/Solvate) : 0s
Ion Placement : 0s
Energy Minimization : 3s
NVT Equilibration : 17s
NPT Equilibration : 17s
Production MD : 3s
Metrics Analysis : 0s
Plotting (XVG to PNG) : 1s
TOTAL RUNTIME : 48s
——— gpu enabled but safe mode
Environment Setup : 1s
GPU Configuration Check : 14s
Ligand/Protein Extraction : 0s
pdb2gmx (Protein Topology) : 0s
ACPYPE (Ligand Topology) : 13s
Topology Integration : 0s
System Preparation (Box/Solvate) : 0s
Ion Placement : 0s
Energy Minimization : 11s
NVT Equilibration : 39s
NPT Equilibration : 39s
Production MD : 12s
Metrics Analysis : 0s
Plotting (XVG to PNG) : 1s
TOTAL RUNTIME : 2m 10s
I have configured the python script in 3 modes but except safe mode others are failing and in safe mode its slower than bare cpu as shown above. I am not sure what is the issue
safe mode configures GPU acceleration to use only the non-bonded force calculations on the GPU, while explicitly forcing PME (Particle Mesh Ewald) calculations to run on the CPU.
What Safe Mode Does:
-
Energy Minimization & MD runs: Uses
-nb gpu -pme cpuflags-
-nb gpu→ Non-bonded interactions (van der Waals, direct-space electrostatics) run on GPU -
-pme cpu→ PME electrostatics (reciprocal space) run on CPU
-
-
Why it exists:
-
Avoids cuFFT library issues that can occur with CUDA 13.x and newer GPU architectures (RTX 40xx/50xx)
-
Prevents PTX JIT compilation errors related to PME calculations
-
Maximum compatibility across all CUDA versions and GPU models
-
-
Performance tradeoff:
-
Slower than
compatorfullmodes -
But guaranteed to work without runtime errors
-
Still much faster than pure CPU mode since non-bonded calculations (usually the most expensive part) run on GPU
-
Comparison with other modes:
-
safe:-nb gpu -pme cpu(both EM and MD) -
compat(default):-nb gpu -pme cpu(EM),-nb gpu -pme gpu(MD) -
full:-nb gpu -bonded gpu -pme gpu -update gpu(everything on GPU)
The script recommends using compat mode for the Fedora build, which uses safe settings for energy minimization but enables GPU PME for production MD to balance stability and performance.
What might be the issue ?