Illegal memory access error #700 CUDA

GROMACS version: 2020.2 / 2025.3
GROMACS modification: No

I am modeling the following system: water (SPC/E) + oil (different hydrocarbons) + surfactants + silica nanoparticles (NPs) with the following procedure: energy minimization → short NVT (200 ps) → short NPT (1 ns) → production NVT (40 ns). I am running about 40 simulations varying surfactant and silica NP concentrations.

I am encountering the following error on my PC (GROMACS 2025.3):
“Fatal error: cudaFreeHost failed CUDA error #700 (cudaErrorIllegalAccess): an illegal memory access was encountered”.

When I run the same simulations on an HPC GPU node (GROMACS 2020.2), the following error appears:
“Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffe01437e50)”.

I don’t see any clear understanding of these errors: sometimes they occur at the very beginning of the simulation (at the short NVT stage), sometimes they appear only after 3–5 ns of production run. I would suggest this is due to instability in the system assembly, but this behavior, in my opinion, makes this theory uncertain.

For some simulations, it helped me to move the PME calculation to the CPU, leaving only the NB on the GPU. But for similar systems that differ only in the concentration of surfactants and nanoparticles, this didn’t work for some reason, and these errors still appear.

Unfortunately, none of the solutions to this problem that I managed to find helped me. I would be very grateful for any advice.

Silica NPs were parametrized in the ClayFF force field; for other molecules, I used OPLS-AA, so I manually formed a file with Lorentz-Berthelot mixing rules for non-bonded cross-interactions.

My first guess would indeed be instability in your system. We try to avoid such issues as much as possible, but to get good performance we can’t avoid all.

So you also got this error when running PME on the CPU?

Dear Berk,

For some simulations (about third of them) running PME on CPU helped and they successfully finished without any issues, but for others the same error occured.

I have carefully studied my systems again and noticed that some water molecules were trapped inside of the silica NP structure after solvation. After removing this trapped molecules the problem was solved.

So it was most likely system instability due to this fact, however, it is interesting there was no other warnings or suspicious energies that would have pointed to the problem.