Fatal error: Unexpected cudaStreamQuery failure: an illegal memory access was encountered

GROMACS version: 2019.4
GROMACS modification: Yes/No
Here post your question

I came up with this fatal error when appending one simulation with -cpi -append. After running ~ 200 ps, the simulation stopped, and the error was repeatable when I tried to restart again. Could you please help me to see how to find the problem?


Following is the information from log …:

Restarting from checkpoint, appending to previous log file.

                  :-) GROMACS - gmx mdrun, 2019.4 (-:

Executable: /install/gromacs-2019.4/bin/gmx
Data prefix: /install/gromacs-2019.4
Working dir: /home/work/gpu/hIAPP_POPG_enlarge_dimer_highc/conf5_ff/duplicatey/npt_duplicate1
Process ID: 20605
Command line:
gmx mdrun -s npt_1500-2000ns.tpr -deffnm npt -v -cpi npt.cpt -append -maxh 12

GROMACS version: 2019.4
Precision: single
Memory model: 64 bit
MPI library: thread_mpi
OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 64)
GPU support: CUDA
SIMD instructions: AVX2_256
FFT library: fftw-3.3.8-sse2-avx-avx2-avx2_128
RDTSCP usage: enabled
TNG support: enabled
Hwloc support: disabled
Tracing support: disabled
C compiler: /usr/bin/cc GNU 4.8.5
C compiler flags: -mavx2 -mfma -O3 -DNDEBUG -funroll-all-loops -fexcess-precision=fast
C++ compiler: /usr/bin/c++ GNU 4.8.5
C++ compiler flags: -mavx2 -mfma -std=c++11 -O3 -DNDEBUG -funroll-all-loops -fexcess-precision=fast
CUDA compiler: /usr/local/cuda-10.0/bin/nvcc nvcc: NVIDIA ® Cuda compiler driver;Copyright © 2005-2018 NVIDIA Corporation;Built on Sat_Aug_25_21:08:01_CDT_2018;Cuda compilation tools, release 10.0, V10.0.130
CUDA compiler flags:-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=compute_75;-use_fast_math;;; ;-mavx2;-mfma;-std=c++11;-O3;-DNDEBUG;-funroll-all-loops;-fexcess-precision=fast;
CUDA driver: 10.0
CUDA runtime: 10.0

Changing nstlist from 10 to 100, rlist from 1.6 to 1.698

Using 1 MPI thread
Using 12 OpenMP threads

1 GPU selected for this run.
Mapping of GPU IDs to the 2 GPU tasks in the 1 rank on this node:
PP tasks will do (non-perturbed) short-ranged interactions on the GPU
PME tasks will do all aspects on the GPU
Pinning threads with an auto-selected logical core stride of 1
System total charge: 0.000
Will do PME sum in reciprocal space for electrostatic interactions.

U. Essmann, L. Perera, M. L. Berkowitz, T. Darden, H. Lee and L. G. Pedersen
A smooth particle mesh Ewald method
J. Chem. Phys. 103 (1995) pp. 8577-8592
-------- -------- — Thank You — -------- --------

Using a Gaussian width (1/beta) of 0.51226 nm for Ewald
Potential shift: LJ r^-12: 0.000e+00 r^-6: 0.000e+00, Ewald -6.250e-06
Initialized non-bonded Ewald correction tables, spacing: 1.18e-03 size: 1357

Long Range LJ corr.: 3.5365e-04
Generated table with 1349 data points for Ewald.
Tabscale = 500 points/nm
Generated table with 1349 data points for LJ6Switch.
Tabscale = 500 points/nm
Generated table with 1349 data points for LJ12Switch.
Tabscale = 500 points/nm
Generated table with 1349 data points for 1-4 COUL.
Tabscale = 500 points/nm
Generated table with 1349 data points for 1-4 LJ6.
Tabscale = 500 points/nm
Generated table with 1349 data points for 1-4 LJ12.
Tabscale = 500 points/nm

Using GPU 8x8 nonbonded short-range kernels

Using a dual 8x4 pair-list setup updated with dynamic, rolling pruning:
outer list: updated every 100 steps, buffer 0.098 nm, rlist 1.698 nm
inner list: updated every 18 steps, buffer 0.002 nm, rlist 1.602 nm
At tolerance 0.005 kJ/mol/ps per atom, equivalent classical 1x1 list would be:
outer list: updated every 100 steps, buffer 0.259 nm, rlist 1.859 nm
inner list: updated every 18 steps, buffer 0.077 nm, rlist 1.677 nm

Initializing LINear Constraint Solver
B. Hess and H. Bekker and H. J. C. Berendsen and J. G. E. M. Fraaije
LINCS: A Linear Constraint Solver for molecular simulations
J. Comp. Chem. 18 (1997) pp. 1463-1472
-------- -------- — Thank You — -------- --------

The number of constraints is 16202

S. Miyamoto and P. A. Kollman
SETTLE: An Analytical Version of the SHAKE and RATTLE Algorithms for Rigid
Water Models
J. Comp. Chem. 13 (1992) pp. 952-962
-------- -------- — Thank You — -------- --------

Center of mass motion removal mode is Linear
We have the following groups for center of mass motion removal:
0: System
There are: 59316 Atoms

Started mdrun on rank 0 Wed Aug 5 11:25:36 2020

step 989737700: timed with pme grid 32 60 96, coulomb cutoff 1.600: 1252.9 M-cycles
step 989737900: timed with pme grid 28 52 84, coulomb cutoff 1.763: 1434.2 M-cycles
step 989738100: timed with pme grid 28 52 96, coulomb cutoff 1.720: 1371.4 M-cycles
step 989738300: timed with pme grid 28 56 96, coulomb cutoff 1.600: 1204.0 M-cycles
step 989738500: timed with pme grid 32 60 96, coulomb cutoff 1.600: 1202.8 M-cycles
step 989738700: timed with pme grid 28 56 96, coulomb cutoff 1.600: 1198.1 M-cycles

Program: gmx mdrun, version 2019.4
Source file: src/gromacs/gpu_utils/cudautils.cuh (line 251)

Fatal error:
Unexpected cudaStreamQuery failure: an illegal memory access was encountered

For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors


Is there any updates on this problem? I am facing the same issue in GROMACS version: 2022.3.


Did you solve this issue? I am experiencing the same and looking for a solution.
Thank you.

I faced the same error did anyone find its answer?

1 Like