Error with gmx mdrun -deffnm md_0_10 -v -nb gpu

weamalyoubi · September 13, 2022, 1:01pm

GROMACS version:
GROMACS modification: Yes/No
Here post your question
what the problem with this simulation

gmx mdrun -deffnm md_0_10 -v -nb gpu
:-) GROMACS - gmx mdrun, 2022.3 (-:

Executable: /usr/local/gromacs/bin/gmx
Data prefix: /usr/local/gromacs
Working dir: /home/bioinfo/gromacs-2022.3/build/protein
Command line:
gmx mdrun -deffnm md_0_10 -v -nb gpu

Back Off! I just backed up md_0_10.log to ./#md_0_10.log.4#
Reading file md_0_10.tpr, VERSION 2020.1-Ubuntu-2020.1-1 (single precision)
Note: file tpx version 119, software tpx version 127
Changing nstlist from 20 to 100, rlist from 1.223 to 1.344

1 GPU selected for this run.
Mapping of GPU IDs to the 2 GPU tasks in the 1 rank on this node:
PP:0,PME:0
PP tasks will do (non-perturbed) short-ranged interactions on the GPU
PP task will update and constrain coordinates on the CPU
PME tasks will do all aspects on the GPU
Using 1 MPI thread
Using 6 OpenMP threads

Back Off! I just backed up md_0_10.xtc to ./#md_0_10.xtc.1#

Back Off! I just backed up md_0_10.edr to ./#md_0_10.edr.1#

WARNING: This run will generate roughly 8520 Mb of data

starting mdrun ‘Protein in water’
50000000 steps, 100000.0 ps.
step 600: timed with pme grid 96 96 96, coulomb cutoff 1.200: 4489.2 M-cycles
step 800: timed with pme grid 80 80 80, coulomb cutoff 1.268: 4482.3 M-cycles
step 1000: timed with pme grid 72 72 72, coulomb cutoff 1.409: 4867.3 M-cycles
step 1200: timed with pme grid 64 64 64, coulomb cutoff 1.585: 4928.9 M-cycles
step 1200: the maximum allowed grid scaling limits the PME load balancing to a coulomb cut-off of 1.691
step 1400: timed with pme grid 60 60 60, coulomb cutoff 1.691: 5332.0 M-cycles
step 1600: timed with pme grid 64 64 64, coulomb cutoff 1.585: 5125.6 M-cycles
step 1800: timed with pme grid 72 72 72, coulomb cutoff 1.409: 4688.1 M-cycles
step 2000: timed with pme grid 80 80 80, coulomb cutoff 1.268: 4570.8 M-cycles
step 2200: timed with pme grid 84 84 84, coulomb cutoff 1.208: 4497.2 M-cycles
step 2400: timed with pme grid 96 96 96, coulomb cutoff 1.200: 4458.9 M-cycles
step 2600: timed with pme grid 64 64 64, coulomb cutoff 1.585: 4898.4 M-cycles
step 2800: timed with pme grid 72 72 72, coulomb cutoff 1.409: 4908.0 M-cycles
step 3000: timed with pme grid 80 80 80, coulomb cutoff 1.268: 4886.0 M-cycles
step 3200: timed with pme grid 84 84 84, coulomb cutoff 1.208: 4523.2 M-cycles
step 3400: timed with pme grid 96 96 96, coulomb cutoff 1.200: 4674.6 M-cycles
optimal pme grid 96 96 96, coulomb cutoff 1.200
step 3574800, will finish Wed Sep 21 21:56:17 2022
WARNING: GPU kernel (PME gather) failed to launch. An unhandled error from a previous CUDA operation was detected. CUDA error #999 (cudaErrorUnknown): unknown error.

Program: gmx mdrun, version 2022.3
Source file: src/gromacs/gpu_utils/devicebuffer.cuh (line 197)
Function: copyFromDeviceBuffer(ValueType*, ValueType**, size_t, size_t, const DeviceStream&, GpuApiCallBehavior, CommandEvent*) [with ValueType = gmx::BasicVector; DeviceBuffer = gmx::BasicVector*; size_t = long unsigned int; CommandEvent = void]::<lambda()>

Assertion failed:
Condition: stat == cudaSuccess
Asynchronous D2H copy failed. CUDA error #999 (cudaErrorUnknown): unknown
error.

For more information and tips for troubleshooting, please check the GROMACS
website at Common Errors — GROMACS webpage https://www.gromacs.org documentation

pszilard · September 13, 2022, 1:46pm

Perhaps your GPU is running out of memory during the PME load balancing? What hardware are you running on?

To try to diagnose the concrete error try the following:
/PATH/TO/CUDA/compute-sanitizer gmx mdrun -deffnm md_0_10 -v -nb gpu -nsteps 10000
What does this report?

You can try passing “-notunepme”, that will disable the load balancing (and potentially lead to a small performance loss), but it might confirm whether the issue is related to PME tuning.

weamalyoubi · September 13, 2022, 3:47pm

can you tell me what the problem? my working directory and files not in compute-sanitizer

pszilard · September 13, 2022, 4:14pm

compute-sanitizer is in the CUDA installation, so you need to replace the /PATH/TO/CUDA/ with the path to your CUDA toolkit, e.g. it could be /usr/local/cuda/bin/.

giacomo.fiorin · September 13, 2022, 4:15pm

I believe @pszilard meant to use the compute-sanitizer program inside the folder of the same name:

$ ls -l /usr/local/cuda/compute-sanitizer/compute-sanitizer 
-rwxr-xr-x. 1 root root 7.2M May 19 23:24 /usr/local/cuda/compute-sanitizer/compute-sanitizer

(Edit: the one in the bin folder works as well, it’s just a script that launches this program).

You should use this program as a launcher for GROMACS, just as you’d sometimes use mpirun for a MPI-based GROMACS build.

Giacomo

weamalyoubi · September 13, 2022, 5:07pm

That means, do I put my working directory and files into bin. After that, I run a simulation. Is it right?

pszilard · September 27, 2022, 2:12pm

No, I mean that you should use the same command you used to run GROMACS beforehand, but put the compute sanitizer in front of the gmx mdrun command.

Topic		Replies	Views
Gmx mdrun on gpu problem User discussions mdrun	0	31	January 22, 2025
Gmx mdrun -deffnm md_0_1 -nb gpu ** GPU command line error User discussions	20	4812	March 25, 2022
Error when running gmx 2023.1 mdrun with GPU User discussions mdrun	3	433	July 3, 2023
Error in mdrun em User discussions	0	596	February 10, 2021
Memory access fault when running mdrun with the AMD GPU User discussions mdrun	2	3365	June 7, 2021

Error with gmx mdrun -deffnm md_0_10 -v -nb gpu

Related topics