Low Performance due to low utilisation of GPU

GROMACS version: 2023
GROMACS modification: No

My desktop has Intel(R) Core™ i9-10900K CPU @ 3.70GHz processor and Nvidia RTX 4090 GPU.

This is gromacs version installed on my system
gmx --version

GROMACS version: 2023
Precision: mixed
Memory model: 64 bit
MPI library: thread_mpi
OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 128)
GPU support: CUDA
NB cluster size: 8
SIMD instructions: AVX2_256
CPU FFT library: fftw-3.3.8-sse2-avx-avx2-avx2_128
GPU FFT library: cuFFT
Multi-GPU FFT: none
RDTSCP usage: enabled
TNG support: enabled
Hwloc support: disabled
Tracing support: disabled
C compiler: /usr/bin/cc GNU 11.3.0
C compiler flags: -fexcess-precision=fast -funroll-all-loops -mavx2 -mfma -Wno-missing-field-initializers -O3 -DNDEBUG
C++ compiler: /usr/bin/c++ GNU 11.3.0
C++ compiler flags: -fexcess-precision=fast -funroll-all-loops -mavx2 -mfma -Wno-missing-field-initializers -Wno-cast-function-type-strict -fopenmp
BLAS library:
LAPACK library:
CUDA compiler: /usr/local/cuda-12.1/bin/nvcc nvcc: NVIDIA (R) Cuda compiler driver;Copyright (c) 2005-2023 NVIDIA Corporation;Built on Tue_Feb__7_19:32:13_PST_2023;Cuda compilation tools, release 12.1, V12.1.66;Build cuda_12.1.r12.1/compiler.32415258_0
CUDA compiler flags:-std=c++17;–generate-code=arch=compute_50,code=sm_50;–generate-code=arch=compute_52,code=sm_52;–generate-code=arch=compute_60,code=sm_60;–generate-code=arch=compute_61,code=sm_61;–generate-code=arch=compute_70,code=sm_70;–generate-code=arch=compute_75,code=sm_75;–generate-code=arch=compute_80,code=sm_80;–generate-code=arch=compute_86,code=sm_86;–generate-code=arch=compute_89,code=sm_89;–generate-code=arch=compute_90,code=sm_90;-Wno-deprecated-gpu-targets;–generate-code=arch=compute_53,code=sm_53;–generate-code=arch=compute_80,code=sm_80;-use_fast_math;-Xptxas;-warn-double-usage;-Xptxas;-Werror;-D_FORCE_INLINES;-fexcess-precision=fast -funroll-all-loops -mavx2 -mfma -Wno-missing-field-initializers -Wno-cast-function-type-strict -fopenmp
CUDA driver: 12.10
CUDA runtime: 12.10

I am running a md where i offloaded everything on GPU
gmx mdrun -deffnm md -nb gpu -pme gpu -bonded gpu -update gpu

Still my GPU utilisation is less than 10% while my cpu utilisation ~ 100%.

Here is log file results

How to increase my GPU utilization??

Thanks and Regards

Hi,

As you can see from the log output, the work on the CPU side that leads to the behavior you see is:

  • pair search is taking 35% of the wall-time; it appears that you have nstlist=100 so not sure why is that so high.
  • “Rest” time 37% this is time mdrun timing does not explicitly account for, perhaps you have some special algorithm enabled?
  • “Force” time is 25% which is likely due to some bonded types needing the CPU kernels.

We should rule out first the possibility that you are running something else on the CPU that is interfering.

Can you share a complete log file?

Cheers,
Szilárd

I can’t access that file without signing in, please allow downloads without sign in or upload a file here.