GPU detection deactivated

GROMACS version: 2020.2
GROMACS modification: No

Under what circumstances, other than setting the GMX_DISABLE_GPU_DETECTION environment variable, would this happen:

Running on 1 node with total 16 cores, 32 logical cores (GPU detection deactivated)

GROMACS was compiled several times with -DGMX_GPU=ON and with either CUDA or OpenCL, then run on multiple cluster nodes, some with and some without GPUs. The message about “GPU detection deactivated” appeared in all cases. When attempting to force GROMACS to use the GPUs present on a node by setting -nb gpu, GROMACS errors out with:

Fatal error:
Cannot run short-ranged nonbonded interactions on a GPU because no GPU is
detected.

Which one would expect, since GPU detection did not happen.

Can GROMACS be compiled in such a way that GPU detection is permanently disabled? Or is this happening at runtime?

Roman,

The message indicates that the GPU detection was either disabled with -DGMX_GPU=OFF or failed at cmake stage. Can you please post the cmake command that you used and its output?

Thanks!
Artem

Hi Artem,

Thanks for getting back to me! The compilation was actually run for me by our cluster admin. He sent me the the CMake command and resulting output, which I’ve copied below. One thing I see right away is that he used -DGMX_CUDA_TARGET_SM=70;72 instead of -DGMX_CUDA_TARGET_SM=“70;72”, which resulted in the “72” being interpreted as a separate command (and not being found). As I understand it, SM_70 and SM_72 are both accurate descriptions of the physical architecture (Volta) of the GPUs (tesla v100) installed on our GPU nodes: https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html#gpu-feature-list; https://images.nvidia.com/content/volta-architecture/pdf/volta-architecture-whitepaper.pdf). I believe the CMake command would have been interpreted to include the SM_70 architecture only, which should not have resulted in a failure.

Thanks again for your help!
Roman

The CMake command was:

cmake … -DCMAKE_INSTALL_PREFIX=/modules/apps/gromacs/2020.2G -DGMX_BUILD_OWN_FFTW=ON -DREGRESSIONTEST_DOWNLOAD=ON -DGMX_THREAD_MPI=ON -DGMX_USE_RDTSCP=ON -DGMX_SIMD=AVX_256 -DGMX_GPU=ON -DCUDA_TOOLKIT_ROOT_DIR=/modules/apps/cuda/11.0.1 -DGMX_CUDA_TARGET_SM=70;72

The resulting CMake output was:

– The C compiler identification is GNU 9.2.0
– The CXX compiler identification is GNU 9.2.0
– Check for working C compiler: /modules/apps/gcc/9.2.0/bin/gcc
– Check for working C compiler: /modules/apps/gcc/9.2.0/bin/gcc – works
– Detecting C compiler ABI info
– Detecting C compiler ABI info - done
– Detecting C compile features
– Detecting C compile features - done
– Check for working CXX compiler: /modules/apps/gcc/9.2.0/bin/c++
– Check for working CXX compiler: /modules/apps/gcc/9.2.0/bin/c++ – works
– Detecting CXX compiler ABI info
– Detecting CXX compiler ABI info - done
– Detecting CXX compile features
– Detecting CXX compile features - done
– Looking for NVIDIA GPUs present in the system
– Number of NVIDIA GPUs detected: 2
– Looking for pthread.h
– Looking for pthread.h - found
– Performing Test CMAKE_HAVE_LIBC_PTHREAD
– Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
– Looking for pthread_create in pthreads
– Looking for pthread_create in pthreads - not found
– Looking for pthread_create in pthread
– Looking for pthread_create in pthread - found
– Found Threads: TRUE
– Found CUDA: /modules/apps/cuda/11.0.1 (found suitable version “11.0”, minimum required is “9.0”)
– Found OpenMP_C: -fopenmp (found version “4.5”)
– Found OpenMP_CXX: -fopenmp (found version “4.5”)
– Found OpenMP: TRUE (found version “4.5”)
– Performing Test CFLAGS_EXCESS_PREC
– Performing Test CFLAGS_EXCESS_PREC - Success
– Performing Test CFLAGS_COPT
– Performing Test CFLAGS_COPT - Success
– Performing Test CFLAGS_NOINLINE
– Performing Test CFLAGS_NOINLINE - Success
– Performing Test CXXFLAGS_EXCESS_PREC
– Performing Test CXXFLAGS_EXCESS_PREC - Success
– Performing Test CXXFLAGS_COPT
– Performing Test CXXFLAGS_COPT - Success
– Performing Test CXXFLAGS_NOINLINE
– Performing Test CXXFLAGS_NOINLINE - Success
– Looking for include file unistd.h
– Looking for include file unistd.h - found
– Looking for include file pwd.h
– Looking for include file pwd.h - found
– Looking for include file dirent.h
– Looking for include file dirent.h - found
– Looking for include file time.h
– Looking for include file time.h - found
– Looking for include file sys/time.h
– Looking for include file sys/time.h - found
– Looking for include file io.h
– Looking for include file io.h - not found
– Looking for include file sched.h
– Looking for include file sched.h - found
– Looking for include file xmmintrin.h
– Looking for include file xmmintrin.h - found
– Looking for gettimeofday
– Looking for gettimeofday - found
– Looking for sysconf
– Looking for sysconf - found
– Looking for nice
– Looking for nice - found
– Looking for fsync
– Looking for fsync - found
– Looking for _fileno
– Looking for _fileno - not found
– Looking for fileno
– Looking for fileno - found
– Looking for _commit
– Looking for _commit - not found
– Looking for sigaction
– Looking for sigaction - found
– Performing Test HAVE_BUILTIN_CLZ
– Performing Test HAVE_BUILTIN_CLZ - Success
– Performing Test HAVE_BUILTIN_CLZLL
– Performing Test HAVE_BUILTIN_CLZLL - Success
– Looking for clock_gettime in rt
– Looking for clock_gettime in rt - found
– Looking for feenableexcept in m
– Looking for feenableexcept in m - found
– Looking for fedisableexcept in m
– Looking for fedisableexcept in m - found
– Checking for sched.h GNU affinity API
– Performing Test sched_affinity_compile
– Performing Test sched_affinity_compile - Success
– Looking for include file mm_malloc.h
– Looking for include file mm_malloc.h - found
– Looking for include file malloc.h
– Looking for include file malloc.h - found
– Checking for _mm_malloc()
– Checking for _mm_malloc() - supported
– Looking for posix_memalign
– Looking for posix_memalign - found
– Looking for memalign
– Looking for memalign - not found
– Check if the system is big endian
– Searching 16 bit integer
– Looking for sys/types.h
– Looking for sys/types.h - found
– Looking for stdint.h
– Looking for stdint.h - found
– Looking for stddef.h
– Looking for stddef.h - found
– Check size of unsigned short
– Check size of unsigned short - done
– Using unsigned short
– Check if the system is big endian - little endian
– Looking for HWLOC
– Looking for hwloc – hwloc.h not found
– Looking for hwloc – lib hwloc not found
– Could NOT find HWLOC (missing: HWLOC_LIBRARIES HWLOC_INCLUDE_DIRS) (Required is at least version “1.5”)
– Looking for C++ include pthread.h
– Looking for C++ include pthread.h - found
– Atomic operations found
– Performing Test PTHREAD_SETAFFINITY
– Performing Test PTHREAD_SETAFFINITY - Success
– Adding work-around for issue compiling CUDA code with glibc 2.23 string.h
– Check for working NVCC/C++ compiler combination with nvcc ‘/modules/apps/cuda/11.0.1/bin/nvcc’
– Check for working NVCC/C++ compiler combination - works
– Checking for GCC x86 inline asm
– Checking for GCC x86 inline asm - supported
– Detected build CPU vendor - Intel
– Detected build CPU brand - Intel® Xeon® Silver 4110 CPU @ 2.10GHz
– Detected build CPU family - 6
– Detected build CPU model - 85
– Detected build CPU stepping - 4
– Detected build CPU features - aes apic avx avx2 avx512f avx512cd avx512bw avx512vl clfsh cmov cx8 cx16 f16c fma hle htt intel lahf mmx msr nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdrnd rdtscp rtm sse2 sse3 sse4.1 sse4.2 ssse3 tdt x2apic
– Enabling RDTSCP support
– Checking for 64-bit off_t
– Checking for 64-bit off_t - present
– Checking for fseeko/ftello
– Checking for fseeko/ftello - present
– Checking for SIGUSR1
– Checking for SIGUSR1 - found
– Checking for pipe support
– Checking for system XDR support
– Checking for system XDR support - present
– Performing Test C_mavx_FLAG_ACCEPTED
– Performing Test C_mavx_FLAG_ACCEPTED - Success
– Performing Test C_mavx_COMPILE_WORKS
– Performing Test C_mavx_COMPILE_WORKS - Success
– Performing Test CXX_mavx_FLAG_ACCEPTED
– Performing Test CXX_mavx_FLAG_ACCEPTED - Success
– Performing Test CXX_mavx_COMPILE_WORKS
– Performing Test CXX_mavx_COMPILE_WORKS - Success
– Enabling 256-bit AVX SIMD instructions using CXX flags: -mavx
– Detecting flags to enable runtime detection of AVX-512 units on newer CPUs
– Performing Test C_xCORE_AVX512_qopt_zmm_usage_high_FLAG_ACCEPTED
– Performing Test C_xCORE_AVX512_qopt_zmm_usage_high_FLAG_ACCEPTED - Failed
– Performing Test C_xCORE_AVX512_FLAG_ACCEPTED
– Performing Test C_xCORE_AVX512_FLAG_ACCEPTED - Failed
– Performing Test C_mavx512f_mfma_FLAG_ACCEPTED
– Performing Test C_mavx512f_mfma_FLAG_ACCEPTED - Success
– Performing Test C_mavx512f_mfma_COMPILE_WORKS
– Performing Test C_mavx512f_mfma_COMPILE_WORKS - Success
– Performing Test CXX_xCORE_AVX512_qopt_zmm_usage_high_FLAG_ACCEPTED
– Performing Test CXX_xCORE_AVX512_qopt_zmm_usage_high_FLAG_ACCEPTED - Failed
– Performing Test CXX_xCORE_AVX512_FLAG_ACCEPTED
– Performing Test CXX_xCORE_AVX512_FLAG_ACCEPTED - Failed
– Performing Test CXX_mavx512f_mfma_FLAG_ACCEPTED
– Performing Test CXX_mavx512f_mfma_FLAG_ACCEPTED - Success
– Performing Test CXX_mavx512f_mfma_COMPILE_WORKS
– Performing Test CXX_mavx512f_mfma_COMPILE_WORKS - Success
– Detecting flags to enable runtime detection of AVX-512 units on newer CPUs - -mavx512f -mfma
– Performing Test _Wno_unused_command_line_argument_FLAG_ACCEPTED
– Performing Test _Wno_unused_command_line_argument_FLAG_ACCEPTED - Success
– Performing Test _callconv___vectorcall
– Performing Test _callconv___vectorcall - Failed
– Performing Test callconv___regcall
– Performing Test callconv___regcall - Failed
– Performing Test callconv
– Performing Test callconv - Success
– The GROMACS-managed build of FFTW 3 will configure with the following optimizations: --enable-sse2;–enable-avx;–enable-avx2
– Using external FFT library - FFTW3 build managed by GROMACS
– Looking for sgemm

– Looking for sgemm
- not found
– Could NOT find BLAS (missing: BLAS_LIBRARIES)
– Using GROMACS built-in BLAS.
– LAPACK requires BLAS
– A library with LAPACK API not found. Please specify library location.
– Using GROMACS built-in LAPACK.
– Checking for dlopen
– Performing Test HAVE_DLOPEN
– Performing Test HAVE_DLOPEN - Success
– Checking for dlopen - found
– Using dynamic plugins (e.g VMD-supported file formats)
– Checking for suitable VMD version
– VMD plugins not found. Path to VMD can be set with VMDDIR.
– Using default binary suffix: “”
– Using default library suffix: “”
– Could not convert sample image, ImageMagick convert can not be used. A possible way to fix it can be found here: https://alexvanderbist.com/posts/2018/fixing-imagick-error-unauthorized
Traceback (most recent call last):
File “”, line 1, in
ModuleNotFoundError: No module named ‘pygments’
– Could NOT find Sphinx (missing: SPHINX_EXECUTABLE pygments) (Required is at least version “1.6.1”)
– Performing Test HAVE_NO_DEPRECATED_COPY
– Performing Test HAVE_NO_DEPRECATED_COPY - Success
– Performing Test HAS_NO_STRINGOP_TRUNCATION
– Performing Test HAS_NO_STRINGOP_TRUNCATION - Success
– Performing Test HAS_NO_UNUSED_MEMBER_FUNCTION
– Performing Test HAS_NO_UNUSED_MEMBER_FUNCTION - Success
– Performing Test HAS_NO_REDUNDANT_MOVE
– Performing Test HAS_NO_REDUNDANT_MOVE - Success
– Performing Test HAS_NO_UNUSED
– Performing Test HAS_NO_UNUSED - Success
– Performing Test HAS_NO_UNUSED_PARAMETER
– Performing Test HAS_NO_UNUSED_PARAMETER - Success
– Performing Test HAS_NO_MISSING_DECLARATIONS
– Performing Test HAS_NO_MISSING_DECLARATIONS - Success
– Performing Test HAS_NO_NULL_CONVERSIONS
– Performing Test HAS_NO_NULL_CONVERSIONS - Success
– Performing Test HAS_DECL_IN_SOURCE
– Performing Test HAS_DECL_IN_SOURCE - Failed
– Performing Test HAS_NO_CLASS_MEMACCESS
– Performing Test HAS_NO_CLASS_MEMACCESS - Success
– Check if the system is big endian
– Searching 16 bit integer
– Using unsigned short
– Check if the system is big endian - little endian
– Looking for inttypes.h
– Looking for inttypes.h - found
– Could NOT find Doxygen (missing: DOXYGEN_EXECUTABLE)
– Doxygen not found. Documentation targets will not be generated.
Downloading: http://gerrit.gromacs.org/download/regressiontests-2020.2.tar.gz
– [download 100% complete]
– [download 0% complete]
– [download 1% complete]
– [download 2% complete]
– [download 3% complete]
– [download 4% complete]
– [download 5% complete]
– [download 6% complete]
– [download 7% complete]
– [download 8% complete]
– [download 9% complete]
– [download 10% complete]
– [download 11% complete]
– [download 12% complete]
– [download 13% complete]
– [download 14% complete]
– [download 15% complete]
– [download 16% complete]
– [download 17% complete]
– [download 18% complete]
– [download 19% complete]
– [download 20% complete]
– [download 21% complete]
– [download 22% complete]
– [download 23% complete]
– [download 24% complete]
– [download 25% complete]
– [download 26% complete]
– [download 27% complete]
– [download 28% complete]
– [download 29% complete]
– [download 30% complete]
– [download 31% complete]
– [download 32% complete]
– [download 33% complete]
– [download 34% complete]
– [download 35% complete]
– [download 36% complete]
– [download 37% complete]
– [download 38% complete]
– [download 39% complete]
– [download 40% complete]
– [download 41% complete]
– [download 42% complete]
– [download 43% complete]
– [download 44% complete]
– [download 45% complete]
– [download 46% complete]
– [download 47% complete]
– [download 48% complete]
– [download 49% complete]
– [download 50% complete]
– [download 51% complete]
– [download 52% complete]
– [download 53% complete]
– [download 54% complete]
– [download 55% complete]
– [download 56% complete]
– [download 57% complete]
– [download 58% complete]
– [download 59% complete]
– [download 60% complete]
– [download 61% complete]
– [download 62% complete]
– [download 63% complete]
– [download 64% complete]
– [download 65% complete]
– [download 66% complete]
– [download 67% complete]
– [download 68% complete]
– [download 69% complete]
– [download 70% complete]
– [download 71% complete]
– [download 72% complete]
– [download 73% complete]
– [download 74% complete]
– [download 75% complete]
– [download 76% complete]
– [download 77% complete]
– [download 78% complete]
– [download 79% complete]
– [download 80% complete]
– [download 81% complete]
– [download 82% complete]
– [download 83% complete]
– [download 84% complete]
– [download 85% complete]
– [download 86% complete]
– [download 87% complete]
– [download 88% complete]
– [download 89% complete]
– [download 90% complete]
– [download 91% complete]
– [download 92% complete]
– [download 93% complete]
– [download 94% complete]
– [download 95% complete]
– [download 96% complete]
– [download 97% complete]
– [download 98% complete]
– [download 99% complete]
– [download 100% complete]
– Configuring done
– Generating done
– Build files have been written to: /home/hsaplakoglu_umass_edu/gromacs-2020.2/build-gpu
72: command not found

It just occurred to me that additional runtime information GROMACS writes to the log file might be helpful in diagnosing the problem. It’s included below.

In the mean time, I’ve gone through the entire compilation process myself and confirmed that there are no obvious failures reported during either the cmake or make stages. All tests, including GpuUtilsUnitTests, pass during make check. Nevertheless, GPU detection is still deactivated when I run this version. Here is the pertinent section of a log file generated by the version I compiled myself:

Command line:
gmx mdrun -nsteps 250 -v -noappend -deffnm testsystem -cpi start_from_this.cpt

GROMACS version: 2020.2
Verified release checksum is 3f718d436b1ac2d44ce97164df8a13322fc143498ba44eccfd567e20d8aaea1d
Precision: single
Memory model: 64 bit
MPI library: thread_mpi
OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 64)
GPU support: CUDA
SIMD instructions: AVX2_256
FFT library: fftw-3.3.8-sse2-avx-avx2-avx2_128
RDTSCP usage: enabled
TNG support: enabled
Hwloc support: disabled
Tracing support: disabled
C compiler: /modules/apps/gcc/9.2.0/bin/gcc GNU 9.2.0
C compiler flags: -mavx2 -mfma -fexcess-precision=fast -funroll-all-loops -O3 -DNDEBUG
C++ compiler: /modules/apps/gcc/9.2.0/bin/c++ GNU 9.2.0
C++ compiler flags: -mavx2 -mfma -fexcess-precision=fast -funroll-all-loops -fopenmp -O3 -DNDEBUG
CUDA compiler: /modules/apps/cuda/11.0.1/bin/nvcc nvcc: NVIDIA ® Cuda compiler driver;Copyright © 2005-2020 NVI
DIA Corporation;Built on Wed_May__6_19:09:25_PDT_2020;Cuda compilation tools, release 11.0, V11.0.167;Build cuda_11.0_bu
.TC445_37.28358933_0
CUDA compiler flags:-std=c++14;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_72,code=sm_72;-use_fast_math;-D
_FORCE_INLINES;-mavx2 -mfma -fexcess-precision=fast -funroll-all-loops -fopenmp -O3 -DNDEBUG
CUDA driver: 10.20
CUDA runtime: N/A

Running on 1 node with total 16 cores, 32 logical cores (GPU detection deactivated)
Hardware detected:
CPU info:
Vendor: Intel
Brand: Intel® Xeon® Silver 4110 CPU @ 2.10GHz
Family: 6 Model: 85 Stepping: 4
Features: aes apic avx avx2 avx512f avx512cd avx512bw avx512vl clfsh cmov cx8 cx16 f16c fma hle htt intel lahf mmx msr nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdrnd rdtscp rtm sse2 sse3 sse4.1 sse4.2 ssse3 tdt x2apic
Number of AVX-512 FMA units: 1 (AVX2 is faster w/o 2 AVX-512 FMA units)
Hardware topology: Basic
Sockets, cores, and logical processors:
Socket 0: [ 0 16] [ 1 17] [ 2 18] [ 3 19] [ 4 20] [ 5 21] [ 6 22] [ 7 23]
Socket 1: [ 8 24] [ 9 25] [ 10 26] [ 11 27] [ 12 28] [ 13 29] [ 14 30] [ 15 31]

One more update: I’m including all the cmake settings pertaining to GPU / CUDA, as viewed with ccmake after running the cmake command. Nothing jumps out at me to suggest GPU support broke while running CMake. I do see (bolded, at the end) that the two GPUs on the node on which this was compiled were detected, they weren’t identified. Is this a problem?

UDA_64_BIT_DEVICE_CODE /usr/ON
UDA_ATTACH_VS_BUILD_RULE_TO_C ON oR
UDA_BUILD_CUBIN OFF
UDA_BUILD_EMULATION NDS OFF
UDA_CUDART_LIBRARY /modules/apps/cuda/11.0.1/lib64/libcudart.soso
UDA_CUDA_LIBRARY /usr/lib/x86_64-linux-gnu/libcuda.so
UDA_GENERATED_OUTPUT_DIRIR
UDA_HOST_COMPILATION_CPPPP ON
UDA_HOST_COMPILER ISI/usr/binmodules/apps/gcc/9.2.0/bin/gcc
UDA_HOST_COMPILER_OPTIONSNSN N -D_FORCE_INLINES
UDA_NVCC_EXECUTABLE /modules/apps/cuda/11.0.1/bin/nvcccc
UDA_NVCC_FLAGS
CUDA_NVCC_FLAGS_DEBUG 3
CUDA_NVCC_FLAGS_MINSIZEREL DNDEBUG
CUDA_NVCC_FLAGS_RELEASE
UDA_NVCC_FLAGS_RELWITHDEBINFO py
UDA_PROPAGATE_HOST_FLAGSGS ON
UDA_SDK_ROOT_DIR R B CUDA_SDK_ROOT_DIR-NOTFOUND
UDA_SEPARABLE_COMPILATIONON FOFF ib
UDA_TOOLKIT_INCLUDE /modules/apps/cuda/11.0.1/include
CUDA_USE_STATIC_CUDA_RUNTIME ON
UDA_VERBOSE_BUILD OFF DNDEBUG
UDA_VERSION 11.0 BUG -pg
UDA_cublas_LIBRARY /modules/apps/cuda/11.0.1/lib64/libcublas.soso
UDA_cudadevrt_LIBRARY /modules/apps/cuda/11.0.1/lib64/libcudadevrt.a
UDA_cudart_static_LIBRARYEROFF /modules/apps/cuda/11.0.1/lib64/libcudart_static.a.a
UDA_cufft_LIBRARY /modules/apps/cuda/11.0.1/lib64/libcufft.so
UDA_cupti_LIBRARY /modules/apps/cuda/11.0.1/extras/CUPTI/lib64/libcupti.soso
UDA_curand_LIBRARY /modules/apps/cuda/11.0.1/lib64/libcurand.soso
UDA_cusolver_LIBRARY /modules/apps/cuda/11.0.1/lib64/libcusolver.soso
UDA_cusparse_LIBRARY /modules/apps/cuda/11.0.1/lib64/libcusparse.soso
UDA_nppc_LIBRARY /usr/binmodules/apps/cuda/11.0.1/lib64/libnppc.so
UDA_nppial_LIBRARY OFFE /modules/apps/cuda/11.0.1/lib64/libnppial.so
UDA_nppicc_LIBRARY /modules/apps/gccuda/11.0.1/lib64/libnppicc.so
UDA_nppicom_LIBRARY -lCUDA_nppicom_LIBRARY-NOTFOUND
UDA_nppidei_LIBRARY DS /modules/apps/cuda/11.0.1/lib64/libnppidei.so
UDA_nppif_LIBRARY SD /modules/apps/cuda/11.0.1/lib64/libnppif.so
UDA_nppig_LIBRARY IR /modules/apps/cuda/11.0.1/lib64/libnppig.so
UDA_nppim_LIBRARY IR /modules/apps/cuda/11.0.1/lib64/libnppim.so
UDA_nppist_LIBRARY IR /modules/apps/cuda/11.0.1/lib64/libnppist.so
UDA_nppisu_LIBRARY IR /modules/apps/cuda/11.0.1/lib64/libnppisu.so
UDA_nppitc_LIBRARY IR /modules/apps/cuda/11.0.1/lib64/libnppitc.so
UDA_npps_LIBRARY IR /modules/apps/cuda/11.0.1/lib64/libnpps.so
UDA_rt_LIBRARY IR /usr/lib/x86_64-linux-gnu/librt.so

UGMX_CLANG_CUDA OFF py
UGMX_COMPILER_WARNINGS OFF
UGMX_COOL_QUOTES B ON
UGMX_CUDA_NB_SINGLE_COMPILATIONFOFF ib
GMX_CUDA_TARGET_SM 70;72

UGMX_DETECT_GPU_AVAILABLE /mON
UGMX_DETECT_GPU_COUNT 2
UGMX_DETECT_GPU_INFO OFF /mGPU 0: Unknown;GPU 0: Unknown

Setting CUDA_VISIBLE_DEVICES="" (or to a non-existent device) will also result in something like

Running on 1 node with total 40 cores, 80 logical cores (GPU detection deactivated)

We had a typo in one of our bash startup scripts that resulted in cuda_visible_devices getting set to some strange value which caused this issue.

This may or may not be your issue but I just wanted to mention it since it hasn’t been mentioned before in this thread. Just in case it’ll help someone in the future 😀

I believe I’ve resolved this issue, and my conclusion is that gromacs could really help out by being more explicit about what went wrong here. I know that’s not always possible, but in this case “GPU detection deactivated” was actually quite misleading.

Just to see what would happen, I compiled 2018.8 with the same options on the same hardware, and got this when I ran it with the same runtime options (I’ve bolded the important parts):

CUDA compiler: /modules/apps/cuda/11.0.1/bin/nvcc nvcc: NVIDIA ® Cuda compiler driver;Copyright © 2005-2020 NVIDIA Corporation;Built on Wed_May__6_19:09:25_PDT_2020;Cuda compilation tools, release 11.0, V11.0.167;Build cuda_11.0_bu.TC445_37.28358933_0

CUDA driver: 10.20
CUDA runtime: 32.47

NOTE: Detection of GPUs failed. The API reported:
GROMACS cannot run tasks on a GPU.

This time CUDA runtime was loaded and GPU detection was attempted, but failed? A google search for “gromacs detection of GPUs failed” led me to this answer from the old mailing list: https://www.mail-archive.com/gromacs.org_gmx-users@maillist.sys.kth.se/msg36254.html, specifically:

CUDA compiler: /usr/local/cuda-9.2/bin/nvcc nvcc: NVIDIA ® Cuda compiler
driver;Copyright © 2005-2018 NVIDIA Corporation;Built on
Wed_Apr_11_23:16:29_CDT_2018;Cuda compilation tools, release 9.2, V9.2.88

CUDA driver: 9.10
CUDA runtime: 32.64

You can not run a program compiled with CUDA 9.2 on a system with a driver
labeled “CUDA 9.1” compatible. Either use CUDA 9.1 or upgrade your NVIDIA
drivers.

After I compiled with CUDA 10.1 instead of 11.0, both 2018.8 and 2020.2 versions were able to detect and use the GPUs.

Since CMake was both supplied with the CUDA version:

– Found CUDA: /modules/apps/cuda/11.0.1 (found suitable version “11.0”, minimum required is “9.0”)

and detected the GPUs:

– Looking for NVIDIA GPUs present in the system
– Number of NVIDIA GPUs detected: 2

it seems like the CUDA driver / CUDA toolkit incompatibility could have been detected at that stage? Certainly it became known at run time, but, unlike 2018, the 2020 version appears just to give up silently on loading CUDA runtime and detecting GPUs:

CUDA driver: 10.20
CUDA runtime: N/A

Running on 1 node with total 16 cores, 32 logical cores (GPU detection deactivated)

which seems like the opposite of desired behavior. Maybe falling through a switch statement to a catch-all case that results in abandoning GPU detection? I can’t imagine this behavior was intended. In any case, since the 2020 version already changes its loading / detection behavior in light of the mismatch, maybe that can be turned into a notification?

Even I am facing a similar problem with 2020 versions. I have tried installing with combinations like cuda-9, 10, 11 with different versions of gcc like 5.5, 7.3 and 7.5. But, no luck. All the time, I get GPU detection deactivated only. Can you provide specific versions of the software you have used to compile it. I was able to use GPUs for the 2019 installations with same parameters used for 2020.

I found the gromacs 2020 with same parameters installation detects 2 x RTX 2080 Ti and it has 32 Xeon cores Gold 6130 processors with 64 threads. it is very strange. The only change is 1080 Ti driver version is 384.98 and 2080 Ti driver version is 410.93. I even tried compiling on skylake node with SIMD as AVX256. But, still it gives GPU detection deactivated on the broadwell nodes.

Please find my cmake command below.

My machine is E5 -2667V4 16 xeon cores with 32 threads with 2 x 1080 Ti cards.

cmake … -DGMX_GPU=on -DGMX_FFT_LIBRARY=fftw3 -DCMAKE_INSTALL_PREFIX="/apps/gromacs/2020.3/avx-256/" -DCMAKE_C_COMPILER="/apps/compilers/gcc/7.5/bin/gcc" -DCMAKE_CXX_COMPILER="/apps/compilers/gcc/7.5/bin/c++" -DBUILD_SHARED_LIBS=OFF -DCMAKE_PREFIX_PATH="/apps/libs/fftw/3.3.8"

Any help/ inputs to solve the problem will be appreciated.

In my case the issue was that version of CUDA against which I compiled was ahead of what was supported by the CUDA driver installed for the GPUs (I was compiling against CUDA 11.0, while the driver was 10.2). The problem for me with 2020 was that I couldn’t tell what kind of problem I was having because the message about GPU detection was misleading (it hadn’t been deactivated, it had failed). I got a more informative message with 2018, which allowed me to resolve the problem for both 2018 and 2020. I would guess the same is true for you – GPU detection is failing.

It might help if provide the hardware detection section of a log file produced by mdrun. Specifically the “GPU support”, “CUDA compiler”, “CUDA driver”, and “CUDA runtime” lines (assuming you are using CUDA). Those helped me diagnose my problem.

Thanks for looking into this and provided all the details for both 2020 and 2018! I opened an issue on gitlab to add more diagnostics info: