Failed to detect a default CUDA architecture

GROMACS version: 2024.2
GROMACS modification: No

Hello, I am facing the issue mentionned here: Failing to detect a default CUDA architecture (#4863) · Issues · GROMACS / GROMACS · GitLab

How I compile:

#!/usr/bin/env bash

# source gnu
. /opt/rh/gcc-toolset-12/enable

# source openmpi
export PATH=$PATH:/NAS/software/molsim/openmpi/5.0.3/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/NAS/software/molsim/openmpi/5.0.3/lib

# source cuda
HPCSDK=/opt/nvidia/hpc_sdk
HPCSDK_LIBDIR=$HPCSDK/Linux_x86_64/23.5/math_libs/12.1
NVSHMEM_HOME=$HPCSDK/Linux_x86_64/23.5/comm_libs/12.1/nvshmem_cufftmp_compat
INSTALL=/NAS/software/molsim/gromacs-2024.2/gcc--12.2.1__openmpi--5.0.3__cuda--12.1/default

# build
cd $HOME/src/gromacs-2024.2
rm -rf ./build/*
cd build
echo $PWD

cmake .. \
        -DCMAKE_PREFIX_PATH=/NAS/software/molsim \
        -DREGRESSIONTEST_DOWNLOAD=OFF \
        -DGMX_BUILD_OWN_FFTW=off \
        -DGMX_FFT_LIBRARY=fftw3 \
        -DGMX_DOUBLE=off \
        -DGMX_MPI=on \
        -DGMX_SIMD=AVX2_256 \
        -DCMAKE_C_COMPILER=gcc \
        -DCMAKE_CXX_COMPILER=g++ \
        -DGMX_GPU=CUDA \
        -DCUDA_TOOLKIT_ROOT_DIR=$HPCSDK/Linux_x86_64/23.5/cuda/12.1 \
        -DGMX_USE_CUFFTMP=ON \
        -DcuFFTMp_ROOT=$HPCSDK_LIBDIR \
        -DCMAKE_INSTALL_PREFIX=$INSTALL \
        -DBUILD_SHARED_LIBS=on

#make -j 24
#make check
#make test
#make install

Output:

/NAS/USERS/e083475/src/gromacs-2024.2/build
-- The C compiler identification is GNU 12.2.1
-- The CXX compiler identification is GNU 12.2.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /opt/rh/gcc-toolset-12/root/usr/bin/gcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /opt/rh/gcc-toolset-12/root/usr/bin/g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Could NOT find Python3 (missing: Python3_INCLUDE_DIRS Python3_LIBRARIES Development Development.Module Development.Embed) (found suitable version "3.11.5", minimum required is "3.7")
-- Selected GPU FFT library - cuFFT
-- Found OpenMP_C: -fopenmp (found version "4.5")
-- Found OpenMP_CXX: -fopenmp (found version "4.5")
-- Found OpenMP: TRUE (found version "4.5")
-- Performing Test CFLAGS_WARN_NO_MISSING_FIELD_INITIALIZERS
-- Performing Test CFLAGS_WARN_NO_MISSING_FIELD_INITIALIZERS - Success
-- Performing Test CFLAGS_EXCESS_PREC
-- Performing Test CFLAGS_EXCESS_PREC - Success
-- Performing Test CFLAGS_COPT
-- Performing Test CFLAGS_COPT - Success
-- Performing Test CFLAGS_NOINLINE
-- Performing Test CFLAGS_NOINLINE - Success
-- Performing Test CXXFLAGS_WARN_NO_MISSING_FIELD_INITIALIZERS
-- Performing Test CXXFLAGS_WARN_NO_MISSING_FIELD_INITIALIZERS - Success
-- Performing Test CXXFLAGS_EXCESS_PREC
-- Performing Test CXXFLAGS_EXCESS_PREC - Success
-- Performing Test CXXFLAGS_COPT
-- Performing Test CXXFLAGS_COPT - Success
-- Performing Test CXXFLAGS_NOINLINE
-- Performing Test CXXFLAGS_NOINLINE - Success
-- Looking for include file unistd.h
-- Looking for include file unistd.h - found
-- Looking for include file pwd.h
-- Looking for include file pwd.h - found
-- Looking for include file dirent.h
-- Looking for include file dirent.h - found
-- Looking for include file time.h
-- Looking for include file time.h - found
-- Looking for include file sys/time.h
-- Looking for include file sys/time.h - found
-- Looking for include file io.h
-- Looking for include file io.h - not found
-- Looking for include file sched.h
-- Looking for include file sched.h - found
-- Looking for include file xmmintrin.h
-- Looking for include file xmmintrin.h - found
-- Looking for gettimeofday
-- Looking for gettimeofday - found
-- Looking for sysconf
-- Looking for sysconf - found
-- Looking for nice
-- Looking for nice - found
-- Looking for fsync
-- Looking for fsync - found
-- Looking for _fileno
-- Looking for _fileno - not found
-- Looking for fileno
-- Looking for fileno - found
-- Looking for _commit
-- Looking for _commit - not found
-- Looking for sigaction
-- Looking for sigaction - found
-- Performing Test HAVE_BUILTIN_CLZ
-- Performing Test HAVE_BUILTIN_CLZ - Success
-- Performing Test HAVE_BUILTIN_CLZLL
-- Performing Test HAVE_BUILTIN_CLZLL - Success
-- Looking for clock_gettime in rt
-- Looking for clock_gettime in rt - found
-- Looking for feenableexcept in m
-- Looking for feenableexcept in m - found
-- Looking for fedisableexcept in m
-- Looking for fedisableexcept in m - found
-- Checking for sched.h GNU affinity API
-- Performing Test sched_affinity_compile
-- Performing Test sched_affinity_compile - Success
-- Looking for include file mm_malloc.h
-- Looking for include file mm_malloc.h - found
-- Looking for include file malloc.h
-- Looking for include file malloc.h - found
-- Checking for _mm_malloc()
-- Checking for _mm_malloc() - supported
-- Looking for posix_memalign
-- Looking for posix_memalign - found
-- Looking for memalign
-- Looking for memalign - not found
-- MPI is not compatible with thread-MPI. Disabling thread-MPI.
-- Found MPI_CXX: /NAS/software/molsim/openmpi/5.0.3/lib/libmpi.so (found version "3.1")
-- Found MPI: TRUE (found version "3.1") found components: CXX
-- GROMACS library will use OpenMPI 5.0.3
-- Using default binary suffix: "_mpi"
-- Using default library suffix: "_mpi"
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Looking for C++ include pthread.h
-- Looking for C++ include pthread.h - found
-- Performing Test TEST_ATOMICS
-- Performing Test TEST_ATOMICS - Success
-- Atomic operations found
-- Performing Test PTHREAD_SETAFFINITY
-- Performing Test PTHREAD_SETAFFINITY - Success
-- Performing Test C_mavx2_mfma_FLAG_ACCEPTED
-- Performing Test C_mavx2_mfma_FLAG_ACCEPTED - Success
-- Performing Test C_mavx2_mfma_COMPILE_WORKS
-- Performing Test C_mavx2_mfma_COMPILE_WORKS - Success
-- Performing Test CXX_mavx2_mfma_FLAG_ACCEPTED
-- Performing Test CXX_mavx2_mfma_FLAG_ACCEPTED - Success
-- Performing Test CXX_mavx2_mfma_COMPILE_WORKS
-- Performing Test CXX_mavx2_mfma_COMPILE_WORKS - Success
-- Enabling 256-bit AVX2 SIMD instructions using CXX flags:  -mavx2 -mfma
-- Detecting flags to enable runtime detection of AVX-512 units on newer CPUs
-- Performing Test C_march_skylake_avx512_FLAG_ACCEPTED
-- Performing Test C_march_skylake_avx512_FLAG_ACCEPTED - Success
-- Performing Test C_march_skylake_avx512_COMPILE_WORKS
-- Performing Test C_march_skylake_avx512_COMPILE_WORKS - Success
-- Performing Test CXX_march_skylake_avx512_FLAG_ACCEPTED
-- Performing Test CXX_march_skylake_avx512_FLAG_ACCEPTED - Success
-- Performing Test CXX_march_skylake_avx512_COMPILE_WORKS
-- Performing Test CXX_march_skylake_avx512_COMPILE_WORKS - Success
-- Detecting flags to enable runtime detection of AVX-512 units on newer CPUs -  -march=skylake-avx512
-- Performing Test _Wno_unused_command_line_argument_FLAG_ACCEPTED
-- Performing Test _Wno_unused_command_line_argument_FLAG_ACCEPTED - Success
-- Performing Test _callconv___vectorcall
-- Performing Test _callconv___vectorcall - Failed
-- Performing Test _callconv___regcall
-- Performing Test _callconv___regcall - Failed
-- Performing Test _callconv_ 
-- Performing Test _callconv_  - Success
-- Found CUDA: /opt/nvidia/hpc_sdk/Linux_x86_64/23.5/cuda/12.1 (found suitable version "12.1", minimum required is "11.0")
-- Adding work-around for issue compiling CUDA code with glibc 2.23 string.h
-- Check for working NVCC/C++ compiler combination with nvcc '/opt/nvidia/hpc_sdk/Linux_x86_64/23.5/cuda/12.1/bin/nvcc'
-- Check for working NVCC/C++ compiler combination - works
-- Checking if nvcc accepts flags --generate-code=arch=compute_35,code=sm_35
-- Checking if nvcc accepts flags --generate-code=arch=compute_35,code=sm_35 - No
-- Checking if nvcc accepts flags --generate-code=arch=compute_37,code=sm_37
-- Checking if nvcc accepts flags --generate-code=arch=compute_37,code=sm_37 - No
-- Checking if nvcc accepts flags --generate-code=arch=compute_50,code=sm_50
-- Checking if nvcc accepts flags --generate-code=arch=compute_50,code=sm_50 - Success
-- Checking if nvcc accepts flags --generate-code=arch=compute_52,code=sm_52
-- Checking if nvcc accepts flags --generate-code=arch=compute_52,code=sm_52 - Success
-- Checking if nvcc accepts flags --generate-code=arch=compute_60,code=sm_60
-- Checking if nvcc accepts flags --generate-code=arch=compute_60,code=sm_60 - Success
-- Checking if nvcc accepts flags --generate-code=arch=compute_61,code=sm_61
-- Checking if nvcc accepts flags --generate-code=arch=compute_61,code=sm_61 - Success
-- Checking if nvcc accepts flags --generate-code=arch=compute_70,code=sm_70
-- Checking if nvcc accepts flags --generate-code=arch=compute_70,code=sm_70 - Success
-- Checking if nvcc accepts flags --generate-code=arch=compute_75,code=sm_75
-- Checking if nvcc accepts flags --generate-code=arch=compute_75,code=sm_75 - Success
-- Checking if nvcc accepts flags --generate-code=arch=compute_80,code=sm_80
-- Checking if nvcc accepts flags --generate-code=arch=compute_80,code=sm_80 - Success
-- Checking if nvcc accepts flags --generate-code=arch=compute_86,code=sm_86
-- Checking if nvcc accepts flags --generate-code=arch=compute_86,code=sm_86 - Success
-- Checking if nvcc accepts flags --generate-code=arch=compute_89,code=sm_89
-- Checking if nvcc accepts flags --generate-code=arch=compute_89,code=sm_89 - Success
-- Checking if nvcc accepts flags --generate-code=arch=compute_90,code=sm_90
-- Checking if nvcc accepts flags --generate-code=arch=compute_90,code=sm_90 - Success
-- Checking if nvcc accepts flags -Wno-deprecated-gpu-targets
-- Checking if nvcc accepts flags -Wno-deprecated-gpu-targets - Success
-- Checking if nvcc accepts flags --generate-code=arch=compute_53,code=sm_53
-- Checking if nvcc accepts flags --generate-code=arch=compute_53,code=sm_53 - Success
-- Checking if nvcc accepts flags --generate-code=arch=compute_80,code=sm_80
-- Checking if nvcc accepts flags --generate-code=arch=compute_80,code=sm_80 - Success
-- Checking if nvcc accepts flags -use_fast_math
-- Checking if nvcc accepts flags -use_fast_math - Success
-- Checking if nvcc accepts flags -Xptxas;-warn-double-usage
-- Checking if nvcc accepts flags -Xptxas;-warn-double-usage - Success
-- Checking if nvcc accepts flags -Xptxas;-Werror
-- Checking if nvcc accepts flags -Xptxas;-Werror - Success
-- The CUDA compiler identification is unknown

CMake Error at /NAS/USERS/e083475/opt/cmake-3.29.3-linux-x86_64/share/cmake-3.29/Modules/CMakeDetermineCUDACompiler.cmake:266 (message):
  Failed to detect a default CUDA architecture.



  Compiler output:

Call Stack (most recent call first):
  cmake/gmxManageCuda.cmake:116 (enable_language)
  CMakeLists.txt:697 (include)

-- Configuring incomplete, errors occurred!

Do you have any idea ?