Issue after running 'make' version 2024

GROMACS version: 2024
GROMACS modification: No
After running successfully:
cmake … -DGMX_BUILD_OWN_FFTW=ON -DREGRESSIONTEST_DOWNLOAD=ON -DCMAKE_INSTALL_PREFIX=/cm/shared/apps/gromacs2024 -DGMX_MPI=on -DGMX_GPU=CUDA GMX_FORCE_GPU_AWARE_MPI=1

then running:
make

I receive the following. Please let me know what questions I can answer about different versions of compilers and all that. Maybe I am missing something easy hopefully. I am trying to get this installed on a 10 node Bright Computing cluster. I am currently on a node with 4-RTX 2080s with NVIDIA driver 530.02 and CUDA 12.1.

[ 11%] Building CXX object src/gromacs/modularsimulator/CMakeFiles/modularsimulator.dir/pullelement.cpp.o
[ 11%] Building CXX object src/gromacs/modularsimulator/CMakeFiles/modularsimulator.dir/referencetemperaturemanager.cpp.o
[ 11%] Building CXX object src/gromacs/modularsimulator/CMakeFiles/modularsimulator.dir/signallers.cpp.o
[ 11%] Building CXX object src/gromacs/modularsimulator/CMakeFiles/modularsimulator.dir/simulatoralgorithm.cpp.o
[ 11%] Building CXX object src/gromacs/modularsimulator/CMakeFiles/modularsimulator.dir/statepropagatordata.cpp.o
[ 11%] Building CXX object src/gromacs/modularsimulator/CMakeFiles/modularsimulator.dir/topologyholder.cpp.o
[ 12%] Building CXX object src/gromacs/modularsimulator/CMakeFiles/modularsimulator.dir/trajectoryelement.cpp.o
[ 12%] Building CXX object src/gromacs/modularsimulator/CMakeFiles/modularsimulator.dir/velocityscalingtemperaturecoupling.cpp.o
[ 12%] Built target modularsimulator
[ 12%] Building CXX object src/gromacs/energyanalysis/CMakeFiles/energyanalysis.dir/energyterm.cpp.o
[ 12%] Built target energyanalysis
[ 12%] Generating baseversion-gen.cpp
[ 12%] Building NVCC (Device) object src/gromacs/CMakeFiles/libgromacs.dir/domdec/libgromacs_generated_gpuhaloexchange_impl_gpu.cpp.o
[ 12%] Building NVCC (Device) object src/gromacs/CMakeFiles/libgromacs.dir/domdec/libgromacs_generated_gpuhaloexchange_impl_gpu.cu.o
[ 12%] Building NVCC (Device) object src/gromacs/CMakeFiles/libgromacs.dir/ewald/libgromacs_generated_pme_coordinate_receiver_gpu_impl_gpu.cpp.o
[ 12%] Building NVCC (Device) object src/gromacs/CMakeFiles/libgromacs.dir/ewald/libgromacs_generated_pme_force_sender_gpu_impl_gpu.cpp.o
[ 12%] Building NVCC (Device) object src/gromacs/CMakeFiles/libgromacs.dir/ewald/libgromacs_generated_pme_force_sender_gpu_impl_gpu.cu.o
[ 12%] Building NVCC (Device) object src/gromacs/CMakeFiles/libgromacs.dir/ewald/libgromacs_generated_pme_gather.cu.o
/cm/shared/apps/cuda11.3/toolkit/11.3.0/include/texture_indirect_functions.h(111): error: Internal Compiler Error (codegen): “unexpected operand in tex/surf handler”

CMake Error at libgromacs_generated_pme_gather.cu.o.Release.cmake:280 (message):
Error generating file
/tmp/gromacs-2024/build/src/gromacs/CMakeFiles/libgromacs.dir/ewald/./libgromacs_generated_pme_gather.cu.o

make[2]: *** [src/gromacs/CMakeFiles/libgromacs.dir/build.make:259: src/gromacs/CMakeFiles/libgromacs.dir/ewald/libgromacs_generated_pme_gather.cu.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:4318: src/gromacs/CMakeFiles/libgromacs.dir/all] Error 2
make: *** [Makefile:166: all] Error 2

Hi!

This error suggests that you are using CUDA 11.3 (which is known to be broken), not CUDA 12.1. Make sure you have the correct module loaded in your environment; you might also have to clear the build directory in case CMake cache remembers the old paths to CUDA.

Also, looks like you are passing GMX_FORCE_GPU_AWARE_MPI=1 to cmake command. That will not have any effect, this environment variable is used during GROMACS runs, not when compiling it.

After I posted the message, I noticed that it was showing Cuda 11.3 and thought that seemed odd. When I run nvidia-smi, it shows version 12.1.

I am going to work on getting 12.1 toolkit setup so I can load it as a module, but am having a tough time finding the source code.

I will keep you updated on my progress though. Thank you for your quick response.

I was able to load the cuda12.1 toolkit and compile and install the software. Thank you very much for your help. This was all due to having Cuda 11.3.

1 Like