GROMACS SYCL for NVIDIA GPUs

GROMACS version: 2021
GROMACS modification: Yes/No
Here post your question

I am installing gromacs with Clang and Clang++ (for SYCL) complier for a system with Nvidia gpu’s. While installation it ask me for g++ for c++ complier and clang++ is not able to pass the test. Please let me know if anyone has done these type of installation.
Thank you

Hi!

GROMACS 2021 only supports Intel GPUs with SYCL.

GROMACS 2023 is known to work on NVIDIA cards via SYCL (with both Intel DPC++ and hipSYCL); however, CUDA is strongly recommended for production runs instead.

If you encounter an error with GROMACS 2023, could you please share the commands you are using and the error message?

Thank you I did try to install the 2023 gromacs, I get following error related Gromacs SYCL 2023

[ 93%] Linking CXX shared library …/…/lib/libgromacs.so

error: linking module flags ‘nvvm-reflect-ftz’: IDs have conflicting override values in ‘/tmp/clang+±d3c14f/libsycl-crt-dfad4e.cubin’ and ‘llvm-link’

clang++: error: sycl-link command failed with exit code 1 (use -v to see invocation)

make[2]: *** [src/gromacs/CMakeFiles/libgromacs.dir/build.make:13390: lib/libgromacs.so.8.0.0] Error 1

make[1]: *** [CMakeFiles/Makefile2:4164: src/gromacs/CMakeFiles/libgromacs.dir/all] Error 2

make: *** [Makefile:166: all] Error 2

Please, share the cmake command you used and your system configuration (OS, CUDA and oneAPI/DPC++ versions, etc). The error message by itself is not very telling.

Hi! No need to share the CMakeLists.txt, it is pretty standard.

What we need to know are the flags you used when you called cmake when building GROMACS.

cmake … -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DGMX_GPU=SYCL -DGMX_GPU_FFT_LIBRARY=none -DSYCL_CXX_FLAGS_EXTRA=-fsycl-targets=nvptx64-nvidia-cuda

Thank you, now I can reproduce your problem.

Per Codeplay troubleshooting guide, you should add -fdenormal-fp-math=ieee compilation flag. For GROMACS, it can be done by setting -DSYCL_CXX_FLAGS_EXTRA='-fsycl-targets=nvptx64-nvidia-cuda;-Xclang;-fdenormal-fp-math=ieee' instead of the shorter version you have.

Note 1: please add -DGMX_GPU_NB_CLUSTER_SIZE=8 to your cmake command; otherwise the GPU code will not work correctly.

Note 2: Depending on how you installed oneAPI and your CMake version, you might get the following warning during CMake execution: “To use GPU acceleration efficiently, mdrun requires OpenMP multi-threading, which is currently not enabled.” In this case, it is recommended also to add -DCMAKE_C_FLAGS=-isystem/opt/intel/oneapi/compiler/latest/linux/compiler/include/ -DCMAKE_CXX_FLAGS=-isystem/opt/intel/oneapi/compiler/latest/linux/compiler/include/ flags to cmake.

Unfortunately, SYCL is a new technology, and automatic detection of the necessary build flags does not always work smoothly yet.

P.S.: I edited the topic title to make it more searchable.

Thank you so much.

In was installing it on Ubuntu 22.04 the with the same flags and got this error
– The CXX compiler identification is IntelLLVM 2023.0.0
– Detecting CXX compiler ABI info
– Detecting CXX compiler ABI info - failed
– Check for working CXX compiler: /opt/intel/oneapi/compiler/2023.0.0/linux/bin-llvm/clang++
– Check for working CXX compiler: /opt/intel/oneapi/compiler/2023.0.0/linux/bin-llvm/clang++ - broken
CMake Error at /usr/share/cmake-3.22/Modules/CMakeTestCXXCompiler.cmake:62 (message):
The C++ compiler

"/opt/intel/oneapi/compiler/2023.0.0/linux/bin-llvm/clang++"

is not able to compile a simple test program.

Does it print any more info? Usually, a full error message should follow. Without it, it is hard to say anything besides “something is wrong”.

If not, you should be able to see the detailed message at the end of CMakeFiles/CMakeError.log in the build directory.

Linking CXX executable cmTC_a0393

/usr/bin/cmake -E cmake_link_script CMakeFiles/cmTC_a0393.dir/link.txt --verbose=1

/opt/intel/oneapi/compiler/2023.0.0/linux/bin-llvm/clang++ -isystem/opt/intel/oneapi/compiler/latest/linux/compiler/include/ CMakeFiles/cmTC_a0393.dir/testCXXCompiler.cxx.o -o cmTC_a0393

/usr/bin/ld: cannot find -lstdc++: No such file or directory

clang++: error: linker command failed with exit code 1 (use -v to see invocation)

gmake[1]: *** [CMakeFiles/cmTC_a0393.dir/build.make:100: cmTC_a0393] Error 1

gmake[1]: Leaving directory ‘/gromacs-2023.1/buildtest/CMakeFiles/CMakeTmp’

gmake: *** [Makefile:127: cmTC_a0393/fast] Error 2

Could you try installing libstdc++ with sudo apt install libstdc++-12-dev?

That is a known issues with ROCm Clang, but might affect Intel LLVM as well.

Thank you

PI CUDA ERROR:
Value: 1
Name: CUDA_ERROR_INVALID_VALUE
Description: invalid argument
Function: cuda_piEnqueueKernelLaunch
Source Location: /root/intel-llvm-mirror/sycl/plugins/cuda/pi_cuda.cpp:3085

Could you provide a bit more context? Software versions, what and how you’re running, full logs, etc?

If the error is reproducible, it could be helpful if you run GROMACS with SYCL_PI_TRACE=2 environment variable set and share the output.

let me know it this helps

gmx mdrun -ntmpi 2 -ntomp 1 -s /home/benchmark/gromacs/data/benchPEP.tpr -nsteps 10000 -resethway

—> piEnqueueKernelLaunch(
: 0x7fc34cefa470
: 0x7fc34de6b8e0
: 3
: 0x7fc3a492d308
: 0x7fc3a492d2d8
: 0x7fc3a492d2f0
: 0
pi_event * : 0[ nullptr ]
pi_event * : 0x7fc34dddde28[ 0 … ]

PI CUDA ERROR:
Value: 1
Name: CUDA_ERROR_INVALID_VALUE
Description: invalid argument
Function: cuda_piEnqueueKernelLaunch
Source Location: /root/intel-llvm-mirror/sycl/plugins/cuda/pi_cuda.cpp:3085

Could you please also provide Software versions and full logs (not just the last item)? It might also matter how you compiled GROMACS and what GPU you’re running on.

it runs, if I match the number of cpu cores to the flag ntmpi i.e if I set it to 16. The system has v100 nvidia gpu with 16 cpu cores. Gromacs version is 2023.1, also could we enable the OpenMP support for SYCL version.

Sure. OpenMP is highly recommended to use with GPU builds. I would not expect it to be relevant for the issue you’re observing, but (a) OpenMP is needed for good performance in most cases, (b) non-OpenMP GPU builds receive very limited testing.

So, you are running non-OpenMP build of GROMACS 2023.1 with oneAPI ??? and Codeplay plugin version ???, CUDA version ???, and it works fine when you use -ntmpi 16, but with lower number of ranks it hangs? Could you please attach the full output when GROMACS is run with SYCL_PI_TRACE=2, like below?

SYCL_PI_TRACE=2 gmx mdrun -ntmpi 2 -ntomp 1 -s /home/benchmark/gromacs/data/benchPEP.tpr -nsteps 10000 &> full_output.log

So I can set -DGMX_OPENMP=ON and build the gromacs ?