Problems with OpenCL build

GROMACS version: 2023
GROMACS modification: No

Hello all,

I have really been struggling to get my GPU to do computation during an mdrun. My current cmake build is:

cmake … -DGMX_BUILD_OWN_FFTW=ON -DGMX_GPU=OpenCL

and my CPU and GPU is:
Ryzen 7 5700G
Radeon RX6700XT

Currently my problem is two-fold. First, GROMACS lists my 6700XT as incompatible. I realize that this is the case, however, I have found some scenarios where NAVI22 builds work, so I am interested in seeing if NAVI21 might also work. There is a way to set the environmental variables to ignore GPU compatibility check, but I couldn’t figure out how to do so. GROMACS does successfully recognize my GPU.

Second, even if I try to use integrated graphics on my CPU, the run freezes here:

1 GPU selected for this run.
Mapping of GPU IDs to the 2 GPU tasks in the 1 rank on this node:
PP:1,PME:1
PP tasks will do (non-perturbed) short-ranged interactions on the GPU
PP task will update and constrain coordinates on the CPU
PME tasks will do all aspects on the GPU
Using 1 MPI thread
Using 8 OpenMP threads

If I check top, GROMACS runs for 5 seconds, then stops working.

Any help is greatly appreciated.

The OpenCL support is not compatible with any RDNA GPU. Use the SYCL build: https://manual.gromacs.org/documentation/current/install-guide/index.html#sycl-gpu-acceleration-for-amd-gpus

Thanks for the help. I have installed the appropriate ROCm and hipSYCL builds from the link and can call SYCL using “syclcc.” After this I tried to build GROMACS using the commands from the tutorial:

cmake .. -DGMX_GPU=SYCL -DGMX_SYCL_HIPSYCL=ON -DHIPSYCL_TARGETS='hip:gfx1xyz' -DCMAKE_C_COMPILER=/opt/rocm/llvm/bin/clang -DCMAKE_CXX_COMPILER=/opt/rocm/llvm/bin/clang++

then I get the following error:

CMake Error at /usr/share/cmake-3.22/Modules/CMakeTestCXXCompiler.cmake:62 (message):
  The C++ compiler

    "/opt/rocm/llvm/bin/clang++"

  is not able to compile a simple test program.

  It fails with the following output:

    Change Dir: /home/seth/Downloads/gromacs-2023/build/CMakeFiles/CMakeTmp
    
    Run Build Command(s):/usr/bin/gmake -f Makefile cmTC_02a5c/fast && /usr/bin/gmake  -f CMakeFiles/cmTC_02a5c.dir/build.make CMakeFiles/cmTC_02a5c.dir/build
    gmake[1]: Entering directory '/home/seth/Downloads/gromacs-2023/build/CMakeFiles/CMakeTmp'
    Building CXX object CMakeFiles/cmTC_02a5c.dir/testCXXCompiler.cxx.o
    /opt/rocm/llvm/bin/clang++    -MD -MT CMakeFiles/cmTC_02a5c.dir/testCXXCompiler.cxx.o -MF CMakeFiles/cmTC_02a5c.dir/testCXXCompiler.cxx.o.d -o CMakeFiles/cmTC_02a5c.dir/testCXXCompiler.cxx.o -c /home/seth/Downloads/gromacs-2023/build/CMakeFiles/CMakeTmp/testCXXCompiler.cxx
    Linking CXX executable cmTC_02a5c
    /usr/bin/cmake -E cmake_link_script CMakeFiles/cmTC_02a5c.dir/link.txt --verbose=1
    /opt/rocm/llvm/bin/clang++ CMakeFiles/cmTC_02a5c.dir/testCXXCompiler.cxx.o -o cmTC_02a5c 
    ld.lld: error: unable to find library -lstdc++
    clang-15: error: linker command failed with exit code 1 (use -v to see invocation)
    gmake[1]: *** [CMakeFiles/cmTC_02a5c.dir/build.make:100: cmTC_02a5c] Error 1
    gmake[1]: Leaving directory '/home/seth/Downloads/gromacs-2023/build/CMakeFiles/CMakeTmp'
    gmake: *** [Makefile:127: cmTC_02a5c/fast] Error 2
    
 

  CMake will not be able to correctly generate this project.
Call Stack (most recent call first):
  CMakeLists.txt:71 (project)

I also tried using

/usr/bin/clang
/usr/bin/clang++

but that also failed.

I have also tried using a GNU compiler, but I get this error:

CMake Error at cmake/gmxManageSYCL.cmake:77 (message):
  HipSYCL build requires Clang compiler, but GNU is used
Call Stack (most recent call first):
  CMakeLists.txt:656 (include)

I apologize if I am missing something obvious, I am pretty new to ROCm and SYCL. I have compiled GROMACS many times before for CUDA, but have never run into these issues.

Thanks for your help.

Hello!

This is a known issue with Clang: https://manual.gromacs.org/current/user-guide/known-issues.html#cannot-find-a-working-standard-library-error-with-rocm-clang

If you’re using Ubuntu 22.04, then doing sudo apt install libstdc++-12-dev could be enough; no extra CMake flags are needed.

Also, for RX6700XT you should use gfx1031, not gfx1xyz :)

You guys are the best! I got it to compile without any errors, but there was one other issue.

When I did the command:

make check -j 16

I got a ton of mdrun segmentation faults.

1 - GmxapiExternalInterfaceTests (SEGFAULT)
	  2 - GmxapiInternalInterfaceTests (SEGFAULT)
	  7 - NbLibSetupTests (SEGFAULT)
	  8 - NbLibTprTests (SEGFAULT)
	 16 - MdlibUnitTest (SEGFAULT)
	 25 - DomDecMpiTests (SEGFAULT)
	 26 - EwaldUnitTests (SEGFAULT)
	 27 - FFTUnitTests (SEGFAULT)
	 28 - GpuUtilsUnitTests (SEGFAULT)
	 60 - MdrunOutputTests (SEGFAULT)
	 61 - MdrunModulesTests (SEGFAULT)
	 62 - MdrunIOTests (SEGFAULT)
	 63 - MdrunTestsOneRank (SEGFAULT)
	 64 - MdrunTestsTwoRanks (SEGFAULT)
	 65 - MdrunSingleRankAlgorithmsTests (SEGFAULT)
	 66 - MdrunNonIntegratorTests (SEGFAULT)
	 67 - MdrunTpiTests (SEGFAULT)
	 68 - MdrunMpiTests (SEGFAULT)
	 69 - MdrunMultiSimTests (SEGFAULT)
	 70 - MdrunMultiSimReplexTests (SEGFAULT)
	 72 - MdrunMpi1RankPmeTests (SEGFAULT)
	 73 - MdrunMpi2RankPmeTests (SEGFAULT)
	 74 - MdrunFEPTests (SEGFAULT)
	 75 - MdrunPullTests (SEGFAULT)
	 77 - MdrunVirtualSiteTests (SEGFAULT)
	 78 - regressiontests/complex (Failed)
	 79 - regressiontests/freeenergy (Failed)
	 80 - regressiontests/rotation (Failed)
	 81 - regressiontests/essentialdynamics (Failed)

To validate this, I did a run of the lysozyme tutorial (just because its fast and easy) and i still got the segmentation fault (core dumped) error even without calling on the GPU. I find this weird because I never have had this happen on the system I am currently using.

Again, thanks for all the help, it is much appreciated!

I can also confirm that the SegFault error occurs in older versions of GROMACS with the same build. I built 2022.4 from source and it yielded the same failed tests.
The following tests FAILED:

  1 - GmxapiExternalInterfaceTests (SEGFAULT)
  2 - GmxapiInternalInterfaceTests (SEGFAULT)
  7 - NbLibSetupTests (SEGFAULT)
  8 - NbLibTprTests (SEGFAULT)
 16 - MdlibUnitTest (SEGFAULT)
 25 - DomDecMpiTests (SEGFAULT)
 26 - EwaldUnitTests (SEGFAULT)
 27 - FFTUnitTests (SEGFAULT)
 28 - GpuUtilsUnitTests (SEGFAULT)
 61 - MdrunOutputTests (SEGFAULT)
 62 - MdrunModulesTests (SEGFAULT)
 63 - MdrunIOTests (SEGFAULT)
 64 - MdrunTestsOneRank (SEGFAULT)
 65 - MdrunTestsTwoRanks (SEGFAULT)
 66 - MdrunSingleRankAlgorithmsTests (SEGFAULT)
 67 - MdrunNonIntegratorTests (SEGFAULT)
 68 - MdrunTpiTests (SEGFAULT)
 69 - MdrunMpiTests (SEGFAULT)
 70 - MdrunMultiSimTests (SEGFAULT)
 71 - MdrunMultiSimReplexTests (SEGFAULT)
 73 - MdrunMpi1RankPmeTests (SEGFAULT)
 74 - MdrunMpi2RankPmeTests (SEGFAULT)
 75 - MdrunFEPTests (SEGFAULT)
 76 - MdrunPullTests (SEGFAULT)
 78 - MdrunVirtualSiteTests (SEGFAULT)
 79 - regressiontests/complex (Failed)
 80 - regressiontests/freeenergy (Failed)
 81 - regressiontests/rotation (Failed)
 82 - regressiontests/essentialdynamics (Failed)

Errors while running CTest
make[3]: *** [CMakeFiles/run-ctest-nophys.dir/build.make:71: CMakeFiles/run-ctest-nophys] Error 8
make[2]: *** [CMakeFiles/Makefile2:3269: CMakeFiles/run-ctest-nophys.dir/all] Error 2
make[1]: *** [CMakeFiles/Makefile2:3302: CMakeFiles/check.dir/rule] Error 2
make: *** [Makefile:641: check] Error 2

For reference, here is the cmake command:

cmake .. -DGMX_GPU=SYCL -DGMX_SYCL_HIPSYCL=ON -DHIPSYCL_TARGETS='hip:gfx1031' -DCMAKE_C_COMPILER=/opt/rocm/llvm/bin/clang -DCMAKE_CXX_COMPILER=/opt/rocm/llvm/bin/clang++ -DGMX_BUILD_OWN_FFTW=ON -DREGRESSIONTEST_DOWNLOAD=ON -DCMAKE_BUILD_TYPE=Debug

Hi!

Thanks for confirming with 2022!

Something is pretty broken, indeed. Looks like each test trying to use GPU crashes.

First, can you try running rocm-smi and hipsycl-info, to check that basic GPU detection works?

Then, please run any single test, e.g., ./bin/gpu_utils-test from the build directory, and share the full output (attach or upload the file).

Some more info about your setup (OS, ROCm version, how you built hipSYCL) could also be helpful.

Sure, thanks for the help.

Here is the rocm-info -a output, it does recognize my gpu:

======================= ROCm System Management Interface =======================
========================= Version of System Component ==========================
Driver version: 5.18.13
================================================================================
====================================== ID ======================================
GPU[0]		: GPU ID: 0x73df
GPU[1]		: GPU ID: 0x1638
================================================================================
================================== Unique ID ===================================
GPU[0]		: Unique ID: N/A
GPU[1]		: Unique ID: N/A
================================================================================
==================================== VBIOS =====================================
GPU[0]		: VBIOS version: 113-67HA6SMD1-D01
GPU[1]		: VBIOS version: 13-CEZANNE-019
================================================================================
================================= Temperature ==================================
GPU[0]		: Temperature (Sensor edge) (C): 33.0
GPU[0]		: Temperature (Sensor junction) (C): 33.0
GPU[0]		: Temperature (Sensor memory) (C): 36.0
GPU[1]		: Temperature (Sensor edge) (C): 29.0
================================================================================
========================== Current clock frequencies ===========================
GPU[0]		: dcefclk clock level: 1: (480Mhz)
GPU[0]		: fclk clock level: 1: (840Mhz)
GPU[0]		: mclk clock level: 0: (96Mhz)
GPU[0]		: sclk clock level: 0: (500Mhz)
GPU[0]		: socclk clock level: 1: (533Mhz)
GPU[0]		: pcie clock level: 1 (8.0GT/s x16)
GPU[1]		: fclk clock level: 0: (1600Mhz)
GPU[1]		: mclk clock level: 0: (1600Mhz)
GPU[1]		: sclk clock level: 1: (400Mhz)
GPU[1]		: socclk clock level: 0: (400Mhz)
================================================================================
============================== Current Fan Metric ==============================
GPU[0]		: Unable to detect fan speed for GPU 0
GPU[1]		: Unable to detect fan speed for GPU 1
================================================================================
============================ Show Performance Level ============================
GPU[0]		: Performance Level: auto
GPU[1]		: Performance Level: auto
================================================================================
=============================== OverDrive Level ================================
GPU[0]		: GPU OverDrive value (%): 0
GPU[1]		: GPU OverDrive value (%): 0
================================================================================
=============================== OverDrive Level ================================
GPU[0]		: GPU Memory OverDrive value (%): 0
GPU[1]		: GPU Memory OverDrive value (%): 0
================================================================================
================================== Power Cap ===================================
GPU[0]		: Max Graphics Package Power (W): 186.0
GPU[1]		: Not supported on the given system
GPU[1]		: Max Graphics Package Power Unsupported
================================================================================
============================= Show Power Profiles ==============================
GPU[0]		: 1. Available power profile (#1 of 7): CUSTOM
GPU[0]		: 2. Available power profile (#2 of 7): VIDEO
GPU[0]		: 3. Available power profile (#3 of 7): POWER SAVING
GPU[0]		: 4. Available power profile (#4 of 7): COMPUTE
GPU[0]		: 5. Available power profile (#5 of 7): VR
GPU[0]		: 6. Available power profile (#6 of 7): 3D FULL SCREEN
GPU[0]		: 7. Available power profile (#7 of 7): BOOTUP DEFAULT*
GPU[1]		: 1. Available power profile (#1 of 7): CUSTOM
GPU[1]		: 2. Available power profile (#2 of 7): VIDEO*
GPU[1]		: 3. Available power profile (#4 of 7): COMPUTE
GPU[1]		: 4. Available power profile (#5 of 7): VR
================================================================================
============================== Power Consumption ===============================
GPU[0]		: Average Graphics Package Power (W): 5.0
Not supported on the given system
GPU[1]		: Average Graphics Package Power (W): 0.006
================================================================================
========================= Supported clock frequencies ==========================
GPU[0]		: Supported dcefclk frequencies on GPU0
GPU[0]		: 0: 417Mhz
GPU[0]		: 1: 480Mhz *
GPU[0]		: 2: 1200Mhz
GPU[0]		: 
GPU[0]		: Supported fclk frequencies on GPU0
GPU[0]		: 0: 500Mhz
GPU[0]		: 1: 840Mhz *
GPU[0]		: 2: 1941Mhz
GPU[0]		: 
GPU[0]		: Supported mclk frequencies on GPU0
GPU[0]		: 0: 96Mhz *
GPU[0]		: 1: 456Mhz
GPU[0]		: 2: 675Mhz
GPU[0]		: 3: 1000Mhz
GPU[0]		: 
GPU[0]		: Supported sclk frequencies on GPU0
GPU[0]		: 0: 500Mhz *
GPU[0]		: 1: 2725Mhz
GPU[0]		: 
GPU[0]		: Supported socclk frequencies on GPU0
GPU[0]		: 0: 480Mhz
GPU[0]		: 1: 533Mhz *
GPU[0]		: 2: 1200Mhz
GPU[0]		: 
GPU[0]		: Supported PCIe frequencies on GPU0
GPU[0]		: 0: 2.5GT/s x1
GPU[0]		: 1: 8.0GT/s x16 *
GPU[0]		: 
--------------------------------------------------------------------------------
GPU[1]		: Supported dcefclk frequencies on GPU1
GPU[1]		: 0: 400Mhz *
GPU[1]		: 1: 464Mhz
GPU[1]		: 2: 514Mhz
GPU[1]		: 3: 576Mhz
GPU[1]		: 4: 626Mhz
GPU[1]		: 5: 685Mhz
GPU[1]		: 6: 757Mhz
GPU[1]		: 7: 847Mhz
GPU[1]		: 
GPU[1]		: Supported fclk frequencies on GPU1
GPU[1]		: 0: 1600Mhz *
GPU[1]		: 
GPU[1]		: Supported mclk frequencies on GPU1
GPU[1]		: 0: 1600Mhz *
GPU[1]		: 
GPU[1]		: Supported sclk frequencies on GPU1
GPU[1]		: 0: 200Mhz
GPU[1]		: 1: 400Mhz *
GPU[1]		: 2: 2000Mhz
GPU[1]		: 
GPU[1]		: Supported socclk frequencies on GPU1
GPU[1]		: 0: 400Mhz *
GPU[1]		: 1: 445Mhz
GPU[1]		: 2: 520Mhz
GPU[1]		: 3: 600Mhz
GPU[1]		: 4: 678Mhz
GPU[1]		: 5: 780Mhz
GPU[1]		: 6: 866Mhz
GPU[1]		: 7: 975Mhz
GPU[1]		: 
--------------------------------------------------------------------------------
================================================================================
============================== % time GPU is busy ==============================
GPU[0]		: GPU use (%): 0
GPU[1]		: GPU use (%): 0
================================================================================
============================== Current Memory Use ==============================
GPU[0]		: GPU memory use (%): 0
GPU[0]		: Memory Activity: N/A
GPU[1]		: Not supported on the given system
GPU[1]		: Memory Activity: N/A
================================================================================
================================ Memory Vendor =================================
GPU[0]		: GPU memory vendor: micron
GPU[1]		: GPU memory vendor: unknown
================================================================================
============================= PCIe Replay Counter ==============================
GPU[0]		: PCIe Replay Count: 0
GPU[1]		: PCIe Replay Count: 0
================================================================================
================================ Serial Number =================================
GPU[0]		: Serial Number: N/A
GPU[1]		: Serial Number: N/A
================================================================================
================================ KFD Processes =================================
No KFD PIDs currently running
================================================================================
============================= GPUs Indexed by PID ==============================
No KFD PIDs currently running
================================================================================
================== GPU Memory clock frequencies and voltages ===================
GPU[0]		: Not supported on the given system
GPU[1]		: Requested function is not implemented on this setup
================================================================================
=============================== Current voltage ================================
GPU[0]		: Voltage (mV): 843
GPU[1]		: Voltage (mV): 1425
================================================================================
================================== PCI Bus ID ==================================
GPU[0]		: PCI Bus: 0000:03:00.0
GPU[1]		: PCI Bus: 0000:0E:00.0
================================================================================
============================= Firmware Information =============================
GPU[0]		: ASD firmware version: 	0x21000095
GPU[0]		: CE firmware version: 		37
GPU[0]		: DMCU firmware version: 	0
GPU[0]		: MC firmware version: 		0
GPU[0]		: ME firmware version: 		64
GPU[0]		: MEC firmware version: 	104
GPU[0]		: MEC2 firmware version: 	104
GPU[0]		: PFP firmware version: 	95
GPU[0]		: RLC firmware version: 	74
GPU[0]		: RLC SRLC firmware version: 	0
GPU[0]		: RLC SRLG firmware version: 	0
GPU[0]		: RLC SRLS firmware version: 	0
GPU[0]		: SDMA firmware version: 	80
GPU[0]		: SDMA2 firmware version: 	80
GPU[0]		: SMC firmware version: 	00.65.57.00
GPU[0]		: SOS firmware version: 	0x00220a0c
GPU[0]		: TA RAS firmware version: 	00.00.00.00
GPU[0]		: TA XGMI firmware version: 	00.00.00.00
GPU[0]		: UVD firmware version: 	0x00000000
GPU[0]		: VCE firmware version: 	0x00000000
GPU[0]		: VCN firmware version: 	0x0211a000
GPU[1]		: ASD firmware version: 	0x21000090
GPU[1]		: CE firmware version: 		79
GPU[1]		: DMCU firmware version: 	0
GPU[1]		: MC firmware version: 		0
GPU[1]		: ME firmware version: 		166
GPU[1]		: MEC firmware version: 	464
GPU[1]		: MEC2 firmware version: 	464
GPU[1]		: PFP firmware version: 	194
GPU[1]		: RLC firmware version: 	60
GPU[1]		: RLC SRLC firmware version: 	1
GPU[1]		: RLC SRLG firmware version: 	1
GPU[1]		: RLC SRLS firmware version: 	1
GPU[1]		: SDMA firmware version: 	40
GPU[1]		: SDMA2 firmware version: 	0
GPU[1]		: SMC firmware version: 	00.64.60.00
GPU[1]		: SOS firmware version: 	0x00000000
GPU[1]		: TA RAS firmware version: 	00.00.00.00
GPU[1]		: TA XGMI firmware version: 	00.00.00.00
GPU[1]		: UVD firmware version: 	0x00000000
GPU[1]		: VCE firmware version: 	0x00000000
GPU[1]		: VCN firmware version: 	0x05113000
================================================================================
================================= Product Info =================================
GPU[0]		: Card series: 		Navi 22 [Radeon RX 6700/6700 XT / 6800M]
GPU[0]		: Card model: 		0x6606
GPU[0]		: Card vendor: 		Advanced Micro Devices, Inc. [AMD/ATI]
GPU[0]		: Card SKU: 		67HA6S
GPU[1]		: Card series: 		Cezanne
GPU[1]		: Card model: 		0x1636
GPU[1]		: Card vendor: 		Advanced Micro Devices, Inc. [AMD/ATI]
GPU[1]		: Card SKU: 		CEZANN
================================================================================
================================== Pages Info ==================================
GPU[0]		: Not supported on the given system
============================ Show Valid sclk Range =============================
GPU[0]		: Not supported on the given system
GPU[1]		: Requested function is not implemented on this setup
================================================================================
============================ Show Valid mclk Range =============================
GPU[0]		: Not supported on the given system
GPU[1]		: Requested function is not implemented on this setup
================================================================================
=========================== Show Valid voltage Range ===========================
GPU[0]		: Not supported on the given system
GPU[1]		: Requested function is not implemented on this setup
================================================================================
============================= Voltage Curve Points =============================
GPU[0]		: Not supported on the given system
GPU[1]		: Requested function is not implemented on this setup
================================================================================
=============================== Consumed Energy ================================
GPU[0]		: Energy counter: 0
GPU[0]		: Accumulated Energy (uJ): 0.0
GPU[1]		: Not supported on the given system
================================================================================
============================= End of ROCm SMI Log ==============================

hipSYCL info:

=================Backend information===================
Loaded backend 0: HIP
  Found device: AMD Radeon RX 6700 XT
  Found device: 
Loaded backend 1: OpenMP
  Found device: hipSYCL OpenMP host device
Loaded backend 2: CUDA
  (no devices found)

=================Device information===================
***************** Devices for backend HIP *****************
Device 0:
 General device information:
  Name: AMD Radeon RX 6700 XT
  Backend: HIP
  Vendor: AMD
  Arch: gfx1031
  Driver version: 50422802
  Is CPU: 0
  Is GPU: 1
 Default executor information:
  Is in-order queue: 1
  Is out-of-order queue: 0
  Is task graph: 0
 Device support queries:
  images: 0
  error_correction: 0
  host_unified_memory: 0
  little_endian: 1
  global_mem_cache: 1
  global_mem_cache_read_only: 0
  global_mem_cache_read_write: 1
  emulated_local_memory: 0
  sub_group_independent_forward_progress: 1
  usm_device_allocations: 1
  usm_host_allocations: 1
  usm_atomic_host_allocations: 0
  usm_shared_allocations: 1
  usm_atomic_shared_allocations: 0
  usm_system_allocations: 0
  execution_timestamps: 1
 Device properties:
  max_compute_units: 20
  max_global_size0: 18446744073709550592
  max_global_size1: 18446744073709550592
  max_global_size2: 18446744073709550592
  max_group_size: 1024
  max_num_sub_groups: 32
  preferred_vector_width_char: 4
  preferred_vector_width_double: 1
  preferred_vector_width_float: 1
  preferred_vector_width_half: 2
  preferred_vector_width_int: 1
  preferred_vector_width_long: 1
  preferred_vector_width_short: 2
  native_vector_width_char: 4
  native_vector_width_double: 1
  native_vector_width_float: 1
  native_vector_width_half: 2
  native_vector_width_int: 1
  native_vector_width_long: 1
  native_vector_width_short: 2
  max_clock_speed: 2725
  max_malloc_size: 12868124672
  address_bits: 64
  max_read_image_args: 0
  max_write_image_args: 0
  image2d_max_width: 0
  image2d_max_height: 0
  image3d_max_width: 0
  image3d_max_height: 0
  image3d_max_depth: 0
  image_max_buffer_size: 0
  image_max_array_size: 0
  max_samplers: 0
  max_parameter_size: 18446744073709551615
  mem_base_addr_align: 8
  global_mem_cache_line_size: 128
  global_mem_cache_size: 3145728
  global_mem_size: 12868124672
  max_constant_buffer_size: 2147483647
  max_constant_args: 18446744073709551615
  local_mem_size: 65536
  printf_buffer_size: 18446744073709551615
  partition_max_sub_devices: 0
  vendor_id: 1022
  sub_group_sizes: 32 


Device 1:
 General device information:
  Name: 
  Backend: HIP
  Vendor: AMD
  Arch: gfx90c:xnack-
  Driver version: 50422802
  Is CPU: 0
  Is GPU: 1
 Default executor information:
  Is in-order queue: 1
  Is out-of-order queue: 0
  Is task graph: 0
 Device support queries:
  images: 0
  error_correction: 0
  host_unified_memory: 0
  little_endian: 1
  global_mem_cache: 1
  global_mem_cache_read_only: 0
  global_mem_cache_read_write: 1
  emulated_local_memory: 0
  sub_group_independent_forward_progress: 1
  usm_device_allocations: 1
  usm_host_allocations: 1
  usm_atomic_host_allocations: 0
  usm_shared_allocations: 1
  usm_atomic_shared_allocations: 0
  usm_system_allocations: 0
  execution_timestamps: 1
 Device properties:
  max_compute_units: 8
  max_global_size0: 18446744073709550592
  max_global_size1: 18446744073709550592
  max_global_size2: 18446744073709550592
  max_group_size: 1024
  max_num_sub_groups: 16
  preferred_vector_width_char: 4
  preferred_vector_width_double: 1
  preferred_vector_width_float: 1
  preferred_vector_width_half: 2
  preferred_vector_width_int: 1
  preferred_vector_width_long: 1
  preferred_vector_width_short: 2
  native_vector_width_char: 4
  native_vector_width_double: 1
  native_vector_width_float: 1
  native_vector_width_half: 2
  native_vector_width_int: 1
  native_vector_width_long: 1
  native_vector_width_short: 2
  max_clock_speed: 2000
  max_malloc_size: 536870912
  address_bits: 64
  max_read_image_args: 0
  max_write_image_args: 0
  image2d_max_width: 0
  image2d_max_height: 0
  image3d_max_width: 0
  image3d_max_height: 0
  image3d_max_depth: 0
  image_max_buffer_size: 0
  image_max_array_size: 0
  max_samplers: 0
  max_parameter_size: 18446744073709551615
  mem_base_addr_align: 8
  global_mem_cache_line_size: 128
  global_mem_cache_size: 1048576
  global_mem_size: 536870912
  max_constant_buffer_size: 536870912
  max_constant_args: 18446744073709551615
  local_mem_size: 65536
  printf_buffer_size: 18446744073709551615
  partition_max_sub_devices: 0
  vendor_id: 1022
  sub_group_sizes: 64 


***************** Devices for backend OpenMP *****************
Device 0:
 General device information:
  Name: hipSYCL OpenMP host device
  Backend: OpenMP
  Vendor: the hipSYCL project
  Arch: <native-cpu>
  Driver version: 1.2
  Is CPU: 1
  Is GPU: 0
 Default executor information:
  Is in-order queue: 1
  Is out-of-order queue: 0
  Is task graph: 0
 Device support queries:
  images: 0
  error_correction: 0
  host_unified_memory: 1
  little_endian: 1
  global_mem_cache: 1
  global_mem_cache_read_only: 0
  global_mem_cache_read_write: 1
  emulated_local_memory: 1
  sub_group_independent_forward_progress: 0
  usm_device_allocations: 1
  usm_host_allocations: 1
  usm_atomic_host_allocations: 1
  usm_shared_allocations: 1
  usm_atomic_shared_allocations: 1
  usm_system_allocations: 1
  execution_timestamps: 1
 Device properties:
  max_compute_units: 16
  max_global_size0: 18446744073709551615
  max_global_size1: 18446744073709551615
  max_global_size2: 18446744073709551615
  max_group_size: 1024
  max_num_sub_groups: 18446744073709551615
  preferred_vector_width_char: 4
  preferred_vector_width_double: 1
  preferred_vector_width_float: 1
  preferred_vector_width_half: 2
  preferred_vector_width_int: 1
  preferred_vector_width_long: 1
  preferred_vector_width_short: 2
  native_vector_width_char: 4
  native_vector_width_double: 1
  native_vector_width_float: 1
  native_vector_width_half: 2
  native_vector_width_int: 1
  native_vector_width_long: 1
  native_vector_width_short: 2
  max_clock_speed: 0
  max_malloc_size: 18446744073709551615
  address_bits: 64
  max_read_image_args: 0
  max_write_image_args: 0
  image2d_max_width: 0
  image2d_max_height: 0
  image3d_max_width: 0
  image3d_max_height: 0
  image3d_max_depth: 0
  image_max_buffer_size: 0
  image_max_array_size: 0
  max_samplers: 0
  max_parameter_size: 18446744073709551615
  mem_base_addr_align: 8
  global_mem_cache_line_size: 64
  global_mem_cache_size: 1
  global_mem_size: 18446744073709551615
  max_constant_buffer_size: 18446744073709551615
  max_constant_args: 18446744073709551615
  local_mem_size: 18446744073709551615
  printf_buffer_size: 18446744073709551615
  partition_max_sub_devices: 0
  vendor_id: 18446744073709551615
  sub_group_sizes: 1 


***************** Devices for backend CUDA *****************
  (no devices)

OS: Ubuntu 22.04

There are a couple things with hipsycl. First, I tried installing with the instructions using the link up above:

cmake .. -DCMAKE_C_COMPILER=/opt/rocm/llvm/bin/clang -DCMAKE_CXX_COMPILER=/opt/rocm/llvm/bin/clang++ -DLLVM_DIR=/opt/rocm/llvm/lib/cmake/llvm -DROCM_PATH=/opt/rocm -DWITH_ROCM_BACKEND=ON

But, when I did this, i couldnt recognize my GPU with the hipsycl-info command. So then I tried this:

cmake .. -DROCM_PATH=/opt/rocm -DWITH_ROCM_BACKEND=ON

and then I got the output I have placed above where it has my 6700xt.

One other thing worth noting, I ran the tests individually and most of them wouldn’t even enter the tests, they would just say Segmentation fault (core dumped). Or, they would say:

[==========] Running 4 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 4 tests from PullTest/PullIntegrationTest
Segmentation fault (core dumped)

with a varying number of tests in the test suite. I even ran the tests with sudo, hoping maybe it was a permission thing.

I’m really stumped here, I think I have tried everything obvious, including scanning the manual for known segmentation fault errors, but I can’t come up with anything. Hope you guys can help.

Could you share the /etc/hipSYCL/syclcc.json? What I think is happening is that hipSYCL was built with one set of compilers (presumably, Clang 14 installed from Ubuntu repositories), but you’re using ROCm’s compilers to build GROMACS. And there is an incompatibility between them.

As you said, you cannot build hipSYCL with ROCm Clang. The next best thing would be to build GROMACS using whatever Clang hipSYCL uses. I’d expect that, for Ubuntu 22.04, that could be done by setting -DCMAKE_C_COMPILER=clang-14 -DCMAKE_CXX_COMPILER=clang++-14 when configuring GROMACS. But take a look at default-clang in syclcc.json to make sure.

Here is the syclcc.json file:

{
  "version-major" : "0",
  "version-minor" : "9",
  "version-patch" : "4",
  "plugin-llvm-version-major" : "15",
  "plugin-with-cpu-acceleration" : "true",
  "default-clang"     : "/opt/rocm/llvm/bin/clang++",
  "default-nvcxx"     : "NVCXX_COMPILER-NOTFOUND",
  "default-platform"  : "cuda",
  "default-cuda-path" : "/usr/local/cuda",
  "default-gpu-arch"  : "",
  "default-cpu-cxx"   : "/opt/rocm/llvm/bin/clang++",
  "default-rocm-path" : "/opt/rocm",
  "default-use-bootstrap-mode" : "false",
  "default-is-dryrun" : "false",
  "default-use-accelerated-cpu" : "true",
  "default-clang-include-path" : "/opt/rocm/llvm/lib/clang/15.0.0/include/..",
  "default-sequential-link-line" : "-L/usr/lib/x86_64-linux-gnu -lboost_context -lboost_fiber -Wl,-rpath=/usr/lib/x86_64-linux-gnu",
  "default-sequential-cxx-flags" : "-I/usr/include -D_ENABLE_EXTENDED_ALIGNED_STORAGE",
  "default-omp-link-line" : "-L/usr/lib/x86_64-linux-gnu -lboost_context -lboost_fiber -Wl,-rpath=/usr/lib/x86_64-linux-gnu -fopenmp",
  "default-omp-cxx-flags" : "-I/usr/include -fopenmp -D_ENABLE_EXTENDED_ALIGNED_STORAGE",
  "default-rocm-link-line" : "-Wl,-rpath=$HIPSYCL_ROCM_PATH/lib -Wl,-rpath=$HIPSYCL_ROCM_PATH/hip/lib -L/opt/rocm/lib -L/opt/rocm/hip/lib -lamdhip64",
  "default-rocm-cxx-flags" : "-isystem $HIPSYCL_PATH/include/hipSYCL/std/hiplike -isystem /opt/rocm/llvm/lib/clang/15.0.0/include/.. -U__FLOAT128__ -U__SIZEOF_FLOAT128__ -I$HIPSYCL_ROCM_PATH/include -I$HIPSYCL_ROCM_PATH/include --rocm-device-lib-path=$HIPSYCL_ROCM_PATH/amdgcn/bitcode --rocm-path=$HIPSYCL_ROCM_PATH -fhip-new-launch-api -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -D__HIP_ROCclr__",
  "default-cuda-link-line" : "-Wl,-rpath=$HIPSYCL_CUDA_LIB_PATH -L$HIPSYCL_CUDA_LIB_PATH -lcudart",
  "default-cuda-cxx-flags" : "-U__FLOAT128__ -U__SIZEOF_FLOAT128__ -isystem $HIPSYCL_PATH/include/hipSYCL/std/hiplike",
  "default-is-explicit-multipass" : "false",
  "default-save-temps" : "false"
}

I also tried the defaut “-DCMAKE_C_COMPILER=/usr/bin/clang” but got a ton of cmake errors:

CMake Error at cmake/gmxManageSYCL.cmake:117 (message):
  hipSYCL compiler not working:

  Change Dir: /home/seth/Downloads/gromacs-2023/build/CMakeTmpHipSyclTest

  

  Run Build Command(s):/usr/bin/gmake -f Makefile && [ 50%] Building CXX
  object CMakeFiles/HipSyclTest.dir/main.cpp.o

  /usr/lib/llvm-14/bin/clang: symbol lookup error:
  /usr/local/bin/../lib/libhipSYCL_clang.so: undefined symbol:
  _ZN5clang20ItaniumMangleContext6createERNS_10ASTContextERNS_17DiagnosticsEngineEb


  clang: error: unable to execute command: No such file or directory

  clang: error: clang frontend command failed due to signal (use -v to see
  invocation)

  Ubuntu clang version 14.0.0-1ubuntu1

  Target: x86_64-pc-linux-gnu

  Thread model: posix

  InstalledDir: /usr/bin

  clang: error: unable to execute command: Executable "clang-offload-bundler"
  doesn't exist!

  clang: note: diagnostic msg: Error generating preprocessed source(s).

  syclcc warning: No optimization flag was given, optimizations are disabled
  by default.  Performance may be degraded.  Compile with e.g.  -O2/-O3 to
  enable optimizations.

  gmake[2]: *** [CMakeFiles/HipSyclTest.dir/build.make:78:
  CMakeFiles/HipSyclTest.dir/main.cpp.o] Error 255

  gmake[1]: *** [CMakeFiles/Makefile2:83: CMakeFiles/HipSyclTest.dir/all]
  Error 2

  gmake: *** [Makefile:91: all] Error 2

  

Call Stack (most recent call first):
  CMakeLists.txt:656 (include)

I don’t know if this matters, but i dont have a default nvcxx compiler and there is no default-gpu-arch, which I found strange. It is using the clang that I have designated in the GROMACS install.

That’s normal. NVCXX is a new NVIDIA compiler (clearly not needed for you), and default-gpu-arch is a convenience feature, not strictly needed.

Thanks for testing. My previous guess was wrong: according to your syclcc.json, ROCm’s compiler is used consistently, so your original choice, -DCMAKE_C_COMPILER=/opt/rocm/llvm/bin/clang -DCMAKE_CXX_COMPILER=/opt/rocm/llvm/bin/clang++, is perfectly correct.

Here are a few more thoughts:

  • Can you try -DHIPSYCL_TARGETS='hip:gfx1031,gfx90c'?
  • To ensure that hipSYCL installation works fine, can you download hello_sycl.cpp · GitHub, and then try compiling and running it with syclcc --hipsycl-targets='hip:gfx1031' -O2 hello_sycl.cpp -o hello_sycl && ./hello_sycl?

Ok, I think we have narrowed down the error to the hipSYCL installation. When I run the hello_sycl.cpp test with or without sudo, I get:

Running on AMD Radeon RX 6700 XT
Segmentation fault (core dumped)

I tried to replicate compiling hipSYCL by uninstalling and reinstalling and now I’m getting errors when running both these cmake commands:

In file included from /home/seth/Downloads/OpenSYCL-0.9.4/src/compiler/HipsyclClangPlugin.cpp:29:
In file included from /home/seth/Downloads/OpenSYCL-0.9.4/src/compiler/../../include/hipSYCL/compiler/FrontendPlugin.hpp:31:
/home/seth/Downloads/OpenSYCL-0.9.4/src/compiler/../../include/hipSYCL/compiler/Frontend.hpp:38:10: fatal error: 'clang/AST/ASTContext.h' file not found
#include "clang/AST/ASTContext.h"

and both the commands were:

cmake .. -DROCM_PATH=/opt/rocm -DWITH_ROCM_BACKEND=ON -DCMAKE_C_COMPILER=/opt/rocm/llvm/bin/clang -DCMAKE_CXX_COMPILER=/opt/rocm/llvm/bin/clang++

and

cmake .. -DROCM_PATH=/opt/rocm -DWITH_ROCM_BACKEND=ON -DCMAKE_C_COMPILER=/opt/rocm/llvm/bin/clang -DCMAKE_CXX_COMPILER=/opt/rocm/llvm/bin/clang++

I realize at this point, this doesn’t involve GROMACS and has to be a hipSYCL problem, but hopefully you have an idea why this is happening.

Does adding -DLLVM_DIR=/opt/rocm/llvm/lib/cmake/llvm/ (see our install guide) help? -DROCM_PATH is often not sufficient.

It still results in a segmentation fault when running

syclcc --hipsycl-targets='hip:gfx1031' -O2 hello_sycl.cpp -o hello_sycl && ./hello_sycl

when compiled with the cmake command:

cmake .. -DROCM_PATH=/opt/rocm -DWITH_ROCM_BACKEND=ON -DLLVM_DIR=/opt/rocm/llvm/lib/cmake/llvm/

It does recognize my card as well.

UPDATE: If I add gfx90c the card works with the above command, about to try GROMACS build again.

1 Like

Ok, so all the tests passes when running make check -j 16, but only if I did cmake for both gfx1031,gfx90c. That is so strange to me, but if i used only gfx90c the tests failed. I will try running the lysozyme tutorial soon then write a summary for documentation on the installation commands.

I cant thank you guys enough for this, I definitely would not have figured this out without your help!

1 Like

Great that it worked.

It looks like ROCm initialization fails when the code is compiled for only one device. Can’t offer any insights here, unfortunately.

Thank you for testing! We’ll add a note to our installation guide / known issues.

Yes, ROCm indeed requires that code is compiled for all GPUs in the system, even if the application is in practice only using some of the GPUs:

This is a limitation that has been in ROCm for a long time and I don’t know why AMD has not yet fixed it - after all this behavior is unintuitive and a source of surprsing and difficult to debug issues as described here.

1 Like

Interesting. @illuhad, the HIP docs state that the resulting behavior is hipErrorNoBinaryForGpu but here a segfault is reported. Any idea why?

Can the issue be circumvented by masking the devices code was not compiled for with HIP_VISIBLE_DEVICES?

AFAIR, this issue also has caused immediate program termination previously, not just kernel launch failures that can be handled by the application. Presumably the error is not handled gracefully inside HIP. So, I’m not surprised to hear that a segfault is one of the possible consequences.

There used to be an error message along the lines of hipErrorNoBinaryForGpu: Unable to find code object for all current devices, but maybe this is no longer present in current ROCm versions, or only present when not compiling ROCm with -DNDEBUG or something similar.

As the docs say, HIP_VISIBLE_DEVICES should work, and I can confirm this from my experience with this issue.