GROMACS version: 2022.3
GROMACS modification: No
I am attempting to run ~150 ns simulations that take ~12 hours on a single-node AMD CPU system, but the simulations sometimes hang after ~8 hours of run time (i.e. the system resources remain occupied but the simulation fails to progress). This is the same issue reported here, but my understanding is that the work-around solution of using intelmpi is incompatible with my system.
For reference, here is my GROMACS and system info:
GROMACS version: 2022.3
Precision: mixed
Memory model: 64 bit
MPI library: thread_mpi
OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 128)
GPU support: disabled
SIMD instructions: AVX2_256
CPU FFT library: fftw-3.3.8-sse2-avx-avx2-avx2_128
GPU FFT library: none
RDTSCP usage: enabled
TNG support: enabled
Hwloc support: disabled
Tracing support: disabled
C compiler: /usr/bin/clang Clang 10.0.0
C compiler flags: -mavx2 -mfma -Wno-missing-field-initializers -O3 -DNDEBUG
C++ compiler: /usr/bin/clang++ Clang 10.0.0
C++ compiler flags: -mavx2 -mfma -Wno-missing-field-initializers -Weverything -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-source-uses-openmp -Wno-c++17-extensions -Wno-documentation-unknown-command -Wno-covered-switch-default -Wno-switch-enum -Wno-extra-semi-stmt -Wno-weak-vtables -Wno-shadow -Wno-padded -Wno-reserved-id-macro -Wno-double-promotion -Wno-exit-time-destructors -Wno-global-constructors -Wno-documentation -Wno-format-nonliteral -Wno-used-but-marked-unused -Wno-float-equal -Wno-conditional-uninitialized -Wno-conversion -Wno-disabled-macro-expansion -Wno-unused-macros -fopenmp=libomp -O3 -DNDEBUG
Running on 1 node with total 32 cores, 64 processing units
Hardware detected:
CPU info:
Vendor: AMD
Brand: AMD Ryzen Threadripper 3970X 32-Core Processor
Family: 23 Model: 49 Stepping: 0
Features: aes amd apic avx avx2 clfsh cmov cx8 cx16 f16c fma htt lahf misalignsse mmx msr nonstop_tsc pclmuldq pdpe1gb popcnt pse rdrnd rdtscp sha sse2 sse3 sse4a sse4.1 sse4.2 ssse3
Hardware topology: Basic
Packages, cores, and logical processors:
[indices refer to OS logical processors]
Package 0: [ 0 40] [ 1 9] [ 2 10] [ 3 11] [ 4 12] [ 5 13] [ 6 14] [ 7 15] [ 8 16] [ 17 25] [ 18 26] [ 19 27] [ 20 28] [ 21 29] [ 22 30] [ 23 31] [ 24 32] [ 33 41] [ 34 42] [ 35 43] [ 36 44] [ 37 45] [ 38 46] [ 39 47] [ 48 56] [ 49 57] [ 50 58] [ 51 59] [ 52 60] [ 53 61] [ 54 62] [ 55 63]
CPU limit set by OS: -1 Recommended max number of threads: 64
I compiled GROMACS with clang because I experienced the make check issue that was reported here.
Any advice would be much appreciated - thank you for your help!