GROMACS version:2021.3
GROMACS modification: No
My GPU is NVIDIA RTX A4000. In the first picture, OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 64). But in the second picture, when I run the simulation, it only uses 16 OpenMP threads. How can I use more OpenMP threads during my simulation?
Thank you very much.
I don’t know that I can help that much since I’m no expert, but this sounds like CPU core parallelism to me. I always understood the GPU use to be in the CUDA, not the openMP. Your computer probably has 16 CPU cores; I don’t think you can use more threads than that.
Greg
Correct.
OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 64).
This means that OpenMP multi-threading for using multiple CPU cores (potentially paired with with a GPU) is possible, and at most 64 threads can be used per MPI rank.
The log file contains a hardware detection section which will tell you what is the number of CPU physical cores and CPU hardware threads (also called logical cores), this will determine the number of threads mdrun
uses by default.
Thank you very much. I checked my log file.
Running on 1 node with total 8 cores, 16 logical cores, 1 compatible GPU
Hardware detected:
CPU info:
Vendor: Intel
Brand: 11th Gen Intel(R) Core(TM) i7-11700K @ 3.60GHz
Family: 6 Model: 167 Stepping: 1
Features: aes apic avx avx2 avx512f avx512cd avx512bw avx512vl avx512secondFMA clfsh cmov cx8 cx16 f16c fma htt intel lahf mmx msr nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdrnd rdtscp sha sse2 sse3 sse4.1 sse4.2 ssse3 tdt x2apic
Number of AVX-512 FMA units: 1 (AVX2 is faster w/o 2 AVX-512 FMA units)
Hardware topology: Basic
Sockets, cores, and logical processors:
Socket 0: [ 0 8] [ 1 9] [ 2 10] [ 3 11] [ 4 12] [ 5 13] [ 6 14] [ 7 15]
GPU info:
Number of GPUs detected: 1
#0: NVIDIA NVIDIA RTX A4000, compute cap.: 8.6, ECC: no, stat: compatible
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
If I want the simulation to run faster, how should I modify the parameters of mdrun?
The above shows that you have an 8-core 16-thread CPU, so mdrun
correctly started 16 threads to use all CPU resources.
You might want to try the GPU-resident mode using -update gpu
which could give better performance, possibly moving the bonded or PME work back to the quite fast CPU you have, i.e. -update gpu -bonded cpu
or -update gpu -pme cpu
.