Question about PME WAIT FOR PP and WAIT GPU STATE COPY variables in performance log

gyacu2000 · January 13, 2021, 10:36pm

GROMACS version: 2020.4
GROMACS modification: No

Hello,

I’m running an 82K atom benchmark to configure my hardware for a molecular dynamics simulation. With the current configuration, I reach a performance of ~80 ns/day, but a large chunk of the performance is being lost to PME WAIT FOR PP and WAIT GPU STATE COPY (screenshot from logfile attached). I have four K80 GPUs in total. Three are designated for PP and one for PME. Additionally I am using four tMPI threads and 3 OpenMP threads per tMPI. I have tried adding more threads, but there is not a significant change in PME WAIT FOR PP nor WAIT GPU STATE COPY.

I’m wondering if anyone has any suggestions on how to optimize performance from this point? It would be very much appreciated!

Best,
George

newmdfan · June 1, 2022, 8:40am

I would try to increase tMPI to 8

hess · June 8, 2022, 7:14am

The original post is more than one year old, so I don’t know if answering is still useful.

The issue is that the constraints take a lot of time and constraining happens on the CPU and the the GPUs are waiting. I assume that -update gpu does not work as there are likely coupled constraints. With constraints on h-bonds only integration and constraining can be done on the CPU and the performance would be much better.

Topic		Replies	Views
How to improve performances on multiple GPU User discussions	6	1143	July 22, 2020
Pme calculation on gpu? User discussions	0	256	January 15, 2021
Optimizing GPU performance for GROMACS? User discussions	6	1412	January 13, 2021
Performance optimization with PME GPU decomposition User discussions	9	371	September 25, 2024
-pme gpu is not working? User discussions	5	1152	January 18, 2021

Question about PME WAIT FOR PP and WAIT GPU STATE COPY variables in performance log

Related topics