GROMACS version:2020
GROMACS modification: Yes/No
Here post your question
Dear all ,
I am trying to accelerate the sampling with the AWH method. I have a performance loss mainly because of COM pull force ( 23.8% )
Computing: Num Num Call Wall time Giga-Cycles
Ranks Threads Count (s) total sum %
Domain decomp. 7 5 34 0.675 58.952 1.7
DD comm. load 7 5 34 0.001 0.112 0.0
Send X to PME 7 5 10001 3.817 333.193 9.4
Neighbor search 7 5 34 0.494 43.145 1.2
Launch GPU ops. 7 5 20002 0.750 65.486 1.8
Comm. coord. 7 5 9967 3.521 307.356 8.7
Force 7 5 10001 5.093 444.554 12.6
Wait + Comm. F 7 5 10001 2.768 241.606 6.8
PME mesh * 1 5 10001 18.072 225.363 6.4
PME wait for PP * 17.433 217.396 6.1
Wait + Recv. PME F 7 5 10001 1.426 124.476 3.5
Wait PME GPU gather 7 5 10001 2.500 218.248 6.2
Wait GPU NB nonloc. 7 5 10001 0.048 4.222 0.1
Wait GPU NB local 7 5 10001 0.037 3.218 0.1
NB X/F buffer ops. 7 5 39936 2.245 195.950 5.5
COM pull force 7 5 10001 9.666 843.764 23.8
AWH 7 5 10001 0.093 8.085 0.2
Write traj. 7 5 1 0.158 13.766 0.4
Update 7 5 10001 0.984 85.913 2.4
Constraints 7 5 10001 2.016 175.968 5.0
Comm. energies 7 5 1001 1.103 96.303 2.7
Total 35.505 3542.055 100.0
(*) Note that with separate PME ranks, the walltime column actually sums to
twice the total reported, but the cycle count total and % are correct.
Core t (s) Wall t (s) (%)
Time: 1420.104 35.505 3999.8
(ns/day) (hour/ns)
Performance: 48.675 0.493
Any suggestions or advice would be very appreciated!
Thank you so much
Amnah