Simulations with free energy calculations on GPU is several time slower than normal simulations

GROMACS version: 2021
GROMACS modification: No

I use Gromacs to get the hydration free energy of one capped residue, the simulation box is 4x4x4 nm^3. When only using 8 CPU cores, the simulations speed is about 60ns/day, not so different with the normal simulation without free energy calculation ~110ns/day. However, when I run it on GPU, the normal simulation is ~1100 ns/day, the free energy calculation one is ~160 ns/day, several times difference. And I found the GPU usage is only 20% when running simulations with free energy calculations. Just wondering whether this is a normal situation? Or something wrong with my setup?

The following is the mdp file:
integrator = sd
dt = 0.002
nsteps = 250000

pbc = xyz

nstlist = 100
rlist = 1.2
ns_type = grid

coulombtype = PME
pme_order = 4
rcoulomb = 1.2
fourierspacing = 0.16
ewald_rtol = 1e-5
DispCorr = EnerPres

vdwtype = cut-off
vdw-modifier = potential-switch
rvdw_switch = 1.0
rvdw = 1.1

tc-grps = Protein Non-Protein
tau_t = 0.1 0.1
ref_t = 298 298

; Pressure coupling is on
pcoupl = Parrinello-Rahman
pcoupltype = isotropic
tau_p = 2.0
ref_p = 1.0
compressibility = 4.5e-5

continuation = no
gen_vel = yes
gen_temp = 298
gen_seed = -1

; For GPU version, h-bonds is fater than all-bonds, see Gromacs manual 2021.
constraints = h-bonds
constraint_algorithm = lincs
lincs_iter = 1
lincs_order = 4

nstcomm = 100
comm-mode = Linear
comm-grps = Protein Non-Protein

nstxout = 1000000
nstvout = 1000000
nstfout = 1000000
compressed_x_grps = System
nstxout-compressed = 10000

nstlog = 1000000
nstenergy = 1000000

; Free energy calculation
free_energy = yes
init_lambda_state = 0
delta_lambda = 0
calc_lambda_neighbors = 1 ; only immediate neighboring windows
couple-moltype = Protein_chain_A ; name of moleculetype to decouple
couple-lambda0 = none ;
couple-lambda1 = vdw-q ;
couple-intramol = no
; Vectors of lambda specified here
; Each combination is an index that is retrieved from init_lambda_state for each simulation
; init_lambda_state 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
vdw_lambdas = 0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
coul_lambdas = 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00
; Options for the decoupling
sc-alpha = 0.5
sc-coul = no
sc-power = 1
sc-sigma = 0.3
nstdhdl = 100

Thanks a lot!

This is typical. Free energy calculations on GPU will be much faster in the upcoming 2022 release.

Thanks for the reply, Justin. BTW, I benefit a lot from your Gromacs tutorial, really appreciate it. I’m looking forward to the new release!

Hi Justin,

Happy new year!

I have the same problem on free energy calculation on GPU using Gromacs (version 2022) — I don’t know how to make Gromacs using GPU — Gromacs is using CPU while calculating free energy. Do you have any tutorials on that? Thanks.

Also, your tutorials are wonderful! Thank you so much!

Qian

Enabling GPU support is addressed in the installation instructions.

1 Like

Hi @jalemkul ,

I am actually replying to this older chat as I am facing the same issue. In one of your earlier replies, you mentioned that the Free energy calculations will be much faster in the 2022 release. I am assuming that this stands for the newest releases as well.
But I am using 2024.1 gromacs version and still see a similar slow down in speed using GPUs for free energy calculations as compared to vanilla MD simulations of the system of the same size.
Given this situation, I am suspecting that I maybe giving some input which might slow down the simulations.
Is there somewhere I need to check? Please help.

My simulation system consists of water molecules and neutral NaCl species. I am increasing the vdW interactions from zero to full LJ (6-12).

Regards,
Chaitanya

There is still a significant slowdown when running alchemical free energy calculations. But the situation is much better than before release 2022. We hope to further improve this by enabling offloading updates and constraints to GPU when using the sd integrator (recommended for alchemical free energy simulations with a decoupled end state).