NPT production run, multiple independent simulations, and end-state free energy calculations

GROMACS version: 2022.2
GROMACS modification: No

Hello everyone,

I am trying to run a simulation on a protein/ligand complex to make a prediction of the binding free energy.

Thus, I find myself trying to set up a similar protocol to one proposed in this paper. The authors state that running multiple short independent simulations using a single, 6ns NPT production run as the starting point, allows them to find binding free energies that correlate with their own experimental data. Given that their system is similar to mine, I want to follow their protocol.

However, I have quite a few questions about how to set up this particular simulation.

What parameters should I take into account when defining my .mdp file for the NPT production run? What differences, beyond the duration, are there between a regular, equilibration NPT simulation and a production run, if any?

Regarding the short independent simulations, should I simply run the production MD with the .mdp parameters like this:

continuation = no
gen-vel = yes
gen-seed = -1

If so, what is the purpose of the long NPT production run if I’ll generate new velocities for my system either way?

After the short simulations, I would calculate and average the binding free energy from the short simulations using either MM-GBSA or QM/MM-GBSA as implemented in the gmx_MMPBSA software.

Finally, does this protocol seem like a good way to get the binding free energy of my compounds to my protein or should I stick with a more traditional protocol (one or a few long MD simulations) and be done with it?

Any help is greatly appreciated, and any references or literature you could suggest to me to better understand how to set up these simulations is very welcome.

Thank you very much for your help,

Pato