What are the potential risks of having a too high pulling rate in Gromacs umbrella sampling?

GROMACS version: 2022.4

Dear Gromacs users, is it acceptable for the pulling rate in umbrella sampling to be relatively fast? If it’s too fast, what impact might it have on the results? Is there an appropriate range for the pulling rate?

I am investigating in the process off an organic molecule diffusing from the liquid phase to the interface and then being released into the gas phase. As the umbrella sampling calculation is a bit time consuming if I need to screen many molecules, so I’m wondering if speeding up the pulling rate might be feasible.

Additionally, I have calculated the work consumed in a single pull from solution to gas phase (without calculating each window) by: W = ΣFΔx
and found that the result exceeded 4000 kJ/mol (see the figure below). If I subtract the kinetic energy of the pulled molecule, will the final result correspond to the energy barrier of this process?

Pulling parameters:

pull_coord1_rate        = 0.02
pull_coord1_k           = 1500

nsteps                  = 500000
dt                      = 0.001

I would greatly appreciate any guidance!

In principle, when you are generating starting positions for umbrella sampling, you can use a fairly high pull rate, in most cases. What you need to keep in mind though, is that a high pull rate means that you are pulling far from equilibrium. This may in turn mean that you need to discard more data in the beginning of each of your umbrella sampling windows to get stable and reliable results. Pulling for a few more ns might save you a significant amount of time if you use many umbrella windows.

You can’t get the free energy barrier, at least not reliably, just from the pull force of a single pulling simulation. You should have a look at Jarzynski’s equality and Crooks’ fluctuation theorem. With Jarzynski’s equality you pull multiple times (very many times) in one direction to get the PMF, but the friction is not properly accounted for. With Crooks’ fluctuation theorem you pull in both the forward and reverse direction, which means that you can get a better estimate of the PMF by accounting for the friction (dissipated work). Unless you know what you are doing, I would recommend sticking to umbrella sampling, or possibly consider AWH or metadynamics etc.

Thanks for your valuable insights MagnusL! They are quite helpful to me.

As you mentioned,

Pulling for a few more ns might save you a significant amount of time if you use many umbrella windows.

Here I suppose you mean pulling with a fairly high rate but for a bit longer time is more efficient than a slow pulling rate, is that right? Thank you once again for your kind help!

High rate and longer time would mean that you are pulling further away. I mean that low rate, and longer time, might be more efficient than high rate, and shorter time, since you might not have to discard as much time in the beginning of each umbrella window.

For example, pulling for 0.5 ns (rate 0.02), having to discard 2 ns in the beginning of each (let’s say 50) umbrella window is not as efficient as pulling for 25 ns (rate 0.0004) and having to discard 0.5 ns in the beginning of each umbrella window.

The numbers above are just examples, and the optimal settings are system dependent.

Oh I see, thank you MagnusL, your reply greatly helped me understand the umbrella sampling!
I have another question, if I hope to qualitatively compare the free energy barrier of 2 different molecules (without needing the exact values), is there a faster method I could use?

What you are doing is not umbrella sampling. In umbrella sampling you run many independent simulations with the umbrella a fixed, different locations. What you are doing is steered MD. To obtain a free-energy from this you would need to run multiple such simulations and average the results using Jarzynski’s equality. In practice this will not work using your extremely high pull rate.

I would advise to use actual umbrella sampling. But this is going to be very time consuming.

Indeed. I updated my answer above (What are the potential risks of having a too high pulling rate in Gromacs umbrella sampling? - #2 by MagnusL) to avoid that potential confusion.

Yes hess, maybe I mixed two questions together, thanks for correction. I know this is not a umbrella sampling. Thanks hess and MagnusL for your kind help.