Umbrella sampling: different harmonic spring constant along reaction coordinate?

GROMACS version: 2020.1
GROMACS modification: No

Greetings everyone, first post in this forum :). First of all, congratulations on the new platform, it is SO nice to see it thriving and full of life.

Short question: Is it acceptable to use different harmonic potential spring constants for different simulations along the reaction coordinate of an umbrella sampling protocol? I would like to use a stronger constant/more simulations in the first half of the coordinate, and weak constant/less simulations in the second half.

According to the gromacs gmx-wham manual section, “The umbrella potential is assumed to be harmonic and the force constants are read from the or .pdo files.”. Will the wham module take into account that different tpr files have different spring constants and correct for that? Is this an acceptable practice?

Long (and probably unnecessary) context

I am trying to estimate the free energy change of the binding between two proteins. I have the PDB structure of the bound conformation (a dimer). I am using umbrella sampling as well as the WHAM module to estimate the PMF along the reaction coordinate (thanks for the tutorial Justin).

The problem is that when I compute the PMF profile, there is a gap in the region of highest energy, which corresponds to regions of the reaction coordinate where the dimer will be reconstituted spontaneously. I guess I can use a higher harmonic constant in the umbrella sampling simulation, but then I would need to make more independent simulations to cover the rest of the reaction coordinate. Since this problem is specific of this region in the reaction coordinate, I was wondering if I could use (and if I should!) a higher spring constants in this region, and leave the rest of the simulations with a lower spring constant.

Thank you for your time

This is done all the time. gmx wham reads the force constants for every window from each .tpr file individually, so the PMF it generates reflects any differences you specified.

1 Like

I’m having the same problem (no matter which frames I pick, there’s a region that’s not sampled, highest energy coordinate I guess?) so I thought of this same solution and wanted to know if it was allowed to do this.
But, is there a limit to how much we can increase the spring constant? or a recommended way to start trying… For example, I’m currently using pull-coord1_k = 1000, could I test 2000? 5000? is there a limit, or a point after which the method might deviate from its expected results? would 10000 be too much?

Thank you


I am not an expert so take what I will say with a big fat grain of salt.

What I learned from trying to do what I describe above is that umbrella sampling is a much trickier issue than it seems. Even for more or less ideal cases, like small ligand / protein systems, the COM distance reaction coordinate might not be enough to get adequate sampling, and you might have to fiddle with your setup to cover the rotational/conformational space.

This issue is made a thousand-fold worse with systems where the ligand is bigger and thus has way more degrees of freedom. In my case, it happened that I was able to get good reaction coordinate coverage “on paper”, but it did not translate to better conformational coverage and thus a better FEP estimation.

Technically you can increase the spring constant as much as you need. My bias is that if you ramp it high enough you can get good coverage of any region you are interested in, on the particular reaction coordinate that you have chosen. But that coverage only reflect the reaction coordinate space, not the actual conformation space in which you are interested! The solution? Find a better reaction coordinate.

Thanks for your reply. So, in your opinion, one could increase this pull-coord1_k to a relatively much higher value to complete the whole reaction coordinate sampling, if I understand correctly. Yes?

Regarding the lack of conformational space sampling, I think I also tumbled upon this problem with another system. But what would a different reaction coordinate be? I’m pulling two proteins appart, I have trouble imagining what other reaction coordinate I might choose.

GROMACS version: 2020.4
GROMACS modification: No

Hello, I’ve come to notice a big(?) difference in the final PMF curves between replicates (triplicates), when for two of them I managed to complete a gap between the sampling windows using always the same spring constant (2000), and for one of them, I had to complete the gap by increasing the spring constant to 4000, since I couldn’t manage to find any starting conformations that worked to fill this gap, they all drifted from the wanted position to other parts of the reaction coordinate already well sampled, leaving the same original gap. So, can I trust the PMF from this umbrella sampling where I’ve used two different spring constants for some of the peaks?

Thanks in advance.

PS: I attach a picture showing my results.

Using multiple force constants is often done. It just depends on the nature of the system and the strength of the interactions. It’s important to look at the histograms to understand if you’re achieving good overlap. That should help justify the force constant and/or window spacing.

Thank you for your answer. After more careful examination, it seems this difference in PMF is not due to having used a different spring constant in some frames for the red profile. It was just coincidence. I am still trying to find out why such a large difference in PMF in that simulation, compared to the others.