Umbrella sampling - what values of force constant and velocity should I use?

I have performed pulling simulations on receptor ligand complex, where ligand is being pulled for around 5 nm for 500 ps. I am showing two plots here: blue corresponds to the Force constant of 250 kJ/mol/nm2 and velocity 0.01 nm/ps. And orange corresponds to k of 500kJ/mol/nm2 and v 0.01 nm/ps.
I know it is preferable to go for the lowest values, but the blue plot has a little plateau, not sure what if it is ok. The simulation looks good though.

Are these the pulling simulations you use to generate starting configurations for umbrella sampling?

Anyhow, from the plot it seems like the ligand does not move during the first 160 ps with the high force constant (the pull reference position is approximately 1.6 nm away from the ligand with a force of 800 kJ/mol/nm, which is in agreement with 160 ps and 0.1 nm/ps) and not until 240 ps with the lower force constant. Does that agree with your trajectories? I would suggest a higher force constant, such as 5,000-10,000 kJ/mol/nm2, i.e. if you want the ligand to follow the pull reference position.

Many thanks for your reply.

Yes, these are pulling simulations to generate starting configurations for Umbrella sampling.

I have added a third plot now with a higher spring constant of 1000 kJ/mol/nm2. It essentially looks same as that with 500, just the ligand is pulled out of the pocket earlier, which is expected.

Could you please explian why spring constant of 5000 to 10000 Kj/mol/nm2 would be better? I thought these values are too high to be used for a small molecule ligand. Then I dont need to go upto 500 ps. Because it is going to converge pretty fast .

It’s not a matter of convergence - you want to generate evenly spaced initial configurations for the following umbrella sampling simulations. You want the ligand close to the pull reference position all the time. If the ligand suddenly snaps out from the binding site, it will be difficult to find evenly spaced ligand positions close to the binding site, which might be where you need the tightest distribution of umbrella sampling windows.

What you would like is a pull force plot that is almost flat at 0 all the way. I would also suggest pulling half as quickly and for twice as long - that may save you a bit of equilibration time in the beginning of each window of the umbrella sampling simulations.

Regarding the force constant, as far as I know, as long as the simulation doesn’t crash it should be ok. You could even pull using a constraint instead of a umbrella - then you don’t need to care about the force constant.

However, when you run the following umbrella sampling simulations, you want the force constant low enough to ensure good overlap between the simulation windows.

Ok. Thank you. This clears a lot of doubt.
I am running it at half the rate ie 0.005 nm/ps and for longer i.e.1000 ps now. Is that what you meant? Will share the plot when done.

The plot with 1000 ps simulation with 0.005 nm/ps and spring constant of 1000 kJ/mol/nm2 looks a little weird. Please let me know if you have any more suggestions.

That plot looks somewhat better to me. But I still think that the pulling is far too weak to get a decent distribution of positions close to the binding configuration. Have a look at the pullx.xvg or the trajectory to make sure that there are no jumps.

Did you try a force constant of 10,000? Even 50,000 might be worth trying. But as I said above, only for generating the starting configurations - not for the umbrella sampling.

Thanks so much for the suggestion.
I tried both 10000(blue) and 50000 (orange) with 0.005 velocity.

I also checked the tarjectories visually, it looked exactly how you wanted it to be. The ligand closer to the reference position throughout.
What are your thoughts now? Is it good to go for the umbrella runs?
Thanks!

Yes, you could probably use the starting positions from either the 10000 or 50000 simulations, but I think the 50000 would be better. It looks like its largest deviation between ligand and pull reference position is ~0.02 nm, compared to 0.1 nm with the 10000 pull force constant.

I would recommend a tight distribution of umbrella windows the first 1 nm (or so), with a force constant in the range of 2500-10000. For the rest you can probably use a force constant of 250-1000.

1 Like

Thank you!

Hi Magnus,
Based on your suggestion I performed Umbrella runs on the 50000 force constant pulling trajectories, keeping tight distribution till 2 nm (trying all the force constants from 500 till 10000). After 2 nm I used 500 force constant.

The issue is, no matter what force constant I try, I am unable to cover the region ~1.5 nm. (I used all the configurations between 1.2 to 2 nm without any gap). have also attached the histogram. Could you please suggest what else can I try?
Also, is it ok to use the same configuration umbrella run with different force constants in the wham analysis?

Thank you,
Tasneem

Do you have any starting configurations at ~1.45-1.65 nm at all? You might have to re-run the non-equilibrium pulling above (i.e., to generate the starting configurations) with even higher force constant (try 250000) and/or write output conformations at a shorter interval. It might not be necessary to re-run the current umbrella simulations, even if you generate a few more starting configurations from a new non-equilbrium pulling simulation. But you should make sure that the pulling paths are similar enough and that you equilibrate the umbrella sampling simulations (discard enough data in the beginning) to ensure proper overlap of the ensembles.

gmx wham uses the force constants from the tpr files, so it should be fine using different force constants for different umbrella windows.

If, as you say above, there is no gap in the starting configurations, you will need significantly higher pull force constant in the 1.45-1.65 nm region. You might need as high as 25000-100000, but then you’ll need a very tight distribution in that region (see above in this answer regarding re-generating starting configurations).

No I didn’t have any conf between 1.45 and 1.65 nm. So as you suggested I reran the pulling with 250k and wrote conf at 0.5 ns. And did longer npt equilibration of the umbrella runs to discard the initial conf.

After playing with force constants varying from 5000 to 10000 for Umbrella runs, I was finally able to cover all the windows.
And this is how the histogram and pmf look like. Please let me know this is good.


There are some overlaps (not sure if that’s ok). Although, the binding energy value matches exactly with the experimental value.

Thank you so much!!

I would say that the overlap at ~1.2, 1.35 and 1.6 nm is insufficient. I think you should try to get a few more configurations in those sampling regions.

Thanks,
I tried sampling those regions but the histogram tends to shift in either the right or left of the bin.
I tried till 10000 k and manage to get a histogram looking like this. I also got rod of some other overlapping peaks.
Please let me know if this looks better than before.
Thank you so much for your guidance.

I still think the overlap could be better. I think you should try to add configurations between the blue and the yellow (~1.2 nm), the yellow and the green (~1.25 nm) and the green and the light red (~1.35 nm). If it’s difficult generating these configurations, and keep them in place, you might need even higher force constants. I don’t think you need to get rid of other overlapping peaks. If you’ve already generated the data, it’s better to keep it.

I sampled these three regions at 50000 k force constant. But I am still struggling to cover 1.2 nm. The other two looks ok. I have attached a histogram (one with all the histograms I have tested so far and the other after getting rid of the overlapping peaks for clarification) for your reference. Please let me know if going beyond 50000 would be worth testing / is recommended.


Also, I noticed some distributions with two peaks, are they ok to include for PMF calculations? Please suggest.
Thank you so much.

Indeed, you still don’t have much overlap at 1.20 nm. If you start an (at least one) umbrella sampling simulation at 1.20 nm and it quickly leaves, then you need a higher force constant. If you don’t have any simulations starting there, you need a higher force constant and/or a shorter interval between output frames in the nonequilibrium simulation generating the starting configurations. If you need to re-run a simulation to generate starting configurations, you only need to pull to 1.25 nm, since you’re only interested in getting at least one configuration in the 1.18-1.21 nm region.

If there are simulations giving sampling distributions with two peaks, it might be worth looking at the trajectories (of those specific simulations) to understand what is happening. You want the peaks to be reasonably bell shaped (following a normal distribution), but not all of them will be perfect (unless your PMF is flat). A few peaks that are split may not be a problem, but might indicate a too low force constant or possibly other configurational changes.

Thank you so much!
Actually, I had a few configurations at/around 1.2 nm interval but none of them peaked around that region when I was doing umbrella runs, I went upto 75000 force constant. So I analysed other umbrella runs to see what’s the usual shift (if any) I am getting and I was finally able to cover the 1.2 nm region using a configuration at 1.270 nm at 5000 force constant. I have attached a histogram for your review. The corresponding PMF plot is also attached.



Thank you!