Changing the position of pulling molecule while doing umbrella sampling in each window

GROMACS version: 2018
GROMACS modification: Yes/No
I’m doing umbrella sampling in the way that Dr. Lemkul did in the gromaces tutorial.

As procedure said, after pulling section I chose configurations with specific distances and did umbrella sampling simulations. The problem is after brief NPT_umbrella, the distance didn’t change but after md_umbrella, the distances changed about 0.2 nm (more or less, it changed!).

For instance:
after pulling:
conf26 => distance: 5.773 nm
after NPT_umbrella => distance: 5.773 nm
after md_umbrella => distance: 6.33 nm

I must say that the space between configurations is 0.1 nm, so 0.2 nm change in distance seems unusual. What should I do? It’s negligible or I have to do somethimg?

I did plenty of simulations to solve this,
I increased NPT_umbrella and md_umbrella simulation time,
I set restrain (for one specific atom near its COM) for my pulling molecules.
In the case I put restrain for whole pulling molecule, it stay in its position but i don’t think that’s right.

Any help will be appreciated.


How big is your box? Once your distances approach half of the box size, you might be running into issues with PBC, and a box of >10 nm would already count as large. Have you checked the evolution of the values with time? Do they jump suddenly?


My box is 7714 and I don’t know it jumped or went through time! How do I understand that?
I don’t know if this is a serious problem or not! Should it be in the exact position along the umbrella sampling simulation?


What do you mean by 7714, what are your box vectors?

US is based on restraints, not constraints, so the actual distance usually fluctuates around the location of the minimum of the harmonic potential, but jumping by almost 0.6 nm is usually quite a lot. There can be many issues, from PBC to an underestimated force constant to errors in group definitions.

However, remember that free energy methods cannot really be used in a black-box manner - you should be at least aware of fundamental issues with molecular simulations. We can help you fix the issue at hand if you post your .mdp pull-code section and/or plot of your .xvg files, but there’s more to getting a successful and reliable free energy estimate than that.


My box vectors are 7 7 14 (as x y z).

I’ve tested US for some configuration and as you said, they fluctuated. Case by case the value of fluctuation is low or high.

; Pull code
pull = yes
pull_ncoords = 1
pull_ngroups = 2
pull_group1_name = group-1
pull_group2_name = group-2
pull_coord1_type = umbrella
pull_coord1_geometry = distance
pull_coord1_dim = N N Y
pull_coord1_groups = 1 2
pull_coord1_start = yes
pull_coord1_rate = 0.0
pull_coord1_k = 1000 ;kJ mol^-1 nm^-2




So here your distance remains close to its initial value (5.77 nm) - nothing wrong about this, that’s exactly the idea of US. I was more worried about the 6.33 nm value you mentioned in the first post; PBC looks ok though, if your coordinate is only calculated in the Z direction, it should be fine up to 14/2 = 7 nm.


But after each simulation I evaluated the distance using gmx distance command and it gave me 6.33 nm for distance between COMs.

Please see the picture below:

Which distance is important in this case? Average distance from gmx distance or the distance from US_pullx.xvg?


The one from .xvg reports how Gromacs’ pull code “saw” the distance during the simulation. So it’s suitable for reconstructing your PMF using WHAM.

Here my guess is that .xvg from the run is reporting only the Z-component of the distance (due to the pull_coord1_dim = N N Y setting), while gmx distance reports the “full” 3D distance. I imagine also contains individual components of the vector; if so, you can check whether that’s the case.


Thanks a lot.

best regards,

Hi Nasrola,
Just use a stiffer spring (a higher pull_coord1_k) for those unhappy windows, It doesn’t necessarily need to have a similar pull_coord1_k for all windows.


Hi Salman,
How much should I increase the value for pull_coord1_k? Is there any rules or it’s just empirical?


Remember that for WHAM, you need overlap in the histograms between each pair of neighboring US windows. So it’s empirical, but two conditions limit the value from above and below: a low constant will allow the system to go towards “preferred” regions yielding unsampled free energy barrier regions, while a high one will restrain the system too much so that no two neighboring distributions will overlap.

Checking the overlap is usually explained in standard US tutorials, such as this: GROMACS Tutorial 5 - Methane-methane PMF from window sampling — BioSFLab homepage

Thanks Milosz, I’ve checked that.
It’s very helpful and my problem has been solved.