Steered MD Simulation error

GROMACS version: 2021.4/gpuvolta
GROMACS modification: No

Hi everyone,

I am following Tutorial 3: Umbrella Sampling, but altering the .mdp files to a method section from a paper (unfortunately the authors are not much help - I’m building my own .mdp files from scratch). After running a steered MD simulation, the pullf.xvg file looks incorrect (below). My system consists of two alpha-helix structures each 35 residues in length and the goal of my simulation is to perform Umbrella Sampling to determine free energy. I appreciate any help, thank you!

Also, during my Steered MD simulation I get these messages:

Screenshot 2026-01-25 at 2.55.25 pm

Here is my md_pull.mdp:

title = Umbrella pulling simulation
define = -DPOSRES
; Run parameters
integrator = md
dt = 0.002
tinit = 0
nsteps = 250000 ; 500 ps
nstcomm = 10
; Output parameters
nstxout = 5000 ; Every 10 ps
nstvout = 5000
nstfout = 500
nstxtcout = 500
nstenergy = 500
constraint_algorithm = lincs
lincs_iter = 2
lincs_order = 4
constraints = h-bonds
continuation = yes
pbc = xyz
DispCorr = EnerPres
; PME electrostatics parameters
coulombtype = PME
fourierspacing = 0.12
fourier_nx = 0
fourier_ny = 0
fourier_nz = 0
pme_order = 4
ewald_rtol = 1e-5
optimize_fft = yes
; Pull code
pull = yes
pull_ncoords = 1 ; Only one reaction coordinate
pull_ngroups = 2 ; Two groups defining one reaction coordinate
pull_group1_name = Chain_A
pull_group2_name = Chain_B
pull_group1_pbcatom = 283
pull_group2_pbcatom = 867
pull_coord1_type = umbrella ; Harmonic potential
pull_coord1_geometry = distance ; Simple distance increase
pull_coord1_dim = N N Y
pull_coord1_groups = 1 2
pull_coord1_start = yes ; Define initial COM distance > 0
pull_coord1_rate = 0.01 ; 0.1 nm per ps = 50 nm per ns
pull_coord1_k = 1000 ; kJ mol^-1 nm^-2
pull_pbc_ref_prev_step_com = yes

When I change pull_coord1_dim = Y Y Y

I get:

You are pulling at an insanely high rate. I don’t expect the system could follow even with a strong spring constant. But with your rather weak spring you are simply extending the spring, leading to the results you show.

Hi Hess!

Thanks for the reply. Actually, the paper I’m following says “A 500 ps COM pull sampling was performed with a consistent force approach (k = 1000 kJ/mol/nm) and a rate of 0.1 nm/ps between the COMs”. Unfortunately, the paper did not include their pullf.xvg graph, so every alteration I test for, goes until creating the pmf.xvg graph. When I change pull_coord1_rate to 0.0001, here are the pullf.xvg and pmf.xvg graphs I get, respectively.

Is trial and error the only way to find a ‘good’ pull_coord1_rate? Also, how come in every simulation I do, the x-axis of my pmf.xvg graph spans 0.72 - 0.77 nm? The paper I am referring to has the following graph:

You need to have some intuition/experience to see what would work. But I’d say that it’s rather obvious that two helices would never move apart at a speed of 0.1 nm/ps in reality. That’s far too fast. Maybe it’s a typo in the paper and it should be 0.1 nm/ns, which sounds reasonable. Then the rather force constant of 1000 kJ/mol/nm^2 (you write kJ/mol/nm) could work, although I would choose a higher value.

Obviously you need to run much longer with a pull rate of 0.1 nm/ns.

A general note is that you will often not get a reasonable free-energy estimate from a single steered MD simulation, as there will be a lot of irreversible work included in the work. You would need run many simulations and used Jarzynski’s equality to get the free energy (but this only works with low variance).

This is 0.01 nm/ps and also “0.1 nm per ps = 50 nm per ns“ is a false statement.

Regardless, the above comments are correct. Also, it may be useful to convert these units to get a common sense feel for these numbers: 0.1 nm/ps = 100 m/s = 360 km/h, which is what a Bugatti does on a sunny day. That aside, it is possible that the paper you are trying to follow is nonsense, because free energy estimates don’t really come from direct pulls.

Considerably slower (1e-5 to 1e-4 nm/ps) pulls can be useful if you wish to monitor the force and/or to get a Mickey Mouse estimate of the force integral over time or some carefully selected distance between the groups, but none of those things yield a reliable estimate of the free energy. To get the latter, I’d recommend following the Justin Lemkul’s tutorial: Umbrella Sampling

Hi Sasha and Hess!

Thank you both for your reply.

Yes I am new to GROMACS and Umbrella Sampling, so I appreciate all the help I get. I ran another steered MD simulation (graph below), this time with a pull_coord1_rate of 0.0001. (0.1 nm/ns) and pull_coord1_k of 2000 kJ/mol/nm^2. I also ran the simulation for much longer (below parameters):

; Run parameters
integrator = md
dt = 0.002
tinit = 0
nsteps = 25000000 ; 50 ns
nstcomm = 10
; Output parameters
nstxout = 500000 ; Every 10 ps
nstvout = 500000
nstfout = 50000
nstxtcout = 50000
nstenergy = 50000

I’ve been following Justin Lemkul’s tutorial on Umbrella Sampling, but modifying the .mdp parameters to align with the paper’s method. Is there a better method for free energy estimates?

I think the group you’re pulling is just stuck in place, that’s all. The pull-code’s reference point moves 5 nm in 50 ns and now you have Hooke’s law (F = kx), where k = 2000 kJ/mol/nm^2 and x = 5 nm. At the end of the simulation, you end up with F = 10000kJ/mol/nm, which means whatever you’re pulling is not moving. I think you should look at your trajectory first.

There are several ways to estimate free energies and the tutorial above is a very good way – even if it feels confusing at the moment. Direct pulling does not estimate free energies.

Hi Sasha,

Sorry for the late reply, I was unable to run simulations for some time. Your explanation about a group being pulled but stuck in place makes sense. I extracted the trajectory frames with the python script provided in the tutorial and here is what the “summary_distances.dat” file looks like. I’m not sure what to make of it or how to fix it:

Here is a full copy of my “md_pull.mdp”:

title = Umbrella pulling simulation
define = -DPOSRES
; Run parameters
integrator = md
dt = 0.002
tinit = 0
nsteps = 25000000 ; 50 ns
nstcomm = 10
; Output parameters
nstxout = 500000 ; Every 10 ps
nstvout = 500000
nstfout = 50000
nstxtcout = 50000
nstenergy = 50000
constraint_algorithm = lincs
lincs_iter = 2
lincs_order = 4
constraints = h-bonds
continuation = yes
pbc = xyz
DispCorr = EnerPres
; PME electrostatics parameters
coulombtype = PME
fourierspacing = 0.12
fourier_nx = 0
fourier_ny = 0
fourier_nz = 0
pme_order = 4
ewald_rtol = 1e-5
optimize_fft = yes
; Pull code
pull = yes
pull_ncoords = 1 ; Only one reaction coordinate
pull_ngroups = 2 ; Two groups defining one reaction coordinate
pull_group1_name = Chain_A
pull_group2_name = Chain_B
pull_group1_pbcatom = 283
pull_group2_pbcatom = 867
pull_coord1_type = umbrella ; Harmonic potential
pull_coord1_geometry = distance ; Simple distance increase
pull_coord1_dim = N N Y
pull_coord1_groups = 1 2
pull_coord1_start = yes ; Define initial COM distance > 0
pull_coord1_rate = 0.0001
pull_coord1_k = 2000 ; kJ mol^-1 nm^-2
pull_pbc_ref_prev_step_com = yes

Again, I appreciate all the help!

You likely need a much larger force constant.

But using distance along one dimension is strange. Then I would use geometry=direction.

And why are use using pull_pbc_ref_prev_step_com=yes?

Hi Hess, appreciate the reply!

The first change to my “md_pull.mdp” file was my force constant to 8000 kJ mol^-1 nm^-2. Second, I realised I had a positional restraint on the whole system, “define = -DPOSRES”. I changed this to “define = -DPOSRES_A” and accordingly changed “topol_Protein_chain_A.itp” to include:
#ifdef POSRES_A
#include “posre_Protein_chain_A.itp”
#endif
I made this change with the logic of restraining one chain and leaving the second chain to freely move in the defined box, thus solving the problem of my chains remaining stationary. However, I received a “Segmentation Fault”.

In addition to these new changes, I used “pull_coord1_geometry = direction”, which prompted me to use “pull_coord1_vec”, which I set to “0.0 0.0 4.5”. I chose 4.5 because I thought it will pull for 4.5 nm in the z axis only.

Also, reading online forums I came to understand “pull_pbc_ref_prev_stom_com=yes” reduces artifacts in pulling simulations. Actually, if I remove this line GROMACS provides the following error message:
”ERROR 1 [file md_pull.mdp]
The maximum distance from the chosen PBC atom (283) of pull group 1 to other atoms in the group is larger than 0.5 times half the box size. Set the pull-pbc-ref-prev-step-com option to yes.”

”ERROR 2 [file md_pull.mdp]
The maximum distance from the chosen PBC atom (867) of pull group 1 to other atoms in the group is larger than 0.5 times half the box size. Set the pull-pbc-ref-prev-step-com option to yes.”

And if I remove the “pull_group1_pbcatom = 283” and “pull_group2_pbcatom = 867” line from the “md_pull.mdp” file, I then recive this error:
”ERROR 1 [file md_pull.mdp]:
When the maximum distance from a pull group reference atom to other atoms in the group is larger than 0.5 times half the box size a centrally placed atom should be chosen as pbcatom. Pull group 1 is larger than that and does not have a specific atom selected as reference atom.”

”ERROR 2 [file md_pull.mdp]:
When the maximum distance from a pull group reference atom to other atoms in the group is larger than 0.5 times half the box size a centrally placed atom should be chosen as pbcatom. Pull group 2 is larger than that and does not have a specific atom selected as reference atom.”

I really appreciate any help with fixing my Umbrella Sampling simulation, my end goal is to determine the binding energy between these chains

How can your pull group be larger than half the box? That doesn’t sound right.