Pull distance issues in a modified version of expanded ensemble

GROMACS version: 2020.2
GROMACS modification: No

Dear GROMACS users,
Recently I’ve been trying a run a modified version of the expanded ensemble on a host-guest binding complex. I defined 40 alchemical states for scaling the van der Waals and electrostatic interactions and applied a restraint between the host molecule and the guest molecule to make sure that the guest did not drift away in the uncoupled or some intermediate state. The difference between my expanded ensemble and the standard expanded ensemble is that I used PLUMED to bias the number of water molecules in the binding cavity in the hope to sample different configurations.

However, my simulation failed due to the following error:

Fatal error:
Distance between pull groups 1 and 2 (1.982856 nm) is larger than 0.49 times
the box size (1.982790).

I’m not entirely sure what was happening there since I didn’t get this error running standard expanded ensemble simulation using the same tpr file. My guess was that the introduction of the biasing on the number of water molecules in the binding cavity somehow make the guest molecule drift away from the host molecule. Therefore, I’ve also tried stronger lambda values of the restraint for the first few states or a larger force constant for the restraint, but the simulations still failed. As you can see from the mdp options that I used (shown below), I’ve already used relatively large values for restraint for the first few intermediate states and used a relatively large force constant. I might try an even larger force constant but I actually don’t think that would solve the problem. I was also wondering if this issue is related to pull-group1-pbcatom = 51 that I used. Since the host molecule is a ring-shape molecule and I was not able to set pull-group1-pbcatom at a virtual atom (like the C.O.M. of the host), so I just picked one atom that did not immediately give error when I launch the simulation. I thought that for a ring molecule, which any atom should be the same, but I’m not entirely sure. I’ve read a lot of discussions about this issue, but none of them seemed to solve my problem. Could anyone please share some experiences or suggestions solving this kind of issue? I would really appreciate that!

Below are the pull code and the free energy options I used, please let me know if more information is required. Thank you!

; Free energy calculation
free_energy = expanded
calc-lambda-neighbors = -1
sc-alpha = 0.5
sc-power = 1
sc-sigma = 0.5
couple-moltype = GST
couple-lambda0 = vdw-q
couple-lambda1 = none
couple-intramol = yes
init-lambda-state = 0
nstdhdl = 1000
dhdl-print-energy = total
; Seed for Monte Carlo in lambda space
lmc-seed = 1000
lmc-gibbsdelta = -1
lmc-forced-nstart = 0
symmetrized-transition-matrix = yes
nst-transition-matrix = 100000
wl-scale = 0.8
wl-ratio = 0.7
init-wl-delta = 2.0
; expanded ensemble variables
nstexpanded = 10
lmc-stats = wang-landau
lmc-move = metropolized-gibbs
lmc-weights-equil = wl-delta
weight-equil-wl-delta = 0.001
wl-oneovert = yes
; Pull code
pull = yes
pull-ngroups = 2
pull-ncoords = 1
pull-group1-name = HST
pull-group2-name = GST
pull-group1-pbcatom = 51
pull-pbc-ref-prev-step-com = yes
pull-coord1-groups = 1 2
pull-coord1-type = umbrella
pull-coord1-geometry = distance
pull-coord1-dim = Y Y Y
pull-coord1-origin = 0.0 0.0 0.0
pull-coord1-vec = 0.0 0.0 0.0
pull-coord1-start = yes
pull-coord1-rate = 0
pull-coord1-k = 0 ; kJ mol^-1 nm^-2
pull-coord1-kB = 5000 ; kJ mol^-1 nm^-2
; lambda-states = 1 2 3 4 5 6 7 8 9 10 11 12
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
29 30 31 32 33 34 35 36 37 38 39 40
coul-lambdas = 0.0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55
0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.00 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
vdw_lambdas = 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.2 0.3 0.4 0.45 0.5 0.55 0.6 0.63 0.66 0.69 0.72 0.75 0.78 0.81 0.84 0.88 0.92 1.0
restraint-lambdas = 0.0 0.0005 0.0010 0.0025 0.0050 0.0075 0.01 0.05 0.1 0.2 0.3 0.4
0.5 0.6 0.7 0.8 0.9 0.95 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.00 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
init-lambda-weights = 0.0 0.00 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

Also, here is the PLUMED input file that I used:

center: CENTER ATOMS=1-144 # geometric center of the host molecule
water_group: GROUP ATOMS=207-6656:3 # oxygen atom of the water molecules
n: COORDINATION GROUPA=center GROUPB=water_group R_0=0.6 # radius: 0.6 nm
lambda: EXTRACV NAME=lambda
METAD …
ARG=lambda,n
SIGMA=0.01,0.2 # small SIGMA ensure that the Gaussian approaximate a delta function
HEIGHT=2.0
PACE=10 # should be nstexpanded
GRID_MIN=0,0 # index of alchemical states starts from 0
GRID_MAX=39,50 # we have 40 states in total
GRID_BIN=39,200
TEMP=298
BIASFACTOR=20
LABEL=metad
FILE=HILLS_2D
… METAD
PRINT STRIDE=10 ARG=n,lambda,metad.bias FILE=COLVAR

This is due to periodicity of system. You can not have distance greater than half of box length.

Let’s give an example, for simplicity, let’s consider 1D box case with dimension length x. Say you want to pull B from A

A_i--------B--------A
|<----------x---------->|

Distance between A and its minimum image(A_i) is x. In a periodic system, distance (minimum image distance) between A and B will be either (A_iB) or (AB), whichever is smaller. Apparently, distance between A and B can not be greater than \frac{1}{2}x.

Gromacs can handle such case with pull-coord1-geometry=direction-periodic.You can also avoid situation by decreasing pull length or increasing box size.

1 Like

I have a feeling that there is some bug in handling of the pull distances.
I have written about it here Problem with "Distance between pull groups" and PBC and submitted an issue here https://gitlab.com/gromacs/gromacs/-/issues/3613 but apparently the developers were not able to look into it yet.

Yes, definitely. We are trying to do a REST simulation with plumed using a fixed (rate = 0 ) COM-COM restraint between a ligand and a protein (with the pull module) and the code dies with the same error (the distance of group 1… etc… larger than 0.49). By plotting the COM-COM distance we see that this is not actually the case in none of the replicas. We think that the problem lies in the way the COM of a large group (e.g a protein) is computed in gromacs. COM calculation should use the atomic coordinates of one molecule and not those of inside the MD box that may belong to different PBC images/molecules.

To avoid the mistake (bug) with a protein with max interatomic distance of 5 nm, one has to use a box of at least 10 nm, with PBC images being always 5 nm apart . To avoid PBC entanglements in MD, PBC should be applied ONLY to distances and NEVER to coordinates. Let the molecules drift as a whole wherever they like to go. It is much easier to fix a box where the molecules have drifted to other box replicas (for example by shifting their COM inside the zero-cell) than reconstructing a PBC-broken molecule.