System blowing up in decoupled ligand (in protein complex) simulation for ABFE calculations

GROMACS version: 2019.2
GROMACS modification: No

Hi everybody,

For the past couple of months I’ve been trying to use Gromacs to calculate the Absolute Binding Free Energy (ABFE) of a ligand to a protein-RNA complex via non-equilibrium thermodynamic integration (non-eq ti). I’ve regularly gotten a segmentation fault (core dumped) error when trying to run a short (15ns) production simulation of my ligand fully decoupled (VdW and coulombic interactions OFF) from the protein-RNA complex. I thought it was due to a really floppy RNA chain, so I implemented a new equilibration procedure–after energy minimization, before production simulation–involving position restraints first on all atoms in nvt, then all atoms in npt, and then all heavy atoms in npt. Doing this led to one successful mdrun production simulation, but two of them failed. One failed after 2.85ns, and the other failed at 7.95ns which leads me to believe that it’s probably not a question of a poorly equilibrated starting structure (although I could be wrong). What’s more frustrating to me is that there are no errors or warnings in the md.log files for both cases, and the only error message is in the job output file where it says:

“step 1429500, will finish Sun Sep 18 00:13:53 2022/var/spool/slurmd/job1002489/slurm_script: line 19: 383301 Segmentation fault (core dumped) gmx mdrun -v -s topol.tpr -ntomp 48 -pin on”

It’s the same case for the other one that failed but at a later time. I know that some Gromacs versions are finnicky about free energy mdrun things, and I stopped using the version of 2021 our lab has because of a similar issue (that was initially solved by me switching back to 2019). Any advice on this would be super helpful! I feel like I’ve been staring at this same problem for so long and I’m unsure about what else to do to for troubleshooting.

My mdp file:
; File ‘mdout.mdp’ was generated
; By user: ()
; On host:
; At date: Thu Sep 15 12:45:25 2022
; Created by:
; :-) GROMACS - gmx grompp, 2019.2 (-:
; Executable: /home//lab/gromacs/2019/2019.2-impi2018-fftw377-gcc550-cuda90/bin/gmx
; Data prefix: /home//lab/gromacs/2019/2019.2-impi2018-fftw377-gcc550-cuda90
; Working dir: ///ABFE/A3_OBM/state_B/03
; Command line:
; gmx grompp -f ABFE_prod.mdp -c npt2.gro -r npt2.gro -o topol.tpr -p

; Preprocessor information: use cpp syntax.
; e.g.: -I/home/joe/doe -I/home/mary/roe
include =
; e.g.: -DPOSRES -DFLEXIBLE (note these variable names are case sensitive)
define =

integrator = sd
; Start time and timestep in ps
tinit = 0
dt = 0.002
nsteps = 7500000
; For exact run continuation or redoing part of a run
init-step = 0
; Part index is updated automatically on checkpointing (keeps files separate)
simulation-part = 1
; mode for center of mass motion removal
comm-mode = Linear
; number of steps for center of mass motion removal
nstcomm = 100
; group(s) for center of mass motion removal
comm-grps =

; Friction coefficient (amu/ps) and random seed
bd-fric = 0
ld-seed = -1

; Force tolerance and initial step-size
emtol = 10
emstep = 0.01
; Max number of iterations in relax-shells
niter = 20
; Step size (ps^2) for minimization of flexible constraints
fcstep = 0
; Frequency of steepest descents steps when doing CG
nstcgsteep = 1000
nbfgscorr = 10

rtpi = 0.05

; Output frequency for coords (x), velocities (v) and forces (f)
nstxout = 25000
nstvout = 25000
nstfout = 0
; Output frequency for energies to log file and energy file
nstlog = 25000
nstcalcenergy = 25000
nstenergy = 25000
; Output frequency and precision for .xtc file
nstxout-compressed = 25000
compressed-x-precision = 25000
; This selects the subset of atoms for the compressed
; trajectory file. You can select multiple groups. By
; default, all atoms will be written.
compressed-x-grps =
; Selection of energy groups
energygrps =

; cut-off scheme (Verlet: particle based cut-offs, group: using charge groups)
cutoff-scheme = Verlet
; nblist update frequency
nstlist = 10
; ns algorithm (simple or grid)
ns-type = grid
; Periodic boundary conditions: xyz, no, xy
pbc = xyz
periodic-molecules = no
; Allowed energy error due to the Verlet buffer in kJ/mol/ps per atom,
; a value of -1 means: use rlist
verlet-buffer-tolerance = 0.005
; nblist cut-off
rlist = 1.0
; long-range cut-off for switched potentials

; Method for doing electrostatics
coulombtype = PME
coulomb-modifier = Potential-shift-Verlet
rcoulomb-switch = 0
rcoulomb = 1.0
; Relative dielectric constant for the medium and the reaction field
epsilon-r = 1
epsilon-rf = 0
; Method for doing Van der Waals
vdw-type = PME
vdw-modifier = Potential-Shift
; cut-off lengths
rvdw-switch = 0
rvdw = 1.0
; Apply long range dispersion corrections for Energy and Pressure
DispCorr = EnerPres
; Extension of the potential lookup tables beyond the cut-off
table-extension = 1
; Separate tables between energy group pairs
energygrp-table =
; Spacing for the PME/PPPM FFT grid
fourierspacing = 0.10
; FFT grid size, when a value is 0 fourierspacing will be used
fourier-nx = 0
fourier-ny = 0
fourier-nz = 0
; EWALD/PME/PPPM parameters
pme-order = 6
ewald-rtol = 1e-6
ewald-rtol-lj = 1e-3
lj-pme-comb-rule = Geometric
ewald_geometry = 3d
epsilon-surface = 0
implicit-solvent = no

; Temperature coupling
tcoupl = No
nsttcouple = -1
nh-chain-length = 10
print-nose-hoover-chain-variables = no
; Groups to couple separately
tc_grps = System
; Time constant (ps) and reference temperature (K)
tau_t = 1.0
ref_t = 300
; pressure coupling
pcoupl = Parrinello-Rahman
pcoupltype = isotropic
nstpcouple = -1
; Time constant (ps), compressibility (1/bar) and reference P (bar)
tau_p = 2
compressibility = 4.5e-05
ref_p = 1.0
; Scaling of reference coordinates, No, All or COM
refcoord-scaling = No

; OPTIONS FOR QMMM calculations
QMMM = no
; Groups treated Quantum Mechanically
QMMM-grps =
; QM method
QMmethod =
; QMMM scheme
QMMMscheme = normal
; QM basisset
QMbasis =
; QM charge
QMcharge =
; QM multiplicity
QMmult =
; Surface Hopping
SH =
; CAS space options
CASorbitals =
CASelectrons =
SAon =
SAoff =
SAsteps =
; Scale factor for MM charges
MMChargeScaleFactor = 1

; Type of annealing for each temperature group (no/single/periodic)
annealing =
; Number of time points to use for specifying annealing in each group
annealing-npoints =
; List of times at the annealing points for each group
annealing-time =
; Temp. at each annealing point, for each group.
annealing-temp =

gen_vel = no
gen-temp = 300
gen-seed = -1

constraints = h-bonds
; Type of constraint algorithm
constraint_algorithm = lincs
; Do not constrain the start configuration
continuation = yes
; Use successive overrelaxation to reduce the number of shake iterations
Shake-SOR = no
; Relative tolerance of shake
shake-tol = 0.0001
; Highest order in the expansion of the constraint coupling matrix
lincs_order = 6
; Number of iterations in the final step of LINCS. 1 is fine for
; normal simulations, but use 2 to conserve energy in NVE runs.
; For energy minimization with constraints it should be 4 to 8.
lincs_iter = 1
; Lincs will write a warning to the stderr if in one step a bond
; rotates over more degrees than
lincs-warnangle = 30
; Convert harmonic bonds to morse potentials
morse = no

; Pairs of energy groups for which all non-bonded interactions are excluded
energygrp-excl =

; Number of walls, type, atom types, densities and box-z scale factor for Ewald
nwall = 0
wall-type = 9-3
wall-r-linpot = -1
wall-atomtype =
wall-density =
wall-ewald-zfac = 3

pull = no

; AWH biasing
awh = no

; Enforced rotation: No or Yes
rotation = no

; Group to display and/or manipulate in interactive MD session
IMD-group =

; NMR refinement stuff
; Distance restraints type: No, Simple or Ensemble
disre = No
; Force weighting of pairs in one distance restraint: Conservative or Equal
disre-weighting = Conservative
; Use sqrt of the time averaged times the instantaneous violation
disre-mixed = no
disre-fc = 1000
disre-tau = 0
; Output frequency for pair distances to energy file
nstdisreout = 100
; Orientation restraints: No or Yes
orire = no
; Orientation restraints force constant and tau for time averaging
orire-fc = 0
orire-tau = 0
orire-fitgrp =
; Output frequency for trace(SD) and S to energy file
nstorireout = 100

; Free energy variables
free-energy = yes
couple-moltype = OBM
couple-lambda0 = vdw-q
couple-lambda1 = none
couple-intramol = no
init-lambda = 1
init-lambda-state = -1
delta-lambda = 0
nstdhdl = 25000
fep-lambdas =
mass-lambdas =
coul-lambdas =
vdw-lambdas =
bonded-lambdas =
restraint-lambdas =
temperature-lambdas =
calc-lambda-neighbors = -1
init-lambda-weights =
dhdl-print-energy = no
sc-alpha = 0
sc-power = 1
sc-r-power = 6
sc-sigma = 0.3
sc-coul = no
separate-dhdl-file = yes
dhdl-derivatives = yes
dh_hist_size = 0
dh_hist_spacing = 0.1

; Non-equilibrium MD stuff
acc-grps =
accelerate =
freezegrps =
freezedim =
cos-acceleration = 0
deform =

; simulated tempering variables
simulated-tempering = no
simulated-tempering-scaling = geometric
sim-temp-low = 300
sim-temp-high = 300

; Ion/water position swapping for computational electrophysiology setups
; Swap positions along direction: no, X, Y, Z
swapcoords = no
adress = no

; User defined thingies
user1-grps =
user2-grps =
userint1 = 0
userint2 = 0
userint3 = 0
userint4 = 0
userreal1 = 0
userreal2 = 0
userreal3 = 0
userreal4 = 0
; Electric fields
; Format for electric-field-x, etc. is: four real variables:
; amplitude (V/nm), frequency omega (1/ps), time for the pulse peak (ps),
; and sigma (ps) width of the pulse. Omega = 0 means static field,
; sigma = 0 means no pulse, leaving the field to be a cosine function.
electric-field-x = 0 0 0 0
electric-field-y = 0 0 0 0
electric-field-z = 0 0 0 0

I don’t know what the issue could be, but this looks like bug. The only known issue I find is a memory issue, fixed in 2019.5, but I don’t know if that is the cause:

What was your issue with 2021? Has that been reported or fixed?

I would suggest to try 2022.3.

Thank you! I will try to get my hands on the latest version. I just submitted the same job with 2020 just to see if that changes anything…

This is the error I get when I try the same exact input files but with 2021.1. It pops up pretty much all the time when I try to use mdrun on this system with 2021:

-------- -------- — Thank You — -------- --------

There are: 53235 Atoms
Center of mass motion removal mode is Linear
We have the following groups for center of mass motion removal:
0: rest

Started mdrun on rank 0 Wed Sep 28 16:36:30 2022

       Step           Time
          0        0.00000

Program: gmx mdrun, version 2021.1
Source file: src/gromacs/gmxlib/nonbonded/nb_free_energy.cpp (line 854)

Fatal error:
There are 4 perturbed non-bonded pair interactions beyond the pair-list cutoff
of 1.001 nm, which is not supported. This can happen because the system is
unstable or because intra-molecular interactions at long distances are
excluded. If the latter is the case, you can try to increase nstlist or rlist
to avoid this.The error is likely triggered by the use of couple-intramol=no
and the maximal distance in the decoupled molecule exceeding rlist.

For more information and tips for troubleshooting, please check the GROMACS
website at Common Errors — GROMACS webpage documentation

This is an actual error for a limitation in GROMACS. Are you decoupling a molecule with couple-intramol=no? If so, how large is your molecule?

Yes, couple-intramol=no for my production simulations. My molecule is fairly large, formula is C20H26O6, molecular weight is 364 g/mol. Is there a reason why this error would come up in 2021 specifically and not 2019/2020?

The old code had no check, so there is no error message, but the results we be incorrect.

For such a large molecule you anyhow want to use couple-intramol=yes which make the sampling much easier. That avoids the issue.

Thank you! Is there an explicit cut-off size for ligands where you would want to always use couple-intramol=yes?

Sorry, I actually have a follow-up question. I saw that on a previous discussion on this forum that couple-intramol is normally =no for all absolute free energy calculations (and that’s normally what they do in the papers for ABFE that I see), but are systems with larger ligands an exception to this in general? Thank you again for replying to this post!

Because of technical reasons, you can not use couple-intramol=no when decoupled atoms in a molecule are separated by more than rlist. Newer GROMACS version properly check for this and exit with a fatal error.

For sampling reasons, you want to use couple-intramol=yes for molecules which significantly change conformation between the bound and the unbound state. This will happen when a molecule can hydrogen bond to itself or with any larger molecule, unless it is completely stiff. With couple-intramol=yes you need an extra computation of the coupling of the molecule to the solvent. There you still have a similar sampling issue, but such simulations are cheaper than the ones in a protein.

Thank you so much for this explanation! Setting couple-intramol=yes makes a lot of sense with regards to rlist. I’ve tried running this system on GROMACS 2021.1 with couple-intramol=yes, but I still get the same core dumped segmentation fault error in my job output with no lincs warnings or any error messages in the md.log file. Is this still a question of system instability? I tried running a 100ns in triplicate, and two of them failed at around 4.2ns and the other failed at 27.8ns. Nothing in the job outputs or the md.logs that tells me what the error is or any other warnings besides:
" /cm/local/apps/slurm/var/spool/job16910579/slurm_script: line 40: 2878353 Segmentation fault (core dumped) $MDRUN -nt $SLURM_NTASKS_PER_NODE -ntmpi 1 -s topol.tpr $CPT_FLAGS "
at different points in the simulation. I think someone in my lab is still working on updating to the newest version of Gromacs, so I’ve been doing stuff with 2021 until then.

I really appreciate all your advice thus far and would be very grateful for any more help on this! I am still struggling to understand why this keeps failing :(