LINCS/h-bonds with dummies

GROMACS version: 2020.2
GROMACS modification: No

Hi,

Running lots of alchemical simulations lately, I often came across runs that reported very high minimum cell sizes (>1.5 nm) and thus would only run on very few CPUs. I was able to trace that back to a trivial issue with LINCS - when only h-bonds are constrained, and dummy atoms are called e.g. DH1, bonds to dummies are not constrained.

Even though there’s nothing wrong with that in principle (a note about the oscillation period is even raised by grompp, which is cool), it wasn’t very obvious that the DD issue was linked to these dummies. I thought it would be nice to either see an extended warning (that the minimal cell size will be affected) or include bonds to atoms starting with DH* (by an apparently popular convention - dummies that are converted into hydrogens) into the list of bonds constrained with constraints=h-bonds.

(Of course, that leaves out the question of heavy atoms being morphed into hydrogens and vice versa, but these seem to be less of an issue.)

Best regards,
Miłosz Wieczór

Pointing out a potential problem - in polarizable models (e.g. Drude), the Drude oscillators start with D, and you definitely do not want those bonds constrained in conjunction with constraints = h-bonds as it completely breaks the model. The “DH” case might be workable, but perhaps not universally.

Great point, I didn’t catch that having no hands-on experience with Drude. Then maybe an extended note/warning would be the way to go - just renaming the dummy in the topology to anything that starts with H fixed the issue, but not knowing where the problem came from was quite frustrating (even if now seems obvious-ish in retrospect).

Dear Miłosz Wieczór,

I want to run the alchemical simulation and need to make the pdb and topology files for the dummy atoms manually. Unfortunately, I couldn’t find any useful tutorial on the web. Since you are experienced in this, it would be great if you can introduce me some sources. (My aim is to calculate free energy for the system using dual topology.)

Thank you in advance.

Best wishes,
Mehrnoosh

Hi,

Generating topologies for alchemical simulations can get anywhere from peanuts to hours or days of work, so you need to be more specific about your system of interest. Let me know what you intend to do, either here or by email (milafternoon@gmail.com).

Best,
Miłosz

Dear all,

I am interested in these last posts. I want to run a relative alchemical simulation (dual topology method) but I didn’t find any useful tutorial on how to implement it. I would be very grateful if you could give me some advice.
I want to compute the ΔG of a protein in water with 1 (or few) different residue (let’s say SER to ARG).
If I understood well the manual, the mdp file should looks like this:

;====================================================
; MDP file - Production simulation
;====================================================
;----------------------------------------------------
; RUN CONTROL
;----------------------------------------------------
integrator = sd
nsteps = 500000
dt = 0.002
comm-mode = Linear
nstcomm = 100
;----------------------------------------------------
; OUTPUT CONTROL
;----------------------------------------------------
nstxout = 50000
nstvout = 0
nstfout = 0
nstlog = 50000
nstenergy = 50000
nstxtcout = 50000
xtc_precision = 5000
xtc-grps = System
;----------------------------------------------------
; BONDS
;----------------------------------------------------
constraints = hbons
constraint-algorithm = LINCS
;----------------------------------------------------
; NEIGHBOR SEARCHING
;----------------------------------------------------
nstlist = 20
ns_type = grid
rlist = 1.4
pbc = xyz
cutoff-scheme = Verlet
;----------------------------------------------------
; ELECTROSTATICS and VDW
;----------------------------------------------------
coulombtype = PME
coulomb-modifier = none
rcoulomb = 1.2
fourierspacing = 0.1
pme-order = 6
ewald-rtol = 1e-6
vdwtype = shift
vdw-modifier = none
rvdw-switch = 1.0
rvdw = 1.2
optimize_fft = yes
;----------------------------------------------------
; TEMPERATURE & PRESSURE COUPL
;----------------------------------------------------
tc_grps = System
tau_t = 1.0
ref_t = 300
pcoupl = Parrinello-Rahman
pcoupltype = isotropic
tau_p = 2
ref_p = 1.0
compressibility = 4.5e-05
;----------------------------------------------------
; LONG-RANGE DISPERSION CORRECTION
;----------------------------------------------------
DispCorr = EnerPres
;----------------------------------------------------
; VELOCITY GENERATION
;----------------------------------------------------
gen_vel = no
;----------------------------------------------------
; Adaptive Resolution Simulation
;----------------------------------------------------
adress = no
;----------------------------------------------------
; FREE ENERGY CALCULATIONS
;----------------------------------------------------
free-energy = yes
couple-moltype = MOL
couple-lambda0 = vdw-q
couple-lambda1 = vdw-q
couple-intramol = yes
separate-dhdl-file = yes
sc-alpha = 0.5
sc-power = 1
sc-sigma = 0.3
init-lambda-state = 0
nstdhdl = 100
calc-lambda-neighbors = -1
vdw_lambdas=0.0000001 0.000001 0.00001 0.0001 0.001 0.01 0.05 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.99 0.999 0.9999 0.99999 0.999999 0.9999999 1"
coul_lambdas=0.000000 0.000000 0.00000 0.0000 0.000 0.00 0.10 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.00 1.000 1.0000 1.00000 1.000000 1.0000000 1

My biggest problem is the dual topology. Since I am going to change a SER in a ARG, I need “turn off” the sidechain of one residue while I am turning on the other. There is any easy way to do it? I read the gromacs manual and in a first trial I build a new hybrid residue (S2R) containing both sidechains:

.rtp FILE
[ S2R ]
[ atoms ]
N N -0.41570 1
H H 0.27190 2
CA CX -0.02490 3
HA H1 0.08430 4
CB 2C 0.21170 5
HB1 H1 0.03520 6
HB2 H1 0.03520 7
OG OH -0.65460 8
HG HO 0.42750 9
CG 0c8 0.03900 10
HG1 0hc 0.02850 11
HG2 0hc 0.02850 12
CD 0c8 0.04860 13
HD1 0h1 0.06870 14
HD2 0h1 0.06870 15
NE 0n2 -0.52950 16
HE 0h 0.34560 17
CZ 0ca 0.80760 18
NH1 0n2 -0.86270 19
HH11 0h 0.44780 20
HH12 0h 0.44780 21
NH2 0n2 -0.86270 22
HH21 0h 0.44780 23
HH22 0h 0.44780 24
C C 0.59730 25
O O -0.56790 26
[ bonds ]
N H
N CA
CA HA
CA CB
CA C
CB HB1
CB HB2
CB OG
OG HG
CB CG
CG HG1
CG HG2
CG CD
CD HD1
CD HD2
CD NE
NE HE
NE CZ
CZ NH1
CZ NH2
NH1 HH11
NH1 HH12
NH2 HH21
NH2 HH22
C O
-C N
[ exclusions ]
OG CG
OG CD
OG NE
OG CZ
OG NH1
OG NH2
OG HG1
OG HG2
OG HD1
OG HD2
OG HE
OG HH11
OG HH12
OG HH21
OG HH22
HG CG
HG CD
HG NE
HG CZ
HG NH1
HG NH2
HG HG1
HG HG2
HG HD1
HG HD2
HG HE
HG HH11
HG HH12
HG HH21

HG HH22
[ impropers ]
-C CA N H
CA +N C O
NE NH1 CZ NH2
CD CZ NE HE
CZ HH11 NH1 HH12
CZ HH21 NH2 HH22

TOPOLOGY
; nr type resnr residue atom cgnr charge mass typeB chargeB massB
1497 N 104 S2R N 1497 -0.4157 14.01 N -0.3479 14.01
1498 H 104 S2R H 1498 0.2719 1.008 H 0.2747 1.008
1499 CX 104 S2R CA 1499 -0.0249 12.01 CX -0.2637 12.0
1500 H1 104 S2R HA 1500 0.0843 1.008 H1 0.156 1.008
1501 2C 104 S2R CB 1501 0.2117 12.01 C8 -0.0007 12.0
1502 H1 104 S2R HB1 1502 0.0352 1.008 HC 0.0327 1.008
1503 H1 104 S2R HB2 1503 0.0352 1.008 HC 0.0327 1.008
1504 OH 104 S2R OG 1504 -0.6546 16 oh 0.0 0.0
1505 HO 104 S2R HG 1505 0.4275 1.008 ho 0.0 0.0
1506 c8 104 S2R CG 1506 0.0 12.01 C8 0.0390 12.01
1507 hc 104 S2R HG1 1507 0.0 1.008 HC 0.0285 1.008
1508 hc 104 S2R HG2 1508 0.0 1.008 HC 0.0285 1.008
1509 c8 104 S2R CD 1509 0.0 12.01 C8 0.0486 12.01
1510 h1 104 S2R HD1 1510 0.0 1.008 H1 0.0687 1.008
1511 h1 104 S2R HD2 1511 0.0 1.008 H1 0.0687 1.008
1512 n2 104 S2R NE 1512 0.0 14.01 N2 -0.5295 14.01
1513 h 104 S2R HE 1513 0.0 1.008 H 0.3456 1.008
1514 ca 104 S2R CZ 1514 0.0 12.01 CA 0.8076 12.01
1515 n2 104 S2R NH1 1515 0.0 14.01 N2 -0.8627 14.01
1516 h 104 S2R HH11 1516 0.0 1.008 H 0.4478 1.008
1517 h 104 S2R HH12 1517 0.0 1.008 H 0.4478 1.008
1518 n2 104 S2R NH2 1518 0.0 14.01 N2 -0.8627 14.01
1519 h 104 S2R HH21 1519 0.0 1.008 H 0.4478 1.008
1520 h 104 S2R HH22 1520 0.0 1.008 H 0.4478 1.008
1521 C 104 S2R C 1521 0.5973 12.01 C 0.7341 12.01
1522 O 104 S2R O 1522 -0.5679 16 O -0.5894 16

where I used new atoms (with low case names) that have sigma and epsilon parameters set to 0 and 0 charge. After I have fixed few problems with bond and angles I got a lot of problems with dihedrals, in particular with the following error:

ERROR 1 [file topol.top, line 25249]:
Cannot automatically perturb a torsion with multiple terms to different
form.
Please specify perturbed parameters manually for this torsion in your
topology!

topol.top, line 25249: 1515 1514 1518 1520 9

How could I solve this problem? There are different and easier way to build the dual topology?

I am really thankful for any of your advance.

Best wishes,
Damiano

Hi Damiano,

This is a common problem when a dihedral is perturbed alchemically in Gromacs. If your dihedral has e.g. periodicity 1 in state A but periodicity 3 in state B, you should have one explicitly defined entry with periodicity 1 that goes from whatever to 0, and another explicitly defined entry with periodicity 3 that goes from 0 to whatever.

While I give no guarantee this will work, you can try using Gromologist’s add_ff_params() on your system to add the parameters automatically, as documented in the README. I remember having implemented a solution to this exact problem, but never tested it extensively beyond that use case, so please do some sanity checks before you proceed.

Otherwise, if you do not plan to go beyond plain amino acid mutations, you can just use the PMX webserver to generate your Ser to Arg mutant topology - these should come with all dihedrals explicitly defined in the topology.

Best,
Miłosz

1 Like

Dear Miłosz,

Thank you very much for your response. I will go through your advises.

Dear Miłosz and Gromacs users,

using PMX I managed to run FEP in Gromacs, but I have few questions for which I would like a more expert opinion.

  1. Since I am going to perform FEP, I am using the fep-lambda to set the lambda values. Is this correct or is preferable to set the different lambda separately? (like setting vdw-lambdas and coul-lambdas)

  2. In the literature I have found many possible set of values for lambda, but I haven’t understood how the lambda values affect the convergence of the simulation or how to understand if I need more windows. Do you have any suggestion or some documentation I can read?

  3. Which tool should I use to analyse the results? So far I have been using gbar and uwham obtaining completely different results starting from the same data.

  4. To check the convergence of the method, I repeated the simulation multiple times starting from different configurations and expecting similar results. So far I obtained results quite different from every simulation. There are many parameters that can affect the convergence of the simulation (like NSTDHDL, CALC-LAMBDA-NEIGHBORS, NSTEPS, FEP-LAMBDAS), there is a way to understand which of them I should start modify to improve the chances of convergence?

  5. I read a tutorial for absolute binding free energy where they suggest Hamiltonian Replica Exchange (HREX) to improve the convergence. Can I run HREX in Gromacs2020 or do I need to patch it with Plumed first?

I am really thankful for any of your advance.

Best wishes,
Damiano

Hi Damiano,

  1. I personally mostly work with fep-lambdas instead of splitting into vdw-lambdas and coul-lambdas, since if e.g. both endstates have disappearing atoms, you would need two different schemes for uncoupling. I’ve seen people do that in NAMD but not sure if that’s doable in Gromacs, and anyway, the default soft-core seems to alleviate most problems you could expect here.
  2. Many people use equal spacing, but in fact not much happens in the middle range (0.2-0.8) and it’s smarter to focus on points near lambda=0 and 1 where the slope (dH/dl) is highest. I’m personally using my own tool to optimize lambda-spacing using rounds of short simulations with replica exchange and trying to equalize the exchange probabilities, but I also know others implemented similar protocols.
  3. gmx bar is mostly fine, if you compare that with TI or mBAR the results will be mostly identical. There’s a nice library that uses many methods at once, but it’s in Python 2 and they apparently still haven’t moved all the functionalities to the more robust Py3 version.
  4. NSTDHDL can be as low as needed but, say, 50-100 is pretty much fine. CALC-LAMBDA-NEIGHBORS is reasonably well defined in the gromacs docs. In terms of simulation length, this really depends on how many soft degrees of freedom your system has, and how efficiently/reliably you can sample them. When in trouble, there are ways of combining alchemy with enhanced sampling but I’m only starting to build my experience here. I often see evolution of instantaneous (binned per 1 ns) free energies on the timescale of tens of nanoseconds, so I’m personally skeptical when values from 1-5 ns per bin are reported for complex cases with many soft degrees of freedom (like protein-protein complexes).
  5. Purely lambda-based HREX should work out of the box in Gromacs. You need the patch for replica exchanges based on generic topology modifications (such as REST2), which goes back to the previous point.

Best,
Miłosz