dATP molecule won't equilibrate

GROMACS version: 2021
GROMACS modification: No

Hi all, I am a new GROMACS user. I am trying to simulate dATP bound to a protein. I generated topology files for dATP from the crystal structure using ACPYPE with amber03 forcefield and spce water model, and since I had some issues with the protein/dATP complex, I tried simulating dATP on its own to troubleshoot if my topology is the problem.

After solvating the dATP and adding ions, the energy minimization looks fine (converges in 151 steps, potential -7.1e4, max force 9.8e2) and visually the conformation looks OK. When I do NVT equilibration, however, it blows up quickly and the simulation fails due to LINCS warnings. The only way I can complete equilibration is if I change the time step to 0.02 fs. When I examine the output of this shorter equilibration it seems like the temperature is spiking to a crazy degree (image attached).

How can I diagnose the problem with the dATP topology? Since my mdp files are taken from the protein-ligand tutorial it’s possible that there are settings incompatible with amber03. I include the nvt.mdp file below

title = Protein-ligand complex NVT equilibration
define = -DPOSRES ; position restrain the protein and ligand
; Run parameters
integrator = md ; leap-frog integrator
nsteps = 50000 ; 2 * 50000 = 100 ps
;dt = 0.002 ; 2 fs
dt = 0.00002 ; equilibration only works at a very low timestep
; Output control
nstenergy = 500 ; save energies every 1.0 ps
nstlog = 500 ; update log file every 1.0 ps
; nstxout-compressed = 500 ; save coordinates every 1.0 ps
nstxout-compressed = 10 ; save coordinates every 10 steps
; Bond parameters
continuation = no ; first dynamics run
constraint_algorithm = lincs ; holonomic constraints
constraints = h-bonds ; bonds to H are constrained
lincs_iter = 2 ; accuracy of LINCS
lincs_order = 4 ; also related to accuracy
lincs-warnangle = 30 ; maximum angle that a bond can rotate before LINCS will complain (default=30)

; Neighbor searching and vdW
cutoff-scheme = Verlet
ns_type = grid ; search neighboring grid cells
nstlist = 20 ; largely irrelevant with Verlet
rlist = 1.2
vdwtype = cutoff
vdw-modifier = force-switch
rvdw-switch = 1.0
rvdw = 1.2 ; short-range van der Waals cutoff (in nm)
; Electrostatics
coulombtype = PME ; Particle Mesh Ewald for long-range electrostatics
rcoulomb = 1.2 ; short-range electrostatic cutoff (in nm)
pme_order = 4 ; cubic interpolation
fourierspacing = 0.16 ; grid spacing for FFT
; Temperature coupling
tcoupl = V-rescale ; modified Berendsen thermostat
tc-grps = System ; Protein_DNA Water_and_ions ; !Protein_&_!DNA ; two coupling groups - more accurate
tau_t = 0.1 ;0.1 ; time constant, in ps
ref_t = 300 ;300 ; reference temperature, one for each group, in K
; Pressure coupling
pcoupl = no ; no pressure coupling in NVT
; Periodic boundary conditions
pbc = xyz ; 3-D PBC
; Dispersion correction is not used for proteins with the C36 additive FF
DispCorr = no
; Velocity generation
gen_vel = yes ; assign velocities from Maxwell distribution
gen_temp = 300 ; temperature for Maxwell distribution
gen_seed = -1 ; generate a random seed

Here is my topology file as well:

topol.top:
; Include forcefield parameters
#include “amber03.ff/forcefield.itp”

; Include ligand topology
#include “DTP_clean_GMX.itp”

; Ligand position restraints
#ifdef POSRES
#include “posre_DTP.itp”
#endif

; Include water topology
#include “amber03.ff/spce.itp”

; #ifdef POSRES_WATER
; Position restraint for each water oxygen
; [ position_restraints ]
; i funct fcx fcy fcz
; 1 1 1000 1000 1000
; #endif

; Include topology for ions
#include “amber03.ff/ions.itp”

[ system ]
; Name
Protein in water

[ molecules ]
DTP_clean 1 ; newly added
SOL 1360
NA 2

Indeed they are. Those inputs are for CHARMM, not AMBER.

I will also note that AMBER03 is an incredibly outdated (and questionable…) force field in general and you should be very careful about using it.

Beyond that, the standard troubleshooting advice (which specifically addresses protein-ligand complexes!) applies: https://manual.gromacs.org/current/user-guide/terminology.html#diagnosing-an-unstable-system

Thank you for the response! One question - I only use amber03 since that’s what I see in similar work in the literature. I have the impression that AMBER is better for modeling nucleic acids. Is this true? Is there a resource for choosing the best force field for a given system?

The literature is your best resource. ff03 was parametrized in a totally different way from the remainder of the ff99 lineage, and isn’t really widely used, aside from some modifications made in intrinsically disordered protein simulations. if you only have dATP, nucleic acid parameters are almost irrelevant here, but beware that the AMBER force fields implemented in GROMACS have the ancient ff94 Cornell parameter set (according to the docs), which is entirely unsuitable for modern simulations.

The single most important choice you make in preparing a simulation is the force field you choose. It must be carefully researched, and existing literature evaluated (primary citations of the parameter set to understand how it was parametrized, what its assumptions are, what defects are known and subsequent assessments of quality). It is insufficient to say “people often use this force field” or “AMBER is better for nucleic acids” because there are many different parameter sets that can be used. Modern AMBER parameter sets are excellent for nucleic acids, but again, that doesn’t sound like your biggest concern here. There are also vastly better AMBER parameter sets for proteins (ff14sb, ff19sb, etc), though they are not officially supported by GROMACS. Ports may exist somewhere, but be sure they have been validated before using them…

To add more to what Justin said, you may want to learn a good deal about tleap (from AmberTools) and build your system there. Once you’re satisfied (complex, solvent, ions etc.) you can use ACPYPE to convert AMBER inpcrd/prmtop files to GMX topologies. However AMBER19SB (with CMAP) is still not ported for ACPYPE yet.