Including parameters of a new molecule in gromacs topology files

Dear gromacs users,

I am asking a question that must have been asked in multiple occasions, but I still could not understand the entire thing very clearly. I would like to understand the optimum way of including the parameters of a new molecule in gromacs topology files.

For example: simulating a protein and a ligand using charmm36 forcefield.

System preparation: I have separately prepared the topologies of the protein in water+ions and ligand in water+ions using other softwares such as the charmm-gui webserver. Next I insert ligands to the solvated protein box using gmx insert-molecules by replacing some of the water molecules.

Protein solvated topology: Now I have a forcefield.itp file for the protein containing only the forcefield combination rules, pair-formation rule, fudgLJ and fudgeQQ parameters along with the necessary atomtypes, bondtypes, dihedraltypes, pairtypes, angletypes that are present in the protein, water and ions only. Separate itp files for water, protein and ions. This forcefield.itp file is not a general file like we typically find inside gromacs/share/gromacs/top/charmm36 or any other default forcefield directories when gromacs is installed as that file contains all possible charmm36 parameters. But the one generated from charmm-gui or any other resource usually contains only necessary parameters for the molecule of interest.

Ligand topology: For ligand also, another forcefield.itp file containing the comb-rule, gen-pair, fudge parameters etc. along with the atomtypes, bondtypes, angletypes for the ligand only. Again this file is also specific to the ligand atoms only, not a generic forcefield file that are present when we install gromacs software.

Incorporating ligand topologies in the solvated protein topology:
For this I simply add two additional #include statements to include the ligand forcefield.itp file (we will have to comment out gen-pair, comb-rule etc. as that will already be present in the solvated protein+water+ions forcefield.itp file) and the ligand itp file containing the atoms, bonds, angles for the ligand.

Now sometime I have seen same atomtypes might be defined in the protein forcefield files as well as in the ligand forcefield files, and then if the parameters are identical, one can use -maxwarn flag to ignore the warning during gmx grompp.

This is a procedure that I am planning to follow, but I would like to understand if this procedure is fine or some other procedure might be followed or there are some issues with this procedure.

Any help would be much appreciated and I think this is a topic that is still not very clear for a lot of gromacs users. Thanks in advance.

The [ moleculetype ] section contains a molecule definition, and it has to be included (either directly pasted, or via #include - these are equivalent) in the .top file.

The [ atomtypes ] directive lists all atom types with their LJ parameters, and it should be placed only once, right after [ defaults ]. Other [ ...types ] (dihedrals, bonds, angles) should follow, and they can be repeated (at least dihedrals can be, never tried other ones independently).

Here’s where the complication comes: [ atomtypes ] will be defined in two different .itp files, so you can’t just #include them both as that violates the above principle. And yes, parameters can be duplicated, in which case the one specified later in the file (if I remember correctly) takes precedence. In principle, one should merge the respective sections manually and look out for overlap.

In Gromologist, I included a few functions exactly to deal with this issue. Feel free to test them if that’s helpful in your workflow. There’s also an overwrite parameter to add_parameters_from_file that will decide whether to overwrite a duplicated parameter or ignore it, but perhaps I should add an interactive option to ask it dynamically.

Thank you very much for the detailed explanation. Could you explain a bit more about two separate definition of the [ atomtypes ] section? Why inclusion of two separate [ atomtypes ] section may violate things? I have earlier used two [ defaults ] directives in two itp files and gromacs gives an error that two [ defaults ] directives can not exists, however using two [ atomtypes ] directives never produced any error or warning. Is there any documentation that I can follow for this. Thanks in advance…

Chapter 5 of the GROMACS manual explains the topology hierarchy. All force field-level directives have to be declared before molecules can be defined. Having multiple [atomtypes] (or other) directives is not a problem, as long as they all come before the first [moleculetype] directive.

Thank you Dr. Lemkul for the clairification. I have checked that I have defined the top file in the following way:

#include “ff.itp” ; this is the forcefield file defining the [ defaults ] and the protein atomtypes, bondtypes, angletypes etc.
#include “lgff.itp” ; this is the ligand atomtypes, bondtypes, angletypes etc.
#include “protein.itp” ; this contains the [ moleculetype ] for the protein, [ atoms ], [ bonds ] etc., for the protein
#include “tip3p.itp”
#include “ions.itp”
#include “ligand.itp” this contains the [ moleculetype ] for the ligand, [ atoms ], [ bonds ], etc., for the ligand

Now I checked that if I swap and put lgff.itp file include statement after the protein.itp file then while executing the grompp command, gromacs reports an error confirming Dr. Lemkul’s as well as gromacs manual. Thus, we can define multiple [ atomtypes ] or other directives but they have to be defined before the first [ moleculetype ] definition.