Pdb2gmx broken for single-residue structures in GROMACS 2021.1

GROMACS version: 2021.1
GROMACS modification: No

I have prepared a single-residue structure (GLU.pdb) using PyMOL:

HEADER GLU
ATOM      1  N   GLU X   1      -0.517  -2.346  -1.038  1.00  0.00           N  
ATOM      2  CA  GLU X   1       0.042  -0.997  -1.038  1.00  0.00           C  
ATOM      3  C   GLU X   1       1.551  -1.037  -1.038  1.00  0.00           C  
ATOM      4  O   GLU X   1       2.180  -2.105  -1.038  1.00  0.00           O  
ATOM      5  CB  GLU X   1      -0.481  -0.212   0.199  1.00  0.00           C  
ATOM      6  CG  GLU X   1      -0.006   1.273   0.310  1.00  0.00           C  
ATOM      7  CD  GLU X   1      -0.430   2.077   1.525  1.00  0.00           C  
ATOM      8  OE1 GLU X   1      -1.158   1.575   2.403  1.00  0.00           O  
ATOM      9  OE2 GLU X   1      -0.000   3.247   1.597  1.00  0.00           O1-
ATOM     10  H   GLU X   1       0.110  -3.225  -1.038  1.00  0.00           H  
ATOM     11  HA  GLU X   1      -0.284  -0.479  -1.958  1.00  0.00           H  
ATOM     12 2HB  GLU X   1      -0.192  -0.743   1.129  1.00  0.00           H  
ATOM     13 3HB  GLU X   1      -1.588  -0.222   0.212  1.00  0.00           H  
ATOM     14 2HG  GLU X   1      -0.316   1.858  -0.571  1.00  0.00           H  
ATOM     15 3HG  GLU X   1       1.094   1.340   0.339  1.00  0.00           H  
END

In GROMACS 2020.3, topology can be prepared for it successfully using gmx pdb2gmx -f GLU.pdb -ff charmm36-jul2020 -ter -ignh:

Command line:
  gmx pdb2gmx -f GLU.pdb -ff charmm36-jul2020 -ter -ignh


Using the Charmm36-jul2020 force field in directory charmm36-jul2020.ff

Opening force field file /home/david/Programs/gromacs-2020.3/share/gromacs/top/charmm36-jul2020.ff/watermodels.dat

Select the Water Model:
 1: TIP3P	TIP 3-point, recommended, by default uses CHARMM TIP3 with LJ on H
 2: TIP4P	TIP 4-point
 3: TIP5P	TIP 5-point
 4: SPC		simple point charge
 5: SPC/E	extended simple point charge
 6: None
1
going to rename charmm36-jul2020.ff/merged.r2b
Opening force field file /home/david/Programs/gromacs-2020.3/share/gromacs/top/charmm36-jul2020.ff/merged.r2b
Reading GLU.pdb...
Read 'GLU', 9 atoms
Analyzing pdb file
Splitting chemical chains based on TER records or chain id changing.
There are 1 chains and 0 blocks of water and 1 residues with 9 atoms

  chain  #res #atoms
  1 'X'     1      9  

All occupancies are one
Opening force field file /home/david/Programs/gromacs-2020.3/share/gromacs/top/charmm36-jul2020.ff/atomtypes.atp
Reading residue database... (Charmm36-jul2020)
Opening force field file /home/david/Programs/gromacs-2020.3/share/gromacs/top/charmm36-jul2020.ff/merged.rtp
Opening force field file /home/david/Programs/gromacs-2020.3/share/gromacs/top/charmm36-jul2020.ff/merged.hdb
Opening force field file /home/david/Programs/gromacs-2020.3/share/gromacs/top/charmm36-jul2020.ff/merged.n.tdb
Opening force field file /home/david/Programs/gromacs-2020.3/share/gromacs/top/charmm36-jul2020.ff/merged.c.tdb

Back Off! I just backed up topol.top to ./#topol.top.12#
Processing chain 1 'X' (9 atoms, 1 residues)
Identified residue GLU1 as a starting terminus.
Identified residue GLU1 as a ending terminus.
8 out of 8 lines of specbond.dat converted successfully
Select start terminus type for GLU-1
 0: NH3+
 1: NH2
 2: 5TER
 3: None
0
Start terminus GLU-1: NH3+
Select end terminus type for GLU-1
 0: COO-
 1: COOH
 2: CT2
 3: 3TER
 4: None
0
End terminus GLU-1: COO-
Opening force field file /home/david/Programs/gromacs-2020.3/share/gromacs/top/charmm36-jul2020.ff/merged.arn
Checking for duplicate atoms....
Generating any missing hydrogen atoms and/or adding termini.
Now there are 1 residues with 18 atoms
Making bonds...
Number of bonds was 17, now 17
Generating angles, dihedrals and pairs...
Before cleaning: 39 pairs
Before cleaning: 39 dihedrals
Keeping all generated dihedrals
Making cmap torsions...
There are   39 dihedrals,    2 impropers,   30 angles
            39 pairs,       17 bonds and     0 virtual sites
Total mass 146.123 a.m.u.
Total charge -1.000 e
Writing topology

Back Off! I just backed up posre.itp to ./#posre.itp.5#

Writing coordinate file...

Back Off! I just backed up conf.gro to ./#conf.gro.5#
		--------- PLEASE NOTE ------------
You have successfully generated a topology from: GLU.pdb.
The Charmm36-jul2020 force field and the tip3p water model are used.
		--------- ETON ESAELP ------------

However, using GROMACS 2021.1, weird things occur. First, using gmx pdb2gmx -f GLU.pdb -ff charmm36-feb2021 -ter -ignh, the following output results:

Command line:
  gmx pdb2gmx -f GLU.pdb -ff charmm36-feb2021 -ter -ignh

Using the Charmm36-feb2021 force field in directory charmm36-feb2021.ff
Opening force field file /home/david/Programs/gromacs-2021.1/share/gromacs/top/charmm36-feb2021.ff/watermodels.dat

Select the Water Model:

 1: TIP3P	TIP 3-point, recommended, by default uses CHARMM TIP3 with LJ on H

 2: TIP4P	TIP 4-point

 3: TIP5P	TIP 5-point

 4: SPC		simple point charge

 5: SPC/E	extended simple point charge

 6: None
1

going to rename charmm36-feb2021.ff/merged.r2b
Opening force field file /home/david/Programs/gromacs-2021.1/share/gromacs/top/charmm36-feb2021.ff/merged.r2b
Reading GLU.pdb...
Read 'GLU', 9 atoms

Analyzing pdb file
Splitting chemical chains based on TER records or chain id changing.

There are 1 chains and 0 blocks of water and 1 residues with 9 atoms

  chain  #res #atoms

  1 'X'     1      9  

All occupancies are one
All occupancies are one
Opening force field file /home/david/Programs/gromacs-2021.1/share/gromacs/top/charmm36-feb2021.ff/atomtypes.atp

Reading residue database... (Charmm36-feb2021)
Opening force field file /home/david/Programs/gromacs-2021.1/share/gromacs/top/charmm36-feb2021.ff/merged.rtp
Opening force field file /home/david/Programs/gromacs-2021.1/share/gromacs/top/charmm36-feb2021.ff/merged.hdb
Opening force field file /home/david/Programs/gromacs-2021.1/share/gromacs/top/charmm36-feb2021.ff/merged.n.tdb
Opening force field file /home/david/Programs/gromacs-2021.1/share/gromacs/top/charmm36-feb2021.ff/merged.c.tdb

Back Off! I just backed up topol.top to ./#topol.top.11#

Processing chain 1 'X' (9 atoms, 1 residues)

Identified residue GLU1 as a starting terminus.

Identified residue GLU1 as a ending terminus.
8 out of 8 lines of specbond.dat converted successfully
Opening force field file /home/david/Programs/gromacs-2021.1/share/gromacs/top/charmm36-feb2021.ff/merged.arn

Checking for duplicate atoms....

Generating any missing hydrogen atoms and/or adding termini.

Now there are 1 residues with 15 atoms

Making bonds...

Number of bonds was 14, now 14

Generating angles, dihedrals and pairs...
Before cleaning: 30 pairs
Before cleaning: 30 dihedrals
Keeping all generated dihedrals

Making cmap torsions...

There are   30 dihedrals,    2 impropers,   23 angles
            30 pairs,       14 bonds and     0 virtual sites

Total mass 128.107 a.m.u.

Total charge -1.000 e

Writing topology

Back Off! I just backed up posre.itp to ./#posre.itp.4#

Writing coordinate file...

Back Off! I just backed up conf.gro to ./#conf.gro.4#

		--------- PLEASE NOTE ------------

You have successfully generated a topology from: GLU.pdb.

The Charmm36-feb2021 force field and the tip3p water model are used.

		--------- ETON ESAELP ------------

Despite the -ter switch, I am not prompted for the termini at all. The topology, however, looks like this:

[ atoms ]
;   nr       type  resnr residue  atom   cgnr     charge       mass  typeB    chargeB      massB
; residue   1 GLU rtp GLU  q -1.0
     1        NH1      1    GLU      N      1      -0.47     14.007
     2          H      1    GLU     HN      2       0.31      1.008
     3        CT1      1    GLU     CA      3       0.07     12.011
     4        HB1      1    GLU     HA      4       0.09      1.008
     5       CT2A      1    GLU     CB      5      -0.18     12.011
     6        HA2      1    GLU    HB1      6       0.09      1.008
     7        HA2      1    GLU    HB2      7       0.09      1.008
     8        CT2      1    GLU     CG      8      -0.28     12.011
     9        HA2      1    GLU    HG1      9       0.09      1.008
    10        HA2      1    GLU    HG2     10       0.09      1.008
    11         CC      1    GLU     CD     11       0.62     12.011
    12         OC      1    GLU    OE1     12      -0.76     15.999
    13         OC      1    GLU    OE2     13      -0.76     15.999
    14          C      1    GLU      C     14       0.51     12.011
    15          O      1    GLU      O     15      -0.51     15.999   ; qtot -1

That is, the termini are not charged – the topology corresponds to a glutamate residue which would be a part of a chain.

Even more weird, when the water model is specified in the command gmx pdb2gmx -f GLU.pdb -ff charmm36-feb2021 -water tip3p -ter -ignh, the following error occurs:

Command line:
  gmx pdb2gmx -f GLU.pdb -ff charmm36-feb2021 -water tip3p -ter -ignh

Using the Charmm36-feb2021 force field in directory charmm36-feb2021.ff

going to rename charmm36-feb2021.ff/merged.r2b
Opening force field file /home/david/Programs/gromacs-2021.1/share/gromacs/top/charmm36-feb2021.ff/merged.r2b
Reading GLU.pdb...
Read 'GLU', 9 atoms

Analyzing pdb file
Splitting chemical chains based on TER records or chain id changing.

There are 1 chains and 0 blocks of water and 1 residues with 9 atoms

  chain  #res #atoms

  1 'X'     1      9  

All occupancies are one
All occupancies are one
Opening force field file /home/david/Programs/gromacs-2021.1/share/gromacs/top/charmm36-feb2021.ff/atomtypes.atp

Reading residue database... (Charmm36-feb2021)
Opening force field file /home/david/Programs/gromacs-2021.1/share/gromacs/top/charmm36-feb2021.ff/merged.rtp
Opening force field file /home/david/Programs/gromacs-2021.1/share/gromacs/top/charmm36-feb2021.ff/merged.hdb
Opening force field file /home/david/Programs/gromacs-2021.1/share/gromacs/top/charmm36-feb2021.ff/merged.n.tdb
Opening force field file /home/david/Programs/gromacs-2021.1/share/gromacs/top/charmm36-feb2021.ff/merged.c.tdb

Back Off! I just backed up topol.top to ./#topol.top.10#

Processing chain 1 'X' (9 atoms, 1 residues)

Identified residue GLU1 as a starting terminus.

Identified residue GLU1 as a ending terminus.
8 out of 8 lines of specbond.dat converted successfully
Opening force field file /home/david/Programs/gromacs-2021.1/share/gromacs/top/charmm36-feb2021.ff/merged.arn

Checking for duplicate atoms....

Generating any missing hydrogen atoms and/or adding termini.

Now there are 1 residues with 15 atoms

Making bonds...

Number of bonds was 14, now 14

Generating angles, dihedrals and pairs...
Before cleaning: 30 pairs

-------------------------------------------------------
Program:     gmx pdb2gmx, version 2021.1
Source file: src/gromacs/gmxpreprocess/pgutil.cpp (line 151)

Fatal error:
Residue 21949 named GLU of a molecule in the input file was mapped
to an entry in the topology database, but the atom N used in
an interaction of type improper in that entry is not found in the
input file. Perhaps your atom and/or residue naming needs to be
fixed.

For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors
-------------------------------------------------------

This is obviously broken.

Hi,

Did you try gmx2021.1 and charmm36-jul2020 forcefield?

The last case it looks to me a bug (wrong error message?). Please report it in GitLab with all the connected inputs. Please check that someone did not already report it.

Best regards
Alessandra

This looks like an actual bug, will investigate and see that we have an open issue for that.

So, when trying to reproduce this I can confirm that the termini selection doesn’t work, but the water model issue doesn’t happen to me. This might be an issue with the updated charmm forcefield?

I opened pdb2gmx termini selection doesn't work (#4029) · Issues · GROMACS / GROMACS · GitLab for it.

I don’t see how. The only change from February to July 2020 was inclusion of NBFIX terms that were missing, and the OP reports that version 2020.3 works with the same version that fails in 2021.

Ok, then there are likely two bugs, as I can’t reproduce the one coming from selecting the water model locally when using the charmm36-jul2020 forcefield.

Will see that I update the GitLab issue. Thanks @jalemkul for checking1

Hello,

thank you all for your responses. I have done some additional testing on these issues. The “water issue” I described above actually occurs with charmm36-feb2021 only when both the force field and water model are selected from the command line simultaneously (so the command is gmx pdb2gmx -f GLU.pdb -ff charmm36-feb2021 -water tip3p -ter -ignh). When only the FF or the water model (or neither) are selected from the command line (i.e., when interactive selection is needed), pdb2gmx finishes “successfully” (though the termini are still not added as described).

However, when I use the “Charmm27” FF included with GROMACS, the situation is exactly the opposite: both the FF and water model need to be selected from the command line (gmx pdb2gmx -f GLU.pdb -ff charmm27 -water tip3p -ter -ignh), otherwise the same kind of error occurs:

Fatal error:
Residue 22066 named GLU of a molecule in the input file was mapped
to an entry in the topology database, but the atom N used in
an interaction of type improper in that entry is not found in the
input file. Perhaps your atom and/or residue naming needs to be
fixed.

Please note that the “residue number” in the error message is different; this generally changes depending on whether the FF or the water model are selected from the command line.

Both of the described issues (termini selection left out/command line options) disappear when the chains contain more than 1 residue, regardless of the FF and water model used and regardless of whether they are selected interactively or using a command line option (tested with an alanine pentapeptide prepared in PyMOL).

I’ll just add that I’ve built and use GROMACS on Ubuntu 20.04 with CUDA support, and the build completed and was tested with make check without errors.

If I were to guess, could maybe some bugs have been introduced with the new “cyclic molecule” support in pdb2gmx?

David

Found the bug, fix is here: Fix incorrect handling of single residue chains (!1467) · Merge requests · GROMACS / GROMACS · GitLab