Source file: src/gromacs/gmxpreprocess/pdb2gmx.cpp Fatal error Glucid Topology

GROMACS version: version 2023.3-conda_forge
GROMACS modification: No

HEADER    UNCLASSIFIED                            24-Jun-24
TITLE     ALL ATOM STRUCTURE FOR MOLECULE UNK                                   
AUTHOR    AUTOMATED TOPOLOGY BUILDER (ATB) REVISION 2023-06-14 20:38:16
AUTHOR   2  https://atb.uq.edu.au
HETATM    1  H22 9EHK    0      -3.656   1.286   1.701  1.00  0.00           H
HETATM    2   O9 9EHK    0      -4.027   0.648   1.063  1.00  0.00           O
HETATM    3   P1 9EHK    0      -3.049   0.316  -0.161  1.00  0.00           P
HETATM    4  O10 9EHK    0      -2.462   1.468  -0.901  1.00  0.00           O
HETATM    5   O8 9EHK    0      -3.971  -0.568  -1.123  1.00  0.00           O
HETATM    6  H21 9EHK    0      -4.317  -1.380  -0.708  1.00  0.00           H
HETATM    7   O7 9EHK    0      -1.975  -0.653   0.529  1.00  0.00           O
HETATM    8  C16 9EHK    0      -0.832  -1.158  -0.213  1.00  0.00           C
HETATM    9  H28 9EHK    0      -0.916  -0.913  -1.277  1.00  0.00           H
HETATM   10  H29 9EHK    0      -0.847  -2.243  -0.100  1.00  0.00           H
HETATM   11  C13 9EHK    0       0.472  -0.618   0.379  1.00  0.00           C
HETATM   12  H25 9EHK    0       0.394  -0.662   1.478  1.00  0.00           H
HETATM   13   O2 9EHK    0       1.476  -1.523  -0.090  1.00  0.00           O
HETATM   14  C15 9EHK    0       2.773  -1.307   0.478  1.00  0.00           C
HETATM   15  H27 9EHK    0       2.723  -1.480   1.562  1.00  0.00           H
HETATM   16   O6 9EHK    0       3.632  -2.260  -0.057  1.00  0.00           O
HETATM   17  H20 9EHK    0       3.875  -1.939  -0.946  1.00  0.00           H
HETATM   18  C14 9EHK    0       3.275   0.115   0.189  1.00  0.00           C
HETATM   19  H26 9EHK    0       4.184   0.291   0.785  1.00  0.00           H
HETATM   20   O5 9EHK    0       3.586   0.164  -1.200  1.00  0.00           O
HETATM   21  H19 9EHK    0       3.665   1.108  -1.425  1.00  0.00           H
HETATM   22  C12 9EHK    0       2.219   1.156   0.549  1.00  0.00           C
HETATM   23  H24 9EHK    0       2.106   1.188   1.645  1.00  0.00           H
HETATM   24   O4 9EHK    0       2.678   2.415   0.064  1.00  0.00           O
HETATM   25  H18 9EHK    0       1.895   2.995   0.062  1.00  0.00           H
HETATM   26  C11 9EHK    0       0.858   0.809  -0.055  1.00  0.00           C
HETATM   27  H23 9EHK    0       0.951   0.815  -1.150  1.00  0.00           H
HETATM   28   O3 9EHK    0      -0.031   1.825   0.389  1.00  0.00           O
HETATM   29  H17 9EHK    0      -0.823   1.831  -0.191  1.00  0.00           H
CONECT    1    2
CONECT    2    1    3
CONECT    3    2    4    5    7
CONECT    4    3
CONECT    5    3    6
CONECT    6    5
CONECT    7    3    8
CONECT    8    7    9   10   11
CONECT    9    8
CONECT   10    8
CONECT   11    8   12   13   26
CONECT   12   11
CONECT   13   11   14
CONECT   14   13   15   16   18
CONECT   15   14
CONECT   16   14   17
CONECT   17   16
CONECT   18   14   19   20   22
CONECT   19   18
CONECT   20   18   21
CONECT   21   20
CONECT   22   18   23   24   26
CONECT   23   22
CONECT   24   22   25
CONECT   25   24
CONECT   26   11   22   27   28
CONECT   27   26
CONECT   28   26   29
CONECT   29   28
END

this is the pdb file of the 3D structure of the glucose 6 phosphate.

[ 9EHK ]

[ atoms ]
;  nr  type  resnr  resid  atom  cgnr  charge    mass
    1  HS14    1    9EHK    H22    1    0.483   1.0080
    2 OEOpt    1    9EHK     O9    2   -0.612  15.9994
    3     P    1    9EHK     P1    3    1.056  30.9738
    4    OM    1    9EHK    O10    4   -0.621  15.9994
    5 OEOpt    1    9EHK     O8    5   -0.601  15.9994
    6  HS14    1    9EHK    H21    6    0.502   1.0080
    7  OAlc    1    9EHK     O7    7   -0.439  15.9994
    8  CPos    1    9EHK    C16    8    0.012  12.0110
    9    HC    1    9EHK    H28    9    0.091   1.0080
   10    HC    1    9EHK    H29   10    0.117   1.0080
   11  CPos    1    9EHK    C13   11    0.303  12.0110
   12    HC    1    9EHK    H25   12    0.034   1.0080
   13    OE    1    9EHK     O2   13   -0.493  15.9994
   14  CPos    1    9EHK    C15   14    0.501  12.0110
   15    HC    1    9EHK    H27   15    0.036   1.0080
   16  OAlc    1    9EHK     O6   16   -0.631  15.9994
   17  HS14    1    9EHK    H20   17    0.453   1.0080
   18     C    1    9EHK    C14   18   -0.208  12.0110
   19    HC    1    9EHK    H26   19    0.142   1.0080
   20  OAlc    1    9EHK     O5   20   -0.592  15.9994
   21     H    1    9EHK    H19   21    0.452   1.0080
   22  CPos    1    9EHK    C12   22    0.320  12.0110
   23    HC    1    9EHK    H24   23    0.018   1.0080
   24  OAlc    1    9EHK     O4   24   -0.627  15.9994
   25  HS14    1    9EHK    H18   25    0.422   1.0080
   26     C    1    9EHK    C11   26    0.032  12.0110
   27    HC    1    9EHK    H23   27    0.076   1.0080
   28  OAlc    1    9EHK     O3   28   -0.651  15.9994
   29     H    1    9EHK    H17   29    0.425   1.0080
; total charge of the molecule:   0.000

[ bonds ]
;  ai   aj  funct   c0         c1
    1    2    2   0.0971   7.9547e+06
    2    3    2   0.1610   4.8400e+06
    3    4    2   0.1480   8.6000e+06
    3    5    2   0.1600   2.1484e+06
    3    7    2   0.1610   4.8400e+06
    5    6    2   0.0975   3.3662e+07
    7    8    2   0.1450   5.2319e+06
    8    9    2   0.1090   1.2300e+07
    8   10    2   0.1090   1.2300e+07
    8   11    2   0.1530   7.1500e+06
   11   12    2   0.1100   1.2100e+07
   11   13    2   0.1430   8.1800e+06
   11   26    2   0.1540   4.0057e+06
   13   14    2   0.1430   8.1800e+06
   14   15    2   0.1090   1.2300e+07
   14   16    2   0.1390   8.6600e+06
   14   18    2   0.1530   7.1500e+06
   16   17    2   0.0971   7.9547e+06
   18   19    2   0.1100   1.2100e+07
   18   20    2   0.1430   8.1800e+06
   18   22    2   0.1530   7.1500e+06
   20   21    2   0.0972   1.9581e+07
   22   23    2   0.1100   1.2100e+07
   22   24    2   0.1430   8.1800e+06
   22   26    2   0.1530   7.1500e+06
   24   25    2   0.0975   3.3662e+07
   26   27    2   0.1090   1.2300e+07
   26   28    2   0.1430   8.1800e+06
   28   29    2   0.0983   9.8314e+06

[ angles ]
;  ai   aj   ak  funct   angle     fc
    1    2    3    2    109.50   450.00
    2    3    4    2    120.00   780.00
    2    3    5    2    103.00   420.00
    2    3    7    2    103.00   420.00
    4    3    5    2    109.60   450.00
    4    3    7    2    114.00  1559.41
    5    3    7    2    109.60   450.00
    3    5    6    2    109.50   450.00
    3    7    8    2    120.00   530.00
    7    8    9    2    111.00   530.00
    7    8   10    2    106.75   503.00
    7    8   11    2    111.00   530.00
    9    8   10    2    108.53   443.00
    9    8   11    2    111.30   632.00
   10    8   11    2    108.53   443.00
    8   11   12    2    108.00   465.00
    8   11   13    2    106.00  1733.55
    8   11   26    2    120.00   560.00
   12   11   13    2    110.30   524.00
   12   11   26    2    109.60   450.00
   13   11   26    2    109.50   520.00
   11   13   14    2    109.50   450.00
   13   14   15    2    109.50   448.00
   13   14   16    2    109.00  1680.51
   13   14   18    2    111.00   530.00
   15   14   16    2    107.57   484.00
   15   14   18    2    110.30   524.00
   16   14   18    2    111.00   530.00
   14   16   17    2    109.50   450.00
   14   18   19    2    108.53   443.00
   14   18   20    2    109.50   520.00
   14   18   22    2    111.00   530.00
   19   18   20    2    110.30   524.00
   19   18   22    2    109.50   448.00
   20   18   22    2    111.00   530.00
   18   20   21    2    109.50   450.00
   18   22   23    2    108.53   443.00
   18   22   24    2    109.50   520.00
   18   22   26    2    111.00   530.00
   23   22   24    2    110.30   524.00
   23   22   26    2    108.00   465.00
   24   22   26    2    111.00   530.00
   22   24   25    2    109.50   450.00
   11   26   22    2    109.50   520.00
   11   26   27    2    108.00   465.00
   11   26   28    2    115.00   610.00
   22   26   27    2    108.53   443.00
   22   26   28    2    109.50   520.00
   27   26   28    2    111.30   632.00
   26   28   29    2    109.50   450.00

[ dihedrals ]
; GROMOS improper dihedrals
;  ai   aj   ak   al  funct   angle     fc
[ dihedrals ]
;  ai   aj   ak   al  funct    ph0      cp     mult
    1    2    3    7    1      0.00     1.05    3
    1    2    3    7    1      0.00     3.14    2
    3    7    8   11    1    180.00     1.00    3
    5    3    7    8    1      0.00     3.19    3
    5    3    7    8    1      0.00     5.09    2
    7    3    5    6    1      0.00     1.05    3
    7    3    5    6    1      0.00     3.14    2
    7    8   11   13    1      0.00     5.92    3
   11   13   14   16    1      0.00     1.26    3
   13   11   26   28    1      0.00     5.92    3
   13   14   16   17    1      0.00     1.26    3
   13   14   18   20    1      0.00     5.92    3
   20   18   22   24    1      0.00     5.92    3
   22   18   20   21    1      0.00     1.26    3
   22   26   28   29    1      0.00     1.26    3
   24   22   26   28    1      0.00     5.92    3
   26   11   13   14    1      0.00     1.26    3
   26   22   24   25    1      0.00     1.26    3

[ exclusions ]
;  ai   aj  funct  ;  GROMOS 1-4 exclusions

this is the lines i’ve added in the aminoacids.rtp file attached to the gromos force field of the gromac program that i’ve build in a conda environement.

(MD_Gromac) ~/Documents/Molecular_dynamics/md-intro-tutorial/Introduction_to_Molecular_Dynamics/md-intro-tutorial-main/data/input_0$ gmx pdb2gmx -f Gluc.pdb -o glucose_processed.gro -water spce
:-) GROMACS - gmx pdb2gmx, 2023.3-conda_forge (-:

Executable: ~/…/anaconda3/envs/MD_Gromac/bin.AVX2_256/gmx
Data prefix: ~/…/anaconda3/envs/MD_Gromac
Working dir: ~/…/Documents/Molecular_dynamics/md-intro-tutorial/Introduction_to_Molecular_Dynamics/md-intro-tutorial-main/data/input_0
Command line:
gmx pdb2gmx -f Gluc.pdb -o glucose_processed.gro -water spce

Select the Force Field:

From ‘~/…/anaconda3/envs/MD_Gromac/share/gromacs/top’:

1: GROMOS96 54a7 force field2 (Eur. Biophys. J. (2011), 40, 843-856, DOI: 10.1007/s00249-011-0700-9)

2: AMBER03 protein, nucleic AMBER94 (Duan et al., J. Comp. Chem. 24, 1999-2012, 2003)

3: AMBER94 force field (Cornell et al., JACS 117, 5179-5197, 1995)

4: AMBER96 protein, nucleic AMBER94 (Kollman et al., Acc. Chem. Res. 29, 461-469, 1996)

5: AMBER99 protein, nucleic AMBER94 (Wang et al., J. Comp. Chem. 21, 1049-1074, 2000)

6: AMBER99SB protein, nucleic AMBER94 (Hornak et al., Proteins 65, 712-725, 2006)

7: AMBER99SB-ILDN protein, nucleic AMBER94 (Lindorff-Larsen et al., Proteins 78, 1950-58, 2010)

8: AMBERGS force field (Garcia & Sanbonmatsu, PNAS 99, 2782-2787, 2002)

9: CHARMM27 all-atom force field (CHARM22 plus CMAP for proteins)

10: GROMOS96 43a1 force field

11: GROMOS96 43a2 force field (improved alkane dihedrals)

12: GROMOS96 45a3 force field (Schuler JCC 2001 22 1205)

13: GROMOS96 53a5 force field (JCC 2004 vol 25 pag 1656)

14: GROMOS96 53a6 force field (JCC 2004 vol 25 pag 1656)

15: GROMOS96 54a7 force field (Eur. Biophys. J. (2011), 40, 843-856, DOI: 10.1007/s00249-011-0700-9)

16: OPLS-AA/L all-atom force field (2001 aminoacid dihedrals)
1

Using the Gromos54a7_atb force field in directory gromos54a7_atb.ff

going to rename gromos54a7_atb.ff/aminoacids.r2b
Opening force field file ~/…/anaconda3/envs/MD_Gromac/share/gromacs/top/gromos54a7_atb.ff/aminoacids.r2b
Reading Gluc.pdb…
WARNING: all CONECT records are ignored
Read ‘ALL ATOM STRUCTURE FOR MOLECULE UNK’, 29 atoms

Analyzing pdb file
Splitting chemical chains based on TER records or chain id changing.

There are 1 chains and 0 blocks of water and 1 residues with 29 atoms

chain #res #atoms

1 ’ ’ 1 29

All occupancies are one
All occupancies are one
Opening force field file ~/…/anaconda3/envs/MD_Gromac/share/gromacs/top/gromos54a7_atb.ff/atomtypes.atp

Reading residue database… (Gromos54a7_atb)
Opening force field file ~/…/anaconda3/envs/MD_Gromac/share/gromacs/top/gromos54a7_atb.ff/aminoacids.rtp

Using default: not generating all possible dihedrals

Using default: excluding 3 bonded neighbors

Using default: generating 1,4 H–H interactions

Using default: removing proper dihedrals found on the same bond as a proper dihedral

Using default: removing proper dihedrals found on the same bond as a proper dihedral
Opening force field file ~/…/anaconda3/envs/MD_Gromac/share/gromacs/top/gromos54a7_atb.ff/aminoacids.hdb
Opening force field file ~/…/anaconda3/envs/MD_Gromac/share/gromacs/top/gromos54a7_atb.ff/aminoacids.n.tdb
Opening force field file ~/…/anaconda3/envs/MD_Gromac/share/gromacs/top/gromos54a7_atb.ff/aminoacids.c.tdb

Back Off! I just backed up topol.top to ./#topol.top.12#

Processing chain 1 (29 atoms, 1 residues)

Problem with chain definition, or missing terminal residues. This chain does not appear to contain a recognized chain molecule. If this is incorrect, you can edit residuetypes.dat to modify the behavior.
8 out of 8 lines of specbond.dat converted successfully


Program: gmx pdb2gmx, version 2023.3-conda_forge
Source file: src/gromacs/gmxpreprocess/pdb2gmx.cpp (line 870)

Fatal error:
Atom H22 in residue 9EHK 0 was not found in rtp entry 9EHK with 29 atoms
while sorting atoms.

For a hydrogen, this can be a different protonation state, or it
might have had a different number in the PDB file and was rebuilt
(it might for instance have been H3, and we only expected H1 & H2).
Note that hydrogens might have been added to the entry for the N-terminus.
Remove this hydrogen or choose a different protonation state to solve it.
Option -ignh will ignore all hydrogens in the input.

For more information and tips for troubleshooting, please check the GROMACS
website at Common Errors — GROMACS webpage https://www.gromacs.org documentation

this is the error that i get but when i compare the two files they are perfectly configure as i see it:
PDB

COLUMNS        DATA TYPE       FIELD         DEFINITION
--------------------------------------------------------------------------------
 1 -  6        Record name     "HETATM"
 7 - 11        Integer         serial        Atom serial number.
13 - 16        Atom            name          Atom name.
17             Character       altLoc        Alternate location indicator.
18 - 20        Residue name    resName       Residue name.
22             Character       chainID       Chain identifier.
23 - 26        Integer         resSeq        Residue sequence number.
27             AChar           iCode         Code for insertion of residues.
31 - 38        Real(8.3)       x             Orthogonal coordinates for X in Angstroms.
39 - 46        Real(8.3)       y             Orthogonal coordinates for Y in Angstroms.
47 - 54        Real(8.3)       z             Orthogonal coordinates for Z in Angstroms.
55 - 60        Real(6.2)       occupancy     Occupancy.
61 - 66        Real(6.2)       tempFactor    Temperature factor.
77 - 78        LString(2)      element       Element symbol, right-justified.
79 - 80        LString(2)      charge        Charge on the atom.

rtp

COLUMNS        DATA TYPE       FIELD         DEFINITION
--------------------------------------------------------------------------------
 1 -  5        Integer         nr             Atom number.
 7 - 10        String          type           Atom type.
12 - 14        Integer         resnr          Residue number.
16 - 19        String          resid          Residue name.
21 - 24        String          atom           Atom name.
26 - 29        Integer         cgnr           Charge group number.
31 - 38        Real            charge         Atom charge.
40 - 47        Real            mass           Atom mass.

do you see something wrong that could explain the error ?

Count the columns in your rtp file again. They don’t seem to match the specifications. Columns 21-24 (Atom name) is “9EHK” for all atoms in your rtp file.

By the way, from where is the rtp specification in your post? I don’t think it should matter how wide each field is. Have a look at File formats - GROMACS 2024.3 documentation. There should be four fields in [ atoms ]: “name”, “type”, “charge” and “chargegroup”

Hello, ho indeed i’ve just modified so the columns # in specs and rtp match… without sucess the error is the same. ^^’

i’ve actually came by an itp file on ATB and by observing the FF rtp content i’ve just tryed to infer ^^’ I know… not really clean… i’m unfamiliar with the program and i don’t know how to generate a rtp script describing the field of a carbohydrate from scratch. is there a possibility to do this ? I would love to learn hopw to do it :D

Itp files and rtp files are different. They are described in the manual. If you’ve already got an itp file you don’t have to use pdb2gmx at all. Just include the itp file in a topology file.

I see, Thanks a lot ! But if one day i don’t have access to an itp file how could i build a “rtp description” of the molecule like a nucleic acids or else ? Is there an existing script ? A platform ? A methodology ? thanks again for your advises.

First, I would say that small molecules, that are not built by specific residues, are best included in the topology as itp files, whereas residues that can be linked together are most convenient to describe in rtp files, which in turn can be used by gmx pdb2gmx. You can of course describe complete molecules in rtp files and use them in gmx pdb2gmx, but personally I do not think that is the best way, unless there is a protein-ligand complex (or similar) in the coordinate file and it is difficult to separate them.

If you want to make an rtp file for a “new” (one you don’t have parameters for), residue you will need to deduce the atom types and partial charges, describe how the atoms are connected and also potential impropers dihedral angles. This requires that the atom types and bonded parameters that would be used are already described in the force field, i.e., that you do not have to add parameter types. I have never used gmx x2top, but I would assume that that tool can help you build a “skeleton” of an rtp file, i.e., the connectivity, but not improper dihedral angles. I would not trust it to generate any parameters or choose correct atom types.

If you need to generate any force field parameters, bonded or non-bonded (including partial charges) you should follow the protocol from the force field publications.

Thanks for you precious advises ! all the best.