Adding SEP and TPO to CHARMM36 ff

GROMACS version:2020.2
GROMACS modification: No

Dear Gromacs Users

I have two phosphorylated residues Serine (SEP) and Tyrosine (TPO) within my pdb model. I am following the step-by-step process on adding a residue to a forcefield as outlined in the following link:

https://manual.gromacs.org/documentation/2020.2/how-to/topology.html

I have updated the residuetypes.dat file within the gromacs installed directory to include the SEP and TPO residues and noted both as Protein. I also copied the charmm36-nov2018.ff folder from the gromacs installed directory and pasted it in my present working directory to edit the merged.rtp file and necessary files. However, I am unsure where to obtain the correct topology defining the atoms, bonds, impropers, etc. for the SEP and TPO residues for CHARMM36 ff.

Can I use CHARMM-GUI or is there a related topology generation tool to obtain these details and adapt the results to the merged.rtp file format?

Thank you, Iris

1 Like

CHARMM36 natively support phosphoserine (SP1, SP2), phosphothreonine (THP1, THP2), and phosphotyrosine (TP1, TP2). You don’t need to do anything aside from renaming the residues in your input coordinate file.

1 Like

Great, thank you Justin! I appreciate your expedient response. I also reviewed the nomenclature for SP1/2 and THP1/2 atoms and it looks like I will need to change SEP/TPO O3P atom to OT.
Best, Iris

1 Like

Hey Justin,

I tried implementing your solution. However, I am still stopped by the error saying THP1211 (211’s the residue number) belongs to the type ‘Other’. The error is the same as before when the name was TPO instead.

Can you please shed some light?
Thanks,
Hemant

Dear Hermant,

if I remember correctly, you should add the affected residue names to the residuetypes.dat file. You will find the “canonical” copy in the “top” directory of your GROMACS installation (e.g. /usr/share/gromacs/top). Make a copy of it and place it to the working directory, and add lines like “THP1 Protein” etc. This way GROMACS tools will recognize these residues as amino acids.

Hope this helps

Kind regards,

Andras

Dear Andras,

Thank you for your response. Your suggestion worked.
Thanks a lot.

Regards,
Hemant

Thanks for suggestion

I got this error after changing the names in coordinate file.
Atom -C not found in residue THP1 0, rtp entry THP1 while adding hydrogens.

@bilal
Make sure there is no space between THP1 and the chain so that the residues of the chain match up such as this. Something probably unique about chain start and terminators giving you that error.

ATOM   2878 HD23 LEU A 100      33.178  28.696  15.759  1.00  0.00      PROA
ATOM   2879  C   LEU A 100      36.233  33.254  15.058  1.00  0.00      PROA
ATOM   2880  O   LEU A 100      37.394  33.567  15.338  1.00  0.00      PROA
ATOM   2881  N   THP1A 101      35.282  34.144  14.783  1.00  0.00      PROA
ATOM   2882  HN  THP1A 101      34.363  33.863  14.519  1.00  0.00      PROA
...
ATOM   2897  O   THP1A 101      35.525  35.316  17.248  1.00  0.00      PROA
ATOM   2898  N   GLU A 102      36.885  36.879  16.379  1.00  0.00      PROA
ATOM   2899  HN  GLU A 102      37.203  37.389  15.584  1.00  0.00      PROA

PDB files are parsed by number of characters allowed per field and are not space or tab delimited

Hi, I also shared same problems. I have changed my phosphorylated residue THR 337, 338, 339, 340 and SER 342 into THP1 and SP1.

PDB2GMX result :
Warning: Residue THP1337 in chain has different type (‘Other’) from
residue ASN1 (‘Protein’). This chain lacks identifiers, which makes
it impossible to do strict classification of the start/end residues. Here we
need to guess this residue should not be part of the chain and instead
introduce a break, but that will be catastrophic if they should in fact be
linked. Please check your structure, and add THP1 to residuetypes.dat
if this was not correct.

Warning: Residue THP1338 in chain has different type (‘Other’) from
residue ASN1 (‘Protein’). This chain lacks identifiers, which makes
it impossible to do strict classification of the start/end residues. Here we
need to guess this residue should not be part of the chain and instead
introduce a break, but that will be catastrophic if they should in fact be
linked. Please check your structure, and add THP1 to residuetypes.dat
if this was not correct.

Warning: Residue THP1339 in chain has different type (‘Other’) from
residue ASN1 (‘Protein’). This chain lacks identifiers, which makes
it impossible to do strict classification of the start/end residues. Here we
need to guess this residue should not be part of the chain and instead
introduce a break, but that will be catastrophic if they should in fact be
linked. Please check your structure, and add THP1 to residuetypes.dat
if this was not correct.

Warning: Residue THP1340 in chain has different type (‘Other’) from
residue ASN1 (‘Protein’). This chain lacks identifiers, which makes
it impossible to do strict classification of the start/end residues. Here we
need to guess this residue should not be part of the chain and instead
introduce a break, but that will be catastrophic if they should in fact be
linked. Please check your structure, and add THP1 to residuetypes.dat
if this was not correct.

Warning: Residue PHE341 in chain has different type (‘Protein’) from
residue ASN1 (‘Protein’). This chain lacks identifiers, which makes
it impossible to do strict classification of the start/end residues. Here we
need to guess this residue should not be part of the chain and instead
introduce a break, but that will be catastrophic if they should in fact be
linked. Please check your structure, and add PHE to residuetypes.dat
if this was not correct.

Hence, I have already input residuetypes.dat into my workfiles, and do the pdb2gmx again.
gmx_mpi pdb2gmx -f step5_input.pdb -o CCL19_CCR7active_BARR2_g.pdb -p topology.top -ter

But another warning showing out :

Warning: Starting residue GLY1 in chain not identified as Protein/RNA/DNA.
This chain lacks identifiers, which makes it impossible to do strict
classification of the start/end residues. Here we need to guess this residue
should not be part of the chain and instead introduce a break, but that will
be catastrophic if they should in fact be linked. Please check your structure,
and add GLY to residuetypes.dat if this was not correct.

Warning: Starting residue THR2 in chain not identified as Protein/RNA/DNA.
This chain lacks identifiers, which makes it impossible to do strict
classification of the start/end residues. Here we need to guess this residue
should not be part of the chain and instead introduce a break, but that will
be catastrophic if they should in fact be linked. Please check your structure,
and add THR to residuetypes.dat if this was not correct.

Warning: Starting residue ASN3 in chain not identified as Protein/RNA/DNA.
This chain lacks identifiers, which makes it impossible to do strict
classification of the start/end residues. Here we need to guess this residue
should not be part of the chain and instead introduce a break, but that will
be catastrophic if they should in fact be linked. Please check your structure,
and add ASN to residuetypes.dat if this was not correct.

Warning: Starting residue ASP4 in chain not identified as Protein/RNA/DNA.
This chain lacks identifiers, which makes it impossible to do strict
classification of the start/end residues. Here we need to guess this residue
should not be part of the chain and instead introduce a break, but that will
be catastrophic if they should in fact be linked. Please check your structure,
and add ASP to residuetypes.dat if this was not correct.

Warning: Starting residue ALA5 in chain not identified as Protein/RNA/DNA.
This chain lacks identifiers, which makes it impossible to do strict
classification of the start/end residues. Here we need to guess this residue
should not be part of the chain and instead introduce a break, but that will
be catastrophic if they should in fact be linked. Please check your structure,
and add ALA to residuetypes.dat if this was not correct.

Disabling further warnings about unidentified residues at start of chain.
Problem with chain definition, or missing terminal residues.
This chain does not appear to contain a recognized chain molecule.
If this is incorrect, you can edit residuetypes.dat to modify the behavior.
8 out of 8 lines of specbond.dat converted successfully
Special Atom Distance matrix:
CYS8 CYS9 CYS34 CYS50
SG94 SG104 SG517 SG781
CYS9 SG104 0.756
CYS34 SG517 0.203 0.708
CYS50 SG781 0.900 0.204 0.815
MET72 SD1150 2.970 2.597 2.910 2.485
Linking CYS-8 SG-94 and CYS-34 SG-517…
Linking CYS-9 SG-104 and CYS-50 SG-781…
Opening force field file ./charmm36-jul2022.ff/aminoacids.arn


Program: gmx pdb2gmx, version 2018.6
Source file: src/gromacs/gmxpreprocess/pdb2gmx.cpp (line 753)

Fatal error:
Atom HT1 in residue GLY 1 was not found in rtp entry GLY with 7 atoms
while sorting atoms.
For a hydrogen, this can be a different protonation state, or it
might have had a different number in the PDB file and was rebuilt
(it might for instance have been H3, and we only expected H1 & H2).
Note that hydrogens might have been added to the entry for the N-terminus.
Remove this hydrogen or choose a different protonation state to solve it.
Option -ignh will ignore all hydrogens in the input.