Adding a new residue to Gromos54a7 FF

Hi Gromacs users,

I have a 65 amino-acid protein chain where I added acetyl and amide capping groups to the N- and C-termini, respectively.I want to add a pseudo-amino acid (called ANX) instead of amino acid 61 (ASN61). To be able to do this, according to the link below, I made the following changes:
http://www.gromacs.org/Documentation_of_outdated_version/How-tos/Adding_a_Residue_to_a_Force_Field

1- I made a copy of gromos54a7 force field directory into my working directory.

2- I added the new residue (ANX) to the .rtp file for the force field I chose.
[ ANX ]
[ atoms ]
N N 0.03695 0
CA CH1 0.02889 1
CB CH2 0.05414 1
CG C 0.22287 2
OD1 O -0.25244 2
ND2 NT -0.20901 2
CD CH1 0.34145 3
CE CH1 0.02370 4
NE2 N -0.08297 5
CP CH1 0.20036 6
OP1 OA -0.27220 6
CQ CH3 0.07359 7
CZ CH1 0.16236 8
OZ1 OA -0.29272 8
CM CH1 0.22151 9
OM1 OA -0.29572 9
CN CH1 0.11181 10
CO CH2 0.28394 10
OO1 OA -0.20218 10
OD2 OA -0.21004 11
C C 0.22546 12
O O -0.16973 12
and with [bonds], [angles], [impropers] and [dihedral] sections.

3- I added my residue to residuetypes.dat as: ANX Protein

4- I updated the specbond.dat to make a connectivity of ANX residue to the previous residue and the residue after it.
( SER C 1 ANX N 1 1.34 SER2 ANX1 and
VAL N 1 ANX C 1 1.35 VAL1 ANX1 )

Ps. I didn’t make any changes in the the atomtypes.atp, ffnonbonded.itp and ffbonded.itp; because there are no new atom types or new bonded types.

5- When I use the pdb2gmx command with –ter flag, I get the following:
1: Select start terminus type for ACE-1
(I choose “None”)
2: Select end terminus type for SER-60:
(So; it does not see my new residue (ANX) that is number 61; and it tries to add a terminus to the residue 60 that is previous of ANX. I was expecting to get the following:
Select end terminus type for NH2-66

It seems that gromacs does not see the new residue I added to the force field.

Any help would be appreciated.
Thank you,
Deniz

@Deniz
Is it just a regular peptide bond between SER 60 and the new aa ANX 61? If yes, I think there is no need to mention it in specbond.dat

In your input pdb file, is there a TER record between SER 60 and ANX 61?

Out of curiosity, since you mentioned no changes required in atomtypes.atp and ffnonbonded.itp, what are the fundamental differences between the rtp file for ASN and ANX?

Hi Neena,

Thanks for your reply. Below are my responses to your questions.

— Is it just a regular peptide bond between SER 60 and the new aa ANX 61? If yes, I think there is no need to mention it in specbond.dat—
Yes you’re right, initially I did not update the specbond.dat since it is a regular peptide bond. But pdb2gmx tried to add the end terminus to the 60SER; I was thinking that it might not understand the link between SER60 and ANX61. So then I updated the specbond.dat.

—In your input pdb file, is there a TER record between SER 60 and ANX 61?—
No. TER record is at end of the protein chain; after capping group NH2 66.

—Out of curiosity, since you mentioned no changes required in atomtypes.atp and ffnonbonded.itp, what are the fundamental differences between the rtp file for ASN and ANX?—
I covalently attached a small sugar molecule to the ASN 61 and introduced them to the system as a pseudo aa with the name ANX 61. There is no new atom type in the sugar molecule; it has the same atoms found in ffnonbonded.itp. For example: under the [pairs] section which I obtained from prodrg output of itp;
[ pairs ]
; ai aj fu c0, c1, …
1 4 1 ; O N
1 8 1 ; O CB (=CH2)
These 2 interactions already exist in the ffnonbonded.itp of Gromos54a7 ff under the [pairs] section as the following:
N O 1 2.347562E-03 1.120291E-06
CH2 O 1 3.268799E-03 1.875128E-06

Am I wrong in this step ?

Cheers,
Deniz

@Deniz
I have a modified aminoacid in my peptide too (just an extra H to carbonyl oxygen of alanine), but did not have to make any changes in specbond.dat.

  1. Did you transfer a copy of the modified residuetypes.dat into your working directory? [mentioned here under modifying a forcefield http://www.gromacs.org/Documentation_of_outdated_version/How-tos/Adding_a_Residue_to_a_Force_Field]
  2. Can you test it as a dipeptide first? Like just SER-ANX-NH2
  3. In aminoacid.rtp file, disregarding the sugar moiety, any difference in the impropers between ASN and ANX? In terms of the main peptide bond improper dihedrals. Check the same between ANX and NH2
  4. Before checking your ffnonbonded.itp files, you did not have to add any new opls_atom types in atomtypes.atp? No change in partial charges between ASN and ANX? Since pdb2gmx is failing, it is not a problem with ffbonded or nonbonded.itp yet.

Are there any chain identifiers in the PDB file? If so, are Ser60 and Anx61 labeled the same way? pdb2gmx will detect a new chain with either TER or a change in chain ID.

Can you post the full screen output from pdb2gmx?

Dear Justin,

Thank you for your reply. Sorry for my late response. First let me first answer your questions:

Are there any chain identifiers in the PDB file? YES
If so, are Ser60 and Anx61 labeled the same way? YES, both are labelled by A

To simplify the problem, I prepared the dipeptide.pdb which only includes SER(60)-ANX(61). Below is my dipeptide.pdb file:

TITLE dipeptide
REMARK THIS IS A SIMULATION BOX
CRYST1 105.054 109.972 135.640 90.00 90.00 90.00 P 1 1
MODEL 1
ATOM 380 N SER A 60 182.100 243.640 217.370 1.00 0.00
ATOM 381 H SER A 60 183.080 243.650 217.560 1.00 0.00
ATOM 382 CA SER A 60 181.180 243.290 218.440 1.00 0.00
ATOM 383 CB SER A 60 181.900 242.570 219.560 1.00 0.00
ATOM 384 OG SER A 60 182.800 243.420 220.210 1.00 0.00
ATOM 385 HG SER A 60 183.260 242.920 220.940 1.00 0.00
ATOM 386 C SER A 60 180.450 244.520 218.980 1.00 0.00
ATOM 387 O SER A 60 180.850 245.660 218.690 1.00 0.00
ATOM 388 N ANX A 61 179.380 244.280 219.750 1.00 0.00
ATOM 389 H ANX A 61 179.120 243.320 219.880 1.00 0.00
ATOM 390 CA ANX A 61 178.570 245.290 220.410 1.00 0.00
ATOM 391 CB ANX A 61 177.130 244.820 220.590 1.00 0.00
ATOM 392 CG ANX A 61 176.260 244.890 219.330 1.00 0.00
ATOM 393 OD1 ANX A 61 176.630 245.550 218.330 1.00 0.00
ATOM 394 ND2 ANX A 61 175.110 244.230 219.370 1.00 0.00
ATOM 395 1HD2 ANX A 61 174.860 243.720 220.190 1.00 0.00
ATOM 396 CQ ANX A 61 176.970 240.510 217.800 1.00 0.00
ATOM 397 CP ANX A 61 175.740 241.200 218.330 1.00 0.00
ATOM 398 OP1 ANX A 61 175.410 241.080 219.520 1.00 0.00
ATOM 399 NE2 ANX A 61 175.050 241.950 217.480 1.00 0.00
ATOM 400 CE ANX A 61 173.830 242.690 217.840 1.00 0.00
ATOM 401 CZ ANX A 61 172.860 242.700 216.620 1.00 0.00
ATOM 402 OZ1 ANX A 61 172.480 241.350 216.310 1.00 0.00
ATOM 403 CM ANX A 61 171.590 243.530 216.980 1.00 0.00
ATOM 404 OM1 ANX A 61 170.710 243.540 215.860 1.00 0.00
ATOM 405 CN ANX A 61 172.020 244.980 217.350 1.00 0.00
ATOM 406 CO ANX A 61 170.850 245.840 217.780 1.00 0.00
ATOM 407 OO1 ANX A 61 171.220 247.220 217.830 1.00 0.00
ATOM 408 OD2 ANX A 61 172.980 244.940 218.470 1.00 0.00
ATOM 409 CD ANX A 61 174.200 244.200 218.170 1.00 0.00
ATOM 410 C ANX A 61 179.170 245.700 221.750 1.00 0.00
ATOM 411 O ANX A 61 179.450 244.820 222.600 1.00 0.00
TER
ENDMDL

After giving the following command :

gmx_mpi pdb2gmx -f ANX.pdb -ter -ignh -water spc

I get the following warning:

Processing chain 1 (28 atoms, 2 residues)
Identified residue SER0 as a starting terminus.

Warning: Residue ANX0 in chain has different type (‘Other’) from residue SER0 (‘Protein’). This chain lacks identifiers, which makes it impossible to do strict classification of the start/end residues. Here we need to guess this residue should not be part of the chain and instead introduce a break, but that will be catastrophic if they should in fact be linked. Please check your structure, and add ANX to residuetypes.dat if this was not correct.

Identified residue SER0 as a ending terminus.
8 out of 8 lines of specbond.dat converted successfully
Select start terminus type for SER-0
0: NH3+
1: NH2
2: None
1
1
Start terminus SER-0: NH2
Select end terminus type for SER-0
0: COO-
1: COOH
2: None


I have already added ANX to the end of the resıdue.dat file as “ANX Protein”.

I would appreciate any help . Thank You.

Deniz

Hi Neena,

Thank you for your reply. Sorry for my late response. Let me first answer your questions:

“I have a modified aminoacid in my peptide too (just an extra H to carbonyl oxygen of alanine), but did not have to make any changes in specbond.dat.”

I also tried the original version of specbond.dat but still got the same warning.

  1. Did you transfer a copy of the modified residuetypes.dat into your working directory? [mentioned here under modifying a forcefield http://www.gromacs.org/Documentation_of_outdated_version/How-tos/Adding_a_Residue_to_a_Force_Field]

Yes I did. It is seen as “ANX Protein” at the end of the residuetypes.dat.

  1. Can you test it as a dipeptide first? Like just SER-ANX-NH2

I prepared SER(60)-ANX(61) as a dipeptide.pdb and after the below command

gmx_mpi pdb2gmx -f dipeptide_60SER-61ANX.pdb -ter -ignh -water spc

Identified residue SER60 as a ending terminus.

8 out of 8 lines of specbond.dat converted successfully

Select start terminus type for SER-60

0: NH3+

1: NH2

2: None

1

1

Start terminus SER-60: NH2

Select end terminus type for SER-60

0: COO-

1: COOH

2: None

nothing changed: it still tries to add end-capping group to SER60 instead of ANX61.

  1. In aminoacid.rtp file, disregarding the sugar moiety, any difference in the impropers between ASN and ANX? In terms of the main peptide bond improper dihedrals. Check the same between ANX and NH2.

There is no difference in the impropers between ASN and ANX-without-sugar-moeity.

I couldn’t understand why I should check the same btw ANX and NH2. Because ANX is the 61st residue, NH2 is the 66th and there is no dihedral between them.

  1. Before checking your ffnonbonded.itp files, you did not have to add any new opls_atom types in atomtypes.atp? No change in partial charges between ASN and ANX? Since pdb2gmx is failing, it is not a problem with ffbonded or nonbonded.itp yet.

Sorry, I couldn’t understand this one. Since I am using Gromacs 54a7ff and I did not need to use opls atomtypes.

Thank you so much,
Deniz

This should be all you have to do to get your residue recognized. Is this version of residuetypes.dat in your working directory or in $GMXLIB? Is the correct version being read (check the screen output of pdb2gmx - it is always wise to include the full screen output in your posts because there is a massive amount of diagnostic information in the output).

Hi Justin,

Is this version of residuetypes.dat in your working directory or in $GMXLIB ?
– It is in my working directory.

–Below is the screen output of pdb2gmx; It does not seem to read the residuetypes.dat file in the gromos54a7_AN-NAN.ff directory.

Select the Force Field:
From current directory:
1: GROMOS96 54a7 force field, extended to include NAG parameters
From ‘/truba/sw/centos7.3/app/gromacs/2019-impi-mkl-PS2019-E5V4/share/gromacs/top’:
2: AMBER03 protein, nucleic AMBER94 (Duan et al., J. Comp. Chem. 24, 1999-2012, 2003)
3: AMBER94 force field (Cornell et al., JACS 117, 5179-5197, 1995)
4: AMBER96 protein, nucleic AMBER94 (Kollman et al., Acc. Chem. Res. 29, 461-469, 1996)
5: AMBER99 protein, nucleic AMBER94 (Wang et al., J. Comp. Chem. 21, 1049-1074, 2000)
6: AMBER99SB protein, nucleic AMBER94 (Hornak et al., Proteins 65, 712-725, 2006)
7: AMBER99SB-ILDN protein, nucleic AMBER94 (Lindorff-Larsen et al., Proteins 78, 1950-58, 2010)
8: AMBERGS force field (Garcia & Sanbonmatsu, PNAS 99, 2782-2787, 2002)
9: CHARMM27 all-atom force field (CHARM22 plus CMAP for proteins)
10: GROMOS96 43a1 force field
11: GROMOS96 43a2 force field (improved alkane dihedrals)
12: GROMOS96 45a3 force field (Schuler JCC 2001 22 1205)
13: GROMOS96 53a5 force field (JCC 2004 vol 25 pag 1656)
14: GROMOS96 53a6 force field (JCC 2004 vol 25 pag 1656)
15: GROMOS96 54a7 force field (Eur. Biophys. J. (2011), 40, 843-856, DOI: 10.1007/s00249-011-0700-9)
16: OPLS-AA/L all-atom force field (2001 aminoacid dihedrals)
1

Using the Gromos54a7_AN-NAN force field in directory ./gromos54a7_AN-NAN.ff

going to rename ./gromos54a7_AN-NAN.ff/aminoacids.r2b
Opening force field file ./gromos54a7_AN-NAN.ff/aminoacids.r2b
Reading dipeptide_60SER-61ANX.pdb…
Read ‘SPIKE GLYCOPROTEIN’, 28 atoms
Analyzing pdb file
Splitting chemical chains based on TER records or chain id changing.
There are 1 chains and 0 blocks of water and 2 residues with 28 atoms

chain #res #atoms
1 ’ ’ 2 28

WARNING: there were 2 atoms with zero occupancy and 26 atoms with
occupancy unequal to one (out of 28 atoms). Check your pdb file.

Opening force field file ./gromos54a7_AN-NAN.ff/atomtypes.atp
Atomtype 58
Reading residue database… (Gromos54a7_AN-NAN)
Opening force field file ./gromos54a7_AN-NAN.ff/aminoacids.rtp
Using default: not generating all possible dihedrals
Using default: excluding 3 bonded neighbors
Using default: generating 1,4 H–H interactions
Using default: removing proper dihedrals found on the same bond as a proper dihedral
Residue 110
Sorting it all out…
Opening force field file ./gromos54a7_AN-NAN.ff/aminoacids.hdb
Opening force field file ./gromos54a7_AN-NAN.ff/aminoacids.n.tdb
Opening force field file ./gromos54a7_AN-NAN.ff/aminoacids.c.tdb

Back Off! I just backed up topol.top to ./#topol.top.1#
Processing chain 1 (28 atoms, 2 residues)
Identified residue SER0 as a starting terminus.

Warning: Residue ANX0 in chain has different type (‘Other’) from
residue SER0 (‘Protein’). This chain lacks identifiers, which makes
it impossible to do strict classification of the start/end residues. Here we
need to guess this residue should not be part of the chain and instead
introduce a break, but that will be catastrophic if they should in fact be
linked. Please check your structure, and add ANX to residuetypes.dat
if this was not correct.

Identified residue SER0 as a ending terminus.
8 out of 8 lines of specbond.dat converted successfully
Select start terminus type for SER-0
0: NH3+
1: NH2
2: None

Thank you,
Deniz

Here you’re saying two different things. If you modify residuetypes.dat, it must be in the working directory. This file is not designed to be read from within a force field subdirectory.

Dear Justin,

With your input, the problem is solved. Thank you very much.

Best regards,
Deniz