Include 9AZR force field residue for ATB automatically generated pdb file

GROMACS version:2021.1
GROMACS modification: No

Dear Gromacs users,
I have downloaded the gromacs files from ATB (mol= 34570). I could find the pdb file and the rtp file. I do not touch the molecule file (9AZR), which is the same label of the residue. I call both rtp and pdb files with the same name (eventually I tried 9AZR itself). Nevertheless, when running “blindly” gmx pdb2gmx -f 9AZR.pdb, there is an unknown 9AZR residue.
What am I doing wrong? I believe gromacs should read the 9AZR information from the rtp file, but this is not the case.

Thank you for the support
Marco

Hi,
Yes, a rtp is needed by gmx pdb2gmx to make a GROMACS topology for a protein contained in a pdb.
The residue in the pdb file should be defined in the rtp using the same name.
For example in the pdb file you have LYS in the rtp file should be a directive with the same name [ LYS ] . See here for rtp format (see File formats — GROMACS 2021.1 documentation)

pdb2gmx should also know which rtp file has to read. you provide this information when you select the force field xxxx (that corresponds to force field directory xxxx.ff)

As far as I understood in your protein you have a residue labelled ‘9AZR’, that gmx pbd2gmx can not find the rtp file. you can check that 1) 9AZR is a residue name in pdb file, 2) that rtp in the selected force field directory include [ 9AZR ] directive.

If you want that 9AZR is recognized as a protein residue it should also be added to residuetypes.dat (extra)

I hope it helps.
Best regards
Alessandra

Dear Alessandra,

absolutely, thank you so much!

There is no [ 9AZR ] directive in any of my files.
I add a “[ 9AZR ] ; mandatory” line in the .rtp, which now looks like this:

[ LIG ]  ; <------- added here. Does the position matter?
[ moleculetype ]
; Name   nrexcl
9AZR     3 
[ atoms ]
....

At the moment I copied the xxxx.ff directory in my running location, and running:

gmx pdb2gmx -f 9AZR.pdb -ff oplsaa -water spce
...
Problem with chain definition, or missing terminal residues. This chain does not appear to contain a recognized chain molecule. If this is incorrect, you can edit residuetypes.dat to modify the behavior.
...
Fatal error:
Residue 'LIG' not found in residue topology database

I understand I should introduce a 9AZR line in $GMXPREFIX/share/gromacs/top/residuetypes.dat but what kind is it? Protein/Ion?

So what I did was just to add [ 9AZR ] line at the very beginning of the .rtp file but nothing changed. Shall I have a .itp file as well? Or maybe I am just missing the meaning of [ moleculetype ] in the .rtp file.

Once again, thank you very much.
Best Regards

Marco

Hi,

you can copy residuetypes.dat in the working directory (better not modified the default one). The program should be able to read it from your working dir. You can check it in the standard output.
If you want that 9AZR is recognized as protein residue add a line in the file where you define it as “Protein”.
It should work (but not % 100 sure)

Note also that gmx pdb2gmx is stopping because ‘LIG’ is missing in tpr file.

best regards
Alessandra

9AZR0 is now recognised as a terminus, but not working yet.

...
Identified residue 9AZR0 as a starting terminus.
Identified residue 9AZR0 as a ending terminus.
... 
Fatal error:
Residue type '9AZR' not found in residue topology database

BR
Marco

Hi,
I am a little focused. I have just noticed that

does not have the format of an rtp file (see File formats — GROMACS 2021.1 documentation.
but more is the format of a topology file (see https://manual.gromacs.org/current/reference-manual/topologies/topology-file-formats.html).

Can it be that you have got already topology file?

Best regards
Alessandra

Hello,

sorry about the confusion, same here I’d say:

I have a ITP file downloaded from the ATP server as - GROMACS G54A7FF All-Atom
http://atb.uq.edu.au/download.py?molid=34570&hash=LATEST&outputType=top&ffVersion=54A7&file=rtp_allatom

the confusion came from it being downloaded as rtp_allatom.
I tried to create a topol.file containing only the “#include file.itp” line, but it does not work.

Regarding LIG/9AZR, sorry for the confusion again.
I must have renamed it at some point or just can’t find the right one, but I am getting the residue error always when downloading files from ATB.

Thank you.

Hi,

yes, including the itp in the topol.top file should works. If the problem is still actual could you share your top and itp file and maybe gro file, and I can have a quick look.

Best regards
Alessandra

Dear Alessandra,

that would be so kind of you!
I am attaching the pdb and itp files (downloaded from ATP + the [ UYKS ] line, being UYKS the missing residue name) as .log, plus the topology file I created.

I am probably making another mistake here, as I probably could go directly to the grompp stage (which by the way gives a “ERROR: Invalid directive UYKS”).

Many thanks
Marco

TFSI_itp.log (7.5 KB) TFSI_pdb.log (1.7 KB) topol.top (20 Bytes)

Hi,

your topol.top should have the following parts (now 1 and 3 are missing). Add them in the order I am listing them.

  1. the force field where all the information on the force field are storage including the non-bonded parameter are storage. In your case this is GROMOS 54A7.

#include “gromos54a7.ff/forcefield.itp”

  1. molecule description, that is in the file TFSI.itp, if the file is in the working directory. Note to remove [UYKS] from the itp file

#include “TFSI.itp”

  1. system description

[ system ]
UYKS

[ molecules ]
;molecule name nr.
UYKS 1

See here the urea example long and short topol.top file (short one is like your case) File formats — GROMACS 2021.1 documentation

Alessandra

Hello Alessandra,
thank you vert much.
By doing this we actually bypassed pdb2gmx and wrote topol.top by hand.
Now my topol.top looks like this:

#include "gromos54a7.ff/forcefield.itp"
#include "TFSI.itp"

[ system ]
UYKS

[ molecules ]
;molecule name nr.
UYKS 1 

and I try directly to run grompp, but I get the following error:

gmx grompp -f ions.mdp -c TFSI.pdb -p topol.top
...
ERROR 1 [file TFSI.itp, line 74]:
  Atomtype NOpt not found
...

“NOpt” does not exist in any of the gromacs-2021.1/share/top/ folder. I am trying to generate the pdb with other resources other than ATB (namely LigParGen, charmm-gui, CGenFF and so on…) hoping to find atom types that are recognised by gromacs.
What should I do in this case?

Thank you
Marco

Hi,
Indeed NOpt is not listed in gromacs-2021.1/share/top/gromos54a7.ff/ffnonbonded.itp.
You have to contact the authors of the server to know which nonbonded parameters describe Nopt? or if there is error in the atom name? (?maybe it is just N).

Once you have the nonbonded parameters (if there are not the standard one) , you can added them at the begin of topol.top by adding a directive
[ atomtypes ]

[ nonbond_params ] if necessary.

Best regards
Alessandra

ATB uses several non-standard atom types, which are described in their FAQ. You need to download their modified force field files if your molecule uses any of these non-standard atom types.

Thank you for the answer and the explanations.
Using ATB force field solves the residue problem.

Unfortunately the box size is too small:

ERROR: The cut-off length is longer than half the shortest box vector or
longer than the smallest box diagonal element. Increase the box size or
decrease rlist.

I tried to set “rlist” to 0.1 and 0.01 (just to see if it went through), but it looks there is no change.
I could modify the box size in gmx editconf, but I have no working .gro file (gmx pdb2gro always give me the : “Fatal error: Residue type ‘UYKS’ not found in residue topology database”).
Any hint would be appreciated

Thank you
Marco

You don’t need to use .gro format, but in any case the transformation should be entirely independent of pdb2gmx. Likely your file has box vectors of zero, which should be set properly using editconf -box or editconf -d.

Dear all,

I could finally get this done.
I downloaded the residues from LigParGen instead since I could not find a Li residue in Gromos, while I could in opsl-aa.
Both residues came with the same atom type label for different atoms, eg:

TFSI.itp:        1   opls_800      1    TFS   O00      1    -0.4053    15.9990
diglyme.itp:     1   opls_800      1    DIG   C00      1    -0.0486    12.0110

neither of which was in the oplsaa.ff/ffnonbonded.itp.
Moreover I was really confused since the downloaded .itp files came with the [ atomtypes ] in the top file. So I took that directive from both files and, after properly renaming I plugged those in the local oplsaa.ff/ffnonbonded.itp file.
Now they look like this:

grep opls_800  TFSI.itp diglyme.itp oplsaa.ff/ffnonbonded.itp
TFSI.itp:     1   TFS_opls_800      1    TFS   O00      1    -0.4053    15.9990
diglyme.itp:     1   DIG_opls_800      1    DIG   C00      1    -0.0486    12.0110
oplsaa.ff/ffnonbonded.itp: TFS_opls_800  O800    15.9990     0.000    A    2.96000E-01   7.11280E-01 ; TFSI.itp
oplsaa.ff/ffnonbonded.itp: DIG_opls_800  C800    12.0110     0.000    A    3.50000E-01   2.76144E-01 ; diglyme.itp

This made the calculation work. Is what I am doing correct?
Moreover I see no effect of the residuetype.dat, either I put them as Protein or Other, does not seem to import. I presume if it is not found in the local directory, gromacs will look for the original one.

Thank you for the support.

Marco

Moving further in @jalemkul tutorial: Lysozyme in Water , when using gmx grompp for equilibration, I encounter a

Fatal error:
Group Protein referenced in the .mdp file was not found in the index file.  
Group names must match either [moleculetype] names or custom index group
names, in which case you must supply an index file to the '-n' option
of grompp.

What input should I give to gmx make_ndx? If I give only TFSI (similar thing for the solvent) I get

0 System : 15 atoms
1 Other : 15 atoms
2 DIG : 15 atoms

If I give the whole neutralised box (with Li ions) I get a much longer list:

0 System : 1373 atoms
1 Other : 1372 atoms
2 TFS : 15 atoms
3 DIG : 1357 atoms
4 LI : 1 atoms
5 Ion : 1 atoms
6 TFS : 15 atoms
7 DIG : 1357 atoms
8 LI : 1 atoms

Shall I group together all LIs and TFSs and DIGs? what is the logic here?
Thank you

Marco

The issue came from the temperature coupling - or better from not having defined groups in my simulation and for wanting to use tau_t and ref_t from the example.

Once commented out that part, the simulation could proceed and I get a temperature profile dropping immediately from 300 down to 200K and then increasing up to around 250K.

I need to figure this out before taking such result as good.

Thank you
Marco

You need some form of temperature coupling if you wish to maintain a constant temperature, but you can’t necessarily copy and paste .mdp settings from files designed for different systems. If you’ve got some small system with mixed components, tc-grps = System is the only appropriate setting for this keyword. You then need to specify tau-t and ref-t appropriately.

What do you mean by small mixed components?
I have 3 parts composing my system: an anion (TFSI+), whose itp/pdb I downloaded from LigParGen, a solvent, diglyme - always from LigParGen and Li cations coming from equilibration.
The topology looks like this:

 [ molecules ]
 ;molecule name nr.
 TFS               1
 DIG         59
 LI               1

Do I need to couple them? How do I use make_ndx, provided I need to create groups?

Thank you again
M.