IUPAC Nomenclature in Hydrogen Naming with pdb2gmx

Hi All,

After doing some work with forcefield parameterisation I came across this issue with the hydrogen addition in pdb2gmx.

https://gromacs.org-gmx-users.maillist.sys.kth.narkive.com/0UVosHTc/gmx-users-pdb2gmx-and-hydrogen-nomenclature#

I’ve been tinkering with a quick fix in GROMACS for hydrogen naming conventions, especially regarding the IUPAC standards for methylene hydrogens. I added a command-line option (--iupac) to pdb2gmx. This option ensures that when generating topologies, methylene hydrogen numbering aligns with IUPAC nomenclature, i.e. starting from HB2 instead of HB1.

It’s a very quick and simple solution that seems to work for the cases I’ve tested, but I’m cautious about potential edge cases I might not have covered. At the moment, it just checks if the hydrogen addition method from the .hdb file is type 6 (i.e. two tetrahedral hydrogens). This only requires a few changes to pdb2gmx.cpp and genhydro.cpp. While it’s not exactly super useful currently as most forcefields have already been ported to accommodate this, it could streamline future forcefield porting. Please let me know if you would like me to upload these changes somewhere. I’m interested to get your thoughts, suggestions, or any feedback if you’ve encountered similar issues or have insights into potential pitfalls with this approach.

Looking forward to your thoughts!

Hi!

Would you be able to join the next GROMACS developer video conference this week, 2024-02-21T16:00:00Z2024-02-21T17:00:00Z?

In the meantime, if you already have the code, you are welcome to open a merge request at our GitLab.

Yes I am able to join. If I can get some time before then I’ll see if I can open a merge request to gitlab.

Thanks,
Tye

1 Like

You may want to study the file atom_nom.tbl that is part of the gromacs share/top installation. In short, there is so much history with incorrect hydrogen naming that it is probably not a good idea to “correct it”. Each force field and software has its own version of hydrogen naming and it would not be a good idea to change e.g. the names of H atoms in Amber or Charmm because a lot of downstream software may break, e.g. VMD, Chimera, you name it.

I’m not sure if I understand what you are trying to say. I have read through the atom_nom.tbl and it includes information for MSI, XPLOR and GROMACS only. The idea I’m proposing is to add an additional option to pdb2gmx to allow force fields with nomenclature aligning with PDB/IUPAC standards. This will allow easier force field conversion between programs such as AMBER. Below is an excerpt of the nomenclature for ARG in AMBER for the ff19SB. You will see that methylene hydrogens are named as HB2 and HB3 rather than HB1 and HB2. Currently GROMACS does not allow for this nomenclature due to the way genhydro.cpp works. The merge request I’ve submitted adds an option to pdb2gmx which allows for this naming convention but maintains functionality for all GROMACS current force fields.

“N” “N” 0 1 131072 1 7 -0.347900
“H” “H” 0 1 131072 2 1 0.274700
“CA” “XC” 0 1 131072 3 6 -0.263700
“HA” “H1” 0 1 131072 4 1 0.156000
“CB” “C8” 0 1 131072 5 6 -0.000700
“HB2” “HC” 0 1 131072 6 1 0.032700
“HB3” “HC” 0 1 131072 7 1 0.032700
“CG” “C8” 0 1 131072 8 6 0.039000
“HG2” “HC” 0 1 131072 9 1 0.028500
“HG3” “HC” 0 1 131072 10 1 0.028500
“CD” “C8” 0 1 131072 11 6 0.048600
“HD2” “H1” 0 1 131072 12 1 0.068700
“HD3” “H1” 0 1 131072 13 1 0.068700
“NE” “N2” 0 1 131072 14 7 -0.529500
“HE” “H” 0 1 131072 15 1 0.345600
“CZ” “CA” 0 1 131072 16 6 0.807600
“NH1” “N2” 0 1 131072 17 7 -0.862700
“HH11” “H” 0 1 131072 18 1 0.447800
“HH12” “H” 0 1 131072 19 1 0.447800
“NH2” “N2” 0 1 131072 20 7 -0.862700
“HH21” “H” 0 1 131072 21 1 0.447800
“HH22” “H” 0 1 131072 22 1 0.447800
“C” “C” 0 1 131072 23 6 0.734100
“O” “O” 0 1 131072 24 8 -0.589400

Sure I’m aware of these discrepancies. Just saying that if you change the names from the force field names to the official IUPAC names it may break codes outside GROMACS. There may be occasions when people want to move between softwares.
In summary, it sounds like a support nightmare. At the very least your code should be allowed to convert back to Amber or Charmm notation I would say.

Anyway, I will not engage in commenting the patch so you are welcome to ignore my opinion.

But as I understand it ff19SB uses naming that GROMACS currently doesn’t support. So we would need this option exactly for the reasons you mention.

Not necessarily, the names are converted by Gromologist to our currently supported scheme during force field conversion.