I’ve been tinkering with a quick fix in GROMACS for hydrogen naming conventions, especially regarding the IUPAC standards for methylene hydrogens. I added a command-line option (--iupac) to pdb2gmx. This option ensures that when generating topologies, methylene hydrogen numbering aligns with IUPAC nomenclature, i.e. starting from HB2 instead of HB1.
It’s a very quick and simple solution that seems to work for the cases I’ve tested, but I’m cautious about potential edge cases I might not have covered. At the moment, it just checks if the hydrogen addition method from the .hdb file is type 6 (i.e. two tetrahedral hydrogens). This only requires a few changes to pdb2gmx.cpp and genhydro.cpp. While it’s not exactly super useful currently as most forcefields have already been ported to accommodate this, it could streamline future forcefield porting. Please let me know if you would like me to upload these changes somewhere. I’m interested to get your thoughts, suggestions, or any feedback if you’ve encountered similar issues or have insights into potential pitfalls with this approach.
You may want to study the file atom_nom.tbl that is part of the gromacs share/top installation. In short, there is so much history with incorrect hydrogen naming that it is probably not a good idea to “correct it”. Each force field and software has its own version of hydrogen naming and it would not be a good idea to change e.g. the names of H atoms in Amber or Charmm because a lot of downstream software may break, e.g. VMD, Chimera, you name it.
I’m not sure if I understand what you are trying to say. I have read through the atom_nom.tbl and it includes information for MSI, XPLOR and GROMACS only. The idea I’m proposing is to add an additional option to pdb2gmx to allow force fields with nomenclature aligning with PDB/IUPAC standards. This will allow easier force field conversion between programs such as AMBER. Below is an excerpt of the nomenclature for ARG in AMBER for the ff19SB. You will see that methylene hydrogens are named as HB2 and HB3 rather than HB1 and HB2. Currently GROMACS does not allow for this nomenclature due to the way genhydro.cpp works. The merge request I’ve submitted adds an option to pdb2gmx which allows for this naming convention but maintains functionality for all GROMACS current force fields.
Sure I’m aware of these discrepancies. Just saying that if you change the names from the force field names to the official IUPAC names it may break codes outside GROMACS. There may be occasions when people want to move between softwares.
In summary, it sounds like a support nightmare. At the very least your code should be allowed to convert back to Amber or Charmm notation I would say.
Anyway, I will not engage in commenting the patch so you are welcome to ignore my opinion.