Can a lysine residue be set as neutral, i.e. zero charge without protonation?

GROMACS version:2021
GROMACS modification: No
Here post your question

I have tested many forcefields (OPLS-AA/L, amber14sb, AMBER99SB-ILDN) and many proteins using the pdb2gmx command:
gmx pdb2gmx -f protein.pdb -o protein_processed.gro -water tip3p -inter -ignh -merge interactive

This command involves assigning the protonation states interactively. It appears that whenever lysine is set as 0 charge (type 0), I will get the error message:

The residues in the chain PRO1--ARG436 do not have a consistent type. The first residue has type 'Protein', while residue LYSN45 is of type 'Other'. Either there is a mistake in your chain, or it includes nonstandard residue names that have not yet been added to the residuetypes.dat file in the GROMACS library directory. If there are other molecules such as ligands, they should not have the same chain ID as the adjacent protein chain since it's a separate molecule.

This error simply disappears when lysine is all set as 1 charge (type 1).

Can I ask how to give 0 charge to a lysine residue?

The pdb file is uploaded.
protein.pdb.log (522.5 KB)

LSN is simply not a residue name typically encountered in a protein, even if it is built into a force field. The solution is simple. Add a line that says LSN Protein to residuetypes.dat.

Thank you. But you can see that my PDB only has LYS, not LSN. When prompted to assign the protonation, it is either LYS or LYSH. So why LYSN appears?

Which LYSINE type do you want for residue 45
0. Not protonated (charge 0) (LYS)
1. Protonated (charge +1) (LYSH)

Nomenclature varies across different force fields, and there is no guarantee that “LSN” (which may appear in a PDB file) is valid in the context of the force field.

Hi Justin, I mean, my PDB only has LYS, not LSN. So why LSN still appears?

Please provide the complete screen output from pdb2gmx for the command that gave the error you first posted.

log.log (18.9 KB)
topol.top (512 Bytes)

Hi, the log file is attached. Here I set LYS 45 as 0 charge, while the other LYS are all 1 charge. If I set any other LYS as 0 charge, the same issue happens.

You can get my PDB in my first post.

I have checked that LSN Protein is already in the residuetypes.dat file, but LYSN Protein does not exist. So I added LYSN Protein to the residuetypes.dat file.

I can move on now. But it is still quite strange that the default residuetypes.dat file does not have LYSN Protein.

Or I still do not understand why LYSN (not LSN) appears. The prompt only asks either LYS or LYSH, not LYSN.

Which LYSINE type do you want for residue 45
0. Not protonated (charge 0) (LYS)
1. Protonated (charge +1) (LYSH)

.

LYS and LYSH are what the force field understands to be neutral and protonated lysine, hence this is what you are prompted to choose between. The force field includes an aminoacids.r2b file that translates these names in to residue names that are common across GROMACS force fields, which means LYS is interpreted as LYSN.

Thank you. Since LYSN is translated, LYSN should be included in the default residuetypes.dat file. Hope this can be added in the next Gromacs release.

File a bug report on GitLab. That’s the only way to get it fixed.

1 Like

I use the Gromacs 2021 version on our university cluster. Because this LYSN bug is not updated, it recognises fewer atoms in the protein entity.

Does it mean, the MD using this “Gromacs 2021 LYSN-bug” version is imperfect?
For example, tc-grps = Protein Non-Protein would incorrectly consider LYSN residues as non-protein residues during the MD.

image

Make an index that merges the Protein and LSN groups, and then manually merge solvent and ions. Use those instead of the default Protein and Non-Protein groups.

Thank you. Do you mean I should do this before the MD, or I just need to do this during analysis?

Can I ask would it make a big difference on the performance of MD with LYSN separated? I have already run quite a long MD, thus do not want to redo it. Certainly I will include LYSN in the protein during analysis.

It’s irrelevant for anything except the thermostat. I have no idea how to predict the problems, but you have an integral residue of the protein with velocities modulated by the properties of the solution, which is clearly wrong.

Hi Justin, thank you. I will redo it. I can see the Gromacs 2022 version has fixed the bug, so I will ask our HPC staff to install it.