Error Coordinat not matching gromacs

Hello, when i’m running my EM i’m facing this issue( i procide the file below, could give me any help) :
here is the command

gmx grompp -f em.mdp -c solv_ions.gro -p topol.top -o em.tpr

Program: gmx grompp, version 2024.2
Source file: src/gromacs/gmxpreprocess/grompp.cpp (line 706)

Fatal error:
number of coordinates in coordinate file (solv_ions.gro, 33491)
does not match topology (topol.top, 33509)

topol.top (703.8 KB)
solv_ions.txt (1.4 MB)
topol.top (703.8 KB)

There are six water molecules more in the topol.top file than in the coordinate file. Try

Protein_chain_A     1
LIG                 1
SOL             10277
Cl                  6

Hello,

I tried to resolve the issue, and it works now. I checked the topol.top file, and I found that the water molecules (SOL) start from 165SOL and end at 10441SOL. The subtraction gives 10276, not 10277. I can’t figure out where the extra molecule comes from.

I would also like to ask if this is normal when building the complex in GROMACS. I am following the simple procedure: I run the command gmx pdb2gmx -f PRO.pdb -o PRO_processed.gro for the protein, then for the ligand gmx editconf -f LIG.pdb -o LIG.gro.

After that, I form the ligand-protein complex by adding the specific content of LIG.gro into PRO_processed.gro. However, when I visualize the result (complex.gro) , I get the following image. Is this normal? In other tests I’ve performed with the same protein and two different ligands, one sits directly in the protein while the other is positioned at a relatively long distance.
Here the file content
File and image

Thank you in advance for your help.

Best regards,

165SOL to 10441SOL sounds like 10277 solvent molecules to me.

When you add LIG.gro to the PRO_processed.gro the coordinates must match. Otherwise it can end up anywhere in the combined structure. How did you get the coordinates of the ligand? If there were extracted from a protein-ligand complex, have the protein coordinates changed after that? E.g., have you changed the size of the periodic box? Are the coordinates of the ligand the same in LIG.pdb as in LIG.gro (keep in mind that the units are different)? I’m afraid that the two images in the folder you linked do not help that much since they show the system from different angles.

Thank you for your feedback.

For my first test, I took a protein from the RCSB server with the ligand JZ4. I removed that ligand, added hydrogen atoms, and then generated the topology file using the SwissParam server (not CGenFF beacuse they didn’t provide the .itp file that I needed later). After that, I took the JZ4 ligand and ran the process (mentioned on my first reply) with another protein called Elastase, and the complex was built perfectly.

For the second ligand, I obtained it from PubChem and followed the same process. However, in this case, the ligand ended up a bit far from the Elastase protein. Regarding the periodic box size, I did not see where i can meke changes in that area.

Does this align with what you were expecting?

If you are extracting ligands from other sources (or in complex with other proteins) and just put them in the protein system, the ligands can end up anywhere.

You will have to find a way to create a suitable starting configuration. It might help to center the ligand (LIG.gro) in its own system (e.g., using gmx editconf with the -c option) and then inserting it at a suitable position in the protein system. gmx insert-molecules can help you with that, and you can specify the position you want the molecule by using the -ip option.

Thanks for the help! I believe I was able to resolve the initial issue. After creating the protein.gro, I regenerated a new Protein.gro by defining the size in the XYZ dimensions. For the ligand, I created the .gro file using the same XYZ coordinates as the protein, and it perfectly complexed with the ligand.

I would also like to ask some advice on another problem. I am working with a protein, collagenase 1CGL (PDB ID: 1CGL), and when applying the force fields (either CHARMM36 or AMBER99SB-ILDN), I encounter the following error:

Program: gmx pdb2gmx, version 2024.2
Source file: src/gromacs/gmxpreprocess/pdb2gmx.cpp (line 870)

Fatal error:
Atom HE2 in residue HIS 113 was not found in rtp entry HSD with 17 atoms
while sorting atoms.

For a hydrogen, this can be a different protonation state, or it
might have had a different number in the PDB file and was rebuilt
(it might for instance have been H3, and we only expected H1 & H2).
Note that hydrogens might have been added to the entry for the N-terminus.
Remove this hydrogen or choose a different protonation state to solve it.
Option -ignh will ignore all hydrogens in the input.

Even when using the -ignh option, the error persists. Do have any idea to fix ths kind of issue.

If -ignh doesn’t work, I don’t have any specific suggestions. Would you be able to post the input files somewhere?

Thank you for your message. Here’s what happened in my case:

Test 1: I ran the command:

gmx pdb2gmx -f PRO.pdb -o PRO_processed.gro

I selected force field 6 (AMBER99SB-ILDN), but I encountered the same issue with the CHARMM force field as well. After choosing the water model, I received the following error:

Program:     gmx pdb2gmx, version 2024.2
Source file: src/gromacs/gmxpreprocess/pdb2gmx.cpp (line 870)
Fatal error:
Atom HE2 in residue HIS 113 was not found in rtp entry HID with 17 atoms
while sorting atoms.
For a hydrogen, this can be a different protonation state, or it
might have had a different number in the PDB file and was rebuilt
(it might for instance have been H3, and we only expected H1 & H2).
Note that hydrogens might have been added to the entry for the N-terminus.
Remove this hydrogen or choose a different protonation state to solve it.
Option -ignh will ignore all hydrogens in the input.

Test 2: I then ran the command with the -ignh option:

gmx pdb2gmx -f PRO.pdb -o PRO_processed.gro -ignh

However, this led to another error:

Program:     gmx pdb2gmx, version 2024.2
Source file: src/gromacs/gmxpreprocess/pgutil.cpp (line 154)
Fatal error:
Residue 1 named VAL of a molecule in the input file was mapped
to an entry in the topology database, but the atom CG1 used in
that entry is not found in the input file. Perhaps your atom
and/or residue naming needs to be fixed.

Could you provide any suggestions on how to resolve this? Here is the input file I used: Input file on format Pdb and txt

OK. So -ignh did help.

Are you choosing the correct terminal modification for VAL1?

The process didn’t allow me sufficient time to reach the stage where I could define the N- or C-terminal. I ran the following command:

gmx pdb2gmx -f PRO.pdb -o PRO_processed.gro -ignh -ter

However, I encountered the same issue:

Residue 1 named VAL of a molecule in the input file was mapped
to an entry in the topology database, but the atom CG1 used in
that entry is not found in the input file. Perhaps your atom
and/or residue naming needs to be fixed.

OK. Have you checked if there is a CG1 atom in VAL1 in your PRO.pdb input file?

Observation of Valine Atoms:

  • In Chain A, all Valine (VAL) atoms have the atom CG1, except for VAL101.
  • In Chain B, all Valine (VAL) atoms have the atom CG1.

(See the image below.)


Since Val101 is the first residue, that corresponds to “Residue 1 named VAL”. So that residue is missing an atom (there might be more missing atoms). You will need to find a way to model these with other software. GROMACS cannot fix broken proteins.

Can you guide me a little on the procedure to follow?

I’ve never used such tools myself, but see, e.g., https://www.quora.com/How-can-I-complete-a-missing-atom-in-a-protein-I-am-unable-to-do-the-pdb2gmx-can-someone-please-assist-me-to-complete-the-missing-atom-in-this-protein-4MGP-pdb-you-can-download-the-protein-from-the-link-given-below for some advice. It’s possible that https://www.researchgate.net/post/How_to_build_a_missing_non-terminal_amino_acid_residue_in_a_broken_secondary_structures_eg_alpha_helices_of_a_protein_model can also be of use.

OK, i will check the link and like you said find a way to build a new model by add the missing atoms.

In other hand, do you ever face this issue when running the procces of force field

Fatal error:
The residues in the chain CU400–HD402 do not have a consistent type. The
first residue has type ‘Ion’, while residue HD402 is of type ‘Other’. Either
there is a mistake in your chain, or it includes nonstandard residue names
that have not yet been added to the residuetypes.dat file in the GROMACS
library directory. If there are other molecules such as ligands, they should
not have the same chain ID as the adjacent protein chain since it’s a separate
molecule.

and also this kinf of issue

Program: gmx pdb2gmx, version 2024.2
Source file: src/gromacs/gmxpreprocess/resall.cpp (line 616)
Fatal error:
Residue ‘Cu’ not found in residue topology database

Check your input files. As it says: “If there are other molecules such as ligands, they should not have the same chain ID as the adjacent protein chain since it’s a separate molecule.” Modfiy the chain ID if you have to.

Does the force field you are using have parameters for Cu ions?

Regarding the issue with the CU residues, they are part of the heteroatoms (HETATM) section of my molecule, specifically CU400, CU401, HO402 etc. I am using the Amber99sb-ildn force field.

In my setup:

  1. The CU residue is included in my residuetypes.dat.
  2. I added the entry to aminoacids.rtp as follows
[ CU ]
 [ atoms ]
   CU         CU    0.00000   ; Charge of  0.00000

  1. I verified that CU 63.546 is already present in the atomtypes.atp file.

Despite these steps, I still encounter the same error when running the process.

Fatal error:
Residue 'CU' not found in residue topology database

I’ve attached the relevant images and files used in my setup for further reference.
File

Could you post the PDB file as well? Have you checked the chainID? I don’t think the problem is whether they are ATOM or HETATM.

It would be good to have a possibility to test this. Could you provide a link to the required set of input files, including the whole force field directory? I.e., if you are allowed to share the files at all.