Topology file after pdb2gmx

GROMACS version: 2021.4
GROMACS modification: Yes/No
Here post your question : After pdb2gmx command, I got the topology file of protein-water system. The topology file has the following for the system-
Protein name; 6 ARTEFACT. THE FIRST RESIDUE OF THE MATURE PROTEIN AFTER SIGNAL; 7 PEPTIDE CLEAVAGE (A) IS MISSING. in water
Should this be ignored or will it affect my simulation result?

Hi, this name is probably taken from the TITLE entry of your source .pdb file, meaning it’s additional information provided by the authors of the structure when depositing it in the database.

To know if that affects your simulation, read carefully both the description in the PDB database and, more importantly, the article describing how this structure was prepared, solved and postprocessed.

I passed this protein.pdb file through the PDBreader on CHARMM-GUI, then I used the obtained PDB file for generating topology file (topol.top) using pdb2gmx, and that missing residue did not show in the topology file, it just showed ‘Protein in water’. Will there be any mistake if i proceed with this topology file? (I am using charmm36 ff for my simulation)

It’s not going to be a missing residue per se, it’s an information about how protein expression/processing was done experimentally prior to structure determination.

I’m trying to say it’s not a technical question of “is this structure valid”, but a scientific question of “are the details of the signaling peptide processing relevant to whatever I’m trying to address with my simulation” (95% of the time they are not, but not always). Hence, it’s the simulator’s job to be aware of the limitations and peculiarities of the initial structure.

After googling, this seems to be the full message:

The first residue (S) of the construct is a cloning artefact. The first residue
of the mature protein after signal peptide cleavage (A) is missing. This is a
chimeric protein of NC2 (Uniprot Q7S3P5, residues 19-80 and 91-97) and EAS
(Uniprot Q04571, residues 88-105).

What it means is, if at some point you do an analysis and determine that a residue from that hybrid part is doing something interesting, you should know it’s not going to be a physiological effect because this protein does not correspond to the one found in the living organism. So you can choose between (a) fixing it to make it more like the physiological structure, if worthwhile, or (b) ignore it and keep in mind this part of the system is artificial.