Weird RMSF curve

GROMACS version: 2020.2
GROMACS modification: No
Here post your question

Dear All
I calculated the rmsf curve using “gmx rmsf -s step4.0_minimization.tpr -f outputbox.xtc -o rmsf_per_resi.xvg -ox average.pdb -res”
trajectory is after concatenating 80ns simulation followed by cluster and center.
the rmsd curve is looks ok. Why rmsf is so weird?


Are you simulating a dimer? Chain A has residues ~420 to ~520 and chain B has 1 to ~60?

Yes. Its dimer. Residue no are same as you mentioned. How to fix it?

Analyze each chain separately.

Hello @jalemkul and @veerubiotech,

Do we analyze each group separately by making the ndx file? If so, which selections do we make to generate the rmsf curve of different residues? I created the ndx file and selected Protein + C-alpha, using the command gmx rmsf -s md1.tpr -f md1_center.xtc -n proteinca.ndx -o rmsf_protein.xvg -res but I get this similar curve.

Thank you in advance for your suggestions and expertise.

Similar to what? What are the two conditions. Note that “Protein + C-alpha” is already a default group (C-alpha) so there’s no point in creating such a group. You would have to select by residue to get a chain into one group. Using -res has no effect when your selection contains only one atom per residue (Cα).

Sorry for the confusion, and thank you for your reply. I am getting an RMSF curve where there are the connections as there are above when I use -res. I see that you mentioned to separate the chains, but have not been able to do so using make_ndx.

You have to create index groups by ranges of residues or provide an input coordinate file (PDB) that supports chain identifiers (and has them) so you can make the selection that way.

I see, thank you for your reply. I attempted this by using gmx editconf -f protein.tpr -o protein.pdb to create a pdb file. Next, I used gmx make_ndx -f protein.pdb -o prot.ndx. With this, I am not seeing the chain identifiers in the menu. Have I made some error in this approach?

Does “protein.pdb” have a valid chain identifier in it?

If it does, it’s a simple matter of specifying “chain A,” “chain B,” etc. at the make_ndx prompt.

How can I check this? It does not seem to be available through make_ndx.

Inspect the PDB file for chain identifiers. If they’re not there, add them (editconf -label, etc.). The option to select by chain is absolutely part of the make_ndx syntax. I just used it to confirm.

Analysing Protein...

  0 System              :  1079 atoms
  1 Protein             :  1001 atoms
  2 Protein-H           :  1001 atoms
  3 C-alpha             :   129 atoms
  4 Backbone            :   387 atoms
  5 MainChain           :   517 atoms
  6 MainChain+Cb        :   634 atoms
  7 MainChain+H         :   517 atoms
  8 SideChain           :   484 atoms
  9 SideChain-H         :   484 atoms
 10 Prot-Masses         :  1001 atoms
 11 non-Protein         :    78 atoms
 12 Water               :    78 atoms
 13 SOL                 :    78 atoms
 14 non-Water           :  1001 atoms

 nr : group      '!': not  'name' nr name   'splitch' nr    Enter: list groups
 'a': atom       '&': and  'del' nr         'splitres' nr   'l': list residues
 't': atom type  '|': or   'keep' nr        'splitat' nr    'h': help
 'r': residue              'res' nr         'chain' char
 "name": group             'case': case sensitive           'q': save and quit
 'ri': residue index

> chain A

Found 1079 atoms with chain identifier A

 15 chA                 :  1079 atoms

Thank you, that worked as well. But, after using the -res identifier I still get an RMSF plot as above, where some of the peaks are connected. What else may cause this?

@L_k please provide all relevant commands, how you created the index group, and upload/embed the image of your RMSF plot. So far, I’ve been working blind. It’s really hard to make any suggestions that way.

@jalemkul I apologize for the inconvenience. Here are the steps I have carried out:

  1. gmx editconf -f protein.tpr -o protein.pdb
  2. gmx make_ndx -f protein.pdb -o protA.ndx
    –make selection: chain A
  3. gmx rmsf -s protein.tpr -f protein_center.xtc -n protA.ndx -o rmsf.xvg
  4. convert xvg to csv and open in Excel to create plot

The plot appears like this:

Thank you for your time.

What do you think is wrong with the plot? Looks reasonable to me. Appears that there is a couple of segments where the residue number jumps by around 20.

I’m not going to try to guess what Microsoft products do, because it’s usually weird. I would strongly recommend breaking away from such software and using normal scientific plotting tools.

What do you expect from the plot? How many residues do you have in each chain? How does this output differ from what you were expecting? It looks like there may be three separate traces, indicating a homotrimer?