GROMACS version: 2021.3
GROMACS modification: /No
Dear all,
I am new to GROMACS. I have carried out an MD simulation of a protein & ligand complex and I got confused about RMSF analysis. I have conveniently done RMSD analysis of my complex using my command prompt but I am unable to repeat the same calculation for the generation of RMSF plot, in fact, I am not able to figure out whether I did RMSF calculation for my complex or my only using the command. I see several articles making comparisons of RMSF results between their studied proteins and protein & ligand complexes but I could not figure that out, so I could not apply the same procedure to my research. I’ve searched on the web and on GROMACS forum about it and couldn’t find a proper solution.
Thank you in advance.
Dear Harunalcakan,
I’m not completely sure I understand your problem.
What do you mean by:
You can use the gmx rmsf command for each of your trajectories to get the RMSFs. There are a few different options of output files as well. You can get XVG files (suitable for plotting graphs) or as b-factors in PDB files (good if you want to visualize the results projected onto your protein structure). In these files you should be able to see which parts were included in the calculation.
Can you provide some more details about what you tried to do (paste the commands you ran) and the results your were expecting?
Dear Cathrine,
First of all I’m sorry if I couldn’t express my problem properly since I’m not a native English speaker.
What I tried to do is to calculate and to plot the RMSF of the protein and the protein-ligand complex separately, in order to compare the RMSF results between the complex and the protein (if this kind of investigation is possible). I had no problem with generating rmsf.xvg file and plotting a graph using the file. To be more clear, I’d like to share the command pathway that I followed:
Command line:
gmx rmsf -f md_0_10.xtc -s md_0_10.tpr -o rmsf_residue.xvg -res
Reading file md_0_10.tpr, VERSION 2021.2 (single precision)
Reading file md_0_10.tpr, VERSION 2021.2 (single precision)
Select group(s) for root mean square calculation
Group 0 ( System) has 143213 elements
Group 1 ( Protein) has 9170 elements
Group 2 ( Protein-H) has 4624 elements
Group 3 ( C-alpha) has 581 elements
Group 4 ( Backbone) has 1743 elements
Group 5 ( MainChain) has 2323 elements
Group 6 ( MainChain+Cb) has 2893 elements
Group 7 ( MainChain+H) has 2882 elements
Group 8 ( SideChain) has 6288 elements
Group 9 ( SideChain-H) has 2301 elements
Group 10 ( Prot-Masses) has 9170 elements
Group 11 ( non-Protein) has 134043 elements
Group 12 ( Other) has 72 elements
Group 13 ( LIG) has 58 elements
Group 14 ( SOD) has 14 elements
Group 15 ( Water) has 133971 elements
Group 16 ( SOL) has 133971 elements
Group 17 ( non-Water) has 9242 elements
Select a group: 1
Selected 1: 'Protein'
Reading frame 2000 time 20000.000
GROMACS reminds you: "What if you're wrong about the great Ju Ju at the bottom of the sea?" (Richard Dawkins)
As could be seen, I chose Group 1 and generated an .xvg file to plot the graph. The .xtc file that I used belongs to the trajectory of protein & ligand complex and I do not have a trajectory file containing only protein.
I hope I could explain myself clearly.
No worries! If I understand you correctly, you want to compare the RMSFs of two systems: the apo protein, and the protein+ligand complex? But you also write you only have the trajectory of the protein+ligand complex? Then, I think your problem is that you need to generate the trajectory for the apo protein?
You can of course calculate the RMSFs for the protein and ligand separately or for the whole complex by selecting different groups in gmx rmsf (as you’ve already shown), but you will of course get the same results for the protein regardless of whether you calculate the RMSF with or without the ligand.
1 Like
Yes, that’s right! It’s also true that I only have the XTC file of the complex. But I’m not sure if I’m missing some points. As you stated, we can perform this calculation by selecting the Group 1 for the protein and the Group 13 for the ligand, but which groups do I need to select for the calculation involving the protein+ligand system? For example I can do RMSD calculation by selecting and combining two groups in the command prompt consecutively and I get RMSDs for complex system but I think it’s not possible for the RMSFs.
I think I got confused about this also. I see on several papers that researchers performed RMSFs for their proteins and protein+ligand complexes and it could be seen from their RMSF plots that they got different results. Actually, I’d like to show some examples about it but I’m not sure if I am able to share them here, I don’t want to break the forum rules if there are some :)
Since I am a beginner, these questions may sound weird but I need to clear my mind to go further.
The relevant comparison is between the RMSF of the protein only in each case, if you are interested in the impact of the ligand on the protein’s flexibility. As such, this requires no special use of gmx rmsf
or contents of the trajectory. Just analyze the protein in each and you will have what you want. You do not want to consider the protein and ligand simultaneously because if you fit to dissimilar groups (e.g. the protein in the apo case and protein+ligand in the other), you have different points of reference.
I would also recommend performing RMSF analysis only on the Cα or backbone groups, not the whole protein. You will get a lot of irrelevant motions (like sidechains) if you consider everything.
3 Likes
Sorry for the confusion! Justin’s answer was a lot clearer! I meant that since you only have one trajectory with the protein+ligand complex and I got the impression you wanted to compare with and without ligand for this trajectory alone, and since RMSF is usually calculated per residue (Cα), you would just see the same RMSFs for your protein regardless of whether you looked at the protein alone or at the protein+ligand (given that the initial alignment was done in a consistent manner, as Justin pointed out).
As Justin wrote, you probably want to generate the apo trajectory and then calculate the RMSFs for the protein alone for each of these cases. I just didn’t want to directly assume this was what you wanted to do.
You got the right impression Cathrine! Thank you very much for your interest and your help. Justin and you made me clear.