PCA analysis

martina_C · July 4, 2024, 10:32am

GROMACS version: 2022
GROMACS modification: No

Hi everyone,
I have simulations for wt and mutated isoforms of a protein. I want to study the system with a PCA but I have a couple of questions about it because I’m not an expert. After using the covar command (only on a part of the protein to reduce the calculation costs), I checked the size of the eigenvalues. I have of course a first value that is larger than the others but considering the sum of all of them, the first few eigenvalues represent only slightly more than 50% of the events. Does it make them suitable for the following analyses? Or are they not so representatives? I was wondering if the first 2 eigenvectors should represent the majority of the events (like more than 60% or higher?) to be used for other analysis. I tried also to select a smaller area of the protein but the results are similar. Moreover, does a PCA with such a small representation in the first eigenvalues suggest something wrong in the simulation?

Thank you very much in advance,
Martina

scinikhil · July 4, 2024, 3:39pm

it is ok, but you need to do more analysis to get a conclusive information. The dynamics is complex here (assuming there are no mistake in PCA calculations).

martina_C · July 4, 2024, 4:33pm

Thank you very much. More specifically, I used this command selecting Calpha as fitting group:

gmx covar -f md_noPBC.xtc -s md.tpr -n index.ndx -o eigenval.xvg -tu ns -v eigenvec.trr

And then I want to analyze the results with something like this:

anaeig -v eigenvec.trr -f md_noPBC.xtc -eig eigenval.xvg -s md.tpr -first -last -2d 2dproj.xvg -comp eigcomp.xvg -rmsf eigrmsf.xvg -tu ns

selecting as first and last the more representatives frames. Then, I wanted to use the first 2 eigenvectors for a free energy landscape but I was wondering if this is ok even with a percentage of representation lower than 50%. Maybe this analysis won’t be particularly representatives while the other results made with anaeig are still valuable?

scinikhil · July 4, 2024, 5:09pm

you need to check fel with other pc’s, how big is the receptor?

martina_C · July 4, 2024, 5:25pm

Is it possible to check fel with more than 2 principal components? I only knew about the possibility to chose 2 of them, or do you mean I should run more than one fel analysis (selecting for instance the first 2 and then the 3rd and 4th pcs) and then compare them?
The protein is 1046 amino acids. I’m doing the PCA with an index that include about 500 residues which are close to the mutation site.

scinikhil · July 4, 2024, 6:11pm

How many ns is your simulation? and which protein it is

martina_C · July 4, 2024, 6:26pm

It is a 1us simulation

scinikhil · July 4, 2024, 7:08pm

1 microsecond is a reasonable time, still it depends on what protein it is and what you are simulating.

Topic		Replies	Views
Principal component analysis (PCA) for studying mutation effect User discussions	0	616	July 8, 2023
Gmx covar analysis- please help! User discussions	2	1132	December 4, 2020
For two examples, the eigenvalues obtained from `g_covar` are all zeros User discussions analysis-tools	3	64	April 11, 2025
Calculation Principal Component analysis (PCA) User discussions	3	3569	February 16, 2023
How to explain different PCA results by projecting onto reference eigenvector User discussions	1	1448	March 14, 2022

PCA analysis

Related topics