PCA on simulations of different MD structures

adextre · July 3, 2024, 12:21am

GROMACS version: 2024.1
GROMACS modification: No
Hello everyone, I’ve conducted a set of 3 individual simulations of a protein-DNA complex, each simulation having a DNA of different sequence. The simulations ran fine and results make sense. Now I am trying to make a visual comparison of the motions of each DNA with respect to the protein to support my statement that there are sequence dependent motions. However, while I can use gmx covar and then gmx anaeig to do a PCA on an individual simulation, from my understanding, doing this for each one would just result in 3 different set of principal components that cannot be compared to each other. From a quick search seemed like doing PCA on one and then projecting the trajectories of all simulations onto the eigenvectors of one is the way to go. Again, doing this for the first structure would go something like:
gmx covar -s struc1.tpr -f traj1.xtc -o eigen1.xvg -v eigenvec1.trr -n index1.ndx
gmx anaeig -s struc1.tpr -f traj1.xtc -v eigenvec1.trr -2d proj2.xvg -first 1 -last 2 -n index1.ndx
However, when trying to map those eigenvectors to the trajectory of the second simulation like:
gmx anaeig -s struc1.tpr -f traj2.xtc -v eigenvectors1.trr -2d proj2.xvg -first 1 -last 2 -n index2.ndx
I get an error that there are inconsistent shifts over periodic boxes, which makes sense given the trajectories were removed from PBC based on their respective structure, so shifts are different. My questions are:

Does my approach make sense?
Do I need to use the trajectory files without removing PBC?
How can I fit different trajectories to the same starting point when there are different number of atoms in each simulation? Should I make a .tpr and .xtc file containing only the atoms of the DNA chains I’m interested in comparing?
Any guidance is appreciated.

milosz.wieczor · July 3, 2024, 11:37am

Option 1) Convert your trajectories so that you only keep DNA backbone + protein, this way they will all be identical.

Option 2) Extract helical parameters from the base pairs (X3DNA/Curves+ style) and perform PCA in the space of helical parameters of the whole duplex (using, say, scikit-learn).

Topic		Replies	Views
Comparing PCA results from two different trajectories User discussions	3	1557	March 29, 2022
Comparing 2 PCA vectors of different systems User discussions	1	725	June 26, 2020
How to explain different PCA results by projecting onto reference eigenvector User discussions	1	1435	March 14, 2022
PCA done on backbone, how to make movie for whole protein? User discussions	1	415	October 12, 2021
Gmx covar analysis- please help! User discussions	2	1104	December 4, 2020

PCA on simulations of different MD structures

Related topics