Unexpected behaviour of gmx cluster

GROMACS version: 2021.4 and 2022
GROMACS modification: No

Using a trajectory of an MD of a protein dimer i performed clustering, as in:

gmx cluster -s tc-000-4.tpr -f …/all-340-4.xtc -n cluster-4.ndx -cl clusters-340-4f.pdb -method gromos -cutoff 0.18 -clid clid-340-4f -g cluster-340-4f.log -o cluster-340-4f.xpm

which gives me 33 clusters. When I do the same calculation with the same input files, but this
time I use the xpm file produced in the previous run as input, so that the RMSDs do not need
to be re-calculated I get a quite different result (only 2 clusters).

gmx cluster -dm cluster-340-4f.xpm -s tc-000-4.tpr -f …/all-340-4.xtc -n cluster-4.ndx -cl clusters-340-4g.pdb -method gromos -cutoff 0.18 -clid clid-340-4g -g cluster-340-4g.log -o cluster-340-4g.xpm

I would expect to get identical results from the two commands above, is there any reason for this behaviour, and is there a proper way to re-use the calculated RMSD values so that i can, e.g., compare
the effect of different cut-offs, without having to re-calculate RMSDs over and over again?


you can provide the rmsd matrix obtained by gmx rms as input in gmx cluster (-dm) to avoid to recalculated the RMSDs . The matrix obtained from the gmx cluster has RMSD values in the upper left half of the matrix and a graphical depiction of the clusters in the lower right half and I think it is not built to be re-used as input. That might be the reason why you get inconsistency in the results.


will try that, thanks!