Using gmx cluster after gmx trjconv

GROMACS version:
GROMACS modification: Yes/No
Here post your question

Hi,

When using gmx cluster on a trajectory that has been centered with gmx trjconv, the resulting clusters.pdb and cluster.log files only show information for the first cluster, despite saying that multiple clusters were found.

When using gmx cluster on the same trajectory before using gmx trjconv, there are no problems.

Any help would be appreciated.

Thanks,
Calvin

Hi,
I assume that you want to cluster the structure of a macromolecule. gmx cluster ( gmx cluster — GROMACS 2021.1 documentation) allows cluster structures using several different methods. RMS deviation after fitting or RMS deviation of atom-pair distances can be used to define the distance between structures.

Can it be that according to chosen criteria all the structures belong to the same cluster? In that case you will see only one cluster in pdb and log file.

Best regards
Alessandra

Thanks for your reply, yes I am clustering the structure of a protein.

The cluster.log file says that there are more structures (1001) than are present in the first cluster (934). Therefore, there must be additional clusters containing these other structures?

Best,
Calvin

Hi,
with cluster methods are you using in gmx cluster?
Standard cluster.log provides at the begin information on the matrix dimension, not on the total number of structures
Best regards
Alessandra

Hi,

I used the following command:

gmx cluster -f file.xtc -s tprout.tpr -n index.ndx -method gromos -dt 100 -cutoff 0.18 -o -g -dist -ev -sz -tr -ntr -clid -cl << EOF
26
26
26
EOF

Where file.xtc is my concatenated and centered trajectory file and tprout.tpr was generated using:

gmx convert-tpr -f md_0_1.tpr -n index.ndx -o tprout.tpr

Best,
Calvin

Hi,
I have tested your command with gmx 2021.1
and it works fine (all the structures are assigned to a cluster).
Could you report exactly what is written in the log file ?
Thank you
Alessandra

Hi,

I used gromacs 2018.4, single precision multi-threaded (single node), not 2021.1.

In the log file it is written:

Using gromos method for clustering
Using RMSD cutoff 0.18 nm
The RMSD ranges from 0.0840247 to 0.344264 nm
Average RMSD is 0.169713
Number of structures for matrix 1001
Energy of the matrix is 11.1241.

Found 4 clusters

Writing middle structure for each cluster to clusters.pdb
Counted 64 transitions in total, max 16 between two specific clusters

cl. | #st rmsd | middle rmsd | cluster members
1 | 934 0.164 | 42900 .147 | 1700 1800 2500 3100 3200 3300 3400
| | | 3500 3600 3700 3900 4000 4100 4200
| | | 4400 4500 4600 4700 4900 5000 5100
| | | 5200 5300 5400 5500 5600 5700 5800
| | | 5900 6100 6300 6400 6600 6800 7200
| | | 7300 7600 7700 7800 7900 8000 8100
| | | 8200 8300 8400 8500 8600 8700 8800
| | | 8900 9000 9100 9200 9300 9400 9500
| | | 9600 9700 9800 9900 10000 10300 10400
| | | 10500 10600 10700 10800 10900 11100 11200
| | | 11500 11600 11700 12000 12100 12200 12300
| | | 12400 12500 12600 12700 12800 12900 13000
| | | 13100 13200 13300 13400 13500 13700 13800
| | | 13900 14000 14100 14200 14300 14400 14500
| | | 14600 14700 14800 14900 15000 15100 15200
| | | 15300 15400 15500 15600 15700 15800 15900
| | | 16000 16100 16200 16300 16400 16500 16600
| | | 16700 16800 16900 17000 17100 17300 17400
| | | 17500 17600 17700 17800 17900 18000 18100
| | | 18200 18300 18400 18500 18600 18700 18800
| | | 18900 19000 19100 19200 19300 19400 19500
| | | 19600 19700 19800 19900 20000 20100 20200
| | | 20300 20400 20500 20600 20700 20800 20900
| | | 21000 21100 21200 21300 21400 21500 21700
| | | 21800 21900 22000 22100 22200 22300 22400
| | | 22500 22600 22700 22800 22900 23000 23100
| | | 23200 23300 23400 23500 23600 23700 23800
| | | 23900 24000 24100 24200 24300 24400 24500
| | | 24600 24700 24800 24900 25000 25100 25200
| | | 25300 25400 25500 25600 25700 25800 25900
| | | 26000 26100 26400 26500 26600 26700 26800
| | | 26900 27000 27100 27200 27400 27500 27700
| | | 27800 27900 28000 28100 28200 28300 28400
| | | 28500 28700 28900 29000 29100 29300 29500
| | | 29600 29700 29800 29900 30000 30100 30200
| | | 30300 30400 30500 30600 30700 30800 30900
| | | 31000 31100 31200 31300 31400 31500 31600
| | | 31700 31800 31900 32000 32100 32200 32300
| | | 32400 32500 32600 32700 32800 32900 33000
| | | 33100 33200 33300 33400 33500 33600 33700
| | | 33800 33900 34000 34100 34200 34300 34400
| | | 34500 34600 34700 34800 34900 35000 35100
| | | 35200 35300 35400 35500 35600 35700 35800
| | | 35900 36000 36100 36200 36300 36400 36500
| | | 36600 36700 36800 36900 37000 37100 37200
| | | 37300 37400 37500 37600 37700 37800 37900
| | | 38000 38100 38200 38300 38400 38500 38600
| | | 38700 38800 38900 39000 39100 39200 39300
| | | 39400 39500 39600 39700 39800 39900 40000
| | | 40100 40200 40300 40400 40500 40600 40700
| | | 40800 40900 41000 41100 41200 41300 41400
| | | 41500 41600 41700 41800 41900 42000 42100
| | | 42200 42300 42400 42500 42600 42700 42800
| | | 42900 43000 43100 43200 43300 43400 43500
| | | 43600 43700 43800 43900 44000 44100 44200
| | | 44300 44400 44500 44600 44700 44800 44900
| | | 45000 45100 45200 45300 45400 45500 45600
| | | 45700 45800 45900 46000 46100 46200 46300
| | | 46400 46500 46600 46700 46800 46900 47000
| | | 47100 47200 47300 47400 47500 47600 47700
| | | 47800 47900 48000 48100 48200 48300 48400
| | | 48500 48600 48700 48800 48900 49000 49100
| | | 49200 49300 49400 49500 49600 49700 49800
| | | 49900 50000 50100 50200 50300 50400 50500
| | | 50600 50700 50800 50900 51000 51100 51200
| | | 51300 51400 51500 51600 51700 51800 51900
| | | 52000 52100 52200 52300 52400 52500 52600
| | | 52700 52800 52900 53000 53100 53200 53300
| | | 53400 53500 53600 53700 53800 53900 54000
| | | 54100 54200 54300 54400 54500 54600 54700
| | | 54800 54900 55000 55100 55200 55300 55400
| | | 55500 55600 55700 55800 55900 56000 56100
| | | 56200 56300 56400 56500 56600 56700 56800
| | | 56900 57000 57100 57200 57300 57400 57500
| | | 57600 57700 57800 57900 58000 58100 58200
| | | 58300 58400 58500 58600 58700 58800 58900
| | | 59000 59100 59200 59300 59400 59500 59600
| | | 59700 59800 59900 60000 60100 60200 60300
| | | 60400 60500 60600 60700 60800 60900 61000
| | | 61100 61200 61300 61400 61500 61600 61700
| | | 61800 61900 62000 62100 62200 62300 62400
| | | 62500 62600 62700 62800 62900 63000 63100
| | | 63200 63300 63400 63500 63600 63700 63800
| | | 63900 64000 64100 64200 64300 64400 64500
| | | 64600 64700 64800 64900 65000 65100 65200
| | | 65300 65400 65500 65600 65700 65800 65900
| | | 66000 66100 66200 66300 66400 66500 66600
| | | 66700 66800 66900 67000 67100 67200 67300
| | | 67400 67500 67600 67700 67800 67900 68000
| | | 68100 68200 68300 68400 68500 68600 68700
| | | 68800 68900 69000 69100 69200 69300 69400
| | | 69500 69600 69700 69800 69900 70000 70100
| | | 70200 70300 70400 70500 70600 70700 70800
| | | 70900 71000 71100 71200 71300 71400 71500
| | | 71600 71700 71800 72000 72100 72200 72300
| | | 72400 72500 72600 72700 72800 72900 73000
| |

Best,
Calvin

Hi,
Sorry I got lost. According to your log file you have

If your log file is not corrupt, I expect that you find the details of other clusters at the end of the file, for example by search for 2|

best regards
Alessandra

Hi,

Yes, the problem I have is that the log file ends at the end of the first cluster and does not give any information regarding the subsequent clusters.

You mentioned previously that my command works fine for you, did you try this on a trajectory file that had been centered with gmx trjconv? Because this is what seems to cause the issue for me, as the command works on uncentered trajectory files fine.

Best,
Calvin

Hi Calvin,

Yes, it was. Among other cases I tried original and center xtc, but I can not reproduce your issue. In general using center xtc file should not affect that the tool works or not.
There are options (in gmx cluster) that affect the number of cluster to be reported / saved but based on what you reported it looks that it is not the problem here.

Best regards
Alessandra

Hi @alevilla , I have done an MD simulation for the protein that I am going to be molecular docking (virtual screening). I want to get an average structure of the protein to be used in docking. Can I use gmx cluster and is it suitable for my work?
Thanks