Cluster size histogram

GROMACS version: 2018
GROMACS modification: No

Dear all,

If I am not entirely wrong the histo-clust.xvg given out by gmx clustsize -hc is the cluster size distribution histogram presenting the number of occurrence of each cluster size (with size one molecule/atom, with size two molecules/atoms … up to the max size) during the course of simulation. Would you please confirm if I am right?

If so, then for a system like mine in which the aggregates are forming and getting larger and larger during the simulation (number of clusters are decreasing logically), I would expect to see an up ward shift in the histo-clust.xvg plot toward the larger size clusters along the simulation. In contrast, the most populated cluster size is cluster with size one molecule as comparing the histo-clust.xvg of three time intervals of early, midd and last stages of a 600 ns simulation, shows below.

So, either I have misunderstood what the -hc in gmx clustsize calculates or something is not just right, anyway, I would be so grateful if one could briefly explain the story for me.

Thank you,

Any comment, please?

What is reported is the real distribution cluster-size/number, averaged over the trajectory.
Your graph shows a decrease of the amount of 1-size cluster moving from 1-10 to 590-600 ns, that means that more atom/molecules are involved in large cluster (but the dimension (size) of those cluster may vary during the simulations.
If one has large cluster of constant dimension (size=9) always present in the trajectory (every frame), then the graph will show a pick at 9 with the height of the pick=1 and the rest zero.
Best regards

Thanks for the nice clarification, you are right.
The real distribution cluster-size/number for one single frame is fully fine, however, I doubt that averaging over the frames of the trajectory makes sens for the cluster-size/number histograms, as histograms are not simply averaged likes averaging a normal quantity, if I am not wrong.

One more issue, regarding the average cluster size given by clustsize -av:

I wonder how Gromacs calculates the average cluster size for each frame? Is the average cluster size the mean cluster size? if so, then, the mean of a histogram is calculated as \frac{\sum\limits_{i=1}^{\textrm{max~size}}~(\text{size}_i \times \text{number}_i )}{\sum\limits_{i=1}^{\textrm{max~size}}~\text{number}_i} by which I calculated the average of histo.xvg for several frames and they don’t agree with the ones reported in average.xvg.

Thank you