I am trying to store very high precision position data from my MD simulation of a protein. Since I do not need the solvent, and including it would make my output unnecessarily large, I am suppressing the .trr output and outputting the protein group to a .xtc.
In a single precision install, you can only turn compressed-x-precision up to 1e8, which makes sense. I thought that I could improve my precision by installing GROMACS in double precision and then turning the compressed-x-precision option up higher. However, if I go to 1e9 or higher, then I immediately get a fatal error when using mdrun:
Fatal Error - XTC error - maybe you are out of disk space
I am not out of disk space, and the simulation runs just fine as soon as I turn compressed-x-precision back down to 1e8. I have confirmed by checking output .trr files that GROMACS is successfully working in double precision. But I can’t use the whole .trr file (I need high time resolution and the results would be gigantic) and I feel like I should be able to turn the precision up for the .xtc so that I can output only the protein.
I think the check if wrong. XTC converts coordinates to 32bit integers, so you can use more than an xtc precision of around 1e8, depending on what the largest coordinate is. There should be an overflow check, so you don’t get the generic XTC error.
But why do you need such high precision? The accuracy of MD is not that high at all.
I need the precision to recover good velocities. Since as far as I can tell there is no way to output velocities of only a subgroup (ie: protein and not solvent), the only way I can do my analysis without generating ~100TB .trr files (I need high time resolution) is to get the velocities from finite differences of the .xtc after the fact. Additionally, I coarse-grain the protein before unpacking any binary files to keep the data a manageable size, using gmx traj to get center-of-mass trajectories of clusters. Since these sites move slower than atoms, I need high precision to avoid spurious zero velocities. It all works well enough, I’ve checked that the resulting velocities work for my purposes. But I’m currently only getting one or two digits after coarse-graining and I was hoping double precision could give me just a little more breathing room. It might also allow me to coarse-grain a little more before unpacking to tackle larger proteins.
If .xtc only works with 32-bit integers, I suppose I am already pushing it though.
Pausing to throw out solvent is very slow, and I do need a reasonably long simulation. They are already slow from all the writing I need to do to get high resolution, so I am loathe to slow them down even more. So I have considered this, but it’s a last resort for me.
As far as modifying the trajectory writing in mdrun: do you mean trying to alter the source code itself? Because I’m not aware of any options to output subgroups in the .trr as GROMACS currently stands. If there is an option I have missed, I would be ecstatic to hear that!
You would need to modify the source code. If you’re lucky that it’s the first N atoms you need, then changing the natoms value passed to gmx_trr_write_frame() in src/gromacs/mdlib/mdoutf.cpp would do it.
That seems to have worked! I very much appreciate the help. I might keep tinkering to see if I can arrange things so that I don’t need to reinstall for every molecule (because natoms is hardcoded to a specific number now. I’m wondering if I can feed the natoms from the xtc group to gmx_write_trr instead) but other than that minor inconvenience this is exactly what I needed and is going to save me SO much trouble.
Like you said it will only work as long as the atoms I need are first in the file, but luckily I’m just doing isolated proteins in solvent for now so I should be set for awhile.