Storing protein velocities at high framerate from long simulation

jhall4 · May 2, 2023, 11:28pm

GROMACS version: 2020.4
GROMACS modification: Yes/No

Hi,

For the analysis I need to do, I need to store very high time-resolution positions and velocities for the protein atoms from a several hundred ns simulation. I am struggling to figure out how to do this in GROMACS in a way that doesn’t require absurd amounts of storage.

The .trr stores positions and velocities in a compact format, but it includes the solvent atoms and so results in incredibly large files regardless. I see no way to output only the protein atoms.

I can output an .xtc file instead, which gives me reasonable file sizes with only the protein atoms. But I cannot get velocities from the integrator this way. I could recover the velocities from sufficiently precise positions, but that would require unpacking the .xtc file before I do my coarse-graining instead of after (I currently process the .xtc using gmx traj to get some center-of-mass trajectories before I convert anything to ascii to save time and storage space). I have tried simply calculating the velocities of my coarse-grained sites using finite differences, but it turns out single-precision data isn’t enough to recover good velocities at this stage as the cg sites move too slowly and I get lots of spurious zero velocities and the rest become obviously quantized. And converting the all-protein atom .xtc to ascii to get velocities first is slow and storage-intensive. Double-precision would probably work here, but I don’t currently have access to double-precision GROMACS.

So my problem is the asymmetric way GROMACS treats positions and velocities. The positions are no problem at all because they can go into the .xtc file and the solvent gets dumped, then I can coarse-grain using traj before converting my results to ascii. But there seems to be no equivalent pipeline for velocities that avoids gigantic file sizes.

I have also considered periodically stopping the MD simulation and using trjconv to toss the solvent out of the .trr, and appending the results of this dumping as I go. The problem I ran into here is that versions of GROMACS more than a couple years old develop errors in the timestamp during long simulations, and the concatenation routines in GROMACS use the timestamps to do alignment instead of frame numbers, so I get errors at every stitch. So again I would need to convert results to ascii and stitch them manually, which is again quite prohibitive. I do not have access to a newer version of GROMACS than 2020. This also drastically slows down simulations.

If anyone who is more familiar with what’s possible with GROMACS has any ideas, I would greatly appreciate it.

pszilard · May 3, 2023, 11:27am

Hi,

Why not compile a newer (also double precision) version of the code yourself ?

Cheers,
Szilárd

jhall4 · May 3, 2023, 7:15pm

I’ve been looking into that since I posted this. I managed to compile the double precision version, but I can’t get the output .xtc files to store the positions in double precision. Setting compressed-x-precision = 1e8 works fine to get essentially full single precision data, but if I try to set it any higher (ie: 1e9) I get an immediate fatal error in mdrun, even though this version of gromacs is double precision. I don’t really understand why.

Fatal Error - XTC error - maybe you are out of disk space

I checked that the .trr is working fine on short runs, so the double precision is working in general, but those are prohibitively large for what I’m doing, I need to use an .xtc so that I can selectively store the protein without the solvent.

Topic		Replies	Views
Storing Velocities/Forces of Protein Atoms User discussions	2	619	September 3, 2021
Double precision mdrun fails when requesting more than single precision in .xtc output User discussions	9	485	May 29, 2023
.trr for a selected group, variable precision User discussions	1	415	May 11, 2022
Reading double precision coordinate data with grompp User discussions grompp	7	38	October 25, 2024
.trr file utilities User discussions mdrun	1	94	November 19, 2024

Storing protein velocities at high framerate from long simulation

Related topics