Mismatch in frames

GROMACS version:2022.5
GROMACS modification: No

Hi,

I was running a 100 ns membrane+protein+ligand simulation job in HPC using SLURM. My job ran out of time and got killed approximately around 92 ns.

I tried to restart the simulation so as to complete the remaining 8 ns simulation using the following command:

gmx_mpi mdrun -s md.tpr -cpi md_prev.cpt -v -append -deffnm md

Then the simulation got over successfully. However, when I checked md.log file i can see that there are some duplicated time steps in it. For instance:

DD step 46263999 load imb.: force 79.1% pme mesh/force 1.054
Step Time
46264000 92528.00000

Energies (kJ/mol)
Bond U-B Proper Dih. Improper Dih. CMAP Dih.
5.34296e+04 2.43940e+05 1.84763e+05 4.48997e+03 -1.78181e+03
LJ-14 Coulomb-14 LJ (SR) Coulomb (SR) Coul. recip.
3.82478e+04 -2.64975e+04 3.25754e+05 -5.84548e+06 2.03223e+04
Potential Kinetic En. Total Energy Conserved En. Temperature
-5.00282e+06 1.12946e+06 -3.87336e+06 7.79489e+06 3.03070e+02
Pressure (bar) Constr. rmsd
-8.74943e+01 4.64061e-06

Writing checkpoint, step 46264500 at Wed May 29 17:44:03 2024

4 2.46595e+05 1.85823e+05 4.56233e+03 -1.95436e+03
LJ-14 Coulomb-14 LJ (SR) Coulomb (SR) Coul. recip.
3.87148e+04 -2.75305e+04 3.24986e+05 -5.84069e+06 1.62945e+04
Potential Kinetic En. Total Energy Conserved En. Temperature
-5.00036e+06 1.13140e+06 -3.86896e+06 7.79472e+06 3.03591e+02
Pressure (bar) Constr. rmsd
1.82992e+01 4.65514e-06

DD step 46278999 load imb.: force 4.4% pme mesh/force 1.029
Step Time
46279000 92558.00000

Energies (kJ/mol)
Bond U-B Proper Dih. Improper Dih. CMAP Dih.
5.35400e+04 2.45639e+05 1.85842e+05 4.61183e+03 -1.92825e+03
LJ-14 Coulomb-14 LJ (SR) Coulomb (SR) Coul. recip.
3.85054e+04 -2.72041e+04 3.24764e+05 -5.84297e+06 1.61553e+04
Potential Kinetic En. Total Energy Conserved En. Temperature
-5.00305e+06 1.13006e+06 -3.87299e+06 7.79500e+06 3.03230e+02
Pressure (bar) Constr. rmsd
2.64827e+01 4.68069e-06

From the above I can see that from step 46264000 and time 92528.00000 the trajectory jumped to step 46279000 and time 92558.00000.

Interestingly, from the above step 46279000 and time 92558.00000 the simulation keeps on running till

DD step 46332999 load imb.: force 3.0% pme mesh/force 1.026
Step Time
46333000 92666.00000

Energies (kJ/mol)
Bond U-B Proper Dih. Improper Dih. CMAP Dih.
5.37965e+04 2.45166e+05 1.86201e+05 4.61486e+03 -2.18204e+03
LJ-14 Coulomb-14 LJ (SR) Coulomb (SR) Coul. recip.
3.82651e+04 -2.70649e+04 3.25583e+05 -5.84266e+06 1.62281e+04
Potential Kinetic En. Total Energy Conserved En. Temperature
-5.00205e+06 1.13090e+06 -3.87115e+06 7.80841e+06 3.03456e+02
Pressure (bar) Constr. rmsd
-9.84917e+00 4.64059e-06

And then it automatically jumps back to an earlier time step that is just after the step 46264000 and time 92528.00000

DD step 46264999 load imb.: force 92.0% pme mesh/force 1.020
Step Time
46265000 92530.00000

Energies (kJ/mol)
Bond U-B Proper Dih. Improper Dih. CMAP Dih.
5.31246e+04 2.45763e+05 1.85537e+05 4.52282e+03 -1.92235e+03
LJ-14 Coulomb-14 LJ (SR) Coulomb (SR) Coul. recip.
3.83409e+04 -2.73014e+04 3.27261e+05 -5.85147e+06 2.04276e+04
Potential Kinetic En. Total Energy Conserved En. Temperature
-5.00572e+06 1.12950e+06 -3.87622e+06 7.79479e+06 3.03080e+02
Pressure (bar) Constr. rmsd
7.18260e+01 4.63354e-06

And it continues all the way just repeating some of the time steps again.

So the md.log file shows a repetition of some portion of the trajectory. I went one step ahead to check if the same is shown in gmx check and the output is as follows:

Command line:
gmx check -f md.trr

Checking file md.trr
trr version: GMX_trn_file (single precision)
Reading frame 0 time 0.000

Atoms 426608

Last frame 1000 time 100000.000

Item #frames Timestep (ps)
Step 1001 100
Time 1001 100
Lambda 1001 100
Coords 1001 100
Velocities 1001 100
Forces 1001 100
Box 1001 100

I think that I might have duplicated frames in my trajectory but after looking at the output from gmx check I am a bit confused. I want to know how can I get rid of the duplicated frames and why such a thing happened? I am not an expert in this field hence I would be really happy if someone can explain what is going on.

Many thanks in advance.

I will be highly grateful if someone can help me with this.

Thanks

You do not have duplicated frames in the trajectory. With a save interval of 100 ps, you would expect 1000 frames. GROMACS also saves the t=0 frame, so you get a total of 1001 frames.

Many thanks Dr. Lemkul for your reply. I have a small follow-up question though. I was just wondering why the steps and time in my md.log jumps ahead and jumps back? Could you please answer this. Many thanks for your help.

Probably just a glitch in syncing to the existing file before writing. It’s irrelevant because you never need the .log file for anything. The data are all saved in the (correct) .edr file.

Many thanks Dr. Lemkul for addressing my concerns.

Best Regards