"conserved energy" is not conserved

GROMACS version: 2021.1
GROMACS modification: Yes/No: No
Here post your question: I simulate a protein in vacuum (NVT ensemble) or in SPC/E water (NPT ensemble). dt = 1 fs, no constraints. I use a V-rescale thermostat and a Berendsen barostat. I am interested in the total energy exchange between the thermostat (and barostat) and the system, and the manual of Gromacs tells me that this exchange is included in the computation of the “conserved energy”, the result being available in the edr and log files of the run. If only a thermostat is used, formula 5.40 of the manual holds; if also a barostat is used, its contribution is described by formula 5.51.
Question 1. Formula 5.40 gives the energy exchanges of the thermostat as a positive fraction of the kinetic energy; that is, all these exchanges are taken as positive. Why?
Question 2. The “conserved energy” at the start of the run is equal to the total energy, but it evolves linearly with time, positively (NPT ensemble) or negatively (NVT ensemble). The change in time is very large, it can exceed the total energy over a 100 ns trajectory. Why is the “conserved energy” not conserved at all?
Question 3. Provided the behaviour of the “conserved energy” is understood, how can one disentangle the contribution of the thermostat from the contribution of the barostat?

I can’t find matching formulas 5.40 and 5.41 in the 2021 manual.

The conserved energy quantity is never exactly conserved due to numerical rounding errors and approximations. This causes a linear drift. You can decrease the verlet-buffer-tolerance which will decrease the drift a bit. Then the drift is usually dominated by inaccuracies in the constraint algorithms, which are difficult to improve, except by moving to double precision (and increase the lincs parameters).

The contributions from the thermostat and barostat are only kept track of internally. The only way to see them is by using gmx dump on a checkpoint file and looking for “integral”.

Thank you for your answer.

(i) Sorry, I shouldn’t have called it “manual”; the formulas are in the Gromacs Documentation Release 2021.1: formula 5.40 at p. 319, formula 5.51 at p.

(ii) I attach a file with the time evolution over 100 ns of the Total Energy and of the Conserved Energy. The former has a small drift, but the drift of the latter is huge (the small spikes on the green curve are produced at the time of the extension of the previous trajectory). The protein is not constrained, only the water molecules are rigid (SPC/E model). I will try the double precision, but I doubt that the drift of the conserved energy is caused by the single precision.

(iii) Thank you for the tip, I will look into the checkpoint file.

formula 5.40 at p. 319, formula 5.51 at p. 323

This drift is normal. There is an extremely small drift at each MD step and you are showing around a 100 million steps.

Are you really not using any constraint at all, also not on bonds involving hydrogens? Than you must be using a very small time step and SETTLE will introduce significant drift in single precision.

The first thing to do is do decrease the Verlet buffer tolerance, then increase the LINCS parameters, if there are normal constraints and the switch to double precision.

I don’t understand why you want to look at the thermostat and barostat contributions though.

I don’t know how many atoms your system has. If it’s 50000 then the drift would be 10^-4 kJ/mol/ps, or half that for 100000 atoms. This is exactly in the range shown in Fig 5.5. This shows that double precision is required to get much lower drift.

I looked into the checkpoint file, as you suggested, and found the data for the thermostat’s and barostat’s energy inputs. The drift in the conserved energy that puzzled me is only a fraction of those inputs (less than 10% over 100 ns), and must be due to inaccuracies in the integration. Thank you.