ILVES, a new parallel constraint algorithm

Dear GROMACS developers,

We are working on the development of a new parallel constraint algorithm named ILVES based on SHAKE’s equations. ILVES converges very rapidly, so our focus is to enable lower relative tolerances than the current default (shake-tol = 0.0001) with low performance overhead.

We published an article in the past introducing the implementation of ILVES specifically designed for proteins (ILVES-PC). We are in the final stages of developing a faster and simpler ILVES implementation capable of working with any molecule, and we are concurrently drafting its associated article.

In our tests, ILVES delivers speedups over SHAKE and P-LINCS for low and high tolerances. We believe that ILVES would be valuable to the community, and we would like to know the best way to contribute this feature to GROMACS.

Thank you very much in advance.

This looks like a good algorithm and the article is very will written and informative.

But I am hesitant to consider a new constraint algorithm. P-LINCS, with all it shortcomings, works well for the majority of cases. The vast majority of simulations are done is single precision. Here the constraint tolerance is limited to 10^-5, which can already be problematic as it is close to the machine precision and rounding errors accumulate. The gain of ILVES is rather moderate in this regime. A question for this regime is how rounding errors accumulate with ILVES. With P-LINCS this goes rather badly, as more iterations amplify rounding errors to the point that accuracy gets works with higher iterations and/or expansion order. I guess that ILVES could do better here.

The main gain seems to be in higher precision simulations in double precision with all bonds constrained. But this is a rather niche application area. Nearly all bio-molecular simulations are performed with only bonds involving hydrogens constrained.

And then there is the question of performance with MPI and on GPUs.

Thank you for your kind words.

In order to properly answer your question, I shall first briefly explain the properties of ILVES. I shall then answer your specific question related to the buildup of error in ILVES. Finally, I shall develop my initial points a bit further.

Brief explanation of the ILVES algorithm

ILVES solves the same system of differential algebraic equations as SHAKE, and thus reaches the same solution. It solves the nonlinear constraint equations using Newton’s method. It uses our own linear solvers that target the linear systems which are found in constrained molecular dynamics. By using ILVES it becomes practical to minimize the truncation error. This is a necessary condition for long term constrained simulations and for computing the modelling error in all constrained simulations.

How do rounding errors accumulate with ILVES?

In exact arithmetic, Newton’s method converges locally and quadratically. In floating point arithmetic, Newton’s method converges locally, at least linearly (and usually quadratically), until it stagnates at a level that is determined by (i) the unit roundoff, (ii) the conditioning of the equations, and (iii) the accuracy of the linear solver. We have explained this in detail in our papers [1,2] which consider the impact of rounding error on all methods of Newton’s type.

The novelty of ILVES is not the use of Newton’s method, but the direct linear solvers that are used to solve the necessary linear systems. We do not rely on standard library software. Instead we are writing our linear solvers from scratch. This is necessary, because the state-of-the-art linear solvers do not target the linear systems that we find in constrained MD. State-of-the-art solvers target systems that can be reordered so that they contain large dense blocks that can be processed using dense matrix matrix multiplication (BLAS3). In contrast, many of the systems that we encounter in MD are ultra sparse with at most 7 nonzero entries per row. Moreover, when we compute the LU decomposition, then there is little to no fill-in in the sense that L (and U) are nearly as sparse as the lower (upper) triangular portion of the original matrix. The potential for BLAS3 operations is therefore low, which is why it is advisable to write specialized linear solvers. We do not use iterative solvers because they are memory bound and because good preconditioners are difficult to construct automatically.

As you stated, << With P-LINCS this [accumulation of rounding errors] goes rather badly, as more iterations amplify rounding errors >>. Every computational chemist must balance accuracy and speed and contend with many different sources of error. So far imposing constraints with high accuracy (e.g. tolerance < E-10) usually required a significant percentage of the total simulation time. This is not the case with ILVES as it is expected to converge quadratically.

Further explanations

The modelling error is the difference between the physical reality and the chosen system of differential algebraic equations (DAE). The discretization error is the error introduced by replacing a system of DAE with a system of discrete equations. The truncation error is the error introduced by not solving the system of nonlinear equations exactly. It is not obvious why anyone would be interested in minimizing the truncation error. We shall now explain why it can be desirable to minimize this error.

Firstly, it is well known that the SHAKE algorithm is 2nd order accurate in the time step h and that it preserves the total energy of a system that is close to the system that we wish to simulate. The distance is O(h^2). The proof of these properties hinges on the fact that the constraint equations are solved exactly. It has been observed (see e.g. [3,4]) that if the constraint equations are not solved exactly (up to machine precision), then the total energy is not preserved and the energy drift is an increasing function of the tolerance used by the constraint solver (for a given number of time steps). Undesired distortions are a concern also in the NPT ensemble because low accuracy of constraints implies that the equations of the thermostat are not fully satisfied.

Secondly, it is frequently possible to estimate the discretization error accurately using a technique known as Richardson extrapolation. Specifically, if T is the target value that we seek to compute and A(h) is the result returned by our simulation, then there is frequently an asymptotic error expansion of the type:

T - A(h) = alpha h^p + beta h^q + O(h^r)

which controls the discretization error. Here alpha, beta are constants independent of the time step and the exponents satisfy 0 < p < q < r. In the case of the SHAKE algorithm it is likely that p=2. It is possible to accurately estimate the primary error term alpha h^p by conducting simulations with different step sizes. Specifically, we have:

alpha h^p = [A(h) - A(2h)]/[2^p - 1] + O(h^q).

It is also possible to determine p experimentally, determine when rounding errors are irrelevant and when Richardson’s error estimate, i.e., the quantity given by:

E(h) = [A(h) - A(2h)]/[2^p - 1]

is a good approximation of the discretization error T - A(h). I cannot promise that it is possible to apply Richardson’s techniques in every scenario, but I can promise that minimizing the truncation error is a necessary condition in every case.

We look forward to engaging with you and GROMACS team and we are ready to answer any questions that you might have.

References

1] Mikkelsen, Lopez-Villellas and Garcia-Risueno, “How Accurate Does Newton Have to Be?”," https://arxiv.org/pdf/2303.17911

[2] Mikkelsen, Lopez-Villellas and Garcia-Risueno, “Newton’s method revisited: How accurate do we have to be?”", https://onlinelibrary.wiley.com/doi/full/10.1002/cpe.7853

[3] Thallmair et al., “Nonconverged Constraints Cause Artificial Temperature Gradients in Lipid Bilayer Simulations”, https://pubs.acs.org/doi/pdf/10.1021/acs.jpcb.1c03665?src=getftr

[4] Mikkelsen, Lopez-Villellas et al., “Accurate and efficient constrained molecular dynamics of polymers using Newton’s method and special purpose code”, https://www.sciencedirect.com/science/article/pii/S0010465523000875

In response to your remaining concerns:

Our new implementation of ILVES is much simpler and faster and works with any molecule, not only proteins. We now also support constraining hydrogen bonds and domain decomposition (MPI). At present, we do not have a GPU version. In our tests, the new implementation of ILVES also delivers speedups for high tolerances (e.g., 10^4) in both single and double precision when constraining h-bonds.