Expanded ensemble: Performance impact of perturbing more atoms

GROMACS version: 2024.1
GROMACS modification: No

Hi all,
I am trying to understand the ways in which the performance of expanded ensemble simulations worsens with increasing system size and with larger numbers of atoms that change with lambda. I am using a dual topology I obtained through the PMX tutorial (which I would link here, but I’m limited to two links per post) as my test case. The tutorial demonstrates a W->F mutation in a Trp cage.

I tested three different simulation boxes based on this system:

  1. One Trp cage molecule in small box of water+NaCl, 5914 atoms
  2. 51 Trp cage molecules in large box of water+NaCl, 98496 atoms
  3. One Trp cage molecule in large box of water+NaCl, 97977 atoms

Only atoms in the Trp cage change with lambda, so with these three systems, I hoped to disentangle performance impacts of the larger box from performance impacts of a larger number of perturbed atoms. I observed the following performance (in ns / day):

                System 1     System 2     System 3
   Standard       847          82           82
   EE             474          15           61

The comparison in system 1 shows that a significant cost is associated with switching on expanded ensemble simulations (~ 55 % performance). The larger systems, of course, have worse performance. What surprised me, however, was the difference in performance cost switching on EE has in system 2 and system 3 (down to 18 % and 74 % performance, respectively). I am by no means an expert, but the lambda-dependent potential functions described at Free energy interactions - GROMACS 2024 documentation don’t seem to me like they should be that much more expensive.

In the runs for system 3 I noticed that due to the nature of this system, a lot of time is spent waiting for GPU processes. To rule out effects of this, I also ran simulations of system 3 without GPUs and got 15.7 and 12.4 ns/day, respectively, for 79 % performance of EE relative to standard MD.

I would be grateful for any suggestions or advice on how I could improve performance here. I am trying to run EE on a relatively large system in which I am changing a large number of atoms (so most like system 2), but the performance penalty I’m getting is rather prohibitive.

MDP, GRO, TOP, and LOG files of these systems and runs are at EE_performance.zip - Google Drive
Thank you!

Alchemical free energy perturbations/transformations are significantly slower than standard MD simulations. They are significantly faster nowadays (especially since GROMACS 2022) than they were a few years ago. I’m afraid that the slowdowns you present are according to what I’d expect.

Hopefully, the FE interactions kernel will take advantage of the GPU in a future version of GROMACS.

Thanks very much, Magnus. That’s a shame - I was hoping I made a mistake somewhere. So just to confirm, it is indeed expected that between two comparable systems simulated in an expended ensemble, the system in which more atoms are perturbed between an A and a B state would perform noticeably worse? I thought/hoped that the using the free energy code in the first place would have the biggest impact on performance, but that the number of atoms that change with lambda might have a negligible impact.

I don’t know exactly what performance loss to expect when perturbing more atoms, but it won’t be negligible. The free energy kernel is slower (per perturbed atom) than the ordinary non-bonded kernels. Likewise, it will be a bit slower if you calculate dH to all neighbouring lambda states (calc-lambda-neighbors) and/or have more states. But you only calculate to the closest neighbor anyhow.

In your case, you get a factor four slowdown with 50 times more perturbed atoms. I’m afraid it sounds like what I’d expect.

That clears this up, thanks very much!