Failing energy minimization with 2023.2+gcc on apple silicon

GROMACS version: 2023.2
GROMACS modification: No
I am running some test simulations with virtual site parameters for DNA and RNA. I started working locally on my MacBook pro with Gromacs 2023.1 and things went fine. I then started over with 2023.2, using the same scripts as before, but now energy minimization failed. Specifically, it claimed to reach machine precision almost imediately, with very high positive potential energy and considerable force.

I tested on a supercomputer (Intel) where I installed 2023.2 and then energy minimization worked fine.

I then compiled 2023.2 with clang on my MacBook pro, and em worked (although it led to somewhat higher Epot and converged in much fewer steps than with gcc).

All builds are simplistic in the sense that I have used a minimum of cmake options, so nothing strange about the builds. FFTW3 was compiled with Gromacs.

The combination 2023.2 and gcc 12.2.0 (installed via homebrew) and Apple silicon (M1 pro) seems to produce a Gromacs incapable of minimizing the energy of my DNA duplex. I have tried to run the tests for the installation, but this doesn’t work (this has been reported before).

Excerpts from log files below.

2023.1 (gcc12):
Steepest Descents did not converge to Fmax < 10 in 5001 steps.
Potential Energy = -1.7327117e+05
Maximum force = 6.1179816e+02 on atom 161
Norm of force = 1.1533582e+01

2023.2 (gcc12):
Steepest Descents converged to machine precision in 16 steps,
but did not reach the requested Fmax < 10.
Potential Energy = 4.0807118e+06
Maximum force = 3.3101590e+07 on atom 33
Norm of force = 7.3828226e+06

2023.2 (clang):
Steepest Descents converged to machine precision in 31 steps,
but did not reach the requested Fmax < 10.
Potential Energy = -1.2083102e+05
Maximum force = 3.6648938e+04 on atom 12
Norm of force = 2.5091512e+03

These energy minimizations were done without constraints. I got it to work with constraints and virtual sites, and also without virtual sites or constraints. The virtual sites parameters seems like an obvious candidate to blame, but I cannot see why they would work with 2023.1 but not 2023.2. Is there a case for a bug report?

Suggestions for additional tests are welcome.

Hi!

That definitely looks to be worthy of a bug report.

As a check, could you please recompile GROMACS with -DGMX_SIMD=None and see if it helps? And share your input files?

I have tried to run the tests for the installation, but this doesn’t work (this has been reported before).

Could you please share the link to this report?

Hi,

I will try without SIMD, then file a bug report with input files and provide a link. Hope to have time today but no promises.

1 Like

I forgot that I compiled with -DGMX_GPU=OpenCL. Full cmake command:

cmake -DCMAKE_INSTALL_PREFIX=$d
-DCMAKE_C_COMPILER=gcc-12
-DCMAKE_CXX_COMPILER=g+±12
-DGMX_BUILD_OWN_FFTW=ON
-DGMX_GPU=OpenCL
-DBUILD_TESTS=ON
-DGMX_BUILD_UNITTESTS=ON
…/gromacs-2023.2

In any case, it works without SIMD:

Steepest Descents did not converge to Fmax < 10 in 5001 steps.
Potential Energy = -1.7337888e+05
Maximum force = 1.8611575e+03 on atom 33
Norm of force = 3.1299109e+01

I also tried without OpenCL (-DGMX_GPU=OFF), which also seems to work.

Steepest Descents did not converge to Fmax < 10 in 5001 steps.
Potential Energy = -1.7342658e+05
Maximum force = 3.1430362e+02 on atom 33
Norm of force = 9.4941796e+00

But here is the really interesting bit. If I use the tpr-file I made with 2023.1 (same gcc) it works quite well also with 2023.2+gcc+simd+gpu:

Steepest Descents converged to machine precision in 2366 steps,
but did not reach the requested Fmax < 10.
Potential Energy = -1.7031525e+05
Maximum force = 1.0484944e+03 on atom 33
Norm of force = 2.2679551e+01

I will run some additional tests to diagnose more before I file a bug report. All preprocessing and is scripted and identical btw. The only difference I can imagine is the placement of ions, which I think has a random element.

The reported problems with make tests on apple silicon: Error compiling Gromacs 2023's checks on Mac M2

Hm. I have tried rerunning the energy minimization a couple of times and it actually works from time to time (maybe using the old tpr was one of those times). I have also noted that it also fails with clang some of the times.

Three successive runs with the gcc-compiled gromacs, using the same tpr-file each time:

Steepest Descents converged to machine precision in 19 steps,
but did not reach the requested Fmax < 10.
Potential Energy = -8.2929227e+20
Maximum force = inf on atom 1
Norm of force = inf

Steepest Descents did not converge to Fmax < 10 in 5001 steps.
Potential Energy = -1.7313095e+05
Maximum force = 1.8273123e+03 on atom 33
Norm of force = 2.5931479e+01

Steepest Descents converged to machine precision in 16 steps,
but did not reach the requested Fmax < 10.
Potential Energy = 1.3144642e+20
Maximum force = inf on atom 1
Norm of force = inf

Two successive runs with the clang-compiled gromacs, using the same tpr file each time:

Steepest Descents converged to machine precision in 14 steps,
but did not reach the requested Fmax < 10.
Potential Energy = -4.1480585e+10
Maximum force = 1.6431707e+11 on atom 33
Norm of force = 3.6613234e+10

Steepest Descents did not converge to Fmax < 10 in 5001 steps.
Potential Energy = -1.7380114e+05
Maximum force = 2.2310647e+03 on atom 33
Norm of force = 3.1037871e+01

I think I am ready to file a bug report.

Ah, that one. So, Apple still has not fixed their linker :(

Be warned that there is an issue with OpenCL on Apple Silicon: GROMACS 2023.2 tests fail/segfault on Apple Silicon with OpenCL (#4852) · Issues · GROMACS / GROMACS · GitLab. Not sure how widespread it is, but if you encounter it, please try the proposed patch.

But your bug also happens without GPU, so that’s not it.

Actually it works without the GPU. I have been a bit more systematic and run it 10 times without GPU now and it always gives exactly the same energy and forces.

Steepest Descents did not converge to Fmax < 10 in 5001 steps.
Potential Energy = -1.7339505e+05
Maximum force = 1.3064794e+03 on atom 33
Norm of force = 1.9743493e+01

So it is probably the OpenCL issue.

I did note however that during the build, OpenCL is mentioned in positive terms.


– GPU support with OpenCL is deprecated. It is still fully supported (and recommended for AMD, Intel, and Apple GPUs). It may be replaced by different approaches in future releases of GROMACS.

Maybe it should not be recommended for this hardware.

I will try the patch and see.

The patch seems to have worked. I get not identical but similar final energies and forces so I don’t think there is need for a bug report anymore now that the problem is understood and even fixed for future versions.

Thank you very much for guiding me towards a solution!

It got broken in 2023.2, while fixing another bug, so we had not had a chance to update the message since then.

Awesome, thanks for checking!

The bug is fixed in GROMACS 2023.3, which will be released later this week.

Thank you. Good to know that it will work in the upcoming release.

By the way, can you share your experience with GPU acceleration on Apple GPUs? Are you getting a significant speed-up?