You might have seen that Apple announced their new ARM-based hardware earlier this week. GROMACS is already fully accelerated for ARM, including Neon SIMD instruction sets, so we expect the code will work great on this hardware.
However, one thing that could complicate usage is that the are equipped with a mix of high/low-performance cores (4+4 or 8+4), and it will be important that we only run on the high-performance ones. I don’t expect this will be difficult to fix (if it doesn’t work automatically), so this is just a heads-up that:
We have ordered a machine ourselves and will test the second we have it here.
Unless the hardware is seriously delayed, I expect any changes requires will be part of GROMACS-2021 - but no promises about release 2020 right now. Thus, you might have to use a beta for a few weeks until the release is official ;-)
Compiling GROMACS on MacOS 11.1 with Xcode and Macports installed worked like a charm. I ran the Ribosome benchmark from MPI (A free GROMACS benchmark set | Max Planck Institute for Biophysical Chemistry) and tried to pin the 4 efficiency and 4 high performing cores on my Apple MacBook Air. I tried three different settings of gmx mdrun:
mdrun -nt 4 -pin on -pinoffset 0 (intend was to work on cores 0-3)
mdrun -nt 4 -pin on -pinoffset 3 (intend was to work on cores 4-7)
mdrun (intend was to use all 8 cores)
Using the first two commands I got 0.474 and 0.473 ns/day, using all 8 cores I could push it to 0.574 ns/day.
Pinning won’t have any effect on Mac OS - Apple simply doesn’t offer the POSIX APIs to select CPU sets. Based on the docs I’ve seen, the OS should however do a good job of always pushing CPU-intensive jobs to the high-performance cores, which is likely why you get the same performance.
However, it’s interesting that you actually do get a bit better performance when using all 8 cores!