GROMACS version: 2023.3
GROMACS modification: No
Hi folks
I know that there have been several threads on running Gromacs with Apple M1 and M1 or M2 chips (e.g. Error compiling Gromacs 2023's checks on Mac M2), but I recently got a MacBook Pro with the M3 chip, so I was interested to see how it would perform. I am starting this thread in case other people have concerns or suggestions or just need a starting point.
Here are the details:
My MacBook Pro is the 14" version with 18GB of RAM and an M3 Pro chip (12 core CPU, 18 core GPU). OS is Sonoma 14.1.2
I am comparing it to a Linux PC running Ubuntu 20.04 with 32GB RAM, a 24-core i9 processor and a 3080ti card with CUDA 11.2.
Test runs were run on HIV-1 protease (1ajx.pdb) with an inhibitor and a 1.5nm solvent jacket (11,137 atoms total).
Note: Mac and PC are approximately equally fast running ML in Keras, with the Mac using the Tensorflow metal plugin and the PC using the CUDA code.
per Error compiling Gromacs 2023's checks on Mac M2 - #5 by hess etc
Install homebrew etc
/bin/bash -c “$(curl -fsSL”
brew install wget
brew install cmake
brew install hwloc
brew install subversion
brew install gcc
brew install libomp
brew install opencl-headers
cd gromacs-2023.3
Build, with OpenMP
Note: Always start builds in a new build directory
rm -rf build ; mkdir build ; cd build
-DOpenMP_{C,CXX}FLAGS=“-Xpreprocessor -fopenmp -I/opt/homebrew/opt/libomp/include”
CMake Warning at cmake/gmxDetectCpu.cmake:100 (message):
Did not detect build CPU features - detection program did not compile.
Please file a bug report if this is a common platform.
Call Stack (most recent call first):
cmake/gmxDetectSimd.cmake:69 (gmx_run_cpu_detection)
cmake/gmxDetectSimd.cmake:155 (gmx_suggest_simd)
cmake/gmxManageSimd.cmake:91 (gmx_detect_simd)
CMakeLists.txt:650 (gmx_manage_simd)
Try with Apple clang compiler instead
cd … ; rm -rf build ; mkdir build ; cd build
-DOpenMP_{C,CXX}FLAGS=“-Xpreprocessor -fopenmp -I/opt/homebrew/opt/libomp/include”
make check
100% tests passed, 0 tests failed out of 83
sudo make install
source /usr/local/gromacs/bin/GMXRC
See below for test results.
Build again with no shared libraries and add GPUs and hwloc (still using clang)
cd … ; rm -rf build ; mkdir build ; cd build
-DOpenMP_{C,CXX}FLAGS=“-Xpreprocessor -fopenmp -I/opt/homebrew/opt/libomp/include”
make check
sudo make install
source /usr/local/gromacs/bin/GMXRC
Looks good:
gmx --version
:-) GROMACS - gmx, 2023.3 (-:
Executable: /usr/local/gromacs/bin/gmx
Data prefix: /usr/local/gromacs
Working dir: /Users/gvigers/Work/programs/gromacs
Command line:
gmx --version
GROMACS version: 2023.3
Precision: mixed
Memory model: 64 bit
MPI library: thread_mpi
OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 128)
GPU support: OpenCL
NB cluster size: 8
SIMD instructions: ARM_NEON_ASIMD
CPU FFT library: fftw-3.3.8
GPU FFT library: VkFFT internal (1.2.26-b15cb0ca3e884bdb6c901a12d87aa8aadf7637d8) with OpenCL backend
Multi-GPU FFT: none
TNG support: enabled
Hwloc support: disabled
Tracing support: disabled
C compiler: /usr/bin/clang AppleClang
C compiler flags: -Wno-missing-field-initializers -fno-stack-check -fno-stack-check -O3 -DNDEBUG
C++ compiler: /usr/bin/clang++ AppleClang
C++ compiler flags: -Wno-missing-field-initializers -fno-stack-check -fno-stack-check -Weverything -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-return-std-move-in-c++11 -Wno-source-uses-openmp -Wno-c++17-extensions -Wno-documentation-unknown-command -Wno-covered-switch-default -Wno-switch-enum -Wno-extra-semi-stmt -Wno-weak-vtables -Wno-shadow -Wno-padded -Wno-reserved-id-macro -Wno-double-promotion -Wno-exit-time-destructors -Wno-global-constructors -Wno-documentation -Wno-format-nonliteral -Wno-used-but-marked-unused -Wno-float-equal -Wno-conditional-uninitialized -Wno-conversion -Wno-disabled-macro-expansion -Wno-unused-macros -Wno-unused-parameter -Wno-unused-variable -Wno-newline-eof -Wno-old-style-cast -Wno-zero-as-null-pointer-constant -Wno-sign-compare SHELL:-Xpreprocessor -fopenmp -I/opt/homebrew/opt/libomp/include -O3 -DNDEBUG
BLAS library: External - detected on the system
LAPACK library: External - detected on the system
OpenCL include dir: /Library/Developer/CommandLineTools/SDKs/MacOSX11.3.sdk/System/Library/Frameworks/OpenCL.framework
OpenCL library: /Library/Developer/CommandLineTools/SDKs/MacOSX11.3.sdk/System/Library/Frameworks/OpenCL.framework
OpenCL version: 1.2
Timings on 1ajx.pdb:
Mac, Steepest-descents minimization:
Steepest Descents converged to Fmax < 500 in 1932 steps
Potential Energy = -5.4745700e+05
Maximum force = 4.9502078e+02 on atom 3131
Norm of force = 1.1838092e+01
real 0m9.929s
PC, Steepest-descents minimization:
Steepest Descents converged to Fmax < 500 in 1913 steps
Potential Energy = -5.4743575e+05
Maximum force = 4.8873819e+02 on atom 3131
Norm of force = 1.1845837e+01
real 0m4.175s
Mac, 500ps MD run, CPUs only:
Core t (s) Wall t (s) (%)
Time: 27530.204 2294.186 1200.0
(ns/day) (hour/ns)
Performance: 18.830 1.275
real 38m21.783s
Mac, 500ps MD run, Add GPUs:
Core t (s) Wall t (s) (%)
Time: 6834.195 569.527 1200.0
(ns/day) (hour/ns)
Performance: 75.853 0.316
real 9m35.661s
PC, 500ps MD run:
Core t (s) Wall t (s) (%)
Time: 1802.629 75.117 2399.8
(ns/day) (hour/ns)
Performance: 575.103 0.042
real 1m21.512s
Mac, Calculating interaction energies (CPU-only task)
real 10m6.758s
PC, Calculating interaction energies (CPU-only task)
real 13m2.150s
The Mac is suprisingly capable, given it’s energy consumption :)
I couldn’t get the g+±13 compiler to work, but the Apple compiler looks good.
Compiling with GPU support on the M3 chip gives ~4x boost in speed.
My PC is ~7.5x as fast for MD runs, slightly slower for CPU-only tasks.
Minimizations have worked well on ~450 test cases (data not shown). I have not tested MD runs extensively.
Side note: I tried the same tests on a Mac Studio with the M2 ultra chip and got similar results, except that compiling with GPUs made it 2x slower! I did not pursue this very far.
Any suggestions for improvements?
Are there other tests that folks would like to see?
Guy Vigers
P.S. Thanks to the developers for a great program, as always!