GPU acceleration on Mac M1 mini

Hello,

I have just purchased a Mac mini with an M1 chip but failed to install GPU. May I ask whether Gromacs GPU installation works with the M1 chip? If so, may you kindly guide me on how to install it? Many thanks.

Kind regards,

Pedro

1 Like

No, it is unlikely you will be able to get it to work. Apple remove OpenCL support and as far as I know they do some kind of OpenCL to Metal translation which does not always work reliably.

I saw that OpenCL was going to be deprecated a few years back. If it goes to the point where OpenCL support is removed, that will be very bad. SYCL works on Intel and AMD GPUs, but not on Apple GPUs (only Apple CPUs). Is there anything we can do to salvage OpenCL support, or perhaps migrate to Metal for Apple devices? I’m very skilled in GPGPU and I find GROMACS very appealing because it doesn’t require CUDA.

On M1 Macs, Apple implements OpenCL as a wrapper over Metal. Macs are eternally frozen on OpenCL 1.2, and Apple prevents you from using half-precision in OpenCL on M1 (even though you can do so in Metal). Apple wants everyone to deprecate OpenCL and migrate to Metal, but OpenCL still works. vkFFT1 and DLPrimitives2 both run on Apple GPUs through OpenCL.

Apple GPUs lack native double precision. This is unlike every other vendor, which has at least a very small amount of FP64 processing power. Is this a major problem for GROMACS? If there was optimized IEEE-compliant FP64 emulation, would that solve the problem? More context here.

New users can’t put >2 links in a post, so:

  • 1https:/ /github .com/DTolm/VkFFT
  • 2https:/ /github .com/artyom-beilis/dlprimitives

Hello,

the reason we have deprecated OpenCL support is that we want to focus our development efforts on targets that give us the widest possible support with the limited resources we have for future development.

The issue is not the actual accelerator code, but always the integration into the rest of the code. Adding support for e.g. Metal on Apple silicon would mean that we need to add whole new layers of code to handle the new accelerator framework, in addition to more complex testing and validation requirements, for a target platform that is likely not very relevant for people wanting to run production simulations.

External contributors are always welcome to add new code that the core team can’t focus on, but please be aware that there is limited chance that it will become part of the official distribution.

Cheers
Paul

The Apple platform as a whole is still relevant. M1 Macs are extremely popular and causing a record-high market share of Macs vs PCs, so a growing body of users have M1. People who use Macs are disproportionately college students, and a lot of computational chemistry research happens at college.

Then perhaps I could just make a SYCL implementation that runs over Metal, with extremely fast performance and native double precision. The SYCL binary might be open-source but distributed separately from the application. If this happens, would you be okay hyperlinking your documentation to a GitHub repository, where I show users how get GROMACS working on M1 GPUs? That removes the burden of maintenance from your core team.

Also, to keep the OpenCL port alive for the Mac, I wouldn’t mind helping you test it. What would that entail?

To hyperlink this thread with another related one: https://github.com/openmm/openmm/issues/2489#issuecomment-1214444729

philipturner, we’re not planning to remove OpenCL support in GROMACS 2023. Other users reported that OpenCL works on M1 GPUs as long as you explicitly set the GMX_GPU_DISABLE_COMPATIBILITY_CHECK environment variable. So, if that work for you now, it will work in GROMACS 2023 as well.

Then perhaps I could just make a SYCL implementation that runs over Metal, with extremely fast performance and native double precision.

If that’s what you want to do, then perhaps adding Metal support to hipSYCL would be easier: Apple (MacOS) MPS back-end? · Issue #460 · AdaptiveCpp/AdaptiveCpp · GitHub (that’s just a personal suggestion from me, not an official direction from the GROMACS team).

And side-note: GROMACS only uses single precision on GPUs, no need to worry about FP64/FP16.

The Apple platform as a whole is still relevant.

It is not about Apple per se. GROMACS is more focused on getting good performance on big machines. GPU support on laptops is mostly a by-product of supporting high-end GPUs, and there are no “big” M1/M2.

My post above got flagged as spam. I’m sorry if I missed something important or phrased something in a way that was irrelevant. Could a moderator please DM me about what happened and how to fix it?

Given what happened, I’ve not gotten a good first impression of these forums. Thus, I am hiding the contents of those comments until I’m sure it’s a good idea to reveal them. In those comments, I said that I wanted to help you guys keep OpenCL supported on macOS, and was willing to dedicate time to testing GROMACS on an M1 GPU. I’m don’t feel the same way now given what happened.

Hello

I tried restoring some of the posts now, the system here is quite aggressive when it comes to spam filtering, sorry about that :)

Concerning your message above, I agree with Andrey that contributing Metal support to hipSYCL or any other accelerator framework is likely the best way forward.

Also to be more concrete about what I said that M1 laptops are not relevant for HPC: I meant there that I expect no one wanting to run actual production length simulations purely on M1 hardware, or that there is going to be an HPC system powered by them. We will of course try to get support for any available hardware in existence so that people can use it to run GROMACS, but can’t add dedicated accelerator support for a single system like this.

Cheers

Paul

Regarding the spam strike: I suspect the mangled external links might have triggered the automatic system. Certainly nothing wrong with your message’s content from human perspective.

If this site is Discourse, then it’s a spam automation issue. I was recently flagged as a spammer on the LLVM Discourse for attempting to post a link to a reputable website.

It is Discourse, so that will be the issue. Any idea how the LLVM people handle this?

The LLVM moderators manually approved/restored all of my posts, and then - I think coincidentally - I attained Discourse trust level 1 (Understanding Discourse Trust Levels). I think the new user restrictions were what caused my problem.

I’m unable to restore the first of my posts because I passed the 3-day mark. Could a moderator manually undo my edits that hide everything? Also, I’m pretty sure it was a human who marked it as spam:

Hello,

I restored your first post again. No one deliberately flagged your post, from what I can see the system just reacted badly to the multiple links in the posts and decided it is spam based on this (and maybe @avilla could have a look into changing the heuristics there).

Your contributions would be very welcome, but please understand that the core team has limited time to work on implementing and verifying another accelerator framework. If you can help us with doing this then it would be awesome.

Let me know if you want to have a more focused discussion with one of developers, or want to continue talking on the developer mailing list.

Cheers

Paul

I have more details about what happened. I made a large, enthusiastic comment on the hipSYCL thread, then linked it here. The system rejected my comment for linking to GitHub. While rewriting it, I noticed that the other two comments were hidden. I got spooked that someone from the GitHub thread didn’t like what I wrote, then clicked a link here and flagged the comments to express disapproval.

If it was a false alarm, I will gladly see how I can help. I am thinking of making a third-party OpenCL backend for M1, which integrates FP64 emulation into a shader transpiler. Donate “MoltenCL” to the Khronos Group, then make a SYCL backend based on the work with OpenCL.

I would be interested in doing this. Would you mind summarizing the pros/cons of each approach: Metal, OpenCL, SYCL? Please do so in this forums thread for now.

From the GROMACS perspective:

  • Metal is no go. “Never say never,” but it will be really hard to persuade us to add yet another GPU backend to the mainline codebase. That said, GROMACS is LGPL, so you can fork the project and use Metal, HIP, or even DirectCompute; but then please explicitly state that it’s not an official GROMACS release.
  • OpenCL: it was reported to already work with Apple GPUs, so it’s the easiest starting point. It’s not in active development, but we are open to small patches tuning the existing kernels for M1/M2 GPUs. OpenCL lacks some fancy GPU features present in CUDA and SYCL, but they are unlikely to be critical for laptop GPUs. Another downside is that we’re now pretty close to the 2023 release, so any non-trivial change will have to be delayed to GROMACS 2024.
  • SYCL: that’s perhaps the best option long-term, but not the easiest one. We would greatly prefer if the Metal support were added to hipSYCL or oneAPI, rather than supporting yet another SYCL implementation (especially the one that only runs on Macs). Another “pro” here is that all SYCL projects would benefit, not just GROMACS.

I’ve been talking with the person who owns the hipSYCL repository, and so far things seem good. I think we can make a Metal backend for SYCL, but it might require a unique approach. With Metal, host and device code must be separate. This is different than CUDA and HIP, which are single-source. Other than that, we haven’t thought of any other roadblocks yet.

1 Like

There are a couple of other interesting projects in this space.
Check out Sylkan, CLVK, MoltenVK, clspv. It might be possible to combine some of those to get something working.

1 Like

We’re now leaning more toward a custom code base that translates LLVM IR to Metal Shading Language before feeding into Apple’s command-line tools. Sorry these discussions are private and over email, but I can announce the plans when they’re final.

Edit: We might use SPIR-V to accomplish this.

That seems quite a large effort. Make sure to check out at least MoltenVK. It can convert SPIRV to MLS. And of course you can convert LLVM-IR to SPIRV.