This is the log file of a case where GROMACS mdrun got stuck on gfx1100:
B.log (19.9 KB)
All input files (including tpr files, bash scripts, and input files for other MD Apps) can be downloaded form the aforementioned Google Cloud Drive link, which is called the “benchmark datasets” in my post. Anyone can use these datasets to run benchmarks.
I usually scan GROMACS mdrun performance for different CPU core counts and bonded options, as shown in Part Ⅰ- Section 2.1. However, this hardly works on the gfx1100, as it gets stuck after running dozens of tests and then the OS crashes. I’ve used different gfx1100 (7900XTX) GPUs and other hardwares but got the same results. Therefore, it took me a lot of time to measure the GROMACS mdrun performance on gfx1100.
BTW, in last year’s RTX4090 test, the most important conclusion was that the single-core (or “per-core”) performance of today’s CPUs severely limits the RTX4090’s potential, so I have been encouraging my peers to choose single-core powerful CPUs (e.g., 7950X, 13900KF, Threadripper-WX, and overclockable Xeon-W), rather than some server CPUs that are primarily focused on multi-core performance. I have even suggested some manufacturers to produce multi-GPU servers based on overclockable workstation platforms, and there has been some progress.
The final data in my post (Part Ⅰ- Section 2.3) is based on the OpenSYCL development branch at 12:31 AM GMT+8 on July 25, 2023 (after commit 485ea8089cfc051d1d5ed916f4cf3fd6800c6335), and in this version, the PR #1054 you mentioned has already been merged and verified (as shown in Commits · AdaptiveCpp/AdaptiveCpp · GitHub). Therefore, my tests have got these performance optimizations.
This is why I specifically identified the OpenSYCL “develop 25Jul2023” in both blog posts.
However, I’ve noticed that a lot of new verified commits have been made after July 25, and I’m looking forward to the potential performance improvements!
Sure! I have always been a loyal user of GROMACS and will continue to learn, use and explore GROMACS.