GROMACS version:
GROMACS modification: Yes/No
Here post your question Recently, I have tested 3070 with 16 core processor.
Used 20 threads and 10 threads and performance was the same. So an 8 core processor do well with gromacs? (in dual core node with gpu)
It depends on your use case and Gromacs version used, but in a lot of cases you only need 2-4 cores per GPU.
I would be using the latest version. So a dual 8 core CPU is enough for the GPU to get the best performance?
How many GPUs are you planning to use with those two 8-core CPUs?
Take a look at Fig 10 of https://doi.org/10.1063/5.0018516
One node with 2 x 3070 and dual 8 core processor? should I go for more cores?
I personally have not run on 3070s, so that makes it hard to judge their performance, but assuming they’re in the 2080 SUPER performance territory, for vanilla MD 8 cores per GPU will be sufficient as long as the GPU-resident code-path supports your use case (do try!).
However, if you are doing anything computationally expensive on the CPU (free energyes, large pull groups, etc. you may get slightly better performance with more cores).
Take a look at Fig 10 of Páll, S., et. al. (2020). Heterogeneous parallelization and acceleration of molecular dynamics simulations in GROMACS. The Journal of Chemical Physics , 153 (13), 134110. https://doi.org/10.1063/5.0018516
This is a bit confusing… all my previous questions were answered in a way that 4-6 core is sufficient for recent GROMACS version. However, Figure 10 shows a better performance from 8 to 12 core. We mainly running normal MD but with a system of 100k-200K atoms. Considering these, higher core processor is necessary here as we load everything to GPU?
There is of course some improvement past 4-6 cores, that’s natural given that GROMACS does use the CPU.
However on just 4 cores we get 94-95% of the peak measured (on 12 cores), here’s the relative performance profile for panel B, a GluCL membrane system:
For the smaller case on panel A the profile is similar, but steeper as expected from the performance data.
In conclusion, there is a peak achievable performance assuming e.g. the fastest existing CPU / GPU combination – which is close to the top-rightmost data point on the referenced Fig 10 (at least for the previous-gen workstation hardware), but of course this comes at a cost. We try our best to balance the features, flexibility and performance of the GROMACS engine so the performance obtainable over a wide range of hardware is as high as possible.
If you care to have the absolute best performance, get many fast cores and the fastest GPUs available. However, if you want to optimize for performance per buck (and/or per energy consumed), it is worth tweaking the hardware balance for the use-case.
Cheers,
Szilárd