Set MPI GROMACS to run on a certain number of CPU cores/threads

sbandeka · August 20, 2020, 5:10pm

GROMACS version: 2020.2_cu9.2.88 for linux x86_64 + CUDA
GROMACS modification: Yes/No
Here post your question

Hi All,

I understand how to set up an MPI run specifying # of GPUs using -gpu_id. However, I am confused as to how to set up an MPI run specifying a # of CPU cores or threads. I have 2x 16 core CPUs.

gmx_mpi mdrun -gpu_id 01 -s protein.tpr -v -deffnm protein_mdout

when I try to specify the number of CPU threads using -nt

I get
“Setting the total number of threads is only supported with thread-MPI and
GROMACS was compiled without thread-MPI”

Is there some syntax I’m misunderstanding?

thanks!

pjohansson · August 21, 2020, 10:32am

Hello!

By default, Gromacs will use all available processes when it launches.

If you want to specify how many MPI processes to use, launch it using your MPI task manager. For example, to start with 4 processes using mpiexec:

$ mpiexec -np 4 gmx_mpi mdrun

To further specify how many OpenMP threads per MPI task to use, use -ntomp. With 2 MPI processes and 8 OpenMP threads per process,

$ mpiexec -np 2 gmx_mpi mdrun -ntomp 8

By default, Gromacs should also handle that automatically.

But, if you are running on a single machine, it may be sufficient to compile Gromacs with thread-MPI instead of MPI.

Regards,
Petter

sbandeka · August 21, 2020, 1:31pm

Hi Petter,

Thank you so much for your reply! I will give it a try. Another question, does GROMACS “bow” to other programs? If I have a run set with no thread parameters specified, it should run to take advantage of all available hardware. However, if someone then starts another process, will it tone down its usage to allow for other processes, and then ramp back up when other processes are done?

pjohansson · August 21, 2020, 2:03pm

Well, Gromacs itself will throw everything it has at all the CPU’s it’s assigned to, pushing them to 100%. But your system load balancer will typically tone it down if other tasks need to do something. That’s external to Gromacs itself, though. But in the end, it’s typically possible to “use” the computer in a limited capacity while a simulation is running.

pszilard · August 21, 2020, 2:59pm

If you want to share CPU resources of a single compute-node, you have two options:

partition the CPU resources between the jobs – which can be done by launching the desired number of total threads (adjusting the #ranks x # threads), e.g. on a 16-core machine run 2x4=8 threads assigned to GROMACS leaving half of the cores empty. Thread pinning is also important (which is done by default when all resoruces are used) and can be done manually using the mdrun -pin on option (or using an MPI launcher/job scheulder e.g. you can tell SLUM to assigned N cores of a node to a job);
oversubscribe the CPU resources by launching multiple jobs that, in total require CPU resources than available (hence competing for these); as @pjohansson noted the operating system will make a best effort to allow execution of all work, but unless jobs launched along mdrun are quite lightweight, I would recommend against doing so as it can often lead worse performance than the former approach (e.g. as it can cause imbalance).

izo-sgi · September 8, 2020, 1:22pm

Hello !

I recently installed GROMACS 2020 in my cluster, I have found a particular issue…

As pjohansson said, GROMACS try to allocate all available resource, instead by setting
-np X will try to run in X ranks.

However, in our cluster the nodes are shared, and rarely fully occupied by a single job.
Then we tried to run GROMACS in different nodes but with a number of cpus smaller
than the maximum number of cpus present in each core.

For exemple: we have 4 nodes of 28 cores, but we are able to allocate just 8 cores of
each node. (So for a total of 32 mpi tasks)
We tried by allocating the resource via our TORQUE scheduler : #PBS -l nodes=4:ppn=8
And calling GROMACS with : mpirun -np 32 gmx_mpi mdrun -deffnm name

But it GROMACS gives the error :

There are not enough slots available in the system to satisfy the 32 slots
that were requested by the application:
gmx_mpi

Either request fewer slots for your application, or make more slots available
for use.

Do you know how can avoid this? (Notice I have also set OMP_NUM_THREADS=1)

Best regards
quim

Topic		Replies	Views
Gromacs run time flags for MPI and GPU User discussions	6	528	September 30, 2021
Hybrid MPI and OpenMP User discussions	3	1116	April 12, 2021
Fatal error: Setting the number of thread-MPI ranks is only supported with thread-MPI and GRO User discussions mdrun , mdrun-crash	0	1768	July 22, 2022
Gromacs run only on 1 processor without using all threads User discussions	2	92	July 4, 2024
GMX_MPI running User discussions	1	665	October 20, 2023

Set MPI GROMACS to run on a certain number of CPU cores/threads

GROMACS version: 2020.2_cu9.2.88 for linux x86_64 + CUDA GROMACS modification: Yes/No Here post your question

But it GROMACS gives the error :

Either request fewer slots for your application, or make more slots available for use.

Related topics

GROMACS version: 2020.2_cu9.2.88 for linux x86_64 + CUDA
GROMACS modification: Yes/No
Here post your question

Either request fewer slots for your application, or make more slots available
for use.