I have been running MD simulation on our HPC server. So far we have been able to carry out our work without any problems. However, we needed to stop last couple of our simulations due to a warning popping up saying “Your jobs are running below the expected efficiency, there might be some problems with your input files, libraries, parameters or with applications that you use. Finish your queued jobs and send them back to the queue with the appropriate configuration.”
-u: username -p: servername -N:1 -n:28 Eff:3.56%
What could be the cause of the problem here? I can provide the required files in case of need.
do you happen to know from where exactly that warning is coming? It’s not coming from GROMACS :)
Do you have reason to believe that your MD performance is a lot lower than expected? Can you compare your MD performances to other settings that ran fine?
I’m using slurm based cluster to perform the simulations, I’m not sure but maybe it’s coming from some problems on the server. When we contacted the specialist she stated that the problem might be related to -ntmpi and -ntomp parameters. I haven’t used these parameters before, actually I’m new to GROMACS and could not figure it out.
Actually the settings of MD simulations that worked properly are similar with the problematic ones. I did not change anything.
There are two things unusual here which potentially lead to problems:
you set OMP_NUM_THREADS=1 although later you instruct mdrun to use 4 OpenMP threads (-ntomp 4).
The name gmx_mpi suggests that this executable is linked to a real MPI library, although then it would choke on the -ntmpi 2 command line argument. Maybe it’s just the naming, or is there something on stdout/stderr?
How big is your system in terms of number of atoms?
Actually, I was facing the efficiency problem before adding these parameters (-ntmpi and ntomp). The problem didn’t lead to automatically stop the simulation but I needed to manually terminate it in order not to cause any trouble. And I need to state that I directly used these figures (2 and 4) which was taken from the “examples for mdrun on one node” section in the manual entitled “getting good performance from mdrun”
My command was like this before adding the parameters:
After adding the parameters, I faced the following error and couldn’t run the simulation.
> Fatal error:
> Setting the number of thread-MPI ranks is only supported with thread-MPI and
> GROMACS was compiled without thread-MPI
Since I am a fresh GROMACS user, I can’t say that I am very familiar with the terminology and I may not have fully answered your question. What values should I instruct to run the simulation properly and increase the efficiency of my work?
My system is around 9300 atoms that is composed of a protein and ligand.
A total of 8 cores seems ok for the small system size that you have, although for the sake of performance I would try to use either only MPI or only OpenMP parallelization, to get rid of one source of parallelization overhead.
In the header of your job script there should be some command to instruct SLURM how many MPI ranks to start - can you share that line?
Ah, now I understand why you get the efficiency warning. With #SBATCH -n 28 you are asking for 28 compute cores, but you are only using 8 of them (2 ranks x 4 threads), leaving the remaining 20 cores unused. You should only ask for 8 cores if that is possible on your cluster.
I see, but my cluster supports that the number of cores per node must be 28 and its multiples :) Then, if I set these parameters to equal 28, can I run the simulation without problems? For example, could these kind of settings be suitable like (2 ranks x 12 threads) or (4 ranks x 7 threads, I’m not sure if odd numbers could be valid) or things like these?
I’m really sorry that I asked so many naive questions but I just need to make my mind more clear on this.
The problem will be that such a small simulation is likely to be significantly slower with 28 threads than with 8. You could, however, make better use of the compute power of the node by running several similar simulations for better statistics. For example, you could generate 4 input .tpr files with different initial speeds, and then run them as part of a multi-simulation, where each rank will then work on one simulation with 7 OpenMP threads. More information about this can be found here https://manual.gromacs.org/current/user-guide/mdrun-features.html#running-multi-simulations