Extreme RAM consumption of an md simulation

max.m90 · September 22, 2023, 4:06pm

GROMACS version: 2022.5
GROMACS modification: No
I encountered an ‘OUT_OF_MEMORY’ error with the following job details for an md simulation with ~250,000 particles:

Nodes: 16
Cores per node: 64
CPU Utilized: 310-17:42:30
CPU Efficiency: 97.49% of 318-17:50:56 core-walltime
Job Wall-clock time: 07:28:14
Memory Utilized: 5.31 TB (estimated maximum)
Memory Efficiency: 392.86% of 1.35 TB (86.43 GB/node)

srun -n 512 -c 2 gmx_mpi mdrun -ntomp 2 -deffnm md

I’m interested in understanding why the simulation required such an excessive amount of RAM and what steps I can take to optimize memory usage in future GROMACS simulations on this cluster.

The workflow (solvation, ions, energy minimization,…) until the md run is pretty similar to the gromacs tutorial.

Any insights or recommendations would be greatly appreciated.

Thanks in advance!

Max

hess · September 25, 2023, 7:52am

We can’t say much without more information about your setup. But it is strange that you can run energy minimization but not MD. What changes did you make in the mdp parameters between EM and MD?

max.m90 · September 25, 2023, 8:43am

Thank you for your answer!

Here is the setup:

I am investigating a protein
ff: amber99sb-star-ildn-q-tip4pd.ff
As solvent model I used tip4p in a dodecahedron box (-c -d 1.0 -bt dodecahedron)
EM.mdp: is the same as in the lysozyme in water tutotrial
emtol = 1000.0
emstep = 0.01
nsteps = 50000
MD.mdp changes:
nsteps = 500000000 (1μsecond)
dt = 0.002
continuation = no
en_vel = yes
gen_seed = -1
For the rest of the MD.mdp file I used the default values

CPU: Intel Gold 6130 S2/C16/T2

In our group we also had this problem with different simulation as soon as 16 nodes are to be claimed.
With 8 nodes the simulation works but the performance is not satisfying.

I hope I could share some useful information.
Thanks for the help

MagnusL · September 25, 2023, 8:53am

Does it make a difference if you run fewer tasks per node?
E.g. srun -n 64 -c 16 gmx_mpi mdrun -ntomp 16 -deffnm md

hess · September 25, 2023, 9:06am

It strange that it works on 8 but not on 16 nodes. Using more nodes reduces the memory requirement per node.

What is the memory usage on 8 nodes?

max.m90 · September 25, 2023, 9:59am

@hess

The MaxRSS on 8 nodes were 87148K.

@MagnusL

I haven´t done it yet, but thats something I will try.

hess · September 25, 2023, 10:38am

87 MB can’t be correct.

max.m90 · September 25, 2023, 10:58am

@hess

Yes that was a mistake sorry.
that should be right

Nodes: 8
Cores per node: 64
CPU Utilized: 2027-09:21:42
CPU Efficiency: 98.99% of 2048-01:25:20 core-walltime
Job Wall-clock time: 4-00:00:10
Memory Utilized: 21.28 GB (estimated maximum)
Memory Efficiency: 3.08% of 691.41 GB (86.43 GB/node)

hess · September 25, 2023, 1:52pm

So the memory usage goes from 21 GB to 5.3 TB when doubling the number of nodes. Or do I misunderstand something? Such a change is very unlikely to come from GROMACS. It could be some hidden bug that suddenly triggers order of magnitude more memory usage wen doubling the number of nodes.

What are the last few lines in the log file of the run that goes out of memory?

max.m90 · September 26, 2023, 8:53am

Yes exactly. The problem is that I don´t have an idea where it is comming from.
I already talked to our cluster support, but their only answer was to double the memory capacity for the next run but that wonn´t help if we are talking about TB.

I dont have the log files anymore, but I am currently waiting for my SLURM job to get started with the adapted command line MagnusL provided.
If I encounter the same problem again I can provide the log file.

Thanks for the input!!

max.m90 · September 28, 2023, 7:13am

Hey @MagnusL

the simulation with your provided command is running smoothly wie 90ns/day.
Thank you for your advice. But still I do not really understand the problem, can you explain me on which grounds you decided to change the values of -n, -c and -ntomp?

Thanks in advance!

MagnusL · September 28, 2023, 8:00am

Good to hear that it helped at least.

Unfortunately I don’t have any specific grounds for that recommendation. It’s just that my personal experience has shown that when running on more than 3 or 4 nodes it has been more efficient not to increase the total number of MPI tasks. I.e., to lower the number of tasks per node. I haven’t had the reported RAM issues, though, so it was just that I thought that 512 MPI tasks sounded a bit high.

Edit: It’s quite possible that srun -n 128 -c 8 gmx_mpi mdrun -ntomp 8 -deffnm md would be worth trying, perhaps also srun -n 256 -c 4 gmx_mpi mdrun -ntomp 4 -deffnm md. You might see a difference in performance.

Topic		Replies	Views
MD simulations User discussions mdrun	1	389	July 20, 2021
GROMACS performance on 8 cores workstation User discussions mdrun , mdrun-performance	8	1625	February 20, 2024
Abysmal MD production performance on GPU node User discussions mdrun	8	849	December 15, 2023
Perfomance issues of mdrun in gromacs User discussions mdrun	0	343	August 22, 2023
GPU Utilization while using Gromacs User discussions	2	53	October 30, 2024

Extreme RAM consumption of an md simulation

Related topics