Enhancing the simulation performance for inhomogeneous systems such as droplets

Hi,

I have run into a performance issue when simulating water vapor systems as e.g. a single ~100,000 atom TIP3P water droplet in vacuum. As such, the ~30,000 water molecules fit into a box of approx 10 x 10 x 10 nm. I place this initial water cube centered into cubic boxes of length 20, 50, 100, … 1000 nm side length, leaving a vacuum layer around the droplet.

Without PME and only cutoffs (1 nm) I have naively assumed that the mdrun performance would be almost unaffected by the simulation box size. However, while 50 time steps take about 5.5 seconds (using 1 core on my workstation, no GPU) in the 20 nm box, that already takes 25 seconds in a 100 nm box and 50 seconds in a 200 nm box.

From the time accounting at the end of the log files it becomes clear that the extra time is almost exclusively spent in Neighbor search and Force. The flops accounting shows that (although in principle the exact same interactions need to be computed for the different box sizes) actually the number of interactions that are computed (NxN*) grows with box size, in the n times larger boxes, n times the number of interactions are evaluated (I assume these are zeroed out later).

The reason seems to be that during pair search, regardless of the box size, the particles are always sorted into (in my case) a 23 by 23 grid in x and y dimensions. For the smallest box, most grid cells contain a small number of particles, but for the larger boxes, only a handful of grid cells contain a large number of particles, whereas all other cells are empty. So the particles of the droplet concentrate in less and less NS grid cells when the box gets larger.

While the paper A flexible algorithm for calculating pair interactions … states that the search grid spacing depends also on the particle density, that does not seem to be the case in the code, where -1 is passed for the particle density in the calls to nbnxn_put_on_grid() in sim_util.cpp and nbnxm.cpp.

When I pass 100 nm-3 for the particle density of TIP3P water to nbnxm_put_on_grid(),
I am getting a good performance independent of the box size – 50 steps now only take about 4.5 seconds in all cases!

So my question would be: Shouldn’t the pair search grid dimensions also depend on
the box size and the particle density?

Best regards,
Carsten

Yes, the search grid dimensions should actually depend on the effective density of where the particles are. But to compute an effective density, the best way is to put particles on a grid … So I haven’t done that. But it would not be hard to check the effective density based on the current grid and choose a finer grid with some criterion. The question is what the overhead of that operation is.

What about using the cutoff length as grid spacing?

The cut-off can differ while the density of a system remains the same, so that is not a good measure.

An alternative that does not require to write many lines of code would be to allow to set a density with an environment variable (or .mdp parameter).