Domain decomposition

GROMACS version: 2021
GROMACS modification: Yes (specific on LUMI - noted as MODIFIED but not sure what modification)

I have simulated a simulation containing a protein, with membrane and solvated with ions. No virtual site on the cluster and has this issues:

There are: 236833 Atoms
Atom distribution over 448 domains: av 528 stddev 42 min 450 max 610
Center of mass motion removal mode is Linear
We have the following groups for center of mass motion removal:
  0:  PROT
  1:  MEMB
  2:  SOL_ION
Not all bonded interactions have been properly assigned to the domain decomposition cells
A list of missing interactions:
                Bond of  33824 missing    317
                 U-B of 145864 missing   2420
         Proper Dih. of 231288 missing   6656
       Improper Dih. of   5824 missing     24
           CMAP Dih. of   1992 missing      6
               LJ-14 of 205712 missing   3300
Molecule type 'PROA'
the first 10 missing interactions, except for exclusions:
         Proper Dih. atoms 5809 5817 5819 5821 global  5809  5817  5819  5821
         Proper Dih. atoms 5809 5817 5819 5821 global  5809  5817  5819  5821
               LJ-14 atoms 5809 5821           global  5809  5821
                 U-B atoms 5817 5819 5821      global  5817  5819  5821
         Proper Dih. atoms 5817 5819 5821 5823 global  5817  5819  5821  5823
         Proper Dih. atoms 5817 5819 5821 5834 global  5817  5819  5821  5834
           CMAP Dih. atoms 5817 5819 5821 5834 5836 global  5817  5819  5821  5834  5836
               LJ-14 atoms 5817 5822           global  5817  5822

so far and so on to the end…

-------------------------------------------------------
Program:     gmx mdrun, version 2021-MODIFIED
Source file: src/gromacs/domdec/domdec_topology.cpp (line 453)
MPI rank:    0 (out of 512)

Fatal error:
12723 of the 721465 bonded interactions could not be calculated because some
atoms involved moved further apart than the multi-body cut-off distance
(1.04734 nm) or the two-body cut-off distance (1.60275 nm), see option -rdd,
for pairs and tabulated bonds also see option -ddcheck

For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors
-------------------------------------------------------
~                                                               

My submission script is:

#!/bin/bash -l

#SBATCH -A project_465000054
#SBATCH -J Q1_MD

# 10 hours wall-clock time will be given to this job
#SBATCH -t 24:00:00

# Number of nodes
#SBATCH --nodes=4

#SBATCH --partition=standard

#SBATCH -e job-%j.err -o job-%j.out
#SBATCH -d singleton

module load LUMI/21.08
module load partition/L
module load GROMACS/2021-cpeGNU-21.08-PLUMED-2.7.2-CPU

srun -n 512 gmx_mpi mdrun -deffnm md -v

My mdp file is:

integrator              = md
dt                      = 0.002
nsteps                  = 500000000000
nstlog                  = 2000
nstxout-compressed      = 10000
nstvout                 = 0
nstfout                 = 0
nstcalcenergy           = 100
nstenergy               = 1000
;
cutoff-scheme           = Verlet
nstlist                 = 20
rlist                   = 1.2
coulombtype             = pme
rcoulomb                = 1.2
vdwtype                 = Cut-off
vdw-modifier            = Force-switch
rvdw_switch             = 1.0
rvdw                    = 1.2
fourier_spacing                 = 0.15
;
tcoupl                  = v-rescale
tc_grps                 = PROT MEMB  SOL_ION
tau_t                   = 1.0    1.0    1.0
ref_t                   = 303.15 303.15 303.15
;
pcoupl                  = c-rescale
pcoupltype              = semiisotropic
tau_p                   = 5.0
compressibility         = 4.5e-5  4.5e-5
ref_p                   = 1.0     1.0
;
constraints             = h-bonds
constraint_algorithm    = LINCS
continuation            = yes
;
nstcomm                 = 100
comm_mode               = linear
comm_grps               = PROT MEMB   SOL_ION
;
refcoord_scaling        = com

This simulation runs fine on my workstation - just not on the cluster called LUMI.

An exact copy of the simulation, but with the ligand bound runs fine both on the cluster and the workstation.

Any idea what to do?


SOLVED:

Realised that the mdp does not continue - so it carries the velocity which was running on different clusters and hence - the error.

The simulations will be repeated so no contunuation is needed from there.