GROMACS version: 2020.4-MODIFIED
This program has been built from source code that has been altered and does not match the code released as part of the official GROMACS version 2020.4-MODIFIED. If you did not intend to use an altered GROMACS version, make sure to download an intact source distribution and compile that before proceeding.
If you have modified the source code, you are strongly encouraged to set your custom version suffix (using -DGMX_VERSION_STRING_OF_FORK) which will can help later with scientific reproducibility but also when reporting bugs.
Release checksum: 79c2857291b034542c26e90512b92fd4b184a1c9d6fa59c55f2e24ccf14e7281
Computed checksum: 71730c3e53f008bf8a6c6ee90f305b5807e001d8824f2de8ace37d9da6377c65
Precision: single
Memory model: 64 bit
MPI library: MPI
OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 64)
GPU support: CUDA
SIMD instructions: IBM_VSX
FFT library: fftw-3.3.7
RDTSCP usage: disabled
TNG support: enabled
Hwloc support: hwloc-1.11.8
Tracing support: disabled
C compiler: /apps/GCC/7.3.0/bin/gcc GNU 7.3.0
C compiler flags: -mcpu=power9 -mtune=power9 -mvsx -pthread -fexcess-precision=fast -funroll-all-loops -O3 -DNDEBUG
C++ compiler: /apps/GCC/7.3.0/bin/g++ GNU 7.3.0
C++ compiler flags: -mcpu=power9 -mtune=power9 -mvsx -pthread -fexcess-precision=fast -funroll-all-loops -fopenmp -O3 -DNDEBUG
CUDA compiler: /usr/local/cuda-10.2/bin/nvcc nvcc: NVIDIA (R) Cuda compiler driver;Copyright (c) 2005-2019 NVIDIA Corporation;Built on Thu_Oct_24_17:58:26_PDT_2019;Cuda compilation tools, release 10.2, V10.2.89
CUDA compiler flags:-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_35,code=compute_35;-gencode;arch=compute_50,code=compute_50;-gencode;arch=compute_52,code=compute_52;-gencode;arch=compute_60,code=compute_60;-gencode;arch=compute_61,code=compute_61;-gencode;arch=compute_70,code=compute_70;-gencode;arch=compute_75,code=compute_75;-use_fast_math;;-mcpu=power9 -mtune=power9 -mvsx -pthread -fexcess-precision=fast -funroll-all-loops -fopenmp -O3 -DNDEBUG
CUDA driver: 10.20
CUDA runtime: 10.20
Dear forum. I am having an issue with multiple-replica simulation systems. For maximum flexibility, simulations are run for 72h and restarted from checkpoint. However, in 2/3 of the systems including the -nsteps flag to limit the number of steps to simulate, I am running into errors when simulations are restarted from the last checkpoint (init_step+nsteps is not equal for all subsystems). This is happening despite the -maxh flag is used to make sure that the simulation is finished by GROMACS and not killed by SLURM.
Could this be related to the use of the -nsteps flag? None of the other 5 systems that are not employing the -nsteps flag have failed. However, GROMACS seems to acknowledge the use of maxh, as can be seen in the Annex.
The GROMACS version had to be modified due to an incompatibility with Power9 CPUs.
I attach the GROMACS command line used to launch the simulation, and also an extract of the error as displayed by GROMACS (Annex).
Thanks!
Annex:
### GROMACS COMMAND LINE ###
gmx_mpi mdrun -multidir prod_0 prod_1 prod_2 prod_3 prod_4 prod_5 prod_6 prod_7 prod_8 prod_9 prod_10 prod_11 prod_12 prod_13 prod_14 prod_15 prod_16 prod_17 prod_18 prod_19 prod_20 prod_21 prod_22 prod_23 prod_24 prod_25 prod_26 prod_27 prod_28 prod_29 prod_30 prod_31 prod_32 prod_33 prod_34 prod_35 prod_36 prod_37 prod_38 prod_39 -maxh 72 -nsteps 100000000 -cpi md.cpt -deffnm md -replex 2000 -plumed plumed_PTWTE.dat
###################
### ERROR BELOW ###
###################
Step 12397840: Run time exceeded 71.280 hours, will terminate the run within 400 steps
Replica exchange at step 12398000 time 24796.00000
Repl 0 <-> 1 dE_term = 3.214e+00 (kT)
dpV = -6.787e-04 d = 3.213e+00
dplumed = -3.738e+00 dE_Term = -5.249e-01 (kT)
Repl ex 0 x 1 2 3 4 x 5 6 7 8 x 9 10 11 12 13 14 15 16 x 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 x 37 38 39
Repl pr 1.0 .34 .76 .79 1.0 .37 .03 .02 .96 .00 .19 .00 .33 .06 .26 .08 .04 .22 1.0 .03
Step Time
12398001 24796.00200
Writing checkpoint, step 12398001 at Sun Jan 7 13:27:36 2024
Reading checkpoint file md.cpt
file generated by: /apps/GROMACS/2020.4-plumed.2.7.0-fftw3.3.7/GCC/OPENMPI/bin/gmx_mpi
file generated at: Sun Jan 7 13:27:36 2024
GROMACS double prec.: 0
simulation part #: 1
step: 12398001
time: 24796.002000
-----------------------------------------------------------
Restarting from checkpoint, appending to previous log file.
:-) GROMACS - gmx mdrun, 2020.4-MODIFIED (-:
Executable: /apps/GROMACS/2020.4-plumed.2.7.0-fftw3.3.7/GCC/OPENMPI/bin/gmx_mpi
Data prefix: /apps/GROMACS/2020.4-plumed.2.7.0-fftw3.3.7/GCC/OPENMPI
Working dir: /gpfs/projects/csic35/md_folders/PACAP/solo/WTE_plumed/prod_0
Process ID: 74107
Command line:
gmx_mpi mdrun -multidir prod_0 prod_1 prod_2 prod_3 prod_4 prod_5 prod_6 prod_7 prod_8 prod_9 prod_10 prod_11 prod_12 prod_13 prod_14 prod_15 prod_16 prod_17 prod_18 prod_19 prod_20 prod_21 prod_22 prod_23 prod_24 prod_25 prod_26 prod_27 prod_28 prod_29 prod_30 prod_31 prod_32 prod_33 prod_34 prod_35 prod_36 prod_37 prod_38 prod_39 -maxh 72 -nsteps 100000000 -cpi md.cpt -deffnm md -replex 2000 -plumed plumed_PTWTE.dat
GROMACS version: 2020.4-MODIFIED
This program has been built from source code that has been altered and does not match the code released as part of the official GROMACS version 2020.4-MODIFIED. If you did not intend to use an altered GROMACS version, make sure to download an intact source distribution and compile that before proceeding.
If you have modified the source code, you are strongly encouraged to set your custom version suffix (using -DGMX_VERSION_STRING_OF_FORK) which will can help later with scientific reproducibility but also when reporting bugs.
Release checksum: 79c2857291b034542c26e90512b92fd4b184a1c9d6fa59c55f2e24ccf14e7281
Computed checksum: 71730c3e53f008bf8a6c6ee90f305b5807e001d8824f2de8ace37d9da6377c65
Precision: single
Memory model: 64 bit
MPI library: MPI
OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 64)
GPU support: CUDA
SIMD instructions: IBM_VSX
FFT library: fftw-3.3.7
RDTSCP usage: disabled
TNG support: enabled
Hwloc support: hwloc-1.11.8
Tracing support: disabled
C compiler: /apps/GCC/7.3.0/bin/gcc GNU 7.3.0
C compiler flags: -mcpu=power9 -mtune=power9 -mvsx -pthread -fexcess-precision=fast -funroll-all-loops -O3 -DNDEBUG
C++ compiler: /apps/GCC/7.3.0/bin/g++ GNU 7.3.0
C++ compiler flags: -mcpu=power9 -mtune=power9 -mvsx -pthread -fexcess-precision=fast -funroll-all-loops -fopenmp -O3 -DNDEBUG
CUDA compiler: /usr/local/cuda-10.2/bin/nvcc nvcc: NVIDIA (R) Cuda compiler driver;Copyright (c) 2005-2019 NVIDIA Corporation;Built on Thu_Oct_24_17:58:26_PDT_2019;Cuda compilation tools, release 10.2, V10.2.89
CUDA compiler flags:-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_35,code=compute_35;-gencode;arch=compute_50,code=compute_50;-gencode;arch=compute_52,code=compute_52;-gencode;arch=compute_60,code=compute_60;-gencode;arch=compute_61,code=compute_61;-gencode;arch=compute_70,code=compute_70;-gencode;arch=compute_75,code=compute_75;-use_fast_math;;-mcpu=power9 -mtune=power9 -mvsx -pthread -fexcess-precision=fast -funroll-all-loops -fopenmp -O3 -DNDEBUG
CUDA driver: 10.20
CUDA runtime: 10.20
The -nsteps functionality is deprecated, and may be removed in a future version. Consider using gmx convert-tpr -nsteps or changing the appropriate .mdp file field.
Overriding nsteps with value passed on the command line: 100000000 steps, 2e+05 ps
Changing nstlist from 10 to 80, rlist from 1.003 to 1.155
4 GPUs selected for this run.
Mapping of GPU IDs to the 80 GPU tasks in the 40 ranks on this node:
PP:0,PME:0,PP:0,PME:0,PP:0,PME:0,PP:0,PME:0,PP:0,PME:0,PP:0,PME:0,PP:0,PME:0,PP:0,PME:0,PP:0,PME:0,PP:0,PME:0,PP:1,PME:1,PP:1,PME:1,PP:1,PME:1,PP:1,PME:1,PP:1,PME:1,PP:1,PME:1,PP:1,PME:1,PP:1,PME:1,PP:1,PME:1,PP:1,PME:1,PP:2,PME:2,PP:2,PME:2,PP:2,PME:2,PP:2,PME:2,PP:2,PME:2,PP:2,PME:2,PP:2,PME:2,PP:2,PME:2,PP:2,PME:2,PP:2,PME:2,PP:3,PME:3,PP:3,PME:3,PP:3,PME:3,PP:3,PME:3,PP:3,PME:3,PP:3,PME:3,PP:3,PME:3,PP:3,PME:3,PP:3,PME:3,PP:3,PME:3
PP tasks will do (non-perturbed) short-ranged interactions on the GPU
PP task will update and constrain coordinates on the CPU
PME tasks will do all aspects on the GPU
This is simulation 0 out of 40 running as a composite GROMACS
multi-simulation job. Setup for this simulation:
Using 1 MPI process
Non-default thread affinity set, disabling internal thread affinity
Using 1 OpenMP thread
System total charge: -0.000
Will do PME sum in reciprocal space for electrostatic interactions.
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
U. Essmann, L. Perera, M. L. Berkowitz, T. Darden, H. Lee and L. G. Pedersen
A smooth particle mesh Ewald method
J. Chem. Phys. 103 (1995) pp. 8577-8592
-------- -------- --- Thank You --- -------- --------
Using a Gaussian width (1/beta) of 0.320163 nm for Ewald
Potential shift: LJ r^-12: -1.000e+00 r^-6: -1.000e+00, Ewald -1.000e-05
Initialized non-bonded Ewald tables, spacing: 9.33e-04 size: 1073
Generated table with 1077 data points for 1-4 COUL.
Tabscale = 500 points/nm
Generated table with 1077 data points for 1-4 LJ6.
Tabscale = 500 points/nm
Generated table with 1077 data points for 1-4 LJ12.
Tabscale = 500 points/nm
Using GPU 8x8 nonbonded short-range kernels
Using a dual 8x8 pair-list setup updated with dynamic, rolling pruning:
outer list: updated every 80 steps, buffer 0.155 nm, rlist 1.155 nm
inner list: updated every 8 steps, buffer 0.002 nm, rlist 1.002 nm
At tolerance 0.005 kJ/mol/ps per atom, equivalent classical 1x1 list would be:
outer list: updated every 80 steps, buffer 0.291 nm, rlist 1.291 nm
inner list: updated every 8 steps, buffer 0.033 nm, rlist 1.033 nm
Using Lorentz-Berthelot Lennard-Jones combination rule
Long Range LJ corr.: <C6> 2.4138e-04
Initializing LINear Constraint Solver
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
B. Hess and H. Bekker and H. J. C. Berendsen and J. G. E. M. Fraaije
LINCS: A Linear Constraint Solver for molecular simulations
J. Comp. Chem. 18 (1997) pp. 1463-1472
-------- -------- --- Thank You --- -------- --------
The number of constraints is 339
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
S. Miyamoto and P. A. Kollman
SETTLE: An Analytical Version of the SHAKE and RATTLE Algorithms for Rigid
Water Models
J. Comp. Chem. 13 (1992) pp. 952-962
-------- -------- --- Thank You --- -------- --------
There are: 80862 Atoms
There are: 26682 VSites
Initializing Replica Exchange
Repl There are 40 replicas:
Multi-checking the number of atoms ... OK
Multi-checking the integrator ... OK
Multi-checking init_step+nsteps ...
init_step+nsteps is not equal for all subsystems
subsystem 0: 112398001
subsystem 1: 112398001
subsystem 2: 112398080
subsystem 3: 112398080
subsystem 4: 112398001
subsystem 5: 112398001
subsystem 6: 112398080
subsystem 7: 112398080
subsystem 8: 112398001
subsystem 9: 112398001
subsystem 10: 112398080
subsystem 11: 112398080
subsystem 12: 112398080
subsystem 13: 112398080
subsystem 14: 112398080
subsystem 15: 112398080
subsystem 16: 112398001
subsystem 17: 112398001
subsystem 18: 112398080
subsystem 19: 112398080
subsystem 20: 112398080
subsystem 21: 112398080
subsystem 22: 112398080
subsystem 23: 112398080
subsystem 24: 112398080
subsystem 25: 112398080
subsystem 26: 112398080
subsystem 27: 112398080
subsystem 28: 112398080
subsystem 29: 112398080
subsystem 30: 112398080
subsystem 31: 112398080
subsystem 32: 112398050
subsystem 33: 112398050
subsystem 34: 112398050
subsystem 35: 112398050
subsystem 36: 112398001
subsystem 37: 112398001
subsystem 38: 112398050
subsystem 39: 112398050
-------------------------------------------------------
Program: gmx mdrun, version 2020.4-MODIFIED
Source file: src/gromacs/mdrunutility/multisim.cpp (line 381)
MPI rank: 0 (out of 40)
Fatal error:
The 40 subsystems are not compatible
For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors
-------------------------------------------------------