GROMACS version: 2023.4 (Local)
GROMACS modification: Yes/No
Hey there!
I have been stuck on this issue for some time now, and no amount of user discussions seem to address my issue. I am trying to simulate an atomistic ion channel, and yet the NVT step (first step I am running) appears stuck on step zero and no attempted solution resolves the issue. It works fine on local, but the second I try to run it on significant GPU or cluster it gets stuck. Note there is a pressure scaling issue it warns about. That is a nonissue as that disappears in around 50 simulation steps in local hardware. Here are the details regarding submitted commands, NVT file commands, and log files:
Simulation Size: 980000 molecules.
Hardware: NVIDIA A100 GPU
NVT commands:
gmx_mpi grompp -f NVT.mdp -c EM.gro -p 8HKQAtom00N00K00Z.top -o NVT.tpr -maxwarn 4
gmx_mpi mdrun -ntomp 8 -v -deffnm NVT
NVT.mdp file details:
integrator = md
dt = 0.002
nsteps = 100000000
nstxout-compressed = 10000
compressed-x-precision = 10000
nstlog = 10000
nstenergy = 10000
nstcalcenergy = 500
cutoff-scheme = verlet
nstlist = 1
ns_type = grid
pbc = xyz
verlet-buffer-tolerance = 0.005
coulombtype = reaction-field
rcoulomb = 1.2
epsilon_r = 15
epsilon_rf = 0
vdw_type = cutoff
vdw-modifier = Potential-shift-verlet
rvdw = 1.2
tcoupl = v-rescale
tc-grps = Protein Non-Protein
tau-t = 1.0 1.0
ref-t = 310 310
gen-temp = 310
Pcoupl = C-rescale
Pcoupltype = Isotropic
ref-p = 1.0 1.0
tau-p = 5.0
compressibility = 3e-4 3e-4
gen-vel = no
constraints = h-bonds
constraint_algorithm = Lincs
lincs_order = 4
lincs_warnangle = 90
NVT.log file:
There are: 948976 Atoms
Updating coordinates and applying constraints on the GPU.
Constraining the starting coordinates (step 0)
Constraining the coordinates at t0-dt (step 0)
Center of mass motion removal mode is Linear
We have the following groups for center of mass motion removal:
0: rest
RMS relative constraint deviation after constraining: 2.78e-06
Initial temperature: 3.12762e-06 K
Started mdrun on rank 0 Thu Jul 11 11:01:38 2024
Step Time
0 0.00000
Step 0 Warning: pressure scaling more than 1%, mu: 1.09091 1.09091 1.09091
Energies (kJ/mol)
Bond U-B Proper Dih. Improper Dih. CMAP Dih.
4.33573e+04 2.37301e+05 4.67317e+05 3.27539e+03 -6.58257e+03
LJ-14 Coulomb-14 LJ (SR) Coulomb (SR) Potential
1.13006e+05 3.08084e+04 1.05483e+06 -9.56765e+05 9.86545e+05
Kinetic En. Total Energy Conserved En. Temperature Pressure (bar)
3.65085e+03 9.90196e+05 4.59899e+06 4.36281e-01 2.17355e+04
Constr. rmsd
2.77985e-06
Slurm file:
Reading file NVT.tpr, VERSION 2024.2 (single precision)
1 GPU selected for this run.
Mapping of GPU IDs to the 1 GPU task in the 1 rank on this node:
PP:0
PP tasks will do (non-perturbed) short-ranged interactions on the GPU
PP task will update and constrain coordinates on the GPU
Using 1 MPI process
Non-default thread affinity set, disabling internal thread affinity
Using 8 OpenMP threads
WARNING: This run will generate roughly 45347 Mb of data
starting mdrun ‘Title’
100000000 steps, 200000.0 ps.
Step 0 Warning: pressure scaling more than 1%, mu: 1.09091 1.09091 1.09091
^Mstep 0^Mstep 100, will finish Fri Dec 27 19:34:19 2024^Mstep 200, will finish Tue Dec 24 08:57:18 2024^Mstep 300, will finish Mon Dec 23 17:08:24 2024^Mstep 400, will finish Mon Dec 23 04:02:50 2024^Mstep 500, will finish Mon Dec 23 00:05:26 2024^Mstep 600, will finish Sun Dec 22 20:13:45 2024^Mstep 700, will finish Sun Dec 22 23:06:03 2024^Mstep 800, will finish Sun Dec 22 21:45:25 2024^Mstep 900, will finish Sun Dec 22 20:18:56 2024^Mstep 1000, will finish Sun Dec 22 22:47:05 2024^Mstep 1100, will finish Mon Dec 23 01:33:16 2024^Mstep 1200, will finish Mon Dec 23 02:20:51 2024^Mstep 1300, will finish Mon Dec 23 07:51:27 2024^Mstep 1400, will finish Mon Dec 23 08:27:20 2024^Mstep 1500, will finish Mon Dec 23 09:41:09 2024^Mstep 1600, will finish Mon Dec 23 14:06:54 2024^Mstep 1700, will finish Mon Dec 23 15:24:17 2024^Mstep 1800, will finish Mon Dec 23 16:26:06 2024^Mstep 1900, will finish Mon Dec 23 19:46:15 2024^Mstep 2000, will finish Mon Dec 23 22:14:16 2024^Mstep 2100, will finish Mon Dec 23 23:32:51 2024^Mstep 2200, will finish Tue Dec 24 03:15:20 2024^Mstep 2300, will finish Tue Dec 24 04:53:59 2024^Mstep 2400, will finish Tue Dec 24 05:54:57 2024^Mstep 2500, will finish Tue Dec 24 07:48:14 2024^Mstep 2600, will finish Tue Dec 24 09:30:01 2024^Mstep 2700, will finish Tue Dec 24 11:22:09 2024^Mstep 2800, will finish Tue Dec 24 14:08:10 2024^Mstep 2900, will finish Tue Dec 24 16:25:38 2024^Mstep 3000, will finish Tue Dec 24 18:33:43 2024^Mstep 3100, will finish Tue Dec 24 20:39:32 2024^Mstep 3200, will finish Tue Dec 24 22:46:45 2024^Mstep 3300, will finish Wed Dec 25 01:11:41 2024^Mstep 3400, will finish Wed Dec 25 03:06:53 2024^Mstep 3500, will finish Wed Dec 25 05:05:51 2024^Mstep 3600, will finish Wed Dec 25 07:47:20 2024^Mstep 3700, will finish Wed Dec 25 10:17:53 2024^Mstep 3800, will finish Wed Dec 25 12:27:23 2024^Mstep 3900, will finish Wed Dec 25 15:14:03 2024^Mstep 4000, will finish Wed Dec 25 18:13:58 2024^Mstep 4100, will finish Wed Dec 25 20:30:35 2024^Mstep 4200, will finish Wed Dec@@@
Solutions I have tried thus far:
Change between srun gmx_mpi and gmx_mpi
Change between GROMACS versions 2023.1 and 2024.1
Change pressure coupling
Change pressure coupling type
Change constraints
Change checkpoints and frequency