Performancec details are not availbe on log file

GROMACS version:
GROMACS modification: Yes/No
Here post your question
I am testing the performance over 20 nodes in a HPC. I submit job through pbs and end the program after 2 min, sometimes log files do not contains perforamce report. especially when i add more number of nodes. Can someone can tell me why it is so?

Running on 25 nodes with total 1700 cores, 6800 logical cores
  Cores per node:           68
  Logical cores per node:   272

Last part is

Intra-simulation communication will occur every 20 steps.
There are: 338569 Atoms
Atom distribution over 960 domains: av 352 stddev 27 min 307 max 402
Center of mass motion removal mode is Linear
We have the following groups for center of mass motion removal:
  1:  SOLV

Started mdrun on rank 0 Tue Jan 11 21:08:21 2022

           Step           Time
              0        0.00000

   Energies (kJ/mol)
           Bond            U-B    Proper Dih.  Improper Dih.      CMAP Dih.
    4.47063e+04    2.14866e+05    1.52041e+05    3.18953e+03   -2.07321e+03
          LJ-14     Coulomb-14        LJ (SR)   Coulomb (SR)   Coul. recip.
    2.85712e+04   -8.40430e+04    2.20345e+05   -4.16962e+06    1.63367e+04
      Potential    Kinetic En.   Total Energy  Conserved En.    Temperature
   -3.57568e+06    9.20262e+05   -2.65541e+06   -2.65521e+06    3.10677e+02
 Pressure (bar)   Constr. rmsd
   -1.66843e+02    4.37461e-06

If the process is killed, the log will be incomplete. You can run with -maxh 0.17 to get a 2 minute runtime run with clean termination.

But the performance details are showing in the log file if i run on different settings, example with 10 nodes

Is that behavior still exhibited with a clean exit?

Batch system often stop jobs by sending TERM signal, waiting a bit, then KILL. I suspect that using more nodes leads to slower shutdown, and so gromacs does not have time to cleanly exit in these case. (You can also look at the standard output for the jobs you ran - if you don’t see the “GROMACS reminds you:” quote at the end, then it was killed prematurely)


-maxh 0.17

also same exit. When number of nodes are less and if i ended within 2 minutes i get all the reports.

When using -maxh, are you letting the job finish, or are you ending it yourself?

-maxh 0.17 is roughly 10 minutes, not 2. Try with 0.033, or lower since Gromacs will quit at 0.99 * maxh which is a very small interval for only 2 minutes (which means it may not have enough time to write the log).

You could also consider running a fixed number of steps per simulation, which works better for resetting performance counters and allows the simulation to “warm up” before measurement.

gmx mdrun -resethway -nsteps 15000 resets the counters at 7500. Or,
gmx mdrun -resetstep 10000 -nsteps 15000 resets the counters at 10000.