GROMACS version: 2024.1
GROMACS modification: No
Here post your question
Hi all,
I am running a simulation that hang in the middle of the gmx mdrun while the CPU continously to run.
Basically in the first 10 hours, the simulation was running fine and the trajectories continue to print, however, at random point, it no longer print any new trajectories or file. When I restart it, it was able to start from the checkpoint and there is no corruption in the trajectories file. I have check there are plenty of RAM and memory therefore I do not think it is storage problem.
This happened couple of times and the exact same file run perfectly in other HPC (the simulation was tested on the new 96 cores server). Therefore, what is the best was to debug this or anyone has similar experience?
Best regards,
Ben