Calculation stopped but job not finished

GROMACS version: 2023.2
GROMACS modification: No

Hi,

I had launched 12 independent NPT runs for 1 us, and I noticed that all the runs got stucked at 502.5 ns without any error.

Because there were no error, the calculation jobs itsels did not finished, lead to quite a waste of time and money…

Has anyone experienced the same problem

Thanks in advance
Kotaro Tanaka

It is very difficult to say anything specific without more information. Did you run the 12 simulations at the same time, e.g., with the -multidir option? Otherwise it would be improbable that all simulations stopped at exactly the same time.

Did you get any output to the terminal explaining why they stopped? Have you checked the end of the log files? Did you run out of disk space? Were they run on your own local computer or on a remote machine with a queueing system?

Thank you for your reply

Did you run the 12 simulations at the same time, e.g., with the -multidir option?

Almost at the same time. Not with the -multidir option. I ran the 12 jobs independently with a shell script.

Did you get any output to the terminal explaining why they stopped?

No.

Have you checked the end of the log files?

Yes, but no error/warning or related message

Did you run out of disk space?

No.

Were they run on your own local computer or on a remote machine with a queueing system?

A remote machine with a queueing system (Fujitsu TCS)

step7_production.mdp (1.07 KB)

Since the jobs ran for 500 ns I don’t think the mdp parameters are the cause. With the available information I can’t say for sure whether the problem was caused by GROMACS or some temporary problems with the hardware/software stack of the server. But we’re of course interested if others have experienced the same thing. It’s not impossible that it is related to PBC wrapping hangs after a system explodes (#4766) · Issues · GROMACS / GROMACS · GitLab, but as I said above, if they had already run for 500 ns I don’t think that’s the problem.

Thank you for the comments.

I’ll just continue the calculations from where they stopped for now.

Thanks