Creating .cpt file

Pooja1 · September 25, 2025, 5:10pm

GROMACS version:
GROMACS modification: Yes/No
Here post your question

Hi,

I am performing energy minimization with a tight emtol. I have set the wall time in my .mdp file to 72 hours, but with about an hour remaining, it seems unlikely that the minimization will converge within this time.

To prevent this kind of issue in the future, I tried to create checkpoint files using:

srun gmx_d mdrun -v -deffnm emnm3 -cpt 10 -ntomp $OMP_NUM_THREADS

However, no checkpoint files were produced, while trying a less strict emtolof 100. The em converges in 1313 steps.

Is it correct that checkpoint files cannot be produced during energy minimization due to the integrator used? Are checkpoint files only generated for simulations using integrator = md?

Ashutosh · September 26, 2025, 10:13am

i use this flag and filename -cpi example.cpt will create a file in working dir.

Pooja1 · October 1, 2025, 5:25pm

Hi Ashutosh,

srun gmx_d mdrun -v -deffnm emnm18 -cpt 10 -cpi emnm18.cpt -ntomp $OMP_NUM_THREADS.

Using this I am not getting any check point files. If you don’t mind, can you please post your command.

obZehn · October 1, 2025, 6:16pm

I’ve never seen a cpt file generated by an energy minimization. Also usually EM runs to completion in seconds or max minutes. What are you minimising that requires that much time? What hardware are you using (how many CPUs)? And why do you want to be so strict in terms of force tolerance?

Pooja1 · October 1, 2025, 9:31pm

Yes — it appears checkpoint files (.cpt) aren’t produced during our energy-minimization step on the cluster. I’m preparing the system for normal-mode analysis, which requires a much stricter em tolerance than usual. The system is a dimer (~14,000 atoms including water), so it’s fairly large and the EM takes much longer than for lysozyme (~960 atoms). I’ve achieved emtol = 1e-9 for lysozyme, but for the dimer I’m stuck around 1e-3 because jobs hit the SLURM time limit. I tried to use checkpointing/continuation but couldn’t get a stable .cpt write during EM. I am using 16 CPUs for the job. If you have any suggestions, please let me know. Much appreciated.

obZehn · October 2, 2025, 9:23am

If the time limit is a problem then I would honestly try by sbatching consecutive energy minimized structures, e.g., first you energy minimize < 1000, then push it to < 100, <10 and so on so that you can continue from the previously minimized structure. However, I’m quite sure that this becomes a problem for low tolerances as the output precision of the gro file is too low. Maybe you can work around this by printing also the full precision .trr file very infrequently and see if the last .trr frame is the final frame of the minimization process and use that as the seed for the new round of minimization. I would also take a look at the EM algorithms in the manual and see if you find some combination that fits well for your case. Also be sure that GROMACS is compiled in double precision!

Ashutosh · October 2, 2025, 1:33pm

as @obZehn brother told em can’t create a cpt file so you can try is that create a small chunks of em one after the another by running them and achieve your goal

Jassu · October 2, 2025, 3:26pm

Hi,

How big is this system? You should try running it on a single thread, I have found that energy minimisations tend to perform far better on a single thread.

Best,
Jasmine

Pooja1 · October 2, 2025, 3:46pm

Hi obZehn, thank you for the suggestions. But I do have one question. During energy minimization, especially for large systems, we may see that the final Fmax of the saved structure is higher than the Fmax of the starting structure. In that case, to continue the energy minimization, who was stopped due to the time limit: is it valid to use the final saved structure which has high Fmax? And if we use the starting structure again for energy minimization continuation, we will be in never-ending loop.

Pooja1 · October 2, 2025, 3:47pm

Thankyou Ashutosh!

Pooja1 · October 2, 2025, 3:57pm

Hi Jasmine

Thank you for your suggestions. I wanted to clarify my observations regarding the energy minimization runs:

For the large system (~14,000 atoms), I am already hitting the maximum walltime limit. Using a single thread would likely increase the required walltime, so it wouldn’t resolve the issue in this case. But if you are thinking it differently, please share your thoughts regarding it.
For the smaller system (~4,000 atoms), I have been experiencing segmentation faults. I believe running EM on a single thread will resolve this problem. Walltime is not a limiting factor for this smaller system, so achieving very low Fnorm values should be feasible.

Thank you again for the helpful idea regarding single-threaded EM.

• If you want to check, I have pasted the error during the segmentation fault of 4000 atoms system.

Low-Memory BFGS Minimizer:

• Tolerance (Fmax) = 1.00000e-09

• Number of steps = 50000000

•Using 10 BFGS correction steps.

•

• F-max = 1.63328e-09 on atom 19

• F-Norm = 7.49291e-10

•

•^MStep 0, Epot=-6.784555e+04, Fnorm=7.490e-10, Fmax=1.671e-09 (atom 684)

•^MStep 1, Epot=-6.784555e+04, Fnorm=7.466e-10, Fmax=1.669e-09 (atom 3975)

•^MStep 2, Epot=-6.784555e+04, Fnorm=7.511e-10, Fmax=1.868e-09 (atom 3344)

•^MStep 3, Epot=-6.784555e+04, Fnorm=7.622e-10, Fmax=2.301e-09 (atom 3344)

•^MStep 4, Epot=-6.784555e+04, Fnorm=7.622e-10, Fmax=2.301e-09 (atom 3344)

•srun: error: cpn-d02-37: task 0: Segmentation fault

•srun: Terminating StepId=21844713.0

•ERROR: mdrun failed.

——————————————————————————————————————————

Ashutosh · October 3, 2025, 4:39am

Hi Pooja, if you want i can help you with minimization as your 14k system size is very small i think my system can handle it very easily and if i think that the run is long then i will switch to super computer it has 112 cpu cores.

Ashutosh · October 3, 2025, 4:55am

you have a cluster of 16 different cpu or you have 16 cpu cores

obZehn · October 3, 2025, 10:27am

That’s the point, I would aim for always converging the force to the tolerance requested and then start again from that last structure. If it doesn’t converge, I would put a larger threshold that is compatible with the time constraint of the cluster, so that you can always feed a decently minimised structure to the next minimization step.

Pooja1 · October 3, 2025, 2:26pm

#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --requeue
#SBATCH --cpus-per-task=16.

Since I am currently using 16 CPU cores, I think it might be a good idea to first try increasing the number of nodes, provided it doesn’t affect the calculations. Increasing the number of CPUs could delay job allocation. Thank you, Ashutosh, for raising this point — I’ll first explore this approach and see how it works. I also truly appreciate your generous offer, but I would prefer to try resolving this on my own cluster for now. I will definitely ask you if I need any help.

Jassu · October 3, 2025, 3:36pm

Dear Pooja,

I have found that even on systems with hundreds of thousands of atoms, single thread performance tends to be better. However, this is only for EM. 14 000 is a tiny system, please let me know how it does on a single thread (1MPI, 1 openMP thread).

Best,
Jasmine

Ashutosh · October 4, 2025, 4:52pm

you means you have 8 cpu (physical) core and 16 threads as most cpu have hyperthreading, check you max cpu core or threads then assign all of them in 1 node as multiple nodes may delay the process. you can also try what @Jassu suggested that might also reduce the time.

Topic		Replies	Views
Check point file User discussions	6	765	January 30, 2021
Energy minimization not happening User discussions mdp-parameters , grompp , mdrun , analysis-tools	6	576	March 14, 2023
Simulation will not continue with cpt file User discussions	0	605	July 6, 2020
MD minimization User discussions	9	6489	August 28, 2020
Energy minimisation not getting negative User discussions	6	468	March 13, 2024

Creating .cpt file

Related topics