Creating .cpt file

GROMACS version:
GROMACS modification: Yes/No
Here post your question

Hi,

I am performing energy minimization with a tight emtol. I have set the wall time in my .mdp file to 72 hours, but with about an hour remaining, it seems unlikely that the minimization will converge within this time.

To prevent this kind of issue in the future, I tried to create checkpoint files using:

srun gmx_d mdrun -v -deffnm emnm3 -cpt 10 -ntomp $OMP_NUM_THREADS

However, no checkpoint files were produced, while trying a less strict emtolof 100. The em converges in 1313 steps.

Is it correct that checkpoint files cannot be produced during energy minimization due to the integrator used? Are checkpoint files only generated for simulations using integrator = md?

i use this flag and filename -cpi example.cpt will create a file in working dir.

Hi Ashutosh,

srun gmx_d mdrun -v -deffnm emnm18 -cpt 10 -cpi emnm18.cpt -ntomp $OMP_NUM_THREADS.

Using this I am not getting any check point files. If you don’t mind, can you please post your command.

I’ve never seen a cpt file generated by an energy minimization. Also usually EM runs to completion in seconds or max minutes. What are you minimising that requires that much time? What hardware are you using (how many CPUs)? And why do you want to be so strict in terms of force tolerance?

Yes — it appears checkpoint files (.cpt) aren’t produced during our energy-minimization step on the cluster. I’m preparing the system for normal-mode analysis, which requires a much stricter em tolerance than usual. The system is a dimer (~14,000 atoms including water), so it’s fairly large and the EM takes much longer than for lysozyme (~960 atoms). I’ve achieved emtol = 1e-9 for lysozyme, but for the dimer I’m stuck around 1e-3 because jobs hit the SLURM time limit. I tried to use checkpointing/continuation but couldn’t get a stable .cpt write during EM. I am using 16 CPUs for the job. If you have any suggestions, please let me know. Much appreciated.

If the time limit is a problem then I would honestly try by sbatching consecutive energy minimized structures, e.g., first you energy minimize < 1000, then push it to < 100, <10 and so on so that you can continue from the previously minimized structure. However, I’m quite sure that this becomes a problem for low tolerances as the output precision of the gro file is too low. Maybe you can work around this by printing also the full precision .trr file very infrequently and see if the last .trr frame is the final frame of the minimization process and use that as the seed for the new round of minimization. I would also take a look at the EM algorithms in the manual and see if you find some combination that fits well for your case. Also be sure that GROMACS is compiled in double precision!

as @obZehn brother told em can’t create a cpt file so you can try is that create a small chunks of em one after the another by running them and achieve your goal

Hi,

How big is this system? You should try running it on a single thread, I have found that energy minimisations tend to perform far better on a single thread.

Best,
Jasmine

Hi obZehn, thank you for the suggestions. But I do have one question. During energy minimization, especially for large systems, we may see that the final Fmax of the saved structure is higher than the Fmax of the starting structure. In that case, to continue the energy minimization, who was stopped due to the time limit: is it valid to use the final saved structure which has high Fmax? And if we use the starting structure again for energy minimization continuation, we will be in never-ending loop.

Thankyou Ashutosh!

Hi Jasmine

Thank you for your suggestions. I wanted to clarify my observations regarding the energy minimization runs:

  • For the large system (~14,000 atoms), I am already hitting the maximum walltime limit. Using a single thread would likely increase the required walltime, so it wouldn’t resolve the issue in this case. But if you are thinking it differently, please share your thoughts regarding it.

  • For the smaller system (~4,000 atoms), I have been experiencing segmentation faults. I believe running EM on a single thread will resolve this problem. Walltime is not a limiting factor for this smaller system, so achieving very low Fnorm values should be feasible.

Thank you again for the helpful idea regarding single-threaded EM.

• If you want to check, I have pasted the error during the segmentation fault of 4000 atoms system.

Low-Memory BFGS Minimizer:

• Tolerance (Fmax) = 1.00000e-09

• Number of steps = 50000000

•Using 10 BFGS correction steps.

• F-max = 1.63328e-09 on atom 19

• F-Norm = 7.49291e-10

•^MStep 0, Epot=-6.784555e+04, Fnorm=7.490e-10, Fmax=1.671e-09 (atom 684)

•^MStep 1, Epot=-6.784555e+04, Fnorm=7.466e-10, Fmax=1.669e-09 (atom 3975)

•^MStep 2, Epot=-6.784555e+04, Fnorm=7.511e-10, Fmax=1.868e-09 (atom 3344)

•^MStep 3, Epot=-6.784555e+04, Fnorm=7.622e-10, Fmax=2.301e-09 (atom 3344)

•^MStep 4, Epot=-6.784555e+04, Fnorm=7.622e-10, Fmax=2.301e-09 (atom 3344)

•srun: error: cpn-d02-37: task 0: Segmentation fault

•srun: Terminating StepId=21844713.0

•ERROR: mdrun failed.

——————————————————————————————————————————

Hi Pooja, if you want i can help you with minimization as your 14k system size is very small i think my system can handle it very easily and if i think that the run is long then i will switch to super computer it has 112 cpu cores.

you have a cluster of 16 different cpu or you have 16 cpu cores

That’s the point, I would aim for always converging the force to the tolerance requested and then start again from that last structure. If it doesn’t converge, I would put a larger threshold that is compatible with the time constraint of the cluster, so that you can always feed a decently minimised structure to the next minimization step.

#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --requeue
#SBATCH --cpus-per-task=16.

Since I am currently using 16 CPU cores, I think it might be a good idea to first try increasing the number of nodes, provided it doesn’t affect the calculations. Increasing the number of CPUs could delay job allocation. Thank you, Ashutosh, for raising this point — I’ll first explore this approach and see how it works. I also truly appreciate your generous offer, but I would prefer to try resolving this on my own cluster for now. I will definitely ask you if I need any help.

Dear Pooja,

I have found that even on systems with hundreds of thousands of atoms, single thread performance tends to be better. However, this is only for EM. 14 000 is a tiny system, please let me know how it does on a single thread (1MPI, 1 openMP thread).

Best,
Jasmine

you means you have 8 cpu (physical) core and 16 threads as most cpu have hyperthreading, check you max cpu core or threads then assign all of them in 1 node as multiple nodes may delay the process. you can also try what @Jassu suggested that might also reduce the time.