I ran npt for 1 ns and it ran for 36 hours and still not completed. Can anyone please guide me in this respect.
How many atoms are in the system? What hardware are you running on? Is
mdrun still producing output or has it frozen? Have you done benchmarking to determine how long you expect 1 ns to take? Huge systems might take days to complete 1 ns if the available hardware isn’t very powerful, but small systems will run more quickly. Without knowing more details about what you’re doing and what hardware you have available, it’s impossible to say whether this outcome is to be expected.
3101 atoms are there in the system.
#PBS -S /bin/bash
#PBS -N GROMACS_MD
#PBS -l select=1:ncpus=32
#PBS -l walltime=360:00:00
#PBS -q testq
#PBS -o out.txt
#PBS -e err.txt
## export environment variable from current session to job run-time … better to use this always.
module load intel/2018.3.222
module load gromacs/2019
wc -l < $PBS_NODEFILE
mpirun -np 32 gmx_mpi mdrun -ntomp 32 -deffnm npt
And these are the system details.
Have you looked inside the log file? eg npt.log It will be writing out information into it if things are still running, or it should also have comments if something has gone wrong.
And what about the SLURM output file, what does that contain? Look in the out.txt and err.txt files.
With only 3,101 atoms and 32 threads, 1ns should be completed quickly. Something has gone wrong.
Did you do some scaling tests/benchmarking as Justin asked? Always a good idea to optimise the number of cores the calculations are spread over for a given system size. For your 3,000 atoms using 32 cores may be inefficient (but if you need it done as fast as possible, then that may not be an issue). And then you also get an idea on how long it should take per ns of simulation time.
I did the optimization of the server earlier, it run 2.5 ns per hour. But this job for npt is not getting completed. Please suggest what to do? Earlier while running nvt i got some warnings but i ignored them and run further.
Answer the questions I asked you i.e. about the output files. Without further information we can’t say what is going on and why.
Since it is expected 2.5ns/hour, simply kill the job. Something has gone wrong definitely.
Then, as suggested, look at the various log files to find out why.