GROMACS version: 2021
GROMACS modification: /No
Here post your question
md_0_1.log (74.8 KB)
The core
files (e.g. core.245674
, core.134940
, core.31056
) appear from time to time during my production run. But I can see the time is still proceeding, so it is not affecting the MD. But why they appear? The log files are attached, is this related to my high-performance computing resources?
These are probably stdout/stderr from the compute cores. They are not GROMACS output.
output.log (3.1 MB)
Here is the output log. I can see Segmentation fault (core dumped)
: Can I ask how can I modify it? Thank you.
GERun: Contents of machinefile:
GERun: node-e00a-014
GERun: node-e00a-014
GERun: node-e00a-014
GERun: node-e00a-014
GERun: node-e00a-014
GERun: node-e00a-014
GERun: node-e00a-014
GERun: node-e00a-014
GERun: node-e00a-014
GERun: node-e00a-014
GERun: node-e00a-014
GERun: node-e00a-014
GERun:
GERun: GErun command being run:
GERun: mpirun --rsh=ssh -machinefile /tmpdir/job/9970527.undefined/machines.unique -np 12 -rr mdrun_mpi -v -deffnm md_0_1 -cpi -append -maxh 1
/shared/ucl/apps/intel/2020/impi/2019.6.166/intel64/bin/mpirun: line 103: 245674 Segmentation fault (core dumped) mpiexec.hydra -machinefile $machinefile "$@" 0<&0
/var/opt/sge/node-j00a-002/active_jobs/9970528.1/pe_hostfile
node-j00a-002
node-j00a-002
node-j00a-002
node-j00a-002
node-j00a-002
node-j00a-002
node-j00a-002
node-j00a-002
node-j00a-002
node-j00a-002
node-j00a-002
node-j00a-002
Hi, here is an update for the core dumped
issue. I submit sequentially-appending jobs to achieve long MD duration. When a job completes or gets core dumped
issue, the next job will follow.
job_conti.sh.log (483 Bytes)
I have tested:
gromacs/2020.4/intel-2020
with error file md_0_1.e9979959.log (1.2 KB) and
core.184587.log (612 KB);
gromacs/2019.3/intel-2018
with error file md_0_1.e9979960.log (289.2 KB).
For both Gromacs versions, the MD can proceed. “core dumped” issue only occasionally happens for the 2020 version but is not affecting the running.
I will ask our high-performance computer manager the possible reasons. But I would appreciate it if someone knows the reason.