Reproducible segfault with 2020.4

GROMACS version: 2020.4 single prec.
GROMACS modification: No

Hi
gmx keeps giving me segfaults with a particular system. (see below for input files and commands).
I tried that on different computers (all debian 10 on Intel Core™ i7-4930K, and NVIDIA GeForce GTX 1060), and with, both, gmx 2020.4 and 2019.3 - I found that this happens only sometimes, eventually I found that with a particular ld-seed I get the segfault, and with a different one I don’t and the simulation finishes gracefully - and reproducibly so …

As the input files are fairly large I cannot include them to this post, and there seems to be no way to upload attachments in this forum (or I am too dump to figure out how that works.) so I put the files (all input and output, except for the trr trajectories) in an archive on my own web-page from where it can be easily downloaded:
http://www.brunsteiner.net/segfault.tgz

commands used:
gmx grompp -f md01.mdp -c aae-01.pdb -p faf-aae-13-14-000008-01.top -o faf-aae-13-14-000008-01 -po faf-aae-13-14-000008-01.mdp
gmx mdrun -v -deffnm faf-aae-13-14-000008-01 > stdouterr 2>&1 &

Any help is highly appreciated!
thanks,
MIchael

What prior minimization and equilibration have you done? The fact that ld-seed influences the result suggests something about an instability of the velocities/thermostat. I notice that you are setting the temperature of the system to 295 K but not generating velocities, so the initial step has T = 0 K and effectively is trying to instantaneously thermalize to 295 K. The .log file has very little in it so it is hard to see what the response of the temperature is, but to me, your problem is most likely in the initial conditions and inadequate equilibration.

Jalemkul - I understand your comment, but I am pretty sure that equilibration is not an issue here. If what you suggest was the case then the system would blow up, and I’d get a more meaningful error message (e.g. LINCS error, or the like) - as it is, all I get is gmx stopping without further comment, and the OS telling that there was a segfault … also the initial structure is perfectly fine, it’s an organic molecular crystal with the structure taken from xray data, so there are no overlaps, or funny conformations to start with …

At first I thought my memory might be going bad, but then this started happening on at least 4 different machines - It’s very unlikely that the memory starts failing on several different computers that worked for years at about the same time … I believe the issue might be the force field (this work is about FF optimization) … I use a modified GAFF2 (both LJ and bonded parameters changed), but try to stick to reasonable boundaries for the individual parameters so as to avoid numerical issues, not too small sigma, or too stiff force constants, etc … but still there should be some meaningful error message, which is not the case … perhaps i need to go and try valgrind … can gmx be compiled with debug info?

BTW … ld-seed does seem to make a difference here - i double checked with gmx dump -s … and two tpr files ONLY differ by this seed, but one dies, the other runs, on the same computer. According to the documentation ld-seed only affects BD or SD dynamics, while I use MD … as far as i can see this is nowhere documented …

mic

ld-seed affects your thermostat:

https://manual.gromacs.org/current/user-guide/mdp-options.html#temperature-coupling

That’s why I suggested a more robust protocol, especially if the quality of your FF is unknown. How quick is the seg fault? Immediate? In most cases, you will get a specific GROMACS error before a failure, but not always. A seg fault can come when the initial conditions are unphysical and some algorithm simply can’t survive it.