Protein membrane system explosion

GROMACS version: 2020.2
GROMACS modification: No

Hi all,

I’m running a set of GPU-accelerated GPCR simulations in a POPC:SDPC:CHL1 bilayer with three different ligands bound. I’ve run 2/3 ligands in duplicate for 600 ns so far, but I’m running into issues with the 3rd system:

The settings are identical to the other two ligands, and the system was generated in the same way (using CHARMM-GUI membrane builder) as earlier. But ~200 ns into the simulation it crashed, so I used the checkpoint file to resume the simulation, it ran okay for a while but crashed again. I repeated this a few times and left the lab, but when I came back I noticed the computer became unresponsive and didn’t output any video to the monitors. The first time, I did a hard reset and noticed only one of my two monitors was working. In graphics settings (Ubuntu 20.04 LTS) the one monitor that was working was no longer recognised. nvidia-smi also gave me an error message (I forgot what it was). So I apt-get purged the nvidia drivers and autoinstalled the newest version and this fixed the monitor and nvidia-smi issues. When I checked the simulation it crashed at ~320 ns with the same error as before:

The Z-size of the box (9.272714) times the triclinic skew factor (1.000000) is smaller than the number of DD cells (7) times the smallest allowed cell size (1.326000)

I also noticed that in the last frame before the crash, gmx -v said vol was 1.00!

I thought maybe the system wasn’t equilibrated enough, so I dumped a frame from a few ns into the production run, and re-minimised and equilibrated that, starting a new run. This time it ran fine up until ~150 ns I found out, after rebooting since there was no video again. I didn’t need to reinstall the drivers this time though, they were still working. I checked the volume from the EDR file and there was a noticeable sharp dip just before the simulation crashed. Hoping it was just a one-off, I resumed the simulation, but once again my colleague informs me it has become unresponsive. I haven’t been back in the lab to check where it crashed, but regardless, I don’t think finishing the 600 ns run in these circumstances is realistic.

Does anyone know why a simulation would suddenly blow up so far into the run, on a system pretty much identical to the others, which ran fine?

I’m considering increasing the pressure coupling time constant, but would that affect the dynamics to the extent that it makes comparisons to the previous runs invalid? I compared the initial cell size to the other runs and there wasn’t much difference. The only irregularity I noticed was a paucity of CHL1 in a particular part of the membrane. Perhaps rebuilding the system would fix the issue?

I think this is an old bug in treating CMAP terms when they cross periodic boundaries. Use the latest version of GROMACS and it should fix the issue (which should have been fixed in 2020.3, IIRC, but it is always good to use the most up-to-date version).

1 Like

Interesting… I did notice some transient Z-axis boundary crossing.

Will running this system on 2020.3+ or 2021.x pose any issues for comparing to the previous runs with 2020.2? Would not like to run another 2.4 μs if possible :)

Patch releases are designed to be scientifically compatible so use the latest 2020.x version to see if it fixes your problem.

I’ve patched to version 2020.6 now, but unfortunately the problem is still recurring with the original trajectory.

I’ve also re-generated the system with the same options (just a different randomly generated lipid and solvent box) and am running it in the lab now to see if that resolves the issue.

I’ve just found one point of difference between the runs:

Since generating the systems, CHARMM-GUI has changed their MDP file options slightly.

Instead of specifying SYSTEM for tc_grps and comm_grps, the new MDP files specify SOLU, MEMB, and SOLV separately (and an index.ndx file is provided with these groups defined, which needs to be provided during grompp).

I don’t see why this would lead to such errors, but I’ll try testing with the old format.

Looks like that was it! Again, not sure why it would cause this issue but changing it back to SYSTEM has resolved it.

For membranes (and any layered system), the comm-grps should be set to those of the layers. If you have separate protein and lipid components, they can have their velocities reset differently such that they crash into each other over time due to COM motion removal. It’s the same reason why a globular protein should never use comm-grps = Protein Non-Protein. The appropriate setting for your case would be comm-grps = SOLU_MEMB SOLV. The different layers should be set to different groups because their intrinsic diffusion is different.

Ah, I wonder why CHARMM-GUI originally used comm-grps = SYSTEM? That’s how I’ve run all of my protein-ligand-membrane simulations so far.

Comparing the volume and box size fluctuations with comm-grps = SOLU_MEMB SOLV it does look like it’s marginally more stable than comm-grps = SYSTEM. I’m not looking at any diffusion-related properties though, so I think it’d be best to complete the last runs with comm-grps = SYSTEM to avoid having to rerun the previous 2.4 μs, unless there are issues with that?

I expect any errors would be minor and perhaps not even noticeable. I’m digging into the issue with the developers as to why a change was made and why the optimal setting was never offered to users…

1 Like