Mutating frames from a trajectory and then running a new simulation with the same topology file?

GROMACS version: 2018 GPU
GROMACS modification: No

Hi,

I am using enhanced sampling methods to probe the effects of point mutations in activation energies for conformational changes in enzymes. One thing that I am now trying to do to improve sampling is taking representative states from a Markov State Model of wild-type trajectories from conformation A to B, making the amino acid substitution in these states, and then seeding a new simulation to observe if probabilities (calculated with the weighted ensemble method) equilibrate to their expected values for a control mutant.

However, I am running into a simulation setup problem when I try to do so. The reason is: I extract frames from the MSM (that was built with ~10 us aggregate simulation time from an enhanced sampling simulation), then I make the mutation on PyMol, and when I build and equilibrate the systems, they all have different topologies due to a different number of water molecules (the conformational change changes the volume of the solute sightly).

Since they all have different topologies (although the protein is the same), I can’t feed these ~250 starting structures to my enhanced sampling method.

I was wondering if there is any way to “graft” a water box from a previous simulation onto a new solute (since I’m just changing one residue), or specify the number of water molecules I want in the box, or if theres’ a way to directly make the mutation in the .gro and .top files so I don’t have to re-make the box.

I realize that there might be good reasons why that would not be smart in many cases, but I was hoping someone faced a similar issue before and was able to figure out a solution that they would like to share.

Here’s my script for building the system (further steps ommited for clarity)

gmx pdb2gmx -f wt_$SLURM_ARRAY_TASK_ID.pdb -o processed_$SLURM_ARRAY_TASK_ID.gro -p processed_$SLURM_ARRAY_TASK_ID.top -i processed_$SLURM_ARRAY_TASK_ID.itp -ignh -ff amber99sb-ildn -water tip3p

gmx editconf -f processed_$SLURM_ARRAY_TASK_ID.gro -o newbox_$SLURM_ARRAY_TASK_ID.gro -bt dodecahedron -d 1.0

gmx solvate -cp newbox_$SLURM_ARRAY_TASK_ID.gro -p processed_$SLURM_ARRAY_TASK_ID.top -o solv_$SLURM_ARRAY_TASK_ID.gro

Thanks!

I think I might have found a solution by defining the dimensions for a box large enough to solvate the conformations with most volume, then running the -maxsol argument in gmx solvate to make sure all boxes have the same amount of solvent molecules.