Assessing the stability of a mutant protein withGROMACS

GROMACS version: 2021.4-Ubuntu-2021.4-2
GROMACS modification: No

Hi,

So I am trying to assess the stability of a mutant protein with respect to its native conformation, in essence obtain a ddg value as well as possibly a break down of the components that contribute to the change in stability.
I have gone through the free energy of solvation tutorial, the lysozome tutorial, as well as the umbrella sampling tutorial. A superior recommended I use umbrella sampling but I don’t understand how the pulling concept would be applied to my single chain protein.

It might be asking for a lot but if someone who perhaps has experience in obtaining ddg values from gromacs or have done similar simulations could assist in directing me to a tutorial or protocol that may provide me with some guidance.

I did want to make use of thermodynamic integration for the MD simulations, however I was unable to find a tutorial to guide this process.

Indeed, alchemical simulations (TI) are a better option than US for estimating the effects of mutation on a relative free energy between conformations.

You’ll have to have a model for the reference (here: unfolded) state, and it can be just a tiny peptide with the same transition, but you need to close the thermodynamic cycle to have a meaningful estimate.

Here’s a great paper that explains the key concept and choices: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8388617/

The PMX webserver will help you create alchemical amino acids: http://pmx.mpibpc.mpg.de/

In Fig 8 of that paper you will find the most popular strategies for running alchemical simulations. Depending on your choice, you will find different tools and tutorials; I should soon have a tutorial for case (D), I have been postponing it for a while, but there are excellent ones for a similar case (solvation free energy):

http://www.mdtutorials.com/gmx/free_energy/index.html
https://tutorials.gromacs.org/free-energy-of-solvation.html

Hi,

So I have made use of PMX for the unfolded state and used their tripeptide database for the CHARMM36 ff. However when I attempt the command pdb2gmx on the tripeptide:

gmx_mpi pdb2gmx -f tripep_V2C.pdb -o conf.pdb -p topol.top -ff charmm36m-mut -water tip3p

I get the following error:
:-) GROMACS - gmx pdb2gmx, 2023.2 (-:

Executable: /usr/local/gromacs/bin/gmx_mpi
Data prefix: /usr/local/gromacs
Working dir: /home/tammara/Documents/gmx_test
Command line:
gmx_mpi pdb2gmx -f tripep_V2C.pdb -o conf.pdb -p topol.top -ff charmm36m-mut -water tip3p

Using the Charmm36m-mut force field in directory /usr/local/lib/python3.10/dist-packages/pmx/data/mutff/charmm36m-mut.ff

going to rename /usr/local/lib/python3.10/dist-packages/pmx/data/mutff/charmm36m-mut.ff/merged.r2b
Opening force field file /usr/local/lib/python3.10/dist-packages/pmx/data/mutff/charmm36m-mut.ff/merged.r2b
Reading tripep_V2C.pdb…
Read ‘’, 42 atoms

Analyzing pdb file
Splitting chemical chains based on TER records or chain id changing.

There are 1 chains and 0 blocks of water and 5 residues with 42 atoms

chain #res #atoms

1 ‘X’ 5 42

All occupancies are one
All occupancies are one
Opening force field file /usr/local/lib/python3.10/dist-packages/pmx/data/mutff/charmm36m-mut.ff/atomtypes.atp

Reading residue database… (Charmm36m-mut)
Opening force field file /usr/local/lib/python3.10/dist-packages/pmx/data/mutff/charmm36m-mut.ff/merged.rtp
Opening force field file /usr/local/lib/python3.10/dist-packages/pmx/data/mutff/charmm36m-mut.ff/mutres.rtp
Opening force field file /usr/local/lib/python3.10/dist-packages/pmx/data/mutff/charmm36m-mut.ff/mutres_dna.rtp
Opening force field file /usr/local/lib/python3.10/dist-packages/pmx/data/mutff/charmm36m-mut.ff/merged.hdb
Opening force field file /usr/local/lib/python3.10/dist-packages/pmx/data/mutff/charmm36m-mut.ff/merged.n.tdb
Opening force field file /usr/local/lib/python3.10/dist-packages/pmx/data/mutff/charmm36m-mut.ff/merged.c.tdb

Back Off! I just backed up topol.top to ./#topol.top.8#

Processing chain 1 ‘X’ (42 atoms, 5 residues)

Identified residue ACE1 as a starting terminus.

Identified residue CT35 as a ending terminus.
8 out of 8 lines of specbond.dat converted successfully
Start terminus ACE-1: NH3+
End terminus CT3-5: COO-


Program: gmx pdb2gmx, version 2023.2
Source file: src/gromacs/gmxpreprocess/pdb2top.cpp (line 1115)

Fatal error:
atom N not found in buiding block 1ACE while combining tdb and rtp

For more information and tips for troubleshooting, please check the GROMACS
website at Common Errors — GROMACS webpage https://www.gromacs.org documentation


MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.

To that effect I realised I had to use the -ter option to specify the capped terminal ends, and so I added the option and selected none for both ends, however I get the following:

gmx_mpi pdb2gmx -f tripep_V2C.pdb -o conf.pdb -p topol.top -ff charmm36m-mut -water tip3p -ter
:-) GROMACS - gmx pdb2gmx, 2023.2 (-:

Executable: /usr/local/gromacs/bin/gmx_mpi
Data prefix: /usr/local/gromacs
Working dir: /home/tammara/Documents/gmx_test
Command line:
gmx_mpi pdb2gmx -f tripep_V2C.pdb -o conf.pdb -p topol.top -ff charmm36m-mut -water tip3p -ter

Using the Charmm36m-mut force field in directory /usr/local/lib/python3.10/dist-packages/pmx/data/mutff/charmm36m-mut.ff

going to rename /usr/local/lib/python3.10/dist-packages/pmx/data/mutff/charmm36m-mut.ff/merged.r2b
Opening force field file /usr/local/lib/python3.10/dist-packages/pmx/data/mutff/charmm36m-mut.ff/merged.r2b
Reading tripep_V2C.pdb…
Read ‘’, 42 atoms

Analyzing pdb file
Splitting chemical chains based on TER records or chain id changing.

There are 1 chains and 0 blocks of water and 5 residues with 42 atoms

chain #res #atoms

1 ‘X’ 5 42

All occupancies are one
All occupancies are one
Opening force field file /usr/local/lib/python3.10/dist-packages/pmx/data/mutff/charmm36m-mut.ff/atomtypes.atp

Reading residue database… (Charmm36m-mut)
Opening force field file /usr/local/lib/python3.10/dist-packages/pmx/data/mutff/charmm36m-mut.ff/merged.rtp
Opening force field file /usr/local/lib/python3.10/dist-packages/pmx/data/mutff/charmm36m-mut.ff/mutres.rtp
Opening force field file /usr/local/lib/python3.10/dist-packages/pmx/data/mutff/charmm36m-mut.ff/mutres_dna.rtp
Opening force field file /usr/local/lib/python3.10/dist-packages/pmx/data/mutff/charmm36m-mut.ff/merged.hdb
Opening force field file /usr/local/lib/python3.10/dist-packages/pmx/data/mutff/charmm36m-mut.ff/merged.n.tdb
Opening force field file /usr/local/lib/python3.10/dist-packages/pmx/data/mutff/charmm36m-mut.ff/merged.c.tdb

Back Off! I just backed up topol.top to ./#topol.top.9#

Processing chain 1 ‘X’ (42 atoms, 5 residues)

Identified residue ACE1 as a starting terminus.

Identified residue CT35 as a ending terminus.
8 out of 8 lines of specbond.dat converted successfully
Select start terminus type for ACE-1
0: NH3+
1: NH2
2: 5TER
3: None
3
Start terminus ACE-1: None
Select end terminus type for CT3-5
0: COO-
1: COOH
2: CT2
3: 3TER
4: None
4
End terminus CT3-5: None
Opening force field file /usr/local/lib/python3.10/dist-packages/pmx/data/mutff/charmm36m-mut.ff/merged.arn
Opening force field file /usr/local/lib/python3.10/dist-packages/pmx/data/mutff/charmm36m-mut.ff/mutres.arn

Checking for duplicate atoms…

Generating any missing hydrogen atoms and/or adding termini.

Now there are 5 residues with 42 atoms

Making bonds…

atom HV is missing in residue V2C 3 in the pdb file

atom HV is missing in residue V2C 3 in the pdb file

You might need to add atom HV to the hydrogen database of building block V2C in the file mutres.hdb (see the manual)

You might need to add atom HV to the hydrogen database of building block V2C in the file mutres.hdb (see the manual)


Program: gmx pdb2gmx, version 2023.2
Source file: src/gromacs/gmxpreprocess/pdb2top.cpp (line 1567)

Fatal error:
There were 1 missing atoms in molecule Protein_chain_X, if you want to use
this incomplete topology anyhow, use the option -missing

For more information and tips for troubleshooting, please check the GROMACS
website at Common Errors — GROMACS webpage https://www.gromacs.org documentation


MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.

Arguably working with the local version of pmx is a bit more tricky - in principle, the webserver should give you working topologies so that you don’t have to go through pdb2gmx in order to obtain them. What you’re getting are pretty common errors that come from missing hydrogen database entries for alchemical residues, or automatic selection of terminal patches.

Hi,

So I have gotten pretty far with these free energy simulations however something strange is happening when I get to the non-equilibrium part of the MD. When I try to extract 50 snapshots from the 5ns equilibrium simulation at every 100ps starting from 100ps, I only generate 47 frames instead of 50. I have gone through my MD parameters but I don’t see anything that particularly stands out. Below is the command and the output, along with the output from gmx check on the .trr file. Attached is the mdp files used for equil and non-equil simulations.
f_equil.mdp (4.0 KB)

The command and out:
Command line:
gmx_mpi trjconv -f equil.trr -s equil.tpr -sep -b 100 -o frame_.gro

Will write gro: Coordinate file in Gromos-87 format
Reading file equil.tpr, VERSION 2023.2 (single precision)
Reading file equil.tpr, VERSION 2023.2 (single precision)
Group 0 ( System) has 3003 elements
Group 1 ( Protein) has 45 elements
Group 2 ( Protein-H) has 21 elements
Group 3 ( C-alpha) has 3 elements
Group 4 ( Backbone) has 11 elements
Group 5 ( MainChain) has 15 elements
Group 6 ( MainChain+Cb) has 16 elements
Group 7 ( MainChain+H) has 19 elements
Group 8 ( SideChain) has 26 elements
Group 9 ( SideChain-H) has 6 elements
Group 10 ( Prot-Masses) has 45 elements
Group 11 ( non-Protein) has 2958 elements
Group 12 ( Water) has 2958 elements
Group 13 ( SOL) has 2958 elements
Group 14 ( non-Water) has 45 elements
Select a group: trr version: GMX_trn_file (single precision)

Skipping frame 0 time 0.000
Skipping frame 1 time 40.000
Skipping frame 2 time 80.000
Reading frame 3 time 120.000
Reading frame 4 time 160.000 → frame 0 time 120.000

Reading frame 5 time 200.000 → frame 1 time 160.000

Reading frame 6 time 240.000 → frame 2 time 200.000

Reading frame 7 time 280.000 → frame 3 time 240.000

Reading frame 8 time 320.000 → frame 4 time 280.000

Reading frame 9 time 360.000 → frame 5 time 320.000

Reading frame 10 time 400.000 → frame 6 time 360.000

Reading frame 11 time 440.000 → frame 7 time 400.000

Reading frame 12 time 480.000 → frame 8 time 440.000

Reading frame 13 time 520.000 → frame 9 time 480.000

Reading frame 14 time 560.000 → frame 10 time 520.000

Reading frame 15 time 600.000 → frame 11 time 560.000

Reading frame 16 time 640.000 → frame 12 time 600.000

Reading frame 17 time 680.000 → frame 13 time 640.000

Reading frame 18 time 720.000 → frame 14 time 680.000

Reading frame 19 time 760.000 → frame 15 time 720.000

Reading frame 20 time 800.000 → frame 16 time 760.000

Reading frame 30 time 1200.000 → frame 26 time 1160.000

Reading frame 40 time 1600.000 → frame 36 time 1560.000

Reading frame 50 time 2000.000 → frame 46 time 1960.000

Last frame 50 time 2000.000
→ frame 47 time 2000.000
Last written: frame 47 time 2000.000

The trr file check:
gmx_mpi check -f equil.trr

                  :-) GROMACS - gmx check, 2023.2 (-:

Executable: /usr/local/gromacs/bin/gmx_mpi
Data prefix: /usr/local/gromacs
Working dir: /home/user/Documents/test_nonequil_process
Command line:
gmx_mpi check -f equil.trr

Checking file equil.trr
trr version: GMX_trn_file (single precision)
Reading frame 0 time 0.000

Atoms 3003

Last frame 50 time 2000.000

Item #frames Timestep (ps)
Step 51 40
Time 51 40
Lambda 51 40
Coords 51 40
Velocities 51 40
Forces 51 40
Box 51 40

GROMACS reminds you: “Make the Floor Burn” (2 Unlimited)

f_nonequil.mdp (4.2 KB)

You say “When I try to extract 50 snapshots from the 5ns equilibrium simulation at every 100 ps starting from 100 ps, I only generate 47 frames instead of 50.” From the output you posted it seems like the simulation is 2 ns and the trajectory is written every 40ps. As you can also see from the output, starting from 100ps means that you discard the frames at 0, 40, 80 ps. So, what you do get is 48 frames (0 to 47), instead of 51 (0 to 50). If you expect the simulation to write every 100 ps and run for 5 ns, you need to check your mdp file and also that it matches the tpr file.

Edit: Checking your f_equil.mdp it can be seen that the source of the problem is that dt = 0.0008. It seems like you assume that it is 0.002.

You are absolutely right, thank you for pointing that out.