Using -membed option in gromacs

GROMACS version: 2020.7
GROMACS modification: No

Hi,

I want to embed a protein in an equilibrated membrane model. I cannot use CHARMM-GUI for this purpose, as I need the membrane to be completely equilibrated before embedding the protein inside.

One option is to use gmx mdrun -membed from older versions of GROMACS. But I cannot find any tutorial or manual about this process. I would appreciate it if you could help me find a manual or explain how to use this method. I am not even sure how exactly one can derive .dat file that is needed. Alternatively, I would appreciate it if there is another method for embedding the protein inside a membrane that might be in the newer versions of GROMACS.

Bests,
Maryam

Hello, Maryam!

I’m currently working on a similar problem. In my case, I was trying to use CHARMM-GUI (lots of love & gratitude to the Im lab and all contributors!) to generate a membrane containing customized glycolipids with an inserted protein. Unfortunately, even with the many options they offer, I can’t get past the lipid packing/ion adding steps.
You’re absolutely right, it’s very hard to find information on how to use gmx mdrun -membed. In my opinion CHARMM-GUI is just so awesome that it essentially has the monopoly on this kind of tasks and it’s hard to find alternative approaches for the few cases that it can’t handle straight away.

So I’m currently gathering info and tinkering on a hybrid approach to embed a protein into a membrane using gmx mdrun -membed from gromacs 2019 with CHARMM-GUI generated membrane-only files (after equilibration) and the protein .pdb as input. Please note that the -membed feature is currently non-functional in gromacs versions >2019, so for this you cannot use 2020 or newer at this point of time (July 2025).
My current workflow is still very messy, has lots of manual editing steps and is pretty specific for my case, but I am working on posting a refined version here in the next days. Just wanted to let you know that I will get back to this.

Greetings :)

Hi,
the CHARMM-GUI is awesome!
However, I found the following tool quite useful to setup membranes containing non-standard lipids (i.e., lipids with custom parameters not provided by the CHARMM-GUI): PACKMOL-Memgen (DOI: https://doi.org/10.1021/acs.jcim.9b00269).
There is also a tutorial available on the Amber-Website (Membrane System Setup with PACKMOL-Memgen).

More information about -membed can be probably found in the original publication (DOI: https://doi.org/10.1002/jcc.21507).

Hope that could help you a little bit.

Best,
Marius

Hello, people!

Thanks for pointing to Packmol, Marius. I’m a little bit upset I didn’t find this before I started digging into membed :O

But well, who knows what this is good for…


The following is based on:
Yesylevskyy, 2007 publication on ProtSqueeze: https://doi.org/10.1021/ci600553y
Wolf et al., 2009 publication on g_membed: https://doi.org/10.1002/jcc.21507
Aponte-Santamaría’s Tutorial: biological membranes
and the gromacs forum, e.g. Problem with energy exclusion groups using gmx mdrun -membed

One interesting point of using gmx mdrun -membed is that it works on an existing patch of membrane and doesn’t assemble a new one. If this is an advantage or disadvantage depends on what you need :)

gmx mdrun -membed workflow

(atomistic, using CHARMM36m forcefield)

I’m heavily relying on gromacs specific files from CHARMM-GUI to set up my insertion into an already equilibrated membrane:

membrane without protein after the equilibration run:

  • step6.6_equilibration.gro
  • topol.top
  • index.ndx
  • whole toppar/ folder
  • step6.0_minimization.mdp (any plausible .mdp should do, it’s only for adding ions with gmx genion, see below)

protein:

  • protein.pdb
  • optional: protein topology data (.top / .itp) e.g. PROA.itp from some other CHARMM-GUI setup which includes the very same protein and forcefield
    In my case I couldn’t just use the topology files created by CHARMM-GUI as they were, so I had to create a protein topology from the .pdb (see below), but I still wanted the PROA.itp file for the position restraints.
    For making a new topology, I also needed the charmm36m forcefield ported for gromacs (put folder charmm36.ff/ into your working directory) which I obtained from the MacKerell Lab website: MacKerell Lab)

It might not be necessary for you to create a new protein topology if you already have one.

membed specific files:
(modified from Aponte-Santamaría’s Tutorial)

I included some explanations in membed.dat.

Many entries in the .mdp are probably not needed, but I know too little to dare remove them :D

Put it all into a new directory.


For less experienced users like me - A bit more basic info, which I only understood correctly when figuring out this workflow

  1. A .gro file is a list of coordinates holding the position of all atoms in the system at a certain point in time.

  2. .top and .itp are topology files that contain info to:

  • define (as lists) the properties of all kinds of atoms, bonds, angles, etc. used in the system in the sections [ ...types ]. This info originally comes from the forcefield.
  • then define for each molecule [ moleculetype ] which atoms, bonds, angles, etc. to actually use [ atoms ], [ bonds ],… in which order. All entries here look up the respective info in the [ ...types ] defined before.
  • then define the number of each molecule in [ molecules ] in the [ system ] in an order. All entries here look up their info in the [ moleculetypes ]

.top is the main topology file, which usually imports info from many “include topology” .itp files via the #include "xyz.itp" statement. Those can in turn also include files. When loading a topology, the whole include hierarchy is read.

  1. The gro. file holds no information which atoms belong to which molecule, how it is connected, angles, etc. (but .pdb files might). This information comes from the definitions in the topology. The entries from .gro and .top are matched by the order dictated by the topology. Therefore, the order of entries in the .gro file and the topology must match: If [ molecules ] ends with a certain protein, then that protein atom’s coordinates must be last in the .gro file.

Steps:

Installs:

I recommend using gromologist, a nice python package for manipulating gromacs structure and topology files by Miłosz Wieczór (KBM / Gromologist · GitLab). It’s optional but makes life easier. Make sure Python is installed and install gromologist:

pip install gromologist

Install gromacs 2019, the last version where -membed was functional. I used conda:
conda create -n gromacs_2019 -c conda-forge gromacs=2019.6=nompi_h5f56185_100

Let’s hope this version doesn’t disappear from the conda channels any time soon :P

Generate from the protein structure the .gro and topology files:
gmx pdb2gmx -f <yourprotein>.pdb -o prot.gro -p prot.top

  • type to select a forcefield: e.g. for me 1 for the CHARMM36 ff in the working directory
  • type to select a water model: e.g. for me 1 for CHARMM36 modified TIP3
  • this generates a .gro coordinate file, a main topology file prot.top and several topology include *.itp files

It might not be necessary to convert the structure to .gro. A .pdb file should also work fine, but I didn’t test it yet. pdb2gmx can also output .pdb files, if you just need the topology.

handle macro/flag style protein position restraints:
CHARMM-GUI uses macro flags such as POSRES_FC_BB in their .itp files to flexibly set the position restraint values for protein backbone and sidechains in each step of the equilibration (actual values are assigned in the mdrun via the .mdp files). I wanted to keep this feature, so I

  • copypasted the position restraint block…
#ifdef POSRES
[ position_restraints ]
...
#endif

…from the PROA.itp file of my protein (created using CHARMM-GUI) to a new file prot_posres.itp.

I used the position restraint block from PROA.itp from some other simulation I created with CHARMM-GUI where I used the exact same protein sequence and forcefield. Of course you can also try using the position restraints generated by pdb2gmx or whatever you like.
To #include the new prot_posres.itp file into the final merged topology, see below.

  • If you use CHARMM-GUI position restraints like me, also delete the posre...itp files created by pdb2gmx
  • and delete…
; Include Position restraint file
#ifdef POSRES
#include "posre_Protein_chain_A.itp"
#endif

… from the end of the prot_Protein_chain_A, B, ... .itp files that were created by pdb2gmx.

orient protein and position at membrane’s center of mass coordinates
I wanted to align the protein principal axes to global coordinate axes (the .pdb file had some awkward rotation) and then place it in the center of the membrane.

  • get the center of mass coordinates of the membrane:
    gmx traj -f step6.6_equilibration.gro -s step6.6_equilibration.gro -ox -com -n
    type to select: MEMB

  • find the COM coordinates in the last line of the new coord.xvg output file (ignore the first value) e.g.:
    0 6.38741 6.53658 5.56973

  • center the protein to these coordinates and align it’s principal axes to global coordinate axes:
    gmx editconf -f prot.gro -o prot_c.gro -center 6.38741 6.53658 5.56973 -princ
    type to select: System

  • if you copy+paste the COM coordinates from coord.xvg be aware that there’s a funny separator between the values (tab?). Replace with a whitespace.

merge protein and membrane .gro files (structure):
These steps can also be done by hand, but that’s a bit error prone. So I was happy to find gromologist.

Merge the structure files:

In python (I used the interactive python shell)

import gromologist as gml
#first, load the membrane .gro
p = gml.Pdb("step6.6_equilibration.gro")
#then add the oriented and centered prot_c.gro
p.insert_from("prot_c.gro")
p.save_gro("combined.gro")

Clean the protein topology:

(python)

ptop = gml.Top("prot.top")
#removes unused entries
ptop.clear_sections()
#might take a while:
ptop.clear_ff_params()
#makes the file self contained, i.e. explicitly adds the info from all #include statements
ptop.explicit_defines()
ptop.add_ff_params()
ptop.save_top("prot_clean.top")

Merge the topologies:

(python)

#same order as with the .gro files! load first the MEMBRANE topology...
#ignore_missing_defines to allow macro/flag style position restraints
t = gml.Top("topol.top",ignore_missing_defines=True)
#...then add the protein topology
t.add_parameters_from_file("prot_clean.top")
t.add_molecule_from_file("prot_clean.top")
#maybe you will need this?
#t.find_missing_ff_params()

#add all the proteins you need, in my case it's a homotrimer of chain A
t.add_molecules_to_system(molname="Protein_chain_A",nmol=3)
#saving here overwrites the original topol.top
t.save_top("topol.top")

Prepare for membed:
membed requires certain index groups that correspond to the entries in membed.mdp.

Make a new index:

gmx make_ndx -f combined.gro

enter the correct group numbers to copy the following groups and rename them accordingly:

  • copy protein → rename new group to “SOLU”, e.g.
11
name 22 SOLU
  • copy solvent (TIP3?) and solvated ions → rename new group to “SOLV”
    use a pipe “|” to combine groups, e.g.
10|8|9
name 23 SOLV
  • copy all lipids/lipid constituents of the membrane → rename new group to “MEMB”, e.g.
2|3|4|5|6|7
name 24 MEMB
  • finish by entering q

gmx mdrun -membed:

This method works by first shrinking the protein in the XY dimensions (to 0.1*original size in my case). Z can also be shrunken, but that’s not usually recommended. Then, all molecules overlapping this shrunken molecule will be removed. Secondly, a short MD run is carried out where the protein is gradually inflated until it reaches it’s original size, pushing all colliding molecules aside.

To run it:

#fire up gromacs 2019
conda activate gromacs_2019
#two warnings are to be expected, hence the -maxwarn 2
gmx grompp -f membed.mdp -p topol.top -c combined.gro -n index.ndx -o membed.tpr -maxwarn 2
gmx mdrun -membed membed.dat -s membed.tpr -mn index.ndx -v -c membed_out.gro -x membed_out.xtc -e membed_out.edr -g membed_out.log

Type to select SOLU, then select MEMB

  • -membed is implemented to run on just a single core, no parallelization, no GPU. So it will take some time (> 10 min for me @ ~5 GHz) for the default 1000 steps, even on mighty machines.

  • Take care that membed states in the console how many of each molecule it is removing in the first step. It’s best to copy that info into a new file. membed produces an updated .gro file, but does not update the topology so you have to do it by hand.

  • Therefore, back up your topol.top file and then open the original topol.top and in the very end, in the [ molecules ] section carefully subtract for each molecule the number that membed removed e.g.

[ molecules ]
; Compound	#mols
GLPD 252   ;I subtracted 13
GLPB 180   ;I subtracted 7
GLPC 36    ;I subtracted 1
...
Protein_chain_A 3
  • I really recommend visualizing membed_out.gro and membed_out.xtc in vmd to check for unwanted effects and simply because the process looks funny.
  • You might have to experiment a bit with the initial size of the protein (in membed.dat) and the size of the membrane you supply. The larger the protein starts out, the more lipids will be more removed, resulting in a bit of a loss of area. If you start with a factor of 0.1 like me, the total area of your simulation might actually increase a bit.
  • My protein complex is larger in the upper leaflet than it is in the lower leaflet, so in the membed.dat I usually use an ndiff of 30 to remove 30 lipids more in the upper leaflet. In the membed.dat provided here I set ndiff=0 though.

Afterwards, you should work on with a newer version of gromacs e.g. by exiting the shell and opening a new session.

add ions to neutralize net charge:

A new protein usually introduces a net charge which should be neutralized.

  • Make a new index file

gmx make_ndx -f membed_out.gro
You don’t have to make any new groups here, just enter q to save the updated index.

  • Then generate a .tpr. I think you can use pretty much any plausible .mdp file here. I used one from a CHARMM-GUI setup. We’re not running a simulation with this.

gmx grompp -f step6.0_minimization.mdp -p topol.top -c membed_out.gro -r membed_out.gro -o membed_temp.tpr -maxwarn 1

  • Now add ions. Replace CLA and SOD with the names of your negative and positive ions you want to add. They have to already be defined in the topology of course.

gmx genion -s membed_temp.tpr -neutral -nname CLA -pname SOD -p topol.top -o step5_input.gro -n index.ndx
Type to select the group of solvent molecules (TIP3 in my case)

Note: gmx genion DOES update the topology, no worries :)

I chose step5_input.gro as output name because I wanted compatibility with the default CHARMM-GUI MDS run script.

Now make the final index:

New ions have been added so
gmx make_ndx -f step5_input.gro

If you want to use CHARMM-GUI .mdp files for minimizing, equilibrating and running you have to create groups as we did before, e.g. for me it was:

  • copy protein → rename new group to “SOLU”
11
name 22 SOLU
  • copy solvent (TIP3?) and solvated ions → rename new group to “SOLV”
    use a pipe “|” to combine groups, e.g.
10|8|9
name 23 SOLV
  • copy all lipids/lipid constituents of the membrane → rename new group to “MEMB”, e.g.
2|3|4|5|6|7
name 24 MEMB
  • Lastly, make a group that combines SOLU and MEMB, e.g.

22|24

  • finish by entering q

include macro/flag style position restraints:

Finally, as I wanted to use position restraints macro/flags for my protein, I had to #include the prot_posres.itp created above into the merged topol.top topology. For this, use a text editor to open topol.top and Ctrl+F search for the [ moleculetype ] definition(s) of the protein(s) (e.g. Protein_Chain_A, _B, …). Here find the end of the section, e.g. by searching for the next [ moleculetype ] or [ system ] and add as the last line for each of your proteins:

#include "prot_posres.itp"

Should look something like this:

...
[ moleculetype ]
  ; Name            nrexcl
Protein_chain_A 3
...
[ cmap ]
...
 4080  4082  4084  4090  4092     1
 4090  4092  4094  4097  4099     1
  ; Include water topology

#include "prot_posres.itp"

[ system ]
  ; Name
Title

[ molecules ]
  ; Compound	#mols
GLPD 246 ;252
...

For me, the last entry in every protein’s [ moleculetype ] section was [ cmap ] so I added the #include statement after each [ cmap ] section.

When using gromologist, this can’t be done at an earlier stage, because then it would try to read prot_posres.itp but it can’t handle the flags (yet).


And that’s it!
Get/Create .mdp files to minimize and equilibrate your setup (e.g. CHARMM-GUI mdp’s and script) and run your simulation.

I’m also trying to get it to work with coarse grained simulations using the Martini22 forcefield. I will report on this later.

Good luck!