Umbrella-sampling setup of molecule permeation through interface surfactant membrane

GROMACS version: 2022.4
GROMACS modification: No

Dear GROMACS users,

I am planning to use umbrella sampling to calculate the energy profile of a molecule’s permeation through an interface surfactant membrane. However, I’m uncertain about how to establish an appropriate pulling pathway to obtain a reliable energy profile.

This is my box (with water, surfactant-cyan, small molecule-green):

When I run the initial pulling simulation, I pulled the molecule COM upwards using

pull_coord1_type             = umbrella  
pull_coord1_geometry         = direction-periodic
pull_coord1_groups           = 0 1  
pull_coord1_dim              = N N Y
pull_coord1_vec		         = 0.0 0.0 1.0 

However, I observed that the molecule’s movement in the x/y dimensions was not constrained, resulting in a trajectory where the molecule wound its way upward. I believe this is not the only pathway for the molecule to exit the solution.

If I use pull_coord1_geometry = cylinder or restrict the x/y dimensions of the COM, this would confine the molecule to exit the solution from a specific region of the interface. Given that the surfactants do not align uniformly, I suspect that a single PMF simulation may not provide an accurate energy profile for the permeation process.

Any comments or suggestions are welcome.

In my experience with these systems, boxes with membranes tend to be treated as homogeneous along a plane (usually xy) and heterogeneous along an axis (usually z). In this way you can use z as a collective variable and sample some quantity with respect to it, like in your case the free energy would be with respect to the normal of the membrane (z, usually). Implicitly, this means that the position on xy along which you permeate is not relevant, that is, the free energy profile is independent of WHERE you go through in the bilayer, it just depends on the depth. As such, in my understanding it is to an extent mandatory to diffuse laterally as being independent from the position is clearly an approximation, and in principle you would want to sample decently the whole xy plane for a given z to have a good statistical description of that specific slab. Since lateral diffusion can be slow, you can actually combine several profiles with molecules at the same depth but different xy positions to sample better and faster.

The effect might be more pronounced especially for large inhomogeneous membranes, where you can have local distribution of lipids with different local permeabilities. Consider also that you are neglecting the inner freedom of the solute, as you are treating it as a point localized at a certain depth. The larger and the more flexible the solute, the worse the approximation becomes, as other solute-related degrees of freedom become relevant (e.g. orientation).

Ideally when running your umbrella sampling simulations, whilst restraining your molecules [z] movements, the molecule should diffuse freely in the x and y plane, you do not need to define the pathway through the membrane. Your umbrella sampling simulations should be long enough for your molecule to explore the [xy] plane at each sampled [z] level.

Depending on the density of your surfactant membrane, exploring the [xy] plane can be very slow and if your small molecule is large enough, your pulling simulation can cause disruption to your membrane that can take a long while to relax. You may want to look at an alternative way to place your molecule in the membrane at the different [z] levels, a series of short free energy perturbation simulations can be used to make your small molecule ‘appear’ in the membrane without the disruption that pulling may cause. If you find this to be the case, I’m happy to advise further.

Thanks for your kind reply!

That’s true, fully explore the xy plane will take a long time. I’m much interested in the alternative method you metioned that could make molecule appear in the membrane, could you tell me more about that? (like how to place the molecule at a specific region without cause any conflicts?)

Thanks for your helpful reply! Your comments have clarified my question.

I don’t quite understand this claim. In the simulation, the solute is not treated as a rigid object. Could you tell me why the inner freedom is neglected? Is this a result of using the COM as the collective variable?

Sorry I didn’t express properly what I mean. You are considering only the position of the center of mass of your solute. However, think about a complex solute, like a long flexible molecule that is polar (like a lipid, for example). Then, the orientation of the solute itself will be very relevant for the partition/permeation properties, and all these degrees of freedom should be properly explored if you want a well converged free energy profile. The issue is explored in literature, like here for example.

Got it, that’s clear! Thanks for sharing the valuable information😊 I’ll keep this in mind.

I don’t understand you argumentation for not using geometry=cylinder. I would say your arguments are exactly the reason to use geometry=cylinder. What matters is precisely where your molecule resides with respect to the local center of the membrane.

Thanks, Hess. I’m conducting this simulation to understand the global energy profile of a molecule passing through the surfactant membrane, rather than focusing on the energy profile of a specific pathway (is this achievable?). I’m uncertain whether using geometry=cylinder can effectively capture global information and provide a comprehensive understanding within just one simulation.

Yes, cylinder does that. A localized COM is required with a deformable membrane. Note that the cylinder is not fixed in space, it moves in the x-y plane along the with molecule.

That’s great! I used to think that the cylinder is fixed.

I’ll try this method right now. Thanks for your help!

Hi hess, I’m meeting some problems when trying to use geometry=cylinder. When I was generating tpr file, there was message showed:

Pull group    natoms      pbc atom     distance at start    reference at t=0
1               26          0
2             16800         0           -nan nm              -nan nm

The pull code I used was:

pull                        = yes
pull_ngroups                = 2
pull_ncoords                = 1
pull_group1_name            = LIM ;molecule
pull_group2_name            = SDS ;layer
pull-group1-pbcatom	        = -1
pull-group2-pbcatom 	    = -1
pull_coord1_type            = umbrella  
pull_coord1_geometry        = cylinder
pull_coord1_groups          = 1 2  
pull_coord1_dim             = N N Y
pull_coord1_vec	   	        = 0.0 0.0 1.0 
pull_coord1_rate            = 0.0001 ;nm/ps
pull_coord1_k               = 1000  
pull_coord1_start           = yes 

If I remove the definition of pull-group1-pbcatom and pull-group2-pbcatom, there would be a error occur:

ERROR 1 [file pull.mdp]:
  When the maximum distance from a pull group reference atom to other atoms
  in the group is larger than 0.5 times half the box size a centrally
  placed atom should be chosen as pbcatom. Pull group 2 is larger than that
  and does not have a specific atom selected as reference atom.

I also discovered statements in the mdp document that mentioned,

pull-group1-pbcatom: This parameter is not used with pull-coord1-geometry cylinder.

So I’m confused and don’t know how to fix the error. Could you please let me know how to set mdp parameters correctly to run the pull code?

p.s. I didn’t find much detailed tutorial about pull code (particularly with geometry=cylinder) online, I’ll appreciate if you could share any helpful resources.

The cylinder group should be the first group, so swap the two groups.

That works!! Thank you so much hess!

Hi hess, thanks for your previous guide on pull=cylinder, now I’m facing another problem:

If I’d like to calculate the free energy profile of such a process, e.g., a small molecule transiting across the membrane into and out of a liquid phase.

Here I defined small molecule as pull group2 and the bottom membrane as group1,

pull                    = yes
pull_ngroups            = 2
pull_ncoords            = 1

pull_group1_name        = BTN
pull_group2_name        = LIM 

pull_coord1_type        = umbrella  
pull_coord1_geometry    = cylinder
pull_coord1_groups      = 1 2  
pull_coord1_dim         = N N Y
pull_coord1_vec		    = 0.0 0.0 -1.0 
pull_coord1_rate        = 0.0001 ;nm/ps
pull_coord1_k           = 1000  
pull_coord1_start       = yes  

However, when the pull starts, the simulation collapsed and showed a lot of LINCS warning:

Step 405, time 0.81 (ps)  LINCS WARNING
relative constraint deviation after LINCS:
rms 1588.394653, max 132962.640625 (between atoms 18299 and 18300)
bonds that rotated more than 30 degrees:

In another case, if I put the small molecule close to the top membrane, the pull simulation could run normally.

I’ll appreciate if you can tell me why the first simulation does not work and how to set the parameters appropriately. Thank you.

Are the mdp options above all the pull options you provide? Does the simulation run normally when you start from the same state without pulling active?

Hi hess, thanks for your reply.

Are the mdp options above all the pull options you provide?

Yes, this is all the pull option I used. Here’s the mdp file

; Run control
integrator      = md
dt              = 0.002 
nsteps          = 40000000
nstcomm         = 100
comm-grps  	    = system

; Output control
nstxout-compressed           = 10000
compressed-x-grps            = system 
nstlog                       = 10000
nstenergy                    = 10000

; Neighbor searching
nstlist         = 100
rlist           = 1.0
pbc		        = xyz

; Elecrostatics & VdW
coulombtype     = PME
rcoulomb        = 1.0
vdwtype       	= cut-off
rvdw            = 1.0
DispCorr        = EnerPres


; Temperature coupling
Tcoupl          = V-rescale
tc_grps         = System
tau_t           = 0.2
ref_t           = 298.15


; Generate velocites
gen_vel         = yes
gen_temp        = 298.15
gen_seed        = -1


; Constrains of Bonds
constraints     = h-bonds
freezegrps  	= 
freezedim   	= 


; distance restraints for hydrated ions
disre           = simple
disre_fc        = 50000

; Pull
pull                    = yes
pull_ngroups            = 2
pull_ncoords            = 1
pull_group1_name        = BTN
pull_group2_name        = LIM 
pull_coord1_type        = umbrella  
pull_coord1_geometry    = cylinder
pull_coord1_groups      = 1 2  
pull_coord1_dim         = N N Y
pull_coord1_vec		    = 0.0 0.0 -1.0 
pull_coord1_rate        = 0.01 ;nm/ps
pull_coord1_k           = 1000  
pull_coord1_start       = yes 

Does the simulation run normally when you start from the same state without pulling active?

Yes, I removed the pull code and rerun the simulation, it ended successfully.

What confused me is, if I put the small molecule close to the top membrane, the pull simulation could run normally.

Then I have not clue what could be wrong. All your settings are very sensible. What molecule are the two atoms in the warning part of?

Hi hess, I checked the outputs. Verbose showed it was atoms from BTN group (pull group2, membrane layer) caused the error.

Step 386, time 0.772 (ps)  LINCS WARNING
relative constraint deviation after LINCS:
rms 0.000067, max 0.005776 (between atoms 14659 and 14661)
bonds that rotated more than 30 degrees:
 atom 1 atom 2  angle  previous, current, constraint length
   8598   8599   35.4    0.1090   0.1090      0.1090
  13176  13178   48.3    0.1090   0.1091      0.1090
  13186  13189   30.3    0.1090   0.1090      0.1090
  13193  13194   38.6    0.1090   0.1090      0.1090
  13209  13211   31.8    0.1090   0.1090      0.1090
  14656  14658   32.5    0.1090   0.1090      0.1090
  14659  14660   57.0    0.1090   0.1087      0.1090
  18290  18292   30.1    0.1090   0.1090      0.1090

You mean group1, I suppose.

The only suggestion I have it to include both membranes in pull group 1 and maybe also the liquid. That should improve your results. You may be lucky in that that also solves your crash.