Using AWH for bilayer permeability

GROMACS version: 2023
GROMACS modification: No

Hi!

I’m attempting to pull a ribose molecule through a lipid bilayer using AWH and pull-code. I’ve run this system for 600ns without seeing any sampling of the bilayer core region. Before committing a longer run time for this experiment, I want to check that the mdp-options are reasonable.

My initial setup has a ribose molecule about 5nm above the bilayer along the Z axis (COM at Z=11nm). I want to pull it through the bilayer and 5nm beyond the far lipid leaflet (to COM at Z=1nm). Here are my pull-related mdp options:

pull = yes
pull-ncoords = 1
pull-ngroups = 2
pull-group1-name = MEMB
pull-group2-name = LIG
pull-nstxout = 10
pull-nstfout = 10
pull-coord1-groups = 1 2
pull-pbc-ref-prev-step-com = yes
pull-group1-pbcatom = 9263 ; a POPC tail atom near the bilayer core region
pull-coord1-type = external-potential ; use AWH potential
pull-coord1-potential-provider = awh
pull-coord1-geometry = direction ; pull the LIG in the direction of the vector
pull-coord1-vec = 0 0 1 ; axis is in +ve Z direction
pull-coord1-dim = N N Y
pull-coord1-rate = 0
pull-coord1-start = yes
pull-coord1-k = 1000

awh = yes
awh-nstout = 500000
awh-potential = convolved
awh-nbias = 1
awh-nstsample = 100
awh-nsamples-update = 10
awh1-growth = exp-linear
awh1-target = constant
awh1-user-data = no

awh1-ndim = 1
awh1-dim1-coord-provider = pull
awh1-dim1-coord-index = 1
awh1-dim1-start = -5.588
awh1-dim1-end = 5.588
awh1-dim1-force-constant = 25000
awh1-dim1-diffusion = 5e-5
awh1-error-init = 5

The coordinate distribution from gmx awh -more shows I’m not yet sampling the bilayer core region:

Screenshot from 2023-07-02 10-06-37

Are these mdp settings incorrect for what I’m trying to do here?

Thanks!
TYD

You are using too low diffusion coefficient and target error, making your initial update size small. The unit for the AWH diffusion is nm^2/ps. Try setting it to 1e-3. I’d also recommend leaving error-init at the default 10 or even raising it to, e.g. 40. At least you should cross the bilayer that way. If you get very high free energy barriers in your output (and non-converging results) you might want to reduce the initial update size.

When you’re pulling through a bilayer, I’d also consider setting the awh range (-start and -end) large enough to get a periodic dimension. But there might of course be cases where you don’t want that.

Thanks for your advice, I’ll try this with a larger initial update size and error-init value, and see if the sampling crosses the bilayer. I’ll post an update here for comparison of the sampling,

Thanks!
TYD

I now see a total of 6 transitions across the bilayer within 1us, and the coordinate distribution reflects this:

However, this sampling doesn’t look great around the bilayer region. Following the tutorial, I see this simulation hasn’t exited the initial stage, so I’ll try a multi-walker simulation for 1us and see if that improves the sampling.

Thanks!
TYD

Thanks for the update.

At least it looks a lot better now. One reason why your coordinate distribution is so skewed in the bilayer region might be that the diffusion is a lot slower there (higher friction). If you have time and resources it might be a good idea to:

  1. Compare the current results with simulations with 1/4 of the awh diffusion coefficient, as well as with 4 times higher diffusion coefficient.
  2. Start a simulation with a higher (try 10 times higher) target distribution in the bilayer region, using an AWH input file and the corresponding mdp option.

Cheers!

Thanks for the suggestions!

Just to be clear about mdp options for point (2), I’m not certain how to start with a higher target distribution in that specific region. Would you suggest using awh1-target = cutoff or boltzmann to enhance the sampling around the bilayer region?

You can still use awh1-target = constant, but set awh1-user-data = yes. Then you supply an awh input file. The input file has the same format as the output from awh analyses. You can then specify an input PMF and/or a custom target distribution.

Thanks for the clarification! I’ll pursue these options and update here with the sampling comparison.

Here’s an update comparing the sampling of the options discussed above:

All these were run for 1us. The coordinate distribution seems closest to flat when using a diffusion coefficient of 0.25e-3, and the PMF profiles of both the multi-walker and the 10x target distribution simulations seem further from convergence. Only the multi-walker run exited the initial stage (@ 342ns).

I’ll run these out to 2us and see if the PMF profiles converge. Are there any other settings I should explore to improve the sampling?

Indeed, I would think that a diffusion coefficient of 2.5e-4 is reasonable for your system. But you might want to try 5e-4.

It still seems like your reaction coordinate is not periodic - the difference is so large between the end points (~-5.6 and 5.6). How large is your system in the Z direction? I’d recommend setting your AWH reaction coordinate range to cover the whole system. I’m still surprised by the huge difference in the PMF of the water phase with 4 walkers. Do the walkers all start from the same “side” of the lipid bilayer?

I would actually recommend using awh1-equilibrate-histogram = yes. I would argue it’s always a good idea, even if the manual (currently) only recommends it when using multiple walkers.

One more thing, it might be worth also setting pull-coord1-k = 5000 or even 10000 in order to make sure that you can properly sample the curvature of the free energy landscape.

EDIT: Sorry, the pull-coord1-k does not matter. Your awh1-dim1-force-constant is what matters and that is high enough.

For the run with multiple walkers, you should also set awh1-dim1-cover-diameter. I would recommend awh1-dim1-cover-diameter = 8 to ensure that all walkers cross the lipid membrane to count towards a covering.

I’ll try using a diffusion coefficient of 5e-4 as well, and see how this affects the coordinate distribution.

My system is 11.29 nm in the Z-dimension, with the ligand COM initially at 11.1 nm. I set the reaction coordinate from -5.6 to +5.6 between the ligand and membrane COMs, which I expect should move the ligand down to the box boundary close to 0 nm.

As for the multi-walker simulation, yes all walkers started from the same coordinate file. Would you recommend starting each walker from a randomised coordinate?

Thank you for the additional recommendations, I’ll make those changes.

Strange. Then the reaction coordinate should be periodic (>95% of the system dimension). Anyhow, to ensure proper overlap and facilitate crossing the periodic boundary I would suggest:

awh1-dim1-start = -5.644
awh1-dim1-end = 5.644

With proper periodic reaction coordinates I don’t think your starting configuration will have a large influence anymore. But in general it’s good to start from different positions.

Thank you for all the assistance. Here’s an update on the sampling achieved using 4 walkers with random starting positions over 1us:

Other modified parameters for this run:

  • awh1-dim1-diffusion = 5e-4
  • awh1-dim1-cover-diameter = 8
  • awh1-dim1-start = -5.644
  • awh1-dim1-end = 5.644
  • awh1-equilibrate-histogram = yes

The simulation exited the initial stage at 634ns. The PMF convergence looks faster compared to the previous single-walker runs, with a more accurate value than the previous multi-walker experiment.

The coordinate distribution still seems a bit bumpy to me, comparable to the single-walker with diffusion coeff 0.25e-3. If I were to run this for longer to get a flatter coordinate distribution, would the PMF profile be expected to change significantly?

Thanks!
TYD

Thanks for the update. I think your PMF looks stable and reasonable now. I’m not surprised that your coord distribution is not perfectly flat. It might be interesting to check the sampling distribution (the column after the coordinate distribution) as well. It is the sampling distribution that is targetting 1. It will probably still not be flat by any means, but it may be slightly less bumpy.

But I would not be worried if your sampling would look like your current coordinate distribution. With different diffusion coefficients/friction in different parts of the system it is difficult to get the sampling completely flat. I don’t think your PMF would change much if you sampled a lot longer.