[umbrella sampling] Can I use "pull_coord1_geometry = direction" for complexes with cyclodextrins?

GROMACS version: 2025.3
GROMACS modification: No

I would like to kindly ask whether the below described approach is valid from the thermodnamical point of view.

Cyclodextrins (CD) are cyclic ologosaccharides, composed of several glucose units. They are known to form inclusion complexes with different organic molecules. The value of Gibbs free energy of the formaton of such a complex can be calculated, among others, using the umbrella sampling technique.

As shown in the figure shown below, a guest molecule may enter the CD cavity through a wider or a narrower rim. When the gmx trjconv command option is executed with -princ option, CD is aligned so that its axis is parallel to the z axis of coordinate system, and the narrower rim is placed at the site of positive z values, while the wider rim is placed at the negative site of z axis. Such an arrangement encourages to execute two pulling simulations along the z axis - first with a positive pulling rate, and the second with a negative rate, which corresponds to pulling the guest away from the host in two different directions.

The following image shows the cross-section of the β-CD molecule accompanied with the included guest molecule, shown in green. It shows the high symmetry of a β-CD moleucle.

Umbrella sampling tutorials often contain many warnings that the researcher should choose the reaction coordinate geometry based on the studied system. Therefore, I would like to kindly ask the GROMACS community whether the symmetry of β-CD allows for the treatment described below.

  1. Two pull mdp files are used, differing in the sign of the pulling rate along the z axis, and containing, among others, the following options:

pull = yes
pull_ncoords = 1
pull_ngroups = 2
pull_coord1_groups = 1 2
pull_coord1_type = umbrella
pull_coord1_geometry = direction
pull-coord1-vec = 0.0 0.0 1.0
pull_coord1_dim = N N Y
pull_coord1_start = yes
pull_coord1_rate = 0.005 ; or -0.005 for the negative direction
pull_coord1_k = 2000

  1. Snapshots are selected in 0.1 nm intervals, acting as the starting points for individual MD simulatons. Their mdp files contain, among others, the following options:

define = -DPOSRES_REC ; keep the CD molecule in its original position
pull = yes
pull_ncoords = 1
pull_ngroups = 2
pull_coord1_groups = 1 2
pull_coord1_type = umbrella
pull_coord1_geometry = direction
pull-coord1-vec = 0.0 0.0 1.0
pull_coord1_dim = N N Y
pull_coord1_start = no
pull_coord1_init = XX ; XX is replaced with appropriate distance each time
pull_coord1_rate = 0
pull_coord1_k = 1000

3.The reuslting .tpr and pullf files are arranged in an order from a most negative z, through 0, up to most positive z, the gmx wham command is executed, and PMF profile is plotted.

I am aware that many researchers prefer another set of options, i.e.

pull_coord1_geometry = distance
pull_coord1_dim = Y Y Y

On the other hand, it seems to me that the use of this settings results in losing the control through which rim the guest leaves the host, and it is also unsure whether the initial position of the guest relative to the host was really the most preferred one. This is why I thnik of an approach utilizing the z coordinate, not sole COM-COM distance. Is it acceptable for the studied system?

I would be very grateful for clarifying whether this proceudre is valid from the point of view of thermodynamic statistics; and if not - what would be a more correct approach? Thank you in advance!

If I wanted to determine the Gibbs free energy of complex formation based on the PMF profile, would it be sufficient to calculate teh difference between the minimum and the plateau corresponding to large distance between the host and the guest?

Are there any corrections on ΔG that should be taken into acoount?

My question from a previous post is also still valid.