Umberlla sampling for protein ligand complex

GROMACS version:
GROMACS modification: Yes/No
Here post your question
umberlla sampling performed for protein ligand complex
While performing gmx grompp -f md_pull.mdp -c npt.gro -p topol.top -r npt.gro -n index.ndx -t npt.cpt -o pull.tpr
i got following error
Distance between pull groups 1 and 2 (5.818415 nm) is larger than 0.49 times
the box size (5.818141).
help me to solve this issue. Thanks in advance

Have a look at Molecular dynamics parameters (.mdp options) - GROMACS 2024.4 documentation. What pull-coord1-geometry are you using?

yes here i copy the pull code ,

; Pull code
pull = yes
pull_ncoords = 1 ; only one reaction coordinate
pull_ngroups = 2 ; two groups defining one reaction coordinate
pull_group1_name = Protein
pull_group2_name = lig
pull_coord1_type = umbrella ; harmonic potential
pull_coord1_geometry = distance ; simple distance increase
pull_coord1_dim = N N Y
pull_coord1_groups = 1 2
pull_coord1_start = yes ; define initial COM distance > 0
pull_coord1_rate = 0.01 ; 0.01 nm per ps = 10 nm per ns
pull_coord1_k = 1000 ; kJ mol^-1 nm^-2
pull-group1-pbcatom =24
pull-pbc-ref-prev-step-com = yes

i increased the box size 16 16 16

again i got the error
Distance between pull groups 1 and 2 (7.773946 nm) is larger than 0.49 times
the box size (7.773864).
using the same pull code

help me to solve this issue

Yes, with distance and direction you can’t pull further than half (0.49 times) the box size. Try direction-periodic, but read the instruction of it first.

How far do you need to pull the molecules apart? Are you sure you need to separate them this much?

Thank you , i will try and let u know, I am beginner to this study. At what basis , how far have to pull the ligand from the protein ?
How to reduce this separation distance ?
if i am wrong correct me pls .
Thank you

I don’t know the what you are studying, but I assume you want to pull the ligand away from the protein into a pure aqueous environment. When the protein is outside (with some margin) the cutoff of the LJ and Coulomb interactions the interactions between the ligand and the protein can be considered negligible.

Currently you are using the COM distances, which means that there is no easy way to estimate how far away the ligand is from the closest protein residues, but if you have a look at the trajectories when you pull you can get an idea about this.

I don’t see any reason to pull much further than the cutoff + 1 nm away from the protein (not the COM of the protein). But in the end it is you who must decide how far you need to pull.

i dnt want to pull the ligand 0.49 times the box size . how to change the distance or reduce the pulling distance
can you clarify me pls

You can lower the pull rate and/or lower the number of steps (the simulation time).

reduced the pull rate, can you check is it correct
; Run parameters
integrator = md
dt = 0.002
tinit = 0
nsteps = 1000000 ; 2 ns
nstcomm = 10

; Pull code
pull = yes
pull_ncoords = 1 ; only one reaction coordinate
pull_ngroups = 2 ; two groups defining one reaction coordinate
pull_group1_name = Protein
pull_group2_name = lig
pull_coord1_type = umbrella ; harmonic potential
pull_coord1_geometry = distance; simple distance increase
pull_coord1_dim = N N Y
pull_coord1_groups = 1 2
pull_coord1_start = yes ; define initial COM distance > 0
pull_coord1_rate = 0.001 ; 0.001 nm per ps = 1 nm per ns
pull_coord1_k = 1000 ; kJ mol^-1 nm^-2
pull-group1-pbcatom =24
pull-pbc-ref-prev-step-com = yes

With those settings you will pull the ligand (or its target position) 2 nm in the Z direction from its initial position. If that is what you want, it looks correct.

can you guide me pls… again i got the error after changing the pull rate

Distance between pull groups 1 and 2 (7.772358 nm) is larger than 0.49 times
the box size (7.771990).

how to fix this , can you refer some basic about protein ligand umbrella sampling

What’s the initial distance between the groups?

gmx_mpi distance -f md.xtc -s md.tpr -n prolig.ndx -oav distave.xvg -oall dist.xvg -oxyz distxyz.xvg -oh disthist.xvg -oallstat diststat.xvg -select -tu ns -dt 50

i select the option 22 protein ligand
but its not working can you correct my mistake

here i copy the my pull_md.mdp file

title = Umbrella pulling simulation
define = -DPOSRES_B
; Run parameters
integrator = md
; Start time and Timestep in ps
dt = 0.002
tinit = 0
nsteps = 250000 ; 500 ps
nstcomm = 10
; Output parameters
nstxout = 5000 ; every 10 ps
nstvout = 5000
nstfout = 500
nstxtcout = 500 ; every 1 ps
nstenergy = 500
; Bond parameters
constraint_algorithm = lincs
constraints = all-bonds
continuation = yes ; continuing from NPT
; Single-range cutoff scheme
cutoff-scheme = Verlet
nstlist = 20
ns_type = grid
rlist = 1.4
rcoulomb = 1.4
rvdw = 1.4
; PME electrostatics parameters
coulombtype = PME
fourierspacing = 0.12
fourier_nx = 0
fourier_ny = 0
fourier_nz = 0
pme_order = 4
ewald_rtol = 1e-5
optimize_fft = yes
; Berendsen temperature coupling is on in two groups
Tcoupl = V-rescale
tc_grps = Protein_lig Water_and_ions
tau_t = 1.0 1.0
ref_t = 310 310
; Pressure coupling is on
Pcoupl = Parrinello-Rahman
pcoupltype = isotropic
tau_p = 1.0
compressibility = 4.5e-5
ref_p = 1.0
refcoord_scaling = com
; Generate velocities is off
gen_vel = no
; Periodic boundary conditions are on in all directions
pbc = xyz
; Long-range dispersion correction
DispCorr = EnerPres
; Pull code
pull = yes
pull_ncoords = 1 ; only one reaction coordinate
pull_ngroups = 2 ; two groups defining one reaction coordinate
pull_group1_name = lig
pull_group2_name = Protein
pull_coord1_type = umbrella ; harmonic potential
pull_coord1_geometry = distance; simple distance increase
pull_coord1_dim = N N Y
pull_coord1_groups = 1 2
pull_coord1_start = yes ; define initial COM distance > 0
pull_coord1_rate = 0.01 ; 0.01 nm per ps = 10 nm per ns
pull_coord1_k = 1000 ; kJ mol^-1 nm^-2
pull-group1-pbcatom =24
pull-pbc-ref-prev-step-com = yes

using this gmx mdrun -deffnm pull -pf pullf.xvg -px pullx.xvg
was worked
pullf.xvg (82.9 KB)
pullx.xvg (82.7 KB)
here with i attached pullf.xvg and pullx.xvg , is it correct way to perform , if am wrong correct it pls

The difference I see between your mdp settings is that you have changed the order of the pull groups. This is of course fine - should not make any difference (with this pull geometry). What you have done, though, is that you are now using atom 24 of the ligand as the reference to define the center of mass of the ligand, and no PBC atom for the protein. Before you used atom 24 of the protein to define the center of mass of the protein, but you did not specify an atom for the ligand. The ligand is so small (I guess) that you don’t really need a pbcatom, but you probably need one for the protein, especially since you are using pull-pbc-ref-prev-step-com = yes. If atom 24 was not correct before, the distances might have been wrong.

With the new settings you are pulling 5 nm from the initial distance. Before, you only pulled 2 nm. How long do you want to pull?

You may be lucky that it works in the second case, but that mostly indicates an error in the first case.

I would suggest:

; Pull code
pull = yes
pull_ncoords = 1 ; only one reaction coordinate
pull_ngroups = 2 ; two groups defining one reaction coordinate
pull_group1_name = Protein
pull_group2_name = lig
pull_coord1_type = umbrella ; harmonic potential
pull_coord1_geometry = distance; simple distance increase
pull_coord1_dim = N N Y
pull_coord1_groups = 1 2
pull_coord1_start = yes ; define initial COM distance > 0
pull_coord1_rate = 0.001 ; 0.001 nm per ps = 1 nm per ns
pull_coord1_k = 1000 ; kJ mol^-1 nm^-2
pull-group1-pbcatom = <A CENTRALLY LOCATED ATOM IN THE PROTEIN>
pull-pbc-ref-prev-step-com = yes

I don’t promise it will help your problem, but it’s almost certainly more correct.

To get the initial distances, just look at the first line of the pullx.xvg file.

Thank you ,
in the pullx.xvg initial distance 0.0000 (ps) 0.362757 (nm)

in the previous case its worked for nsteps = 250000 ; 500 ps then
increased the nsteps = 500000 ; 1000 ps (1ns) its not working , i get the same error
Fatal error:
Distance between pull groups 1 and 2 (6.703034 nm) is larger than 0.49 times
the box size (6.702723).

is it correct if i run pull.tpr for nsteps = 250000 ; 500 ps is it correct. how to confirm this

If you start at 0.36 nm and pull for 500 ps at 10 nm/ns, you should end at a distance of 5.36 nm, which is below 6.7 nm, which is why it works. It’s no surprise that it does not work when you pull for 1 ns (with the same pull rate), as you will then end at 10.36 nm.

It is difficult to say if it’s correct or not. If you generate the conformations you are interested in, as input to the umbrella sampling, you can do almost whatever you want. Remember, you are still just trying to generate input conformations to the actual simulations. But keep in mind, if you pull quickly (what is quick is system dependent, but for simplicity let’s say > 1 nm/ns) or the umbrella sampling starting conformations are generated far from equilibrium in other ways, you will probably have to run the umbrella sampling windows longer and discard more data in the beginning of them.

thank you so much.
You mean , if i reduce the pull rate may be it will work longer nsteps = 500000 ; 1000 ps (1ns).
but dnt use the pull rate > 10 nm .
is it right

Yes, with lower pull rate you will not pull as far in the same period of time (same number of steps).