CHARMM MDP FILES FOR Umbrella Sampling

GROMACS version:2018.8
GROMACS modification: No
Here post your question
Can someone help me fixing the mdp files npt.mdp and md_pull.mdp file. The available .mdp file at Umbrella Sampling was utilized for GROMOS96 53A6 parameter with SPC water. Currently I am using CHARMM36 ff for pulling of triphosphate analogue from a protein. What changes should I make in mdp files ( npt.mdp and md_pull.mdp file) for CHARMM36.

Hi,

Here mdp settings in harmony with CHARMM force field

constraints = h-bonds ; bonds involving H are constrained
rcoulomb = 1.2 ; short-range electrostatic cutoff (in nm)
rvdw = 1.2 ; short-range van der Waals cutoff (in nm)
vdw-modifier = Force-switch ; specific CHARMM
rvdw_switch = 1.0 ;
DispCorr = EnerPres ; account for cut-off vdW scheme
coulombtype = PME ; Particle Mesh Ewald for long-range electrostatics
fourierspacing = 0.15 ; grid spacing for FFT

Best regards
Alessandra

Any specific thermostat and barostat parameters? Also I have somewhere earlier read that" DispCorr-no; note that dispersing correction should be only applied in case of lipid monolayer. "

Hi,
I am aware on discussion on the topic. Propably, @jalemkul knows more about the this
Best regards
Alessandra

1 Like

Thermostats and barostats are not force field-dependent.

DispCorr = EnerPres is only for monolayers, as the documentation says. The force field is parametrized explicitly to account for long-range vdW forces. Phospholipid monolayers are the only systems that are imbalanced in this regard and require analytical correction.

1 Like

Thank You @jalemkul Sir and @alevilla mam.
Chain-A -Protein +MG(In binding pocket)
Chain-B- GTP
FF- CHARMM36 FF
Trying to pull GTP Chain B (ligand from pocket)
Interaction between triphospahte and metal ion is strong, so along with ligand MG is also coming out.

Getting warning and NOTES while running pulling simualtion (md_pull.mdp)
NOTE 1 [file md_pull.mdp]:
nstcomm < nstcalcenergy defeats the purpose of nstcalcenergy, setting
nstcomm to nstcalcenergy
NOTE 2 [file topol.top, line 57]:
You are combining position restraints with Parrinello-Rahman pressure
coupling, which can lead to instabilities. If you really want to combine
position restraints with pressure coupling, we suggest to use Berendsen
pressure coupling instead.
WARNING 1 [file md_pull.mdp]:
You are using pressure coupling with absolute position restraints, this
will give artifacts. Use the refcoord_scaling option.

What should be an ideal strategy for pulling out GTP from the binding pocket.
How should I proceed.

NPT Parameter
*title = Protein-ligand complex NPT equilibration *
define = -DPOSRES ; position restrain the protein
; Run parameters
integrator = md ; leap-frog integrator
nsteps = 50000 ; 2 * 50000 = 100 ps
dt = 0.002 ; 2 fs
; Output control
nstvout = 1000 ; save velocities every 1.0 ps
nstenergy = 1000 ; save energies every 1.0 ps
nstlog = 1000 ; update log file every 1.0 ps
nstxout-compressed = 1000 ; save coordinates every 1.0 ps
; Bond parameters
*continuation = no ; continuing from NVT *
*constraint_algorithm = lincs ; holonomic constraints *
*constraints = h-bonds ; bonds to H are constrained *
lincs_iter = 1 ; accuracy of LINCS
lincs_order = 4 ; also related to accuracy
; Neighbor searching and vdW
cutoff-scheme = Verlet
ns_type = grid ; search neighboring grid cells
nstlist = 20 ; largely irrelevant with Verlet
rlist = 1.2
vdwtype = cutoff
vdw-modifier = force-switch
rvdw-switch = 1.0
rvdw = 1.2 ; short-range van der Waals cutoff (in nm)
; Electrostatics
coulombtype = PME ; Particle Mesh Ewald for long-range electrostatics
rcoulomb = 1.2
pme_order = 4 ; cubic interpolation
fourierspacing = 0.16 ; grid spacing for FFT
; Temperature coupling
tcoupl = V-rescale ; modified Berendsen thermostat
tc-grps = Protein_UNK Water_and_ions ; two coupling groups - more accurate
tau_t = 0.1 0.1 ; time constant, in ps
ref_t = 300 300 ; reference temperature, one for each group, in K
; Pressure coupling
pcoupl = Berendsen ; pressure coupling is on for NPT
pcoupltype = isotropic ; uniform scaling of box vectors
tau_p = 2.0 ; time constant, in ps
ref_p = 1.0 ; reference pressure, in bar
compressibility = 4.5e-5 ; isothermal compressibility of water, bar^-1
refcoord_scaling = com
; Periodic boundary conditions
pbc = xyz ; 3-D PBC
; Dispersion correction is not used for proteins with the C36 additive FF
*DispCorr = no *
; Velocity generation
gen_vel = yes ; Velocity generation is on
gen_temp = 300 ; temperature for velocity generation
*gen_seed = -1 ; random seed *
; COM motion removal
; These options remove COM motion of the system
nstcomm = 10
comm-mode = Linear
comm-grps = System

Pull_md.mdp
*title = Umbrella pulling simulation *
define = -DPOSRES_A
; Run parameters
integrator = md
dt = 0.002
tinit = 0
nsteps = 350000 ; 700 ps
nstcomm = 10
; Output parameters
nstxout = 5000 ; every 10 ps
*nstvout = 5000 *
nstfout = 500
nstxtcout = 500 ; every 1 ps
nstenergy = 500
; Bond parameters
constraint_algorithm = lincs
*constraints = h-bonds *
*continuation = yes ; continuing from NPT *
; Neighbor searching and vdW
cutoff-scheme = Verlet
ns_type = grid ; search neighboring grid cells
nstlist = 20 ; largely irrelevant with Verlet
rlist = 1.2
vdwtype = cutoff
vdw-modifier = force-switch
rvdw-switch = 1.0
rvdw = 1.2 ; short-range van der Waals cutoff (in nm)
; Electrostatics
coulombtype = PME ; Particle Mesh Ewald for long-range electrostatics
rcoulomb = 1.2
pme_order = 4 ; cubic interpolation
fourierspacing = 0.16 ; grid spacing for FFT
; Temperature coupling
tcoupl = V-rescale ; modified Berendsen thermostat
tc-grps = Protein_UNK Water_and_ions ; two coupling groups - more accurate
tau_t = 0.1 0.1 ; time constant, in ps
ref_t = 300 300 ; reference temperature, one for each group, in K
*; Pressure coupling *
pcoupl = Parrinello-Rahman ; pressure coupling is on for NPT
pcoupltype = isotropic ; uniform scaling of box vectors
tau_p = 2.0 ; time constant, in ps
ref_p = 1.0 ; reference pressure, in bar
compressibility = 4.5e-5 ; isothermal compressibility of water, bar^-1
; Generate velocities is off
*gen_vel = no *
; Periodic boundary conditions are on in all directions
pbc = xyz
; Long-range dispersion correction
DispCorr = no
; Pull code
pull = yes
*pull_ncoords = 1 ; only one reaction coordinate *
*pull_ngroups = 2 ; two groups defining one reaction coordinate *
*pull_group1_name = Chain_B *
*pull_group2_name = Chain_A *
pull_coord1_type = umbrella ; harmonic potential
*pull_coord1_geometry = distance ; simple distance increase *
pull_coord1_dim = N N Y
pull_coord1_groups = 1 2
pull_coord1_start = yes ; define initial COM distance > 0
pull_coord1_rate = 0.01 ; 0.01 nm per ps = 10 nm per ns
pull_coord1_k = 1000 ; kJ mol^-1 nm^-2

I would expect that to happen. Nucleoside triphosphates usually have divalent ions bound in solution; it is not generally believed that the ion and NTP bind to enzymes in separate steps.

In this case only GTP is getting pulled out of the pocket. MG stays inside the pocket

Problem is that in the presence of GTP, the MG2+ stays in the binding pocket. Only the GTP is getting pulled out from the pocket. While in the presence of GDP, the MG2+ along with GDP (both) are getting pulled out of the binding pocket. It doesn’t matter whether I am using certain constraints or not! I want to calculate the delG value for both GDP AND GTP so that I can compare the affinity.

You do realize the conditions you apply here are quite drastic by biological standards? If your enforced dissociation is thousands to millions times faster than in reality, you shouldn’t expect it to follow any biologically relevant pathway.

1 Like

Thank you @milosz.wieczor !

Yeah! You may be right. But there is strong interaction like salt bridges and metal-coordinate bond between metal ion and ligand.

Magnesium ions are usually strongly (hexa-)coordinated in biological systems, with residence times well into micro- to miliseconds. Some of these interactions are with the ligand’s TP/DP, some are with carboxylates. At very high forces, which ones will yield first is largely random.

To put it in terms of free energy barriers in a complex landscape - if you pull slowly, you’re only adding a bit of energy, and your system has to follow a low-energetic (= realistic) pathway to escape the bound state. If you pull quickly, you’re adding lots of energy, and whichever pathway happens to be the shortest will be chosen - even if it goes across very high barriers, like breaking the “wrong” coordinate bond.

Ok I got your point. @milosz.wieczor

So should I decrease the pull rate and try repulling at different decreased rate? Please suggest for mdp parameters. Doing this for first time.

pull_coord1_rate = 0.01 ; 0.01 nm per ps = 10 nm per ns

How should be the COM graph look like?

The general rule is “the slower, the better”. ATP/ADP and magnesium are notorious hard cases due to high charge densities, so if breaking the coordinate bonds with protein happens in less than 50-100 ns (so rate closer to 1 nm per 100 000 ps), I would still consider it excessive force. Of course, this assumes you have access to a modern cluster, and your system is of reasonable size.

The rest depends on your further goals. To get a qualitative difference between GDP and GTP, it might be enough to run 3-5 repeats and compare the average dissociation work. To get correct quantitative data like free energies, we’re talking microseconds of cumulative simulation time.

If I want to pull it for 5 nm, I have to simulate it for 500 ns. Then running the umbrella sampling from individual conf Extracted from the pulling simulation will be quite tedious task? I will end up having too much conf files. Protein is of medium size.

5 nm is a very long interval to sample properly, I imagine already after 2-3 you’re out of contact with your protein of interest, and already after 0.5 you’ve broken all native contacts. Remember - the bigger the interval, the more costly it is to achieve convergence. US is tedious in general unless your system is extremely simple (and protein-based systems are hardly ever simple systems).

You might want to consult e.g. this paper for typical issues and solutions employed in free energy calculations for protein-ligand systems, to avoid common pitfalls that render a good portion of published free energy profiles largely unrealistic. Even though the paper refers to metadynamics, many issues and rules of thumb will apply elsewhere.

Thank you @milosz.wieczor! I have got your point. Will be extremely useful for my research work.

Hi! @milosz.wieczor
I decreased the *pull_coord1_rate = to 1nm per 100 ns. But kept the spring constant I. e. pull_coord1_k = 1000 ; kJ mol^-1 nm^-2. Should I also decrease spring constant value.

For the pulling, it’s reasonable to use a high force constant; otherwise your harmonic restraint will grow until your ligand suddenly jumps out of the pocket. For umbrella sampling, it might take a bit of experimentation to get the force constant right - try reproducing someone else’s setup from the literature, paying special attention to the units.