Getting started with gromacs

Dear All,

Can someone please help me with the following questions?

  1. If I am simulating a protein complex (for protein-protein interaction), should I create an index file for each group then add the index file to each step of the gmx grompp throughout or can I just simulate my molecule without the index file?
    gmx grompp -f md.mdp -c npt.gro -t npt.cpt -p topol.top -n protein.ndx -o md.tpr

What’s the idea behind freezing groups as described in this website (https://bioinformaticsreview.com/20210615/how-to-create-an-index-file-in-gromacs-for-md-simulation/?__cf_chl_tk=Az2.3TjrMkvU0LHJ4lE1YS9JPuvI05sek_OeJZNgvLM-1704611370-0-gaNycGzNDmU)

  1. Regarding RMSD,

How should I check for convergence? Do I treat each group’s RMSD (RMSD for group 1, RMSD for group 2) separately or is convergence described using the whole system (RMSD for system)?

Now, let’s say after I run my simulation for 20 ns, I see my simulation converged at 10ns. Should I extract my trajectory as from 10 ns to analyse the RMSF, radius of gyration, H-bonds etc or should I use the whole trajectory?
If I need to only analyse as from 10 ns. How do I do that? Extraction is rather done using frames. Right?

  1. Regarding trajectory,

If I want to generate a new trajectory, do I have to repeat nvt and npt since gen_vel = yes only occurs in nvt and thereafter it is gen_vel = no

; Velocity generation in nvt.mdp
gen_vel = yes
gen_temp = 300
gen_seed = -1

  1. Regarding tc-grps

What’s the difference between Protein Non Protein and Protein SOL? And if I have an index file with 2 groups defined, should I change tc-grps to group1_group2_SOL which means 3 tau_t and ref_t?

  1. I see so many ref_t being used. Some use, 298, 300, 303. But plysiological temperature is 310.2K. If I’m using 0.15 mmol to mimic physiological conditions, then why is temperature not taken at 310.2K? Is temperature related to force field?

  2. I am using amberff99SB, should I change fourierspacing to 0.125 and rcoulomb and rvdw to 0.8?
    At what point do I add vdw-modifier = Potential-shift as described in this discussion (Mdp parameters for non-bonded interaction using the AMBER force field - #4 by dbischoff)

the mdp part of my script is as follows:

; Neighborsearching
cutoff-scheme = Verlet
ns_type = grid
nstlist = 10
rcoulomb = 0.8
rvdw = 0.8

  1. How does pme_order change? I saw a file where it was given as 6.

  2. How does linc_order change? I saw a file where it was given as 9.

  3. I see several mdp files online include rlist but the gromacs tutorial mdp files don’t have it. When can it be ignored?

  4. I wanted to do conjugate gradient in addition to steepest descent. I read that it should be done for nmode and I plan to do mmgbsa with nmode afterwards.

I don’t see any default conjugate gradient mdp file available online? I came up with the following. Please help me write one.

define = -DFLEXIBLE
integrator = cg
emtol = 1000.0
emstep = 0.01
nstcgsteep = 1000
nsteps = 5000
nstcomm =1
ns_type = grid
rlist =1
coulombtype = PME
rcoulomb = 0.8
rvdw = 0.8

  1. I saw a paper where they say the complex was simulated in a box of yy × yy × yy nm3 with xxxx water molecules. Where are these informations obtained? I know the water molecules will appear at a certain step but I’m not sure about the size.

Thank you.

That’s a lot of questions. In general, I’d advice you to read through the manual and have a look at updated tutorials, a good place to start is https://tutorials.gromacs.org/. You will learn more from that and you will probably get better answers when your questions on the forum are more specific. Admittedly it requires more work on your part, but less for those who are helping you. Anyhow, I will try to help you with some of them.

  1. You only need an index file if you are using index groups in your input parameter file (.mdp). But if you specify the -n option, such as -n protein.ndx then that file must exist.

  2. When your observable fluctuates around an average rather than changes over time you can expect it to be converged. However, it is for you to show that. A good idea might be to check other observables as well, such as the potential energy of the system.
    When calculating the RMSD you probably do not want to take the whole system into account. Keep in mind that water is moving a lot.
    Regarding your analysis, see the help text for the tools you are using. For gmx rms (and other analysis tools) you will find that there is a -b option. That will let you set the starting time of the analyses, e.g. -b 10000 if your RMSD converged at 10 ns.

  3. If you want to start a new trajectory from an equilibrated system it might be a good idea to regenerate velocities - or use a different random seed for a v-rescale thermostat.

  4. You have to decide for yourself, and motivate to yourself and your future readers, whether it makes sense to divide your system into multiple temperature groups. The difference between your example: Protein Non-protein and Protein SOL depends on how the selection groups are defined. If you only have a protein and water they would be expected to be equivalent. Unless there are specific reasons to divide your system into multiple temperature coupling groups I’d recommend just using system.

  5. It’s up to you to choose your reference temperature. 310 K makes sense if you want to mimic physiological conditions. If you want to compare with measurements from the lab, then you’d better choose a temperature that matches those conditions - assuming that your force field is relevant at that temperature.

  6. You just add vdw-modifier = Potential-shift, or did I misunderstand you?

  7. https://manual.gromacs.org/current/user-guide/mdp-options.html#ewald and the reference manual will help you understand the PME settings.

  8. Is there a specific reason why you’d want to do a conjugate gradient energy minimization?

  9. I’m not sure what you mean. You create a simulation box around your protein and you solvate it. If you want to know the size of an existing simulation system you can check the last line of a .gro file or analyse the system size of simulation results using gmx energy. You can find the number of water molecules in the .top file, if you don’t remember how many were added.

I hope that helps you getting started. Good luck.

1 Like

Thank you for taking the time to answer my questions.

Ok, so I can just add, vdw-modifier = Potential-shift in my nvt.mdp, npt.mdp and md.mdp files as shown below?

; Neighborsearching
cutoff-scheme = Verlet
ns_type = grid
nstlist = 10
rcoulomb = 0.8
rvdw = 0.8
vdw-modifier = Potential-shift

I would like to use conjugate gradient energy minimization as well because I plan to do gmx_mmgbsa later with nmode and I read that conjugate gradient is good for nmode.

Yes, it should work with those settings, gmx grompp should tell you if there are any problems. You don’t need ns_type = grid any more.

I see. I think your cg parameters look good.

Thank you.

To generate another trajectory, did you mean that I can add ld_seed to the md.mdp file as shown below instead of repeating nvt and npt?

; Temperature coupling is on
tcoupl = V-rescale
tc-grps = Protein Non-Protein
tau_t = 0.1 0.1
ref_t = 300 300
ld_seed = -1

Yes, that will give a new trajectory with other velocities, so it will be different. However, since it will still start from the npt coordinates the beginning of the two trajectories might be similar. If you want to reduce the “memory” of the starting structure it might be better to rerun the equilibration stages as well. In most cases the equilibration is significantly shorter than the production runs anyhow.

Thank you.

Hi

I have been reading and I saw the following link ([gmx-users] what's the difference between gen_seed and ld_seed?) where they say that ld_seed should be used for bd and sd but I am using md. Please let me know if I should simply run nvt and npt again with the following in nvt.mdp to generate another trajectory?

;Velocity generation in nvt.mdp
gen_vel = yes
gen_temp = 300
gen_seed = -1

But they also spoke about V-scale later on on the page and v-scale is something I am using for t-couple. So I’m not sure what to use. Could you please help?

The v-rescale thermostat uses ld-seed for the stochastic part. So, two simulations (with different ld-seeds, or two tpr-files generated with ld-seed = -1) will get different velocities. You don’t need to generate new velocities. But as I mentioned, it also depends on “how different” you want the second trajectory to be. It might be good to start from a different configuration.

Ok sure thank you.