GROMACS version: 2024.1
GROMACS modification: No
Hello GROMACS community.
I am attempting to follow jalemkul’s umbrella sampling tutorial.
I have succeeded so far until I reached the section of Generating configurations.
In particular, I am unable to explain myself how the following instructions actually work:
gmx make_ndx -f npt.gro
...
> r 1-27
> name 19 Chain_A
> r 28-54
> name 20 Chain_B
> q
To be more specific:
a) How does make_ndx
know which of the five chains I am selecting residues 1-27?
b) What name 19
means? Why not, for example, name 5
and we save one character?
Could a kind-hearted member explain those commands verbosely?
Things that I have done before deciding to post:
-
I have agreed with an old post stating that “make_ndx is quite cryptic to understand” (May 2021).
-
I have tried gmx make_ndx
followed by h. Interesting by still fully cryptic.
-
I have tried gmx make_ndx -h
, which offers no explanation as to how the logic works.
-
I have read the reference manual, specifically the part about Using groups. (And lost all hope.)
Thank you.
Ivan
Yes, it would be great to make make_ndx more user-friendly. Anyhow …
a) make_ndx uses the residue numbering in the .gro file, starting from 1 and not restarting the numbering in following chains. So, r 1-27
is the first 27 residues (chain A), and r 28-54
are the next 27 residues (chain B).
b) name 19 chain_A
means that you are renaming selection group 19 to chain_A. I think you will find a list of the other (automatically generated, unless you specify an index file as input to gmx make_ndx
) selection groups if you look in the make_ndx output. Otherwise, it should be listed if you just enter a blank line in its interface.
I hope that helps for now at least.
Hi MagnusL.
I understand better now.
So, to make it a bit more clear to the next user getting acquainted with GROMACS, the .gro format is leaner that you might expect.
This is the starting structure for the tutorial:
from pymol import cmd
cmd.load('2BEG_model1_capped.pdb')
cmd.get_chains()
['A', 'B', 'C', 'D', 'E']
which shows that there are five chains. All identical, all 27 residues long (not shown).
However, this is the .gro derived from that .pdb:
from pymol import cmd
cmd.load('npt.gro')
cmd.get_chains()
['']
As shown here, the .gro does not have a chain concept.
Residues are included (not shown) but they must be picked from a continuum of concatenated positions:
1, 2, ..., 27, 28, ..., 54, 55, ...
|<-chain A-->|<-chain B-->|<-chain C, etc->
Thank you MagnusL.
Ivan
And there is an additional twist to this story:
For make_ndx
, the word r
(residue) is not synonymous to amino acid.
Residue 1 is not an amino acid.
echo 'list residues' | gmx make_ndx -f npt.gro
output
>
1 ACE 2 LEU 3 VAL 4 PHE 5 PHE 6 ALA 7 GLU 8 ASP 9 VAL
10 GLY 11 SER 12 ASN 13 LYS 14 GLY 15 ALA 16 ILE 17 ILE 18 GLY
19 LEU 20 MET 21 VAL 22 GLY 23 GLY 24 VAL 25 VAL 26 ILE 27 ALA
28 ACE 29 LEU 30 VAL 31 PHE 32 PHE 33 ALA 34 GLU 35 ASP 36 VAL
37 GLY 38 SER 39 ASN 40 LYS 41 GLY 42 ALA 43 ILE 44 ILE 45 GLY
46 LEU 47 MET 48 VAL 49 GLY 50 GLY 51 VAL 52 VAL 53 ILE 54 ALA
55 ACE 56 LEU 57 VAL 58 PHE 59 PHE 60 ALA 61 GLU 62 ASP 63 VAL
64 GLY 65 SER 66 ASN 67 LYS 68 GLY 69 ALA 70 ILE 71 ILE 72 GLY
73 LEU 74 MET 75 VAL 76 GLY 77 GLY 78 VAL 79 VAL 80 ILE 81 ALA
82 ACE 83 LEU 84 VAL 85 PHE 86 PHE 87 ALA 88 GLU 89 ASP 90 VAL
91 GLY 92 SER 93 ASN 94 LYS 95 GLY 96 ALA 97 ILE 98 ILE 99 GLY
100 LEU 101 MET 102 VAL 103 GLY 104 GLY 105 VAL 106 VAL 107 ILE 108 ALA
109 ACE 110 LEU 111 VAL 112 PHE 113 PHE 114 ALA 115 GLU 116 ASP 117 VAL
118 GLY 119 SER 120 ASN 121 LYS 122 GLY 123 ALA 124 ILE 125 ILE 126 GLY
127 LEU 128 MET 129 VAL 130 GLY 131 GLY 132 VAL 133 VAL 134 ILE 135 ALA
136 - 11224 SOL 11225 - 11255 NA 11256 - 11276 CL
Notice that residues 1, 28, 55, 82, and 109 are not amino acids but N-terminal acetylations.
What a twist for the unguarded structural biologist!
:)
Indeed, this can be a bit confusing. A residue is not as strictly defined, in most computational tools, as expected, from a biochemical point of view.
I believe this has to do with the assumption that everything belongs to a residue. This has been the case in, e.g., PDB files for a very long time. It is a convenient unit to separate monomers as well as separate small molecules of the system.