I am running a simple test to get atom types for arginine dipeptide with CHARMM36m force field. The arginine dipeptide should have 12 atom types, but it only shows 11 atom types from the information printed by gmx dump command. I think the HC atom type is not recognized, and it is mistakenly assigned as H atom type. In addition, the energy of the arginine dipeptide is different from other programs. The major difference is the electrostatic energy.
Not only Arg dipeptide, but also some other dipeptides, such as Glu and Lys, have the incorrect number of atom types in Gromacs. Can you tell me why some atom atypes are missing in Gromacs?
HC and H have the same LJ parameters, so they get merged into a single atom type by grompp but the associated bonded parameters are different.
Please provide more detail on the types of tests you are doing. I have rigorously tested the C36m implementation, explicitly using dipeptides, and the energies and forces match between GROMACS and CHARMM to at least the third decimal place (some deviation in electrostatic terms occurs because of the difference in Coulomb’s constant used by each program).
For the tests of dipeptides, the simulations were run without solvent and without long-range PME correction. I did such a simple test becasue I just wanted to get the information of each energy term and the potential energy of a dipeptide at the first MD step. Here are the results between different programs (All energies are in the unit of kcal/mol).
For Arg dipeptide:
Gromacs
NAMD
Difference
E(bond)
19.54404876
19.5442
0.000151243
E(angle)
20.3956979
20.3958
0.000102103
E(dihedral+cmap)
9.936431644
9.9364
-3.16444E-05
E(improper)
0.243979446
0.244
2.05545E-05
E(vdw)
-0.29831979
-0.3228
-0.02448021
E(ele)
-251.2767686
-237.4397
13.83706864
E(pot)
-201.4549307
-187.6421
13.81283069
For Ala dipeptide:
Gromacs
NAMD
Difference
E(bond)
10.41295411
10.413
4.58891E-05
E(angle)
16.26589388
16.2658
-9.38815E-05
E(dihedral+cmap)
1.942441683
1.9424
-4.16826E-05
E(improper)
1.222153442
1.2221
-5.34417E-05
E(vdw)
-1.378117543
-1.3841
-0.005982457
E(ele)
-16.00597514
-16.0059
7.51434E-05
E(pot)
12.45935043
12.4533
-0.00605043
All energy terms are consistent between Gromacs and NAMD for Ala dipeptide, but for Arg dipeptide, the electorstatic energy has large difference. Could you explain the reason for this difference?
I also found two possible errors in the C36m topology file.
For the most recent C36m force field (charmm36-feb2021.ff) on CHARMM force field website, I saw that the atom names of ACE and NME in merged.rtp file were renamed to the same names as in origin top_all36_prot.rtf file, e.g. CH3 → CAY for ACE. However, the name is still CH3 for ACE in the merged.hdb file, which causes errors in gmx pdb2gmx step when adding hydrogens. The earlier version of C36m force field (e.g. charmm36-jul2017.ff) has consistent atom names between merged.rtp and merged.hdb file, which works very well in topology preparation. I think changing the atom names back to the previous version can eliminate the error in gmx pdb2gmx step for the latest version.
In all versions of C36m for Gromacs files, there are two impropers for NME. The first one “N -C CH3 HN” is correct, but I don’t know why the second improper “-C CH3 N -O” is used. Is this an improper angle? Other tool such as psfgen does not generate this improper, so Gromacs will always have one more improper in “topol.top” file if we cap C-terminus with NME. I guess what this improper wants to represent is “-C -CA N -O”, but it could be constructed from “C CA +N O” improper in the topology of each amino acid. Therefore, I think it is necessary to delete this improper in the merged.rtp file.
Could you please check if what I said is correct or not?
Are you performing a single-point energy evaluation, or actually running 1 step of MD? There can be a big difference. Without your actual inputs, I can’t really comment on the robustness of your approach, but single-point energy and force evaluations in CHARMM (not NAMD) vs. GROMACS yield matching results, as I noted before.
The better solution is just to fix the atom names in the .hdb file, and the same applies for the NME entry. I changed things to be consistent with CHARMM for better interoperability. The corrected entries are:
ACE 1
3 4 HY CAY C O
NME 2
1 1 HN N -C CAT
3 4 HT CAT N -C
I will update those in the next release of the force field port.
The fix you propose is correct, and I’ll make that change in the next version, too. There should be two impropers there, to match the CHARMM .rtf (patch) entry, which specifies two impropers:
PRES CT3 0.00 ! N-Methylamide C-terminus
GROUP ! use in generate statement
ATOM C C 0.51 !
ATOM O O -0.51 ! |
GROUP ! C=O
ATOM NT NH1 -0.47 ! |
ATOM HNT H 0.31 ! NT-HNT
ATOM CAT CT3 -0.11 ! |
ATOM HT1 HA3 0.09 ! HT1-CAT-HT3
ATOM HT2 HA3 0.09 ! |
ATOM HT3 HA3 0.09 ! HT2
!
BOND C NT NT HNT NT CAT CAT HT1 CAT HT2 CAT HT3
IMPR NT C CAT HNT C CA NT O
I can’t speak to what psfgen is doing, but again using CHARMM as the reference rather than VMD/NAMD tools, the implementations should agree here that there are two impropers pertaining to this patch, one centered on NT and the other on C.
I ran 1 step of MD to test these results. As your suggestion, I performed single-point energy evaluation by gmx mdrun -rerun option, and the results are same for each energy term as I listed above. Here is my Gromacs input for dipeptide test:
I think my input for comparison is okay because I got same energies between Gromacs and NAMD for Ala dipeptide. I did this test because I want the potential energy of a dipeptide to be same in different MD programs, and it should be same if different MD prgrams have the same force field functional forms and read the same parameters. The weird thing is that I can get the matched energy terms for Ala dipeptide but not for Arg dipeptide. From more tests, I found the electrostatic energies are more likely to be same between Gromacs and NAMD for neutral dipeptides, but they are different for charged dipeptides (e.g. Ala, Lys, Glu). Is my
In addition, I also did the same test on LAMMPS, and all energy terms are matched very well between LAMMPS and NAMD for Arg dipeptide. Since I thought you might compare the dipeptides energies between Gromcas and CHARMM long time ago during implementation, I did the same test on older Gromacs versions, such as 2018.2 and 5.1.4, and all versions gave me same energies for Arg dipeptide, which are different from both NAMD and LAMMPs. I didn’t perform the test on CHARMM because I haven’t installed CHARMM and never perform simulations on it before. Could you please do the comparison on Arg dipeptide between Gromcas and CHARMM, or tell me if something is wrong in my test?
Here’s the clue. The large side chain is wrong but the small side chain is right. The programs are using different neighbor lists, I’m willing to bet. This is why all comparisons should be done with infinite cutoffs (all-vs-all comparison). Sadly, this has been disabled in recent GROMACS versions but we have done these comparisons for single molecules, zwitterions, tripeptide, etc. Everything matches between CHARMM and GROMACS. I don’t have a per-energy term breakdown on hand, but I have done it. The overall energies match between CHARMM and GROMACS:
=== Protein ===
AA CHARMM GMX
ala 37.31776 37.31739
arg -172.87158 -172.86615
asp 42.78600 42.78632
asn 67.73579 67.73661
cys 47.28588 47.28537
glu 46.22594 46.22562
gln 8.33029 8.33111
gly 23.93051 23.93021
hsd 63.71331 63.71295
hse 9.40274 9.40317
hsp 116.47855 116.47872
ile 55.21422 55.21391
leu 15697.24964 15697.53824
lys 48.68499 48.68475
met 44.32551 44.32528
phe 69.03321 69.03298
pro 1418.32281 1418.36759
ser 50.78526 50.78465
thr 30.00300 30.00310
trp 80.95839 80.95769
tyr 58.48646 58.48661
val 136.13354 136.13432
=== DNA ===
BASE CHARMM GMX
ade 212.61300 212.61520
cyt -74.35659 -74.34918
gua -184.29789 -184.28847
thy -28.78346 -28.77652
=== RNA ===
BASE CHARMM GMX
ade 900.43821 900.43738
cyt 615.83186 615.83652
gua 496.22720 496.23565
ura 588.27669 588.28632
=== LIPID ===
LIP CHARMM GMX
popc 11.68598 11.68740
pope 0.51746 .51885
pops -18.83684 -18.83494
popa -40.73156 -40.72920
dppc 7.75118 7.75239
=== CARB ===
SUG CHARMM GMX
aglc 125.21478 125.21319
bglc 133.08417 133.08221
agal 123.95685 123.95506
bgal 131.01627 131.01481
I have tested different cutoff schemes with diffrent cutoffs. Even though we cannot do inifite cutoffs, I think we can use some large cutoffs to make sure all non-bonded interactions are included even for the dipeptide with large side chain. Here are the potential energy results for Arg dipeptide.
Epot
Gromacs Verlet 12A
-201.4549
NAMD 12A
-187.6420
Gromacs Verlet 22A
-195.1886
NAMD 22A
-187.6420
Gromacs Group 12A
-187.6420
In this table, if the cutoff is increased from 12 A to 22 A in Verlet cutoff scheme, the energy is indeed changed, but it is also not close to NAMD energy. In NAMD, the 22 A cutoff provided the same energy as 12 A cutoff, meaning that 12 A is enough for including all non-bonded interaction for Arg dipeptide. Interestingly, I run a test with Group cutoff scheme and 12 A cutoff on Gromacs version 2018.2, the energy are same as NAMD. I don’t know why the Group scheme is removed in current Gromacs version but it seems to have better consistency on energy compared to other MD programs.
Here I also calculated the potential energy on all dipeptides. For Gromacs, the Verlet cutoff scheme and 12 A cutoff are used as before.
Gromacs
NAMD
Difference (NAMD - Gromacs)
ala
32.1171
32.1111
-0.0060
arg
-168.9135
-155.1019
13.8116
asp
-14.1699
-0.3446
13.8253
asn
-27.3021
-27.3136
-0.0115
cys
69.3255
69.3164
-0.0091
glu
18.8536
32.6762
13.8226
gln
13.1125
13.0976
-0.0149
gly
12.9022
12.8976
-0.0046
hsd
147.7385
147.7160
-0.0225
hse
129.0425
129.0203
-0.0222
hsp
86.6663
100.4869
13.8206
ile
74.2459
74.2317
-0.0142
leu
66.1360
66.1216
-0.0144
lys
80.7010
94.5198
13.8188
met
59.0194
59.0022
-0.0172
phe
63.1496
63.1281
-0.0215
pro
92.9489
92.9404
-0.0085
ser
44.8286
44.8214
-0.0072
thr
69.1549
69.1458
-0.0091
trp
90.7715
90.7412
-0.0303
tyr
53.1451
53.1221
-0.0230
val
63.8470
63.8364
-0.0106
There are several interesting things. First, in this test I used different coordinates for Arg dipeptide, so the energy is different from the previous test for both Gromacs (-168.9135 vs -201.4549) and NAMD (-155.1019 vs -187.6420). However, the energy difference (NAMD - Gromacs) are around 13.81 for the previous coordinates and current coordinates. Seond, from the table we can observe that the differences for all neutral dipeptides are very small, but for charged dipeptides (Arg, Asp, Glu, Hsp, Lys), the differences are quite large and the magnitudes are round 13.8, no matter the dipeptide is positively charged or negatively charged. Third, the size of the side chain is not likely to be the reason for the difference of energy, because some neutral dipeptides also have large side chains as Arg, but their energy differences are small. One more direct evidence is hse/hsd vs hsp. These three dipeptides have very similar size of side chains, except that hsp has 1 more hydrogen than hse/hsd. The energy differences for hse/hsd are good but it is large for hsp.
This test indicates that there might be some systematic differences between Gromacs and NAMD (~13.8 kcal/mol) for calculating the energy of dipeptides with Verlet cutoff scheme. Could you check the codes of electrostatic energy calculation with Verlet scheme, or are there any other settings I need to put into input file for Verlet scheme to get the same energy as Cutoff scheme or other programs such as NAMD?
I have no explanation for this systematic difference, but something is bizarre if it is exactly the same value for every charged side chain. Do the energies match if you use infinite cutoffs? This is the only meaningful comparison. There are too many differences between the codes, and frankly the Verlet code is far too complicated for me to try to figure out why GROMACS is different from NAMD to try to dig into it. We have demonstrated that results of MD simulations match among CHARMM, NAMD, and GROMACS, so I am not concerned that there is actually a meaningful differences when it comes to the forces, and the only apples-to-apples comparisons that I have done show perfect matching. I’d be interested to know if the source of the difference can be pinpointed, but it’s not something I have time to dig into.
As you can see in previous results, the Group cutoff scheme has the same energy as NAMD/LAMMPS for Arg dipeptide, which means even inside Gromacs, the energies from the Group scheme and from the Verlet scheme are different. Is it normal that the electrostatic energies from these two cutoff schemes are different only for the charged dipeptides?
In my opinion, the energies for a simple dipeptide should be same no matter what MD programs or cutoff schemes we used, if all bonded and nonbonded interactions are included in the calculation. The weird thing is that the energies of neutral dipeptides with Verlet scheme are same with other programs, but the energies of charged dipeptdies are not, and the differences are all around 13.8 kcal/mol. However, the Group scheme with the same cutoff can provide the same energy for the charged dipeptides. That’s why I doubt that there might be some systematic errors in Verlet scheme when calculating energies for charged dipeptides.
I want the energies are consistent between different MD programs because it is important for library development. If a library is based on CHARMM force field energy, it cannot be generalized to all MD programs if the MM energy is not consistent in them. If you don’t want to dig into the Verlet codes, could you tell me what parameters about Verlet scheme I can set in mdp input file to obtain the same energy as the Group scheme, since the Group scheme is deleted in the current Gromacs version?
I don’t know how to do this. Maybe @hess can comment. I truly hope that the ability to do a proper in vacuo calculation will be restored into GROMACS in the near future. This is a critically important feature that needs to be restored for validating force fields.
Without looking into the details, my guess is that the difference are due to the shift in the Coulomb potential. I expect the difference go away when you set coulomb-modifier = none
You might want to do the same for vdw-modifier
It is indeed convenient to be able to do vacuum simulations, but the same results can be achieved in nearly all cases by using a large periodic box.
Thank you so much, @jalemkul and @hess! By setting coulomb-modifier=none and vdw-modifier=none, the dipeptide energies from Gromacs are same as NAMD/LAMMPS.
I would like to ask one more question. Are there any documents or tutorials for developing library used in Gromacs?
I mean the external library, which can read coordinates and other desired variables from Gromacs, perform user-defined calculations, and return the energies and forces back to Gromacs.