Creating a topology file for a R-3C space group crystal structure (calcite)

GROMACS version: 2019
GROMACS modification: No

Dear all,
I write this post after having made some extensive research. Unfortunately, couldn’t find the correct hints to solve the problem - forgive me, please bear this in mind. I’d highly appreciate all your help.

Statement of the problem
I’m trying to build the topology file for the crystal structure of calcite (CaCO3) to run MD.

Some info about the crystallographic cell
The space group is R-3C, highly symmetric. The primitive unit cell is rhombohedral and the conventional cell is hexagonal.

I think the hexagonal cell could be a good starting point for the Gromacs MD simulation, but please feel free to suggest other better options.

Force field for the topology
The force field I’d like to use is the one described here (Geochimica et Cosmochimica Acta 141 (2014) 547–566) in Table 3.
I transcribe here the content:

Table 3 :
The CO3 groups are assumed to be flexible with C–O bonds modeled with harmonic oscillators (stretching constant kC–O = 6118.17 kJ/mol/A˚^2 at an equilibrium distance r(C–O) = 1.16 A˚)
and with an exponential repulsion between the oxygen atoms to prevent the folding of the molecule (uOO(r)= Bexp(-r/rho) where B = 2611707.2 kJ/mol and rho = 0.22 A˚).
In the melt the Ca cations and the CO3 groups interact with each other through coulombic forces (with qCa =+1.64202 e, qC = +1.04085 e and qO = -0.89429 e), atom–atom repulsive forces (urep_ij (r) = B_ij exp (r/rho_ij) and dispersive forces (udisp_ij (r) = C_ij / r^6)
where Bij, qij and Cij are parameters given below:

What have I tried
I’ve intensely studied the documentation about the topology file format here,
and using this knowledge I’ve transcribed the information in Table 3 onto Gromacs topology format, as shown here:
topol_caco3.top (2.0 KB)

However, you may see there are only one Ca, one C and one O entry per parameter. I do not quite know how to extend this parameter info to the 30 atom hexagonal cell.
Is there a way of doing this?

Many thanks,
All the best

Unfortunately, I wonder why this is not getting any replies. I wonder whether is there anything unclear or not precise enough? I’ll work hard to try to make it clear if this is the case, but I’d love some kind of feedback.
Many thanks again and sorry for any inconvenience.
All the best.

I’m not entirely sure I understand your question, so pardon me if this obvious, but when using gmx grompp to generate the .tpr runfile, you will pass a .gro structure file containing the starting coordinates of the atoms (one line for Ca, one for C, etc). If you want 30 molecules of CaCO3, you repeat that 30 times with the coordinate shifted appropriately, and in the [ molecules ] section of the topology, put MOL 30 instead of MOL 1.

So you could start from the provider PDB file (gmx editconf to convert to .gro), however you would also need to reorder the atoms first (atoms within a molecule should appear in order of [ atoms ], ie Ca, C, O)

@ebriand Many thanks for your reply.

So I’ve used:

gmx editconf -f Calcite_I_exp_ase_convert.pdb -o Calcite_I_exp_from_editconf.gro

The gro file reads:

Great Red Oystrich Makes All Chemists Sane
   30
    1MOL     Ca    1   0.000   0.000   0.000
    1MOL     Ca    2   0.249   0.144   0.569
    1MOL     Ca    3   0.000   0.288   1.137
    1MOL     Ca    4   0.000   0.000   0.853
    1MOL     Ca    5   0.249   0.144   1.422
    1MOL     Ca    6   0.000   0.288   0.284
    1MOL      C    7   0.000   0.000   0.426
    1MOL      C    8   0.249   0.144   0.995
...
    1MOL      O   26   0.185   0.033   0.995
    1MOL      O   27  -0.064   0.177   1.564
    1MOL      O   28  -0.185   0.321   1.280
    1MOL      O   29   0.313   0.033   0.142
    1MOL      O   30   0.064   0.177   0.711
   0.49880   0.43197   1.70610   0.00000   0.00000  -0.24940   0.00000   0.00000   0.00000

Now we need the topology file to be consistent with this .gro file.

Now, that newly .gro file produced above contains 30 atoms. However, this topol_caco3.top file I’ve compiled is a manual transcription of the above force field info and contains only 3 atom entries: Ca, C and O (please click on the topol_caco3.top file to see it!)

How could I make this topology file consistent with the .gro file generated?

Many thanks,
All the best

So the topology file should describe each type of molecule of the system once, then the number of each of them should be specificied in the [ molecules ] section.

Because the CO3 and the Ca are not linked with bonded interaction (only Coulomb and LJ), that counts as two types of molecules, which you could name, say “CA” and “CO3”. If you want the CaCO3 to not separate, you would use a single molecule for CACO3. However that is not what the publication seem to describe.

The sections [ defaults ], [ atomtypes ] and [ nonbond_params ] are common to all molecules, so do not repeat them. However, you will need to repeat the [ moleculetype ], [ atoms ] and [ bonds ] section for each type of molecule.

And finally [ system ] and [ molecules ] are also common.

So you would get something like that:

;
[ defaults ]
; nbfunc=2  for Buckingham
; comb-rule is the number of the combination rule. A 1 indicates that gromacs will interpret the parameters written in the topology file
; nbfunc    comb-rule   gen-pairs   fudgeLJ   fudgeQQ
2            1 

[ atomtypes ]
;name   bond_type  mass    charge    ptype    a         b        c6
;                          (q/e)           (kJ/mol)    (nm^-1)   (kJ mol^-1 nm^6)
Ca      Ca         40.078   1.64202   A      0          0           0 
C       C          12.0107  1.04085   A      0          0           0  
O       O          15.999  -0.89429   A      500000     39.6     0.023  


[ nonbond_params ]
;i   j   func              a           b               c6
;        (2=Buckingham) (kJ/mol)    (nm^-1)       (kJ mol^-1 nm^6)  
Ca   C   2                0           0                0  
Ca   O   2               380509    3.975036769090      0  
C    O   2                 0           0               0 
O    O   2              2611707.2    45.45454545       0 ; This line is the only way I came out to 
; include the "exponential repulsion between the oxygen atoms to prevent the folding of the molecule"
; (see caption for Table 3 in the post),
; but I wonder whether this aforementioned repulsion should be included in another way?

[ moleculetype ]
; name  nrexcl
CA        1
; A molecule that only contains the calcium ion

[ atoms ]
;nr  atom type      residue  residue  atom      charge  charge    mass
;    One single atom
1     Ca             1        MOL       Ca         1     1.64202   40.078

[ bonds ]
; i  j  func  b0      k_b
; Single atom so no bonds


[ moleculetype ]
; name  nrexcl
CO3        1
; Molecule for the carbonate group

[ atoms ]
;nr  atom type      residue  residue  atom      charge  charge    mass
; Central C and three identical oxygens
1     C              1        MOL       C          1     1.04085   12.0107 
2     O              1        MOL       O          1    -0.89429   15.999
3     O              1        MOL       O          1    -0.89429   15.999
4     O              1        MOL       O          1    -0.89429   15.999

[ bonds ]
; i and j should be the atom numbers defined in the [ atoms ] section for the molecule
; three bond between C and O
; i  j  func  b0      k_b
;            (nm)   (kJ mol^-1 nm^-2)
1    2   6   0.116   61.1817
1    3   6   0.116   61.1817
1    4   6   0.116   61.1817

[ system ]
calcite_work

[ molecules ]
; The number of each species, as they appear in the gro file
CA 6
CO3 6

You have 6 CaCO3 in the PDB file, so the numbers in the last section should match.

Finally, you should order the atoms in the PDB file such that they appear in the same order as in the topology [ molecules ] and [ atoms ] section. Here, the order will be 6 Ca atoms first (this is already the case), then a C atom followed by three O; then another C followed by 3 other O, etc. So you need to reorder your file.

Note that, I have not checked the values nor whether the nonbonded/bonded interaction types are correct. I would suggest making a system with only two CaCO3 and computing the forces by hand, then compare to the 0th step of a simulation to ensure everything works as described in the paper.

@ebriand Thank you so much for your useful and detailed reply.

Alright.
Since this seems to be a Buckingham FF, then I think [ defaults ] should read:

;--------- Sections common to all molecules:
[ defaults ]
; nbfunc=2  for Buckingham
; comb-rule is the number of the combination rule. A 1 indicates that gromacs will interpret the parameters written in the topology file
; nbfunc    comb-rule   gen-pairs   fudgeLJ   fudgeQQ
2            1

Regarding [ nonbond_params ] I guess they’re the numbers on the table:

Hence:

;--------- Sections common to all molecules:
[ nonbond_params ]
;i   j   func              a           b            c6
;        (2=Buckingham) (kJ/mol)    (nm^-1)     (kJ mol^-1 nm^6)  
Ca   Ca  2                0           0            0  
Ca   C   2                0           0            0  
Ca   O   2               380509    39.75037        0  
C    C   2                 0           0           0                                                     
C    O   2                 0           0           0 
O    O   2              500000     39.60004        0.0023 

Even though they’re zero, I’ve also included the Ca Ca, C C and O O interactions (to be consistent with all the rows on the screenshot table).

However, I’m not sure about the [ atomtypes ] : This is my guess:

;--------- Sections common to all molecules:
[ atomtypes ]
;name  mass    charge    ptype    a         b        c6
;              (q/e)           (kJ/mol)    (nm^-1)   (kJ mol^-1 nm^6)
Ca     40.078   1.64202   A      0          0           0 
C      12.0107  1.04085   A      0          0           0  
O      15.999  -0.89429   A      500000     39.60004    0.0023  

It seems [ atomtypes ] needs
a, b and c6 parameters per atom (unless I’m interpreting this wrongly).
I wonder where should I be getting those parameters for Ca, C and O.
Just like I show above,
should a, b and c6
in [ atomtypes ] be taken from
those Ca Ca, C C and O O rows from the [ nonbond_params ] section?

Again, thanks a lot for your effort and useful explanation. Seriously, this is helping me a lot.

@ebriand Hope all is well. Just wanted to check: was my last question clear? Otherwise please let me know I’ll work harder to make it clearer.

Many thanks,
All the best

The parameters a/b/c6 given in the atom types are the “defaults” for the atom. If you declare atomtypes X and Y, and no nonbond_params, then the NB interaction between X and Y will be according to the param for types and the combination rule. If you explicitly specify a pair interaction in nonbond_params for X and Y, then these parameters will be used instead - they can be different from the default, of course.

Here you already have a table of parameters per interaction, so using nonbond_params makes sense.

See Parameter files — GROMACS 2019 documentation

Note that you can see a list of the generated non-bonded pair interactions (with parameters, including those that were generated through atomtypes), and which atom they apply to with gmx dump -s file.tpr. This is useful to debug/double check that you have not forgotten a line in nonbond_params, for instance.

@ebriand Many thanks for this explanation, truly.

Thanks a lot!, this makes total sense.
Hence, I will remove the a, b and c6 parameters from [ atom types ] ,
and will just leave the charge, mass and ptype:

;--------- Sections common to all molecules:
[ atomtypes ]
;name  mass    charge    ptype
;              (q/e)
Ca     40.078   1.64202   A
C      12.0107  1.04085   A
O      15.999  -0.89429   A

However, this may be redundant, since the charge and mass for Ca, C and O are defined below, in the sections for Ca and CO3 respectively:

;######### Sections for Ca:
[ atoms ]
;nr  atom type      residue  residue  atom      charge  charge    mass
;   (real atom or    number   name    name      group    (q/e)         
;    dummy)
1     Ca             1        MOL       Ca         1     1.64202   40.078


;######### Sections for CO3:
[ atoms ]
; Central C and three identical oxygens
;nr  atom type      residue  residue  atom      charge  charge    mass
;   (real atom or    number   name    name      group    (q/e)         
;    dummy)
1     C              1        MOL       C          1     1.04085   12.0107 
2     O              1        MOL       O          1    -0.89429   15.999
3     O              1        MOL       O          1    -0.89429   15.999
4     O              1        MOL       O          1    -0.89429   15.999

Maybe you have comments on this redundancy…
But most importantly, there’s an extra interaction in the force field that we are missing out and I cannot figure out how to include it:

“The CO3 groups […] and with an exponential repulsion between the oxygen atoms to prevent the folding of the molecule (uOO(r)= Bexp(-r/rho) where B = 2611707.2 kJ/mol and rho = 0.22 A˚)”

(below the screenshot and text to see the context).

How could we include that extra repulsion?

This is the topology we both have constructed so far:
topol_caco3-v4.top (2.5 KB)

Again, thank you so much, truly.

Removing the nonbonded parameters from atomtype will lead to warning when running grompp. Warnings will not stop you from proceeding forward, but suppressing warnings (-maxwarn) is considered bad practice as you could miss an important problem. I would suggest to leave the non bonded parameters in the atomtypes directives.

The general rule for redundancy is that the most specialized directive will override the more general directives. So the mass and charge in the atoms directive will be the ones used.

The geometry of the carbonate (CO_3^2-) ion is trigonal planar (all atoms in the same plane, and 360/3 degree angle between the oxygen), so beyond what the publication say, this is what the parametrization should aim to reproduce.

The proposed exponential repulsion is a way to maintain this geometry, however it is not a very good way to do so. This is because the very large force constant B will require an unecessarily small integration step.

So there are two options here:

  • Either you want to follow as closely as possible the protocol. Then you could use an additional NB potential with only repulsion between the oxygens, and use a small integration step.

  • Or you define bonded parameters to ensure a trigonal planar geometry. This is formally a deviation from the protocol, and you will have to figure out suitable parameters (find parameters that produce numerically similar forces to the described potential for small deviation from rest state position), or find a publication with bonded parameters for a carbonate ions, ie an harmonic bond angle potential for the OCO angle, and improper dihedral(s) to maintain planarity (see here and here).

Also, with the exclusion nrexcl = 1, the non bonded O-O interaction from the table will apply within the molecule, which is not wanted. If you increase the nrexcl, you will need to add the repulsive potential in intermolecular_interactions.