Infinite DNA (across boundary conditions)

GROMACS version: 2021
GROMACS modification: No

Hi fellow GROMACSers,

I am trying to simulate an infinite DNA using periodic boundary conditions. I have used a process that I believe was suggested by @jalemkul in a previous version of the gromacs forum, but it is very likely that I misunderstood it.

I first created generated a 12bp DNA pdb (Using this website: Sequence to Structure . If there are better ways of doing this, please offer suggestions.) I then ran it through pdb2gmx to create a topology. Finally, I manually removed the first and last basepairs from the .gro and topology outputs to create a 10bp DNA. I also renumbered the topology file(s). Finally, I put it in a box that perfectly fit the DNA in the Z-axis (3.4nm) with periodic boundary conditions.

The part I am pretty sure I messed up is that I connected the top phosphates to the bottom phosphates in the topology file, which I am pretty sure is incorrect as it led to bonds that stretched across the entire molecule.

What is the correct way to do this? I tried using “periodic-molecule = yes” in my .mdp file, but I didn’t think this actually added bonds, but merely led to a more detailed calculation across the PBCs. Should I have just used the 12bp DNA and put it in a simulation box that was undersized? In any case, I would appreciate any help. I tried to search the manual and forums and find something similar, but please let me know if I missed something obvious or need to just RTM.


I do not know the details of the recipe that you are following. Thus I can not comment on that point. Here some general suggestions:

  • One thing that you have to account when you remove the termini, is to define/to keep in the topology file the definition for the bond, angles and torsions between the molecule and its image such to ensure that the molecule is periodic.
  • One thing that it is crucial is that you start from a DNA structure in the correct conformation, since with this option you keep fix the number of residues per turn. You have 10 residues thus I guess you are interested in B-like structure (I also suggest to use 20 bp in place of 10 bp to allow more flexibility in the system).
    B-DNA has on average a rise of 0.33 nm, thus one expect a length of around 3.3 nm along z.
  • “periodic-molecule = yes” in .mdp file is the appropriate option to use here and it does not generate bond automatically, you have to manually including them in the topology file (see first points)
  • I suggest to use semi-isotropic for pressure coupling
  • As a tool to build DNA from a sequence, I usually use 3DNA

I hope it helps
Best regards

In the 2021 release, support for cyclic molecules was added to pdb2gmx. But pdb2gmx does not detect cyclic molecules over periodic boundary conditions.

If you know how to apply diffs to source code, the changes below should add support for PBC, but I haven’t tested this myself.

diff --git a/src/gromacs/gmxpreprocess/pdb2gmx.cpp b/src/gromacs/gmxpreprocess/pdb2gmx.cpp
index 6153e72b3e..44b0335e5a 100644
--- a/src/gromacs/gmxpreprocess/pdb2gmx.cpp
+++ b/src/gromacs/gmxpreprocess/pdb2gmx.cpp
@@ -2549,7 +2549,7 @@ int pdb2gmx::run()
             top_file2 = top_file;
-        pdb2top(top_file2, posre_fn.c_str(), molname.c_str(), pdba, &x, &atype, &symtab, rtpFFDB,
+        pdb2top(top_file2, posre_fn.c_str(), molname.c_str(), pdba, &x, box, &atype, &symtab, rtpFFDB,
                 restp_chain, hb_chain, bAllowMissing_, bVsites_, bVsiteAromatics_, ffdir_, mHmult_,
                 ssbonds, long_bond_dist_, short_bond_dist_, bDeuterate_, bChargeGroups_, bCmap_,
                 bRenumRes_, bRTPresname_, cc->cyclicBondsIndex, logger);
diff --git a/src/gromacs/gmxpreprocess/pdb2top.cpp b/src/gromacs/gmxpreprocess/pdb2top.cpp
index 351168fdc1..2a6de9a746 100644
--- a/src/gromacs/gmxpreprocess/pdb2top.cpp
+++ b/src/gromacs/gmxpreprocess/pdb2top.cpp
@@ -63,6 +63,7 @@
 #include "gromacs/gmxpreprocess/toputil.h"
 #include "gromacs/math/functions.h"
 #include "gromacs/math/vec.h"
+#include "gromacs/pbcutil/pbc.h"
 #include "gromacs/topology/residuetypes.h"
 #include "gromacs/topology/symtab.h"
 #include "gromacs/utility/binaryinformation.h"
@@ -754,6 +755,7 @@ static void at2bonds(InteractionsOfType*                  psb,
                      gmx::ArrayRef<MoleculePatchDatabase> globalPatches,
                      t_atoms*                             atoms,
                      gmx::ArrayRef<const gmx::RVec>       x,
+                     matrix                               box,
                      real                                 long_bond_dist,
                      real                                 short_bond_dist,
                      gmx::ArrayRef<const int>             cyclicBondsIndex,
@@ -765,6 +767,9 @@ static void at2bonds(InteractionsOfType*                  psb,
     long_bond_dist2  = gmx::square(long_bond_dist);
     short_bond_dist2 = gmx::square(short_bond_dist);
+    t_pbc pbc;
+    set_pbc(&pbc, PbcType::Unset, box);
     if (debug)
         ptr = "bond";
@@ -789,7 +794,9 @@ static void at2bonds(InteractionsOfType*                  psb,
             int aj = search_atom(patch.aj().c_str(), i, atoms, ptr, TRUE, cyclicBondsIndex);
             if (ai != -1 && aj != -1)
-                real dist2 = distance2(x[ai], x[aj]);
+                rvec dx;
+                pbc_dx(&pbc, x[ai], x[aj], dx);
+                real dist2 = norm2(dx);
                 if (dist2 > long_bond_dist2)
@@ -1483,6 +1490,7 @@ void pdb2top(FILE*                                  top_file,
              const char*                            molname,
              t_atoms*                               atoms,
              std::vector<gmx::RVec>*                x,
+             matrix                                 box,
              PreprocessingAtomTypes*                atype,
              t_symtab*                              tab,
              gmx::ArrayRef<const PreprocessResidue> rtpFFDB,
@@ -1514,7 +1522,7 @@ void pdb2top(FILE*                                  top_file,
     ResidueType rt;
     /* Make bonds */
-    at2bonds(&(plist[F_BONDS]), globalPatches, atoms, *x, long_bond_dist, short_bond_dist,
+    at2bonds(&(plist[F_BONDS]), globalPatches, atoms, *x, box, long_bond_dist, short_bond_dist,
              cyclicBondsIndex, logger);
     /* specbonds: disulphide bonds & heme-his */
diff --git a/src/gromacs/gmxpreprocess/pdb2top.h b/src/gromacs/gmxpreprocess/pdb2top.h
index 77d7f4b8cd..cd1243bc81 100644
--- a/src/gromacs/gmxpreprocess/pdb2top.h
+++ b/src/gromacs/gmxpreprocess/pdb2top.h
@@ -150,6 +150,7 @@ void pdb2top(FILE*                                  top_file,
              const char*                            molname,
              t_atoms*                               atoms,
              std::vector<gmx::RVec>*                x,
+             matrix                                 box,
              PreprocessingAtomTypes*                atype,
              t_symtab*                              tab,
              gmx::ArrayRef<const PreprocessResidue> rtpFFDB,

Thank you so much, Alessandra. All of your suggestions are extremely helpful. The first tip is the part that I struggled with:

In my original topology file (the 12 bp),after I cut off the ends, the 31st atom is my initial P and the 352nd atom is my last O. Before cutting off the ends, the 30th atom (O) was connected to the 31st ( P) and the 352nd (O) was connected to the 353rd atom ( P). So, for example, I simply deleted the bond reference linking the 30th atom to the 31st and modified the last bond to connect the 352nd atom to the 31st atom.
So now, for instance, my [ bond ] section (before renumbering, looks like):

31 32
31 33
31 34
34 35
352 31

(I did a similar thing to all bonds, angles, dihedrals, etc. that would cross the PBC.)

I guess the question is: Is this the correct way to link the molecule to its image? Or is this simply creating a strange, artificial bond that stretches from the top to the bottom of the DNA molecule?

Thank you, hess. I don’t know if I’ve got the chops for this, but I’ll check it out.

I get a little lost in the number. you have to look each strand independently. Assuming your strand is going from resid 1 to resid 12 and you have the following atoms numbering in the strand
Then what you wrote is correct.

If you visualize the gro file with VMD (and one of the drawing options) it occurs that you see a long bond going through the box from the first strand residue to the last strand residue. That is due to the fact that vmd think that it is one molecule and do not know that it is a continuum system.
I hope that makes thing more clear.
Best regards

This is perfectly clear. Thank you so much, Alessandra.