DNA-RNA Simulations

Does there exist a forcefield meant for simulating complexes of both DNA and RNA? I’m hoping to run a simulation of an R-loop, but I’d read from this post that there are usually issues with RNA forcefields and so far haven’t been able to find any mention of forcefields that are used for complexes with both DNA and RNA. Any advice/suggestions would be greatly appreciated!

Reiterating what I wrote in that thread - for an R-loop, the best option (at least until we release the next iteration of the nucleic acid FF ;)) would be to use DE Shaw’s RNA/DNA FFs due to the significant impact of the single-stranded part, and they certainly improve ssDNA/ssRNA description over alternatives. However, for that you have to ask them for the files directly, and they’re not in the most convenient format, so some knowledge of Gromacs technical details is necessary.

Meanwhile, in the user-contributed FFs, there is a ready-made combination of OL3 (for RNA) and OL15 (for DNA) that should work right away. You can also combine OL3 with parmbsc1 by preparing them separately, adding the parameters explicitly with gromologist, and merging the two molecules into one topology. These options should be much easier to implement but might yield a less flexible ssDNA region than desired; look at this paper for a discussion of how much less.

Would you be able to help explain an error I’m encountering with gromologist? I’ve been trying to run .add_ff_params() and am getting an error RuntimeError: Wildcard (‘X’) entries were found prior to regular ones, please fixyour FF parameters, but I don’t know what is meant by Wildcard entries or how to fix them

Oh yes, wildcard dihedral entries (ones containing “any”, or “X” types) are the ones that should only be applied if explicitly defined dihedrals are not specified, and it used to be an issue until perhaps 2016 or 2018 that changing the order of [ dihedraltypes ] would change which parameters were used (obviously not a great thing).

You should be able to fix that manually by moving the blocks with Xs to the bottom of the dihedrals section, but perhaps the issue is ancient enough that I should change it to a warning rather than throw an error.

Thank you for your reply! There are a few more points which I still need some guidance on.

  • When I tried using the .add_parameters_from_file() function to merge the topologies of a DNA and RNA, I get multiple instances of ‘Found mismatch between original entry’. My understanding is that the forcefield for RNA and DNA have defined parameters for atoms, bonds and angles that overlap, and while not all of them are used for RNA and DNA, the ones that are might have conflicting values, so one is chosen over the other. Since the user-contributed FFs seem to have cases where different forcefields are used for DNA and RNA, I was wondering how this conflict is usually solved. Is there a way to have only those parameters that are needed for DNA and RNA loaded into the topology file? If some of the values parametrizing the bonds that DNA and RNA have in common do conflict, is there a way to make sure the values specific to DNA are only used for DNA, and only those for RNA are used for RNA, or is it necessary to choose the values from one forcefield over the other and you just have to assume that one of the molecules are going to be parametrized incorrectly?
  • Upon removing the wildcard entries, it seems .add_ff_params() worked, but upon adding the parameters and using .find_missing_ff_params(), the function returned multiple instances of ‘Couldn’t find params for interaction type dihedrals’, so I also wanted to know whether this issue is separate from the removal of the wildcards or if there’s some other way to get past this
  • When I tried a different forcefield combination, using .add_parameters_from_file() returned this error “RuntimeError: Found multiple entries matching the types of (‘HC’, ‘CT’, ‘CM’) in section angletypes, make sure this is intentional” which I’m again not sure what to make of

Thank you in advance!

Hi! First of all, don’t remove wildcards, they are needed. It’s just that Gromologist requires them to be at the bottom of the list, in order to avoid them being applied wrongly in older Gromacs versions.

It is entirely possible that the DNA and RNA FFs will have conflicting entries for some bonded terms, that’s why the best strategy is to (in this order):

  1. First prepare the RNA and DNA topologies separately
  2. Run t.add_ff_params() and t.explicit_defines() on both topologies and save them separately - this way all the bonded terms will be self-contained
  3. Run t.add_molecule_from_file(...) and t.add_molecule_to_system(...) to move, say, the RNA molecule into the DNA system’s topology
  4. Run t.find_missing_ff_params() to check for the last part that’s possibly missing, i.e. nonbonded parameters; it’s next to impossible for nucleic acids’ atom types in chiOL3 and bsc1 to have different LJ parameters, but other atom types (protein, ions etc) might differ slightly, and if any are missing, make sure to transfer them too.

This should be generally more robust than using add_parameters_from_file, and sorry if that wasn’t clear from my previous suggestion.

Either way, let me know if it ends up working!

I’ve tried the method for both the OL3/parmbsc1 combination and for the parameters for the DE Shaw forcefields, so far it seems to work up to the .find_missing_ff_params() step, at which point I get ‘Couldn’t find definition of atom type’ messages, the atoms being the RNA’s C5 for the OL3/parmbsc1 and the RNA’s CA_CA and NC_NA for the DE Shaw forcefields. When I checked the .top files saved from step 2, these atoms were present under the RNA .top file but not in the DNA .top file, so is the issue that not all the parameters transferred from the RNA to the DNA topology file? If so, is there a way to specify adding those specific atoms, or is it best to just manually copy-paste the parameters involved with those atoms from the RNA to the DNA topology before merging the topologies?

I’d say if they are just 1-2 atom types, it’s safer to transfer the corresponding 1-2 lines manually. You can merge automatically but then the chance of something getting overwritten becomes somewhat larger. Of course, you don’t need to worry about the bonded terms, as these should be already explicitly embedded in the molecules’ definitions.