MD set up for all-atom Protein-ssDNA complex systems

GROMACS version: 2020.3
GROMACS modification: No

I am fairly newly to MD and would like to simulate a protein-ssDNA complex. I have been reading the literature to try and establish the best set up for my system and I am more than happy to hear suggestions from the community or alternatively to discuss the current state/best practice set up for these types of simulations in general.

My simulation set up will contain the protein and a short ssDNA fragment of ~20 nucleotides in 0.15 M NaCl, tip3p water model. I am interested in analysing protein-DNA interactions to investigate DNA arrangement and dynamics, as well as protein-ion interactions. The only information I have about potential interactions are from the structure which suggests a weak co-ordination of the phosphate backbone to positively charged residues.

I gather that the CHARMM and AMBER forcefields are most commonly used to simulate protein-DNA systems. However a couple of years ago this paper came out highlighting some artefacts with the CHARMM36 forcefield when simulating dsDNA over longer microsecond timescales. https://pubs.acs.org/doi/10.1021/acs.jpcb.9b09106. Although my simulation set up will be very different to what was done in this paper, I am leaning towards using the AMBER forcefields due to potential problems with CHARMM. But the CHARMM36 forcefield may model protein-ssDNA dynamics fine over long timescales, I need to do more reading on this and I actually don’t know if the problem is only with dsDNA. I wondered if anyone had any input.

According to the latest AMBER manual on their website, the 2021 manual, the recommended forcefields to use are ff14SB/OL15 for protein/DNA or alternatively ff14SB/bsc1. I gather that some of the main problems that I need to be aware of are that both the CHARMM/AMBER forcefields + tip3p water model produce overly strong NA-NA and protein-DNA interactions. To counteract this I have noticed a couple of papers using CUFIX- non-bonded fix for the CHARMM/AMBER forcefields. Although highlighted in their review, there are some concerns with using this approach New tricks for old dogs: improving the accuracy of biomolecular force fields by pair-specific corrections to non-bonded interactions - Physical Chemistry Chemical Physics (RSC Publishing). I have also noticed some papers which propose novel forcefields, for example Tumuc1, but I haven’t noticed many papers testing or validating more novel forcefields yet.

So there is a lot to consider- is there any sort of consensus on best practice for this type of set up with the tested forcefields and additional fixes we have available to us at the moment, with the end result being to analyse protein-dna/protein-ion interactions? Do people tend to use the standard forcefields but acknowledge that interactions may be overestimated, or use fixes like CUFIX?

Hi,

According to my recent investigation for ssDNA the best forcefield would be amber14 indeed over charmm36: https://doi.org/10.1016/j.omtn.2021.07.015

Problem of charmm36 is that it ssDNA tends to unfold completely in this forcefield. However amber14bsc1 has an opposite problem, as it was created for dsDNA tending to make everything more rigid, preventing of loops conformational sampling. So I would suggest indeed to use normal amber14 for ssDNA. Did not check more recent Amber forcefields, but you could try them as well by doing ~hundred ns dynamics and the clustering trajectories.