How can I get reproducible production mdrun results when running multiple times with same inputs?

GROMACS version: 2021.1
GROMACS modification: No


I just finished 2 times 300-ns production mdrun with all the same inputs and commands. When I check the RMSDs of these 2 mdrun, I found these 2 RMSDs look very different (e.g., series 1 and series 2 have an opposite position and for series, it also gaves different profiles in the 2 same produciton mdrun) as shown below.
Does it mean what I did is wrong?

I thought that with all the same inputs, the final production mdrun results should look similar. If so, what is the good way to generate the similar results for multiple replicates with all the same inputs (reproducible research idea)?

If production mdrun can not be reproducible, what is the good way to explain these different results?


1 Like

Hi Ming,

I think what you did is not wrong. Even though you use the same initial coordinates and input files, two simulations can be different because they will have different initial velocities if you don’t set the seed of generating velocities same in your inputs. The RMSD is the difference between the coordinates at time t and the inital coordinates, which can be different between two simulations starting with different velocities, because systems are moving in different way.

You can set gen-seed to the same integer except -1 in the input file for different simulations, since -1 is the default value and it generates random velocities in different simulations. If you are running Langevin dynamics or using v-rescale thermostat, you can also set ld-seed to the same integer except -1 to make sure everything is same for reproducing simulations.



from my understanding perfect reproducibility in MD simulations is a very tricky subject. Even with identical initial conditions (positions, forces and velocities), the trajectory is chaotic: any differences at any point will propagate and cause two simulations to diverge. For this to not happen with most computer architectures (and floating point precision), all calculations must be performed with the same precision and in the exact same order.

But since Gromacs is dynamically doing load balancing by shifting work with domain distributions and many other techniques, there are a lot of things that can affect the order in which these are done, which will affect the final trajectory. Even if these settings were to be constant I believe there’s also the risk of random bit flips in memory, although I don’t know how common those would be.

Still, while the trajectories can be different they still represent the same physics, so neither of the two produced trajectories are incorrect.

This is all to my best knowledge although I’m not sure if this or, as suggested above, a different RNG seed is what explains your divergence.

Kind regards,

Hello Pan and Petter,

Thanks so much for your help and explanation!
I will check the results with setting the seeds.

More questions:
(1) generally, if one does not set the seed and get the different RMSD trajectories, which one should be selected for explaining the experimental results or the phenomena?

(2) Or what is the good way to determine which trajectory is more proper to explain the phenomena?

I have read some papers, where it seems only one trajectory (from only one simulation with enough long time or maybe pick up only one from multiple simulations which can explain the results well?) was used to discuss their contents/phenomena in the paper.

(3) Is it reasonable to pick up only one trajectory from multiple simulations which can explain the wet-lab results well? If not so, what is the good way to determine which trajectory should be used to explain the wet-lab results?


All replicate simulations should be considered equally valid representations of possible outcomes of a given dynamical system. You wouldn’t do an experiment once and say your error bar is zero, would you? Sampling is key. So even if one trajectory is a perfect explanation of a given phenomenon, cherry-picking because you don’t like the other outcomes is dishonest and bad science. When different simulations give different results, the appropriate action is to comment on distinct behavior in the simulations, the other behaviors that were shared, and the probability of each happening. Rare events are interesting, too!

1 Like

Hi Justin,

Thanks for your help and suggestions. I am doing 3 times 300-ns simulations and waiting for 3rd results.

From all you three (Pan, Petter and Justin) helps, I am learning how to set up seeds when running the script and how to analyze the results properly.

Thanks so much!


1 Like