Sunday, 13 December 2015

Reproducibility in molecular dynamics

I once asked David Osguthorpe for some advice when I was a PhD student. I was using Discover to carry out molecular dynamics simulations on a small peptide. He told me that it was unreliable in the version that Biosym were selling and that you got different answers every time. He also told me that my loop simulations were wrong and that I must have forgotten to cap the peptide ends to get the peptide to fold into any loop form.

A few ideas and lessons came from this:

  1. I ignored my peptide simulation results and so it was never published. At the time it was the largest and longest peptide simulation ANYONE had done (this was 1994).
  2. The structure of the peptide bound to an enzyme became available and it was in the conformation that I had observed in the calculations! This structure was later retracted as it was based on a poor enzyme crystal structure (I corrected the structure in the PDB).
  3. I now know to run some simulations fixing the random seed to check for reproducibility of the simulations. This shows what is computer/coding variation and what is simulated variation.
  4. I have shown that despite variability the simulations are following some sort of physical reality in that they follow the Arrhenius equation. This is for an ensemble, an average over multiple simulations.
  5. Now I run all simulations a number of times and it worries me about how irreproducible they are when the seeds are not fixed. This seriously undermines the reproducibility of the field and supports my proposed second doctoral supervisors reasons for not taking that position on. His comment was that it is garbage in and garbage out.

