Importance of .tpr file for generating trajectories

GROMACS version: 4.5.5

Hello everyone,

I was trying to generate the .pdb files using -pbc whole and -pbc nojump conditions. As I am aware that for recovering the broken molecules we always need to supply the initial structure file that can be .tpr or .g96?

So, I am curious to know what exactly the .tpr file is doing for recovering the broken molecules instead of .g96 file. Although, both files (.g96 and .tpr) have the initial coordinates of the molecules for the given system.

Regards
Dalip

Molecules can only be made whole when there is topological information (e.g. bonded connectivity) that are in a .tpr file but are not in coordinate files.

Hello Professor,
Thank you so much for highlighting this. I was unaware of it, I was thinking that the initial structure file, itself have a proper connectivity of atoms.

Regards
Dalip

Hello Professor,
There is one more concern kindly give your expert advice/suggestion.
As I am using the following command to recover the broken molecules (using .tpr) but still there are some molecules that are broken
trjconv -f prod.trr -s prod.tpr -pbc whole -b XX -e YY -o ABC.trr
trjconv -f ABC.trr -s prod.tpr -pbc nojump -b xx -e yy -o abc{xx}.pdb
OR
instead of the above command, should I use initial coordinate file for nojump condition?
trjconv -f prod.trr -s prod.tpr -pbc whole -b XX -e YY -o ABC.trr
trjconv -f ABC.trr -s in.g96 -pbc nojump -b xx -e yy -o abc{xx}.pdb

Please give your advice, why initial coordinate file is important in nojump condition!

With -pbc whole there should never be broken molecules.

Using -pbc nojump just requires that the molecules in the reference structure are imaged as you would like them.

There are very few instances for which -pbc mol -center doesn’t work, so that may be an alternative, at least if your system is relatively simple.

Hello Professor,
As I am using -pbc whole in 1st step and later -pbc nojump in 2nd step, it makes molecules broken. I am also shocked, how it can be possible?

Here I am explaining the overall scenario, let’s say I have a simulation trajectory of 5ns.

Step1: converting the whole trajectory (from 0 to 5ns) into .trr or .pdb formate

trjconv -f prod.trr -s prod.tpr -pbc whole -o trj.trr or trj.pdb
(This makes all molecules as a whole, this is perfect, I have checked it in VMD)

Step2: Now, separating all frames from the trajectory with time defference of 5ps (1000 frames)
trjconv -f trj.trr of trj.pdb -s prod.tpr -pbc nojump -b xx -e yy -o framesXXX.pdb

Here, I have found that molecules are broken again. I am curious to know why this is happening, although I have already repaired the broken molecules. Or this nojump condition makes molecules to be broken.

Regards
Dalip

Can you please provide some screenshots of equivalent frames after -pbc whole and then after -pbc nojump. There is nothing about the nojump option that would cause intact molecules to become broken again.

Hello Professor,
Sorry for the late response, I was trying to understand the -pbc whole and nojump. Now, I am explaining the whole procedure in detail, please give your expert comments on whether it is correct or not!

  1. I have a long trajectory file, let’s say 10ns and the data (coordinates) are saved at a time interval of 5ps in this trajectory file

  2. I break this (10ns) long trajectory into 1ns trajectories
    trjconv -f prod.trr -b 0 -e 1000 -o trj1ns.trr
    trjconv -f prod.trr -b 1000 -e 2000 -o trj2ns.trr
    .
    .
    .
    trjconv -f prod.trr -b 9000 -e 10000 -o trj10ns.trr

[For the 1st segment e.g. 0-1ns]
3. I have generated the initial frame using below command:
trjconv -f prod.trr -s prod.tpr -b 0 -e 0 -pbc whole -o input1st.pdb

  1. To generate the smooth trajectory using -pbc nojump from the 1ns trajectory e.g. trj1ns.trr

trjconv -f trj1ns.trr -s input1st.pdb -pbc nojump -o trjnojump1ns.pdb

  1. Now, I have extracted all the frames from trjnojump1ns.pdb files
    trjconv -f trjfinalnojump1ns.pdb -s input1st.pdb -b 0 -e 0 -o PDBGEN0.pdb
    trjconv -f trjfinalnojump1ns.pdb -s input1st.pdb -b 5 -e 5 -o PDBGEN5.pdb
    trjconv -f trjfinalnojump1ns.pdb -s input1st.pdb -b 10 -e 10 -o PDBGEN10.pdb
    .
    .
    trjconv -f trjfinalnojump1ns.pdb -s input1st.pdb -b 1000 -e 1000 -o PDBGEN1000.pdb

[For the 2nd segment of 1-2ns]

  1. To generate the smooth trajectory using -pbc nojump from the 2ns trajectory e.g. trj2ns.trr

trjconv -f trj2ns.trr -s PDBGEN1000.pdb -b 0 -e 1000 -pbc nojump -o trjnojump2ns.pdb

  1. Again, I have extracted the frames from trjnojump2ns.pdb files

trjconv -f trjfinalnojump2ns.pdb -s PDBGEN1000.pdb -b 1000 -e 1000 -o PDBGEN1000.pdb
trjconv -f trjfinalnojump2ns.pdb -s PDBGEN1000.pdb -b 1005 -e 1005-o PDBGEN1005.pdb

.
.
trjconv -f trjfinalnojump2ns.pdb -s PDBGEN1000.pdb -b 2000 -e 2000 -o PDBGEN2000.pdb

and so on… till PDBGEN10000.pdb

Here are the doubts:
Q1. The last frame of the 1st segment (e.g. PDBGEN1000.pdb) should be used as the 1st frame for the next segment (1-2ns segment) ?
Q2. Or should I always use the 1st frame (e.g. input1st.pdb) for all the next segments (e.g. 1ns, 2ns, 3ns, …10ns)?
Q3. What exactly this -s input1st.pdb/PDBGEN1000.pdb/PDBGEN2000.pdb -pbc nojump doing in this step?

Kindly share your thoughts on this, is this seems correct to you for generating the smooth trajectory frames!

You’re doing way too much work here. I don’t understand why you’re splitting the trajectory up into intervals and trying to adjust PBC individually; that makes no sense. Also, never use a PDB file with -s when also trying to use -pbc. Always use a .tpr file.

All you should need is

gmx trjconv -s prod.tpr -f prod.trr -o whole.trr -pbc whole
gmx trjconv -s prod.tpr -f whole.trr -o nojump.trr -pbc nojump

Or even more simply

gmx trjconv -s prod.tpr -f prod.trr -o mol.trr -pbc mol -center

Hello Professor,
I have a very long trajectories (10) of 2000ns. If I will generate all possible frames (400000) from this long trajectory (2000ns), it will take months while using the above command (you mentioned). So, I need to break it into small segments of 1ns. I am facing this major issue, how to handle this PBC while dealing with small segments of trajectories. For my analysis (own program), I need all the frames (.gro or .pdb files) to do the calculations. Also, .trr file is binary coded file and I can not read it directly.

The .tpr file will provide the intial connectivity of molecules and -pbc whole make them as a whole. Later their movement remains the same as how the system will behave.

In my understanding, for each of the segments, we need to supply the coordinate file at that time (the last frame from the last segment will works as the first frame for next segment) so the molecules diffusion will not affected. Please correct me if I am wrong!

You should be able to use one reference .tpr to correct all PBC issues with a single trajectory. I still do not understand why you would break it into pieces; this will be very slow the way you’re doing it. I have never seen trjconv take “months” to process any trajectory. You can easily drop out frames individually from a corrected trajectory file with gmx trjconv -sep rather than explicitly using -b -e with the same time (which also is redundant with -dump).

Thank you very much, Professor. Now, things make more sense to me. Thank you for your time and fruitful discussion. I hope this discussion will help many more!