Trajectory in bash

S.Z · June 3, 2020, 7:12pm

GROMACS version: 2018
GROMACS modification: No

To calculate distance between C1 and C2 atoms of each molecule during a simulation, I loop over the molecules (molecules # 3000) in the trajectory, likes below inefficient bash script:

for i in `seq 1 3000`
do
gmx distance -s topol.tpr -f traj.trr -n index.ndx -select ‘group "r_’$i’_&_C1" plus group “r_’$i’_&_C2”’ -oxyz $i.xvg
done

As I was expecting, the process is so time consuming as for each $i (each molecule), the rest of trajectory belonging to other 2999 molecules are also loaded, get processed and then discarded.
So, I wonder if there is any other better way in BASH to read and store the whole trajectory just ONCE, and then do the distance calculations?

I also tried to partitioned the trajectory to single molecules trajectories $i.trr first, and then apply gmx distance on each $i.trr, however, no improvement I could noticed.

Regards,
Salman

mick · June 7, 2020, 8:59pm

How ‘fat’ are your nodes? You could stick the distance command as background processes, something like:

for((i=1;i<3000;i++));do
val=$(awk -v i=$i ‘{print i%10+1}’)
if [ $val -eq 10 ]; then
gmx distance -s topol.tpr -f traj.trr -n index.ndx -select ‘group "r_’$i’&C1" plus group “r’$i’&C2”’ &
wait
else
gmx distance -s topol.tpr -f traj.trr -n index.ndx -select ‘group "r’$i’&C1" plus group “r’$i’&_C2”’&
fi
done

(be warned, I haven’t tested this, it is just a first guess).

Alternatively, parallel (https://www.gnu.org/software/bash/manual/html_node/GNU-Parallel.html ) would also work.

As a final alternative, you could do this in vmd as well (lots of tcl scripts exist for this)

S.Z · June 9, 2020, 7:16pm

Thanks Micholas for your comment.
The nodes have 44 cores or 32 cores, however even if I had a node with 3000 cores corresponding to the number of molecules, I believe that sending the distance commands to background wouldn’t still solve the issue of unnecessarily reading/loading the whole trajectory 2999 times. I am more looking for a way in BASH to read the trajectory only once.

JohnWhittaker · June 11, 2020, 9:09pm

I’m not really sure that you can tell bash to read the trajectory only once because you’re using a GROMACS tool to read the trajectory and do the distance calculation. The only thing bash is useful for there is doing the looping to rerun the command over and over.

If you’re comfortable with using Python then MDAnalysis might work better for you.

https://www.mdanalysis.org/

For most applications using these tools you will only load the trajectory once and then you can iterate over the frames of the trajectory and calculate your distances of interest. Hopefully this helps.

John

Topic		Replies	Views
Speed up "rdf" and "hbond-legacy" for single frames for whole trajectory? User discussions	0	19	January 16, 2025
Gromacs distance calculation between multiple atoms, is it real? User discussions analysis-tools	0	282	December 6, 2023
GMX_distance User discussions analysis-tools	3	384	March 30, 2021
Gmx distance using bash for loop User discussions gmx-distance	5	1035	February 8, 2022
Distance calculation between four different protein residues and the ligand molecule as a function o User discussions pdb2gmx , mdrun , analysis-tools	7	751	July 20, 2021

Trajectory in bash

Related topics