Visualizing clusters of molecules with trjconv with pbc

GROMACS version: 2016
GROMACS modification: No

I am running simulations which involve some proteins and some lipids which end up forming a cluster. I am having some issues visualizing it in some cases where the cluster of the molecules crosses the periodic boundary conditions. An example is below:

aBeta_double_48dmpcAndMonomer_150mMSalt

Proteins are blue and lipids are grey (with other colors highlighting residues or atoms in the lipids to determine termini and lipid orientation). The two large clusters of grey appear to be part of one larger cluster broken across the boundary conditions. I am having trouble figuring out the correct way to “stitch” this together using gmx trjconv. I have tried the following:

  • each type of pbc option. The cluster option actually broke this system up into multiple small clusters, > 4, which clearly is not the case here (at least from my perspective). whole and mol inherently do not help here except keeping molecules whole.
  • I have incorporated “-ur compact”
  • Using “-center yes” for any group (Protein, Lipid, Protein + Lipid using an index file, the whole system) does not help and only shifts the box at best.

There are total of 98 residues so I’d like to avoid the manual labor of finding which lipids are across the box and just doing to calculation by hand to change the coordinates. I am using VMD to visualize in case that’s useful. Would anyone have some suggestions for how to tackle this problem?

Thanks in advance!

1 Like

Heya,

I just finished writing a small script which fixes exactly this issue. I do not know if you still need it, but you can find it here:

It uses voxels and a bit of graph theory to solve the issue, for any amount of consecutive PBC crossings and makes it whole (pdb, gro, xtc, etc.). You can specify the density using the MDAnalysis syntax, usually it is fine to just specify ‘not solvent’. For whatever your solvent is It works near realtime for pretty large systems on a laptop and makes no use of copies of the original box, making the RAM usage very low (again a laptop can do it near realtime for the system you describe).

Cheers,

Bart

This looks great I will check it out!

Please feel free to open as many issues/topics as you want at the github page, we are still trying to improve the user experience.

hello @BrianAndrews
Did u resolve how to extract? Actually I am working on a Protein dimer and struggling with extraction procedure and facing difficulty similar to yours.
Can u help if possible?
Thank you in advance.

Hi priyanka,

I never ended up trying the package mdvwhole in this thread (although i want to). But it claims to do exactly what you are looking for.

But it mostly depends on the specifics of what information are you trying to get. Is it just a dimer in water with neutralizing ions? Does the dimer dissociate during the times you want to export the structure?

The simplest case is a dimer in water that remains dimeric during the simulation times you will extract. In that case, you can use the ‘-center’ flag in gmx trjconv and choose the protein group to center the dimer in the box in each frame of your trajectory. Then use ‘-pbc whole’ to ensure the dimer does not jump across the box. That way you should get a whole dimer to observe the structure ovet time. If your case is more complicated, you may need to use different PBC treatment to observe the desired behavior.

Hope this helps,

Brian

Thank you brian for responding.

Actually about the mdvwhole, I won’t be able to use it, because I want dimer along with surrounding water for analysis.
And I am not sure whether the dimer is dissociating or not as one extraction methods shows some frames having separation, and another extraction method shows no separation at all in any of the frames.
I have tried 3 command using gmx_trjconv and obtained the results as follows:

  1. -center -pbc whole (centering on chain A) - some frames show separation but one chain is out of the box.
  2. -center -pbc mol -ur compact (centering on whole protein) - the same results as above method but the frames which show separation , have chains at opposite ends of box with some part of the chain outside the box.
  3. -center -pbc mol -ur compact (centering on chain A) - shows both chains at middle of the box with no separation at all in any of the frames.

Note: Structural conformation of each chain individually remains same in all the 3 cases.

This is what confuses me and I am not sure which extraction method is correct.
Any suggestions?

With regards,
Priyanka

Indeed by default it returns only the selection. One could combine the water from the solvated trajectory using MDAnalysis, however, I will add this functionality into the software. Maybe if you state what kind of analysis you have in mind, I could make a suggestion on how to overcome this problem.

I’d suggest two things.

  1. Try -center (whole protein, so both monomers) and -pbc whole. Just in case.

  2. A nonvisual analysis. You could do gmx clustsize for the protein group. This would output an xvg file and it would give you a distribution of the time in the monomeric state and the dimeric state. The x axis is atom numbers so you should see only two peaks. If you only see one, its either entirely monomeric or entirely dimeric. This would help you not rely on potentially weird artifacts of the visualizations.

Hope this helps!

The feature to write all atoms of the input file as well as multiple selections to make whole has been added to the package as well as the pip distribution.


  1. As suggested -center and -pbc whole gave the same result as I had got with -center and -pbc mol -ur compact (centering on whole protein dimer).

  2. And gmx_clustsize is giving fatal error with
    gmx_mpi clustsize -f whole.xtc -s equil1/md.tpr -n index.ndx -o size.xpm -nc nclust.xvg

Group 0 ( System) has 90938 elements
Group 1 ( Protein) has 1768 elements
Group 2 ( Protein-H) has 860 elements
Group 3 ( C-alpha) has 124 elements
Group 4 ( Backbone) has 372 elements
Group 5 ( MainChain) has 494 elements
Group 6 ( MainChain+Cb) has 600 elements
Group 7 ( MainChain+H) has 622 elements
Group 8 ( SideChain) has 1146 elements
Group 9 ( SideChain-H) has 366 elements
Group 10 ( Prot-Masses) has 1768 elements
Group 11 ( non-Protein) has 89170 elements
Group 12 ( Water) has 89166 elements
Group 13 ( SOL) has 89166 elements
Group 14 ( non-Water) has 1772 elements
Group 15 ( Ion) has 4 elements
Group 16 ( CL) has 4 elements
Group 17 ( Water_and_ions) has 89170 elements

Total number of atoms in clusters = 1768
cmid: 1, cmax: 1, max_size: 1768
Fatal error:
Lo: 0.000000, Mid: 1.000000, Hi: 1.000000

Thank you @Bart . I will try using this.

I get an error when I try to install on windows as below:

C:\Users\DELL>pip install mdvwhole
Collecting mdvwhole
Downloading mdvwhole-0.0.4-py3-none-any.whl (15 kB)
Collecting mdvoxelsegmentation
Downloading mdvoxelsegmentation-1.1.5-py3-none-any.whl (26 kB)
Collecting mdvwhole
Downloading mdvwhole-0.0.3-py3-none-any.whl (14 kB)
Downloading mdvwhole-0.0.2-py3-none-any.whl (14 kB)
Downloading mdvwhole-0.0.1-py3-none-any.whl (14 kB)
ERROR: Cannot install mdvwhole==0.0.1, mdvwhole==0.0.2, mdvwhole==0.0.3 and mdvwhole==0.0.4 because these package versions have conflicting dependencies.

The conflict is caused by:
mdvwhole 0.0.4 depends on open3d
mdvwhole 0.0.3 depends on open3d
mdvwhole 0.0.2 depends on open3d
mdvwhole 0.0.1 depends on open3d

To fix this you could try to:

  1. loosen the range of package versions you’ve specified
  2. remove package versions to allow pip attempt to solve the dependency conflict

ERROR: ResolutionImpossible: for help visit Dependency Resolution - pip documentation v22.1.dev0

For now I did not test the dependencies on Windows yet. Also I can still trim them down quite a bit. To my understanding linux is a more common development platform, but I will look into the possibility to run on Windows too. Thanks for pointing this out.