Is there an easy way to find all clashing atoms in a .gro file from energy minimization?

GROMACS version: 2023.4
GROMACS modification: No
Is there an easy way to find all clashing atoms in a .gro file?

I am running an energy minimization of a large ion channel and I am running into the issue that I am having atom clashes of water molecules with my protein. I am fairly confident I know why the water molecules are clashing, and I was wondering if there was a quick way to find all clashing atoms in one go?

I can find them one at a time by looking for any infinite force levels, but considering there could be a couple hundred that are clashing the prospect of running energy minimization that many times is daunting. Thus my question is again, is there a way to detect all atoms with infinite force in one energy minimization run?

Thanks so much!

Dear @zamydm

How did you build the system? Did you use gmx solvate? It seems weird that you have all that clashes, especially if you used tools like solvate or CHARMM-GUI web server. I would go back a step and take a look at what is generating such a weird starting structure.

As a side note, I can give you a couple of hints on how to fix this, but take my words with caution and double check what you do, as in my opinion fixing this things by hand can really hide problems that need proper fixing and trouble shooting. In my experience (embedding of proteins in lipid bilayers, CHARMM36m ff) most of atoms that are distant < 0.2 A will give rise to infinite forces, and will break your energy minimization (kinda of force field dependent, though). You can take a look at the structure with VMD and select only those atoms, do something like “water and within 0.2 of protein” and the selection will show you which water atoms are withing 0.2 A from the protein. In my cases, they couldn’t be removed as they were whole lipids and I would have had other problems to handle, but that’s another story. Now, if these are all water molecules that can be removed without much of a problem (the box is going to relax anyway in the first NVT/NPT runs), you have two options. If these are just a couple of molecules, get their names, remove them by hand, fix the topology and the gro file and you are good to go. Otherwise, if they are many, then do the same but with gmx select tool. Run something like

gmx select -f your_structure.gro -s your_tpr.tpr -n index_if_you_need_it.ndx -select 'resname TIP3 and same residue as not within 0.02 of group Protein' -on good_water.ndx

The command will require some fine tuning on your side depending on the name of the water model, the distance you want to select, the group name of the Protein etc, but the general idea is to select all the water molecules that DO NOT have atoms that are distant < 0.2 A from the protein. As such, as an output you will get an index file that contains the indexes of the water molecules that are okay, i.e., not clashing. Then just make a new index with gmx make_ndx and generate an entry that has everything that is not water plus your good water molecules, and use gmx trjconv with that index file to get as an output the final .gro file without all the water molecules. You will still have to fix topology by hand, since you will have to reduce the number of water molecules in your system accordingly.

I never found a better way to fix these problems, so maybe someone here has better tools than my tedious procedure. Also, again, if these are hundreds of atoms, then I strongly advice you to go back one step and try to understand why this is happening.

Hope this helps at least a bit!

Oh my goodness, thank you!

To answer your first question, yes I used gmx solvate. Let me elaborate for a bit on it. The ion channel I am working with was repaired and inserted into a lipid membrane using Charmm-Gui that was then solvated with pure water. This system was then extracted. However, rather than simply running the system, researchers that I am working with suggested to mirror the system so there would be two ion channel systems facing each other in opposite directions to better represent a cellular environment. I did this, and filled the gap between them using gmx solvate. However, I believe some of the water molecules added were placed on top of existing proteins, thus creating the clashing issues.

As for your concern regarding hundreds of atoms, I understand your worry, but my system is extremely large such that even if one thousand water molecules are clashing (much greater than even my worst case estimates), that would still be <0.5% the total volume of water molecules, making me confident that removing those molecules would not have a significant impact on the system behavior.

I will try your suggested method. Thanks again, you are awesome!

Glad I helped!

Yes, if the system is so large then going by hand is not feasible. If you extracted all the structures and resolvated the full system then pay attention that gmx solvate tends to put also a ton of water molecules in between the leaflets of the lipid bilayer. This is usually not a big problem, as water leaves the intra-bilayer region quite fast, but I saw protein simulations from colleagues break down for some weird water molecule trapped between the lipids and the channel, so keep an eye on those as well!

Have you considered packmol?