Fatal error: An input file contains a line longer than 4095 characters

GROMACS version: VERSION 4.6.5
GROMACS modification: No
Here post your question

I’m trying to run an RNA potential published in [1]:

## Create topology files (make sure your pdb file has Amber format)
pdb2gmx -f 1duq.pdb -o 1duq.gro -p 1duq.top
# Choose AMBER03 (Option 1 on my machine)
# Choose None for water model (Option 6 on my machine)

## Make index file
make_ndx -f 1duq.gro
# Keep only the system group
# Create groups for each specific KB atom type
# ex:
#    r A & a P
#    name 1 aP
# the list of groups and commands is provided in groups.txt
# Be careful if you want to include the residues at the end of each chain as they are labelled differently (this is not made in the supplied group file).

## Prepare the energy computation
grompp -c 1duq.gro -p 1duq.top -f score.mdp -n index.ndx  -o run.tpr -maxwarn 15
# You can ignore the 2 warnings on sigma/epsilon and oscillation period

## Add the directory with the table files to your GMXLIB env variable (with export or setenv depending on your shell)

## Compute the energy
mdrun -table table.xvg -tablep tablep.xvg -s run.tpr -e run.edr
# The KB energy is labelled LJ (SR) in the log file 

The authors provide score.mdp file for model of course-grain RNA, where there is 5 atom per residue, “5pt” with the definition of groups:

;Selection of energy groups
energygrps               = aP aC4s aC2 aC4 aC6 uP uC4s uC2 uC4 uC6 gP gC4s gC2 gC4 gC6 cP cC4s cC2 cC4 cC6 other

; Seperate tables between energy group pairs
energygrp_table          = aP aP aP aC4s aP aC2 aP aC4 aP aC6 aP uP aP uC4s aP uC2 aP uC4 aP uC6 aP gP aP gC4s aP gC2 aP gC4 aP gC6 aP cP aP cC4s aP cC2 aP cC4 aP cC6 aC4s aC4s aC4s aC2 aC4s aC4 aC4s aC6 aC4s uP aC4s uC4s aC4s uC2 aC4s uC4 aC4s uC6 aC4s gP aC4s gC4s aC4s gC2 aC4s gC4 aC4s gC6 aC4s cP aC4s cC4s aC4s cC2 aC4s cC4 aC4s cC6 aC2 aC2 aC2 aC4 aC2 aC6 aC2 uP aC2 uC4s aC2 uC2 aC2 uC4 aC2 uC6 aC2 gP aC2 gC4s aC2 gC2 aC2 gC4 aC2 gC6 aC2 cP aC2 cC4s aC2 cC2 aC2 cC4 aC2 cC6 aC4 aC4 aC4 aC6 aC4 uP aC4 uC4s aC4 uC2 aC4 uC4 aC4 uC6 aC4 gP aC4 gC4s aC4 gC2 aC4 gC4 aC4 gC6 aC4 cP aC4 cC4s aC4 cC2 aC4 cC4 aC4 cC6 aC6 aC6 aC6 uP aC6 uC4s aC6 uC2 aC6 uC4 aC6 uC6 aC6 gP aC6 gC4s aC6 gC2 aC6 gC4 aC6 gC6 aC6 cP aC6 cC4s aC6 cC2 aC6 cC4 aC6 cC6 uP uP uP uC4s uP uC2 uP uC4 uP uC6 uP gP uP gC4s uP gC2 uP gC4 uP gC6 uP cP uP cC4s uP cC2 uP cC4 uP cC6 uC4s uC4s uC4s uC2 uC4s uC4 uC4s uC6 uC4s gP uC4s gC4s uC4s gC2 uC4s gC4 uC4s gC6 uC4s cP uC4s cC4s uC4s cC2 uC4s cC4 uC4s cC6 uC2 uC2 uC2 uC4 uC2 uC6 uC2 gP uC2 gC4s uC2 gC2 uC2 gC4 uC2 gC6 uC2 cP uC2 cC4s uC2 cC2 uC2 cC4 uC2 cC6 uC4 uC4 uC4 uC6 uC4 gP uC4 gC4s uC4 gC2 uC4 gC4 uC4 gC6 uC4 cP uC4 cC4s uC4 cC2 uC4 cC4 uC4 cC6 uC6 uC6 uC6 gP uC6 gC4s uC6 gC2 uC6 gC4 uC6 gC6 uC6 cP uC6 cC4s uC6 cC2 uC6 cC4 uC6 cC6 gP gP gP gC4s gP gC2 gP gC4 gP gC6 gP cP gP cC4s gP cC2 gP cC4 gP cC6 gC4s gC4s gC4s gC2 gC4s gC4 gC4s gC6 gC4s cP gC4s cC4s gC4s cC2 gC4s cC4 gC4s cC6 gC2 gC2 gC2 gC4 gC2 gC6 gC2 cP gC2 cC4s gC2 cC2 gC2 cC4 gC2 cC6 gC4 gC4 gC4 gC6 gC4 cP gC4 cC4s gC4 cC2 gC4 cC4 gC4 cC6 gC6 gC6 gC6 cP gC6 cC4s gC6 cC2 gC6 cC4 gC6 cC6 cP cP cP cC4s cP cC2 cP cC4 cP cC6 cC4s cC4s cC4s cC2 cC4s cC4 cC4s cC6 cC2 cC2 cC2 cC4 cC2 cC6 cC4 cC4 cC4 cC6 cC6 cC6 

But the problem is when I used it for full atom potential.

There is around 3.6K table files in the potential that the authors provided:

RNA_aa_full$ ls -l | wc -l
    3662 # files like table_uP_uC4s.xvg

so this gives so many groups that the line energygrp_table gets so long that Gromacs complains:

 Fatal error: An input file contains a line longer than 4095 characters, while the buffer passed to fgets2 has size 4095. The line starts with: '20s’.   

Is there anyway to split the option energygrp_table into multiple lines? Or anything else to do?

Marcin

energygrp-table:
When user tables are used for electrostatics and/or VdW, here one can give pairs of energy groups for which seperate user tables should be used. The two energy groups will be appended to the table file name, in order of their definition in energygrps, seperated by underscores. For example, if energygrps = Na Cl Sol and energygrp-table = Na Na Na Cl, mdrun will read table_Na_Na.xvg and table_Na_Cl.xvg in addition to the normal table.xvg which will be used for all other energy group pairs.

J. Bernauer, X. Huang, A. Y. L. Sim, and M. Levitt, “Fully differentiable coarse-grained and all-atom knowledge-based potentials for RNA structure evaluation.,” RNA, vol. 17, no. 6, pp. 1066–1075, Jun. 2011.

You are using a very old GROMACS version. We always suggest to use the newest version. As versions 2019 and 2020 do not support user tables, this is version 2018. In that version you should be able to change the buffer for the energy group table input by replacing the macro STRLEN by a sufficiently large number (of characters) in egptable[STRLEN] on line 88 in the file src/gromacs/gmxpreprocess/readir.cpp

Thank you @hess! Yeah, te authors suggest actually even Gromacs 4. So this is what I have to use or just skip using this potential.

I played with using \n to put this long string into a few lines, but this didn’t work.

Thank you for a great tip. I will try this and let you know (maybe others will have the same issue).

@hess ok, thank you very much.

It seems that I had to stick to the old version of Gromacs.

I changed STRLEN to 40000 in gromacs-4.6.5/include/types/simple.h [line 294] and it worked! This changes this variable for the whole code, so I guess this is not recommended, but for my case it worked. Thanks again!