Test failure in FourCenter tests in GROMACS 2021

GROMACS version: 2021
GROMACS modification: No

I’m getting failures in the FourCenter tests when building 2021.

Hardware: Intel® Xeon® CPU E5-2690 v4
Compiler: GCC/9.3.0
Libraries: OpenBlas/0.3.9, FFTW/3.3.8

No cmake options set.

[ RUN ] FourCenter/ProperDihedralTest.CheckListed/0
/scratch/ake/gromacs/gromacs-2021.orig/src/testutils/refdata.cpp:934: Failure
In item: /forces/[1]/Z
Actual: 2.9845130443572998
Reference: -14.707831382751465
Difference: 17.6923 (2175423882 single-prec. ULPs, rel. 1.2), signs differ
Tolerance: abs. 1e-12, rel. 1e-12, 1000 ULPs
/scratch/ake/gromacs/gromacs-2021.orig/src/testutils/refdata.cpp:934: Failure
In item: /forces/[2]/Z
Actual: -5.9854903221130371
Reference: 29.496799468994141
Difference: 35.4823 (2192278165 single-prec. ULPs, rel. 1.2), signs differ
Tolerance: abs. 1e-12, rel. 1e-12, 1000 ULPs
/scratch/ake/gromacs/gromacs-2021.orig/src/testutils/refdata.cpp:934: Failure
In item: /forces/[3]/Z
Actual: 3.0009772777557373
Reference: -14.788968086242676
Difference: 17.7899 (2175578016 single-prec. ULPs, rel. 1.2), signs differ
Tolerance: abs. 1e-12, rel. 1e-12, 1000 ULPs
[ FAILED ] FourCenter/ProperDihedralTest.CheckListed/0, where GetParam() = ({ 12-byte object <7F-92 EA-BF 00-00 70-41 02-00 00-00> }, { 12-byte object <00-00 00-00 00-00 00-00 00-00 00-00>, 12-byte object <00-00 00-00 00-00 00-00 CD-CC 4C-3E>, 12-byte object <0A-D7 A3-3B 00-00 00-00 CD-CC CC-3D>, 12-byte object <6F-12 83-BA CD-CC CC-3D 00-00 00-00> }) (0 ms)

and more.

Are they expected or?

Doing a -DCMAKE_BUILD_TYPE=Debug build there are no problems.

Doing the build on our Skylakes (Intel® Xeon® Gold 6132 CPU) and otherwise with the same setup on compilers/libs there are no errors in the tests.

This does not reproduce with gcc 9.1 nor 10.2 so I assume it is quite gcc version-specific. Can you check another gcc version? Could you try -O2 and -O1 optimization levels just so we know at which level does it appear?

The numbers seem quite off, so if this does reproduce, I suggest filing an issue on gitlab.

Cheers,
Szilárd

I don’t have problems using GCC/10.2.0 either.

Some more data:
env CFLAGS="-O2 -march=native -fno-math-errno" CXXFLAGS="-O2 -march=native -fno-math-errno" cmake -DCMAKE_INSTALL_PREFIX=/scratch/ake/gromacs/inst -DCMAKE_VERBOSE_MAKEFILE=ON -DCMAKE_BUILD_TYPE=None …/gromacs-2021
No errors

env CFLAGS="-O3 -march=native -fno-math-errno" CXXFLAGS="-O3 -march=native -fno-math-errno" cmake -DCMAKE_INSTALL_PREFIX=/scratch/ake/gromacs/inst -DCMAKE_VERBOSE_MAKEFILE=ON -DCMAKE_BUILD_TYPE=None …/gromacs-2021
Errors.

env CFLAGS="-O3 " CXXFLAGS="-O3 " cmake -DCMAKE_INSTALL_PREFIX=/scratch/ake/gromacs/inst -DCMAKE_VERBOSE_MAKEFILE=ON -DCMAKE_BUILD_TYPE=None …/gromacs-2021
No errors

env CFLAGS="-O3 -march=native" CXXFLAGS="-O3 -march=native" cmake -DCMAKE_INSTALL_PREFIX=/scratch/ake/gromacs/inst -DCMAKE_VERBOSE_MAKEFILE=ON -DCMAKE_BUILD_TYPE=None …/gromacs-2021
No errors

env CFLAGS="-O3 -fno-math-errno" CXXFLAGS="-O3 -fno-math-errno" cmake -DCMAKE_INSTALL_PREFIX=/scratch/ake/gromacs/inst -DCMAKE_VERBOSE_MAKEFILE=ON -DCMAKE_BUILD_TYPE=None …/gromacs-2021
Errors

So, it’s the -O3 -fno-math-errno combo that is causing problems with GCC/9.3.0

I do not see why disabling setting errno would cause such failures, so I suspect it may be indirectly causing the error.

Can you please check whether this reproduces with other SIMD targets as well (e.g. AVX or SSE4)?

Don’t have any such old systems available, but I guess there is a CMAKE flag in GROMACS to target that?

The cmake option is GMX_SIMD, see Installation guide — GROMACS 2021 documentation

GMX_SIMD:
SSE4.1 works
AVX2_128 fails
AVX_256 works
AVX2_256 fails

So i guess it’s fma related.

I’ll try the same + more on our skylakes later.

Can you please file a gitlab issue and record your finding there?

I’m also seeing this error with GROMACS 2022.2 when building it with the easybuild gofbf-2020a toolchain.

[----------] 1 test from FourCenter
[ RUN      ] FourCenter.ListedForcesProperDihedralTest
/tmp/stuekero/avx2/GROMACS/2022.2/gofbf-2020a/gromacs-2022.2/src/testutils/refdata.cpp:949: Failure
   In item: /forces/[1]/Z
    Actual: 2.9845132827758789
 Reference: -14.707831382751465
Difference: 17.6923 (2175423883 single-prec. ULPs, rel. 1.2), signs differ
 Tolerance: abs. 1e-06, rel. 0.0001, 200 ULPs, sign must match
/tmp/stuekero/avx2/GROMACS/2022.2/gofbf-2020a/gromacs-2022.2/src/testutils/refdata.cpp:949: Failure
   In item: /forces/[2]/Z
    Actual: -5.9854907989501953
 Reference: 29.496799468994141
Difference: 35.4823 (2192278166 single-prec. ULPs, rel. 1.2), signs differ
 Tolerance: abs. 1e-06, rel. 0.0001, 200 ULPs, sign must match
/tmp/stuekero/avx2/GROMACS/2022.2/gofbf-2020a/gromacs-2022.2/src/testutils/refdata.cpp:949: Failure
   In item: /forces/[3]/Z
    Actual: 3.0009775161743164
 Reference: -14.788968086242676
Difference: 17.7899 (2175578017 single-prec. ULPs, rel. 1.2), signs differ
 Tolerance: abs. 1e-06, rel. 0.0001, 200 ULPs, sign must match
[  FAILED  ] FourCenter.ListedForcesProperDihedralTest (0 ms)
[----------] 1 test from FourCenter (0 ms total)
[...]
99% tests passed, 1 tests failed out of 84

Label Time Summary:
GTest              = 131.61 sec*proc (81 tests)
IntegrationTest    =  43.81 sec*proc (25 tests)
MpiTest            =  85.80 sec*proc (19 tests)
QuickGpuTest       =  15.77 sec*proc (17 tests)
SlowTest           =  84.00 sec*proc (13 tests)
UnitTest           =   3.80 sec*proc (43 tests)

The toolchain consists of:

  • gcc/9.3.0
  • openmpi/4.0.3
  • flexiblas/3.0.4
  • fftw-mpi/3.3.8
  • scalapack/2.1.0

As our build-node has Intel SkyLake CPUs (though I’m building for avx2 first), flexiblas should use the BLAS and LAPACK implementation from Intel MKL imkl/2020.1.217.

When I don’t do -fno-math-errno (*), the tests pass just fine.

@ake_s Have you opened a GitLab issue for this?
@pszilard Shall I open a GitLab issue, or is gcc/9.3.0 now too old for you to worry about?

Oliver

*) To all the easybuilders (like Åke):
I’ve disabled -fno-math-errno by adding 'precise': True to the toolchainopts.