GROMACS version:2025.1
GROMACS modification: Yes/No
Hi, i am trying to make gpu aware MPI support and used
cmake .. -DGMX_BUILD_OWN_FFTW=ON -DREGRESSIONTEST_DOWNLOAD=ON -DGMX_GPU=CUDA -DGMX_MPI=ON
and everythink successfully complied but when i ran a simulation it shows
:-) GROMACS - gmx mdrun, 2025.1 (-:
Copyright 1991-2025 The GROMACS Authors.
GROMACS is free software; you can redistribute it and/or modify it
under the terms of the GNU Lesser General Public License
as published by the Free Software Foundation; either version 2.1
of the License, or (at your option) any later version.
Current GROMACS contributors:
Mark Abraham Andrey Alekseenko Brian Andrews
Vladimir Basov Paul Bauer Hugh Bird
Eliane Briand Ania Brown Mahesh Doijade
Giacomo Fiorin Stefan Fleischmann Sergey Gorelov
Gilles Gouaillardet Alan Gray M. Eric Irrgang
Farzaneh Jalalypour Petter Johansson Carsten Kutzner
Grzegorz Ĺazarski Justin A. Lemkul Magnus Lundborg
Pascal Merz Vedran MiletiÄ Dmitry Morozov
Lukas MĂźllender Julien Nabet SzilĂĄrd PĂĄll
Andrea Pasquadibisceglie Michele Pellegrino Nicola Piasentin
Daniele Rapetti Muhammad Umair Sadiq Hubert Santuz
Roland Schulz Michael Shirts Tatiana Shugaeva
Alexey Shvetsov Philip Turner Alessandra Villa
Sebastian WingbermĂźhle
Previous GROMACS contributors:
Emile Apol Rossen Apostolov James Barnett
Herman J.C. Berendsen Cathrine Bergh Par Bjelkmar
Christian Blau Viacheslav Bolnykh Kevin Boyd
Aldert van Buuren Carlo Camilloni Rudi van Drunen
Anton Feenstra Oliver Fleetwood Vytas Gapsys
Gaurav Garg Gerrit Groenhof Bert de Groot
Anca Hamuraru Vincent Hindriksen Victor Holanda
Aleksei Iupinov Joe Jordan Christoph Junghans
Prashanth Kanduri Dimitrios Karkoulis Peter Kasson
Sebastian Kehl Sebastian Keller Jiri Kraus
Per Larsson Viveca Lindahl Erik Marklund
Pieter Meulenhoff Teemu Murtola Sander Pronk
Alfons Sijbers Balint Soproni David van der Spoel
Peter Tieleman Carsten Uphoff Jon Vincent
Teemu Virolainen Christian Wennberg Maarten Wolf
Artem Zhmurov
Coordinated by the GROMACS project leaders:
Berk Hess and Erik Lindahl
GROMACS: gmx mdrun, version 2025.1
Executable: /usr/local/gromacs/bin/gmx_mpi
Data prefix: /usr/local/gromacs
Working dir: /-------------------------------
Process ID: 467
Command line:
gmx_mpi mdrun -v -deffnm md
GROMACS version: 2025.1
Precision: mixed
Memory model: 64 bit
MPI library: MPI
MPI library version: Open MPI v4.1.6, package: Debian OpenMPI, ident: 4.1.6, repo rev: v4.1.6, Sep 30, 2023
OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 128)
GPU support: CUDA
NBNxM GPU setup: super-cluster 2x2x2 / cluster 8 (cluster-pair splitting on)
SIMD instructions: AVX2_256
CPU FFT library: fftw-3.3.10-sse2-avx-avx2-avx2_128
GPU FFT library: cuFFT
Multi-GPU FFT: none
RDTSCP usage: enabled
TNG support: enabled
Hwloc support: disabled
Tracing support: disabled
C compiler: /usr/bin/cc GNU 13.3.0
C compiler flags: -fexcess-precision=fast -funroll-all-loops -mavx2 -mfma -Wno-missing-field-initializers -O3 -DNDEBUG
C++ compiler: /usr/bin/c++ GNU 13.3.0
C++ compiler flags: -fexcess-precision=fast -funroll-all-loops -mavx2 -mfma -Wno-missing-field-initializers -Wno-old-style-cast -Wno-cast-qual -Wno-suggest-override -Wno-suggest-destructor-override -Wno-zero-as-null-pointer-constant -Wno-cast-function-type-strict SHELL:-fopenmp -O3 -DNDEBUG
BLAS library: External - detected on the system
LAPACK library: External - detected on the system
CUDA compiler: /usr/local/cuda-12.6/bin/nvcc nvcc: NVIDIA (R) Cuda compiler driver;Copyright (c) 2005-2024 NVIDIA Corporation;Built on Thu_Sep_12_02:18:05_PDT_2024;Cuda compilation tools, release 12.6, V12.6.77;Build cuda_12.6.r12.6/compiler.34841621_0
CUDA compiler flags:-DONNX_NAMESPACE=onnx_c2;-gencode;arch=compute_89,code=sm_89;-Xcudafe;--diag_suppress=cc_clobber_ignored,--diag_suppress=field_without_dll_interface,--diag_suppress=base_class_has_different_dll_interface,--diag_suppress=dll_interface_conflict_none_assumed,--diag_suppress=dll_interface_conflict_dllexport_assumed,--diag_suppress=bad_friend_decl;--expt-relaxed-constexpr;--expt-extended-lambda-fexcess-precision=fast -funroll-all-loops -mavx2 -mfma -Wno-missing-field-initializers -Wno-old-style-cast -Wno-cast-qual -Wno-suggest-override -Wno-suggest-destructor-override -Wno-zero-as-null-pointer-constant -Wno-cast-function-type-strict SHELL:-fopenmp -O3 -DNDEBUG
CUDA driver: 12.80
CUDA runtime: 12.60
Running on 1 node with total 8 cores, 16 processing units, 1 compatible GPU
Hardware detected on host Alpha (the node of MPI rank 0):
CPU info:
Vendor: Intel
Brand: 12th Gen Intel(R) Core(TM) i5-12500H
Family: 6 Model: 154 Stepping: 3
Features: aes apic avx avx2 clfsh cmov cx8 cx16 f16c fma htt intel lahf mmx msr nonstop_tsc pcid pclmuldq pdpe1gb popcnt pse rdrnd rdtscp sha sse2 sse3 sse4.1 sse4.2 ssse3 tdt x2apic
Hardware topology: Basic
Packages, cores, and logical processors:
[indices refer to OS logical processors]
Package 0: [ 0 1] [ 2 3] [ 4 5] [ 6 7] [ 8 9] [ 10 11] [ 12 13] [ 14 15]
CPU limit set by OS: -1 Recommended max number of threads: 16
GPU info:
Number of GPUs detected: 1
#0: NVIDIA NVIDIA GeForce RTX 4060 Laptop GPU, compute cap.: 8.9, ECC: no, stat: compatible
-------------------------------------
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
M. J. Abraham, T. Murtola, R. Schulz, S. PĂĄll, J. C. Smith, B. Hess, E.
Lindahl
GROMACS: High performance molecular simulations through multi-level
parallelism from laptops to supercomputers
SoftwareX (2015)
DOI: 10.1016/j.softx.2015.06.001
-------- -------- --- Thank You --- -------- --------
Input Parameters:
----------
Changing nstlist from 20 to 100, rlist from 1.22 to 1.365
Update groups can not be used for this system because atoms that are (in)directly constrained together are interdispersed with other atoms
GPU-aware MPI was not detected, will not use direct GPU communication. Check the GROMACS install guide for recommendations for GPU-aware support. If you are certain about GPU-aware support in your MPI library, you can force its use by setting the GMX_FORCE_GPU_AWARE_MPI environment variable.
Local state does not use filler particles
1 GPU selected for this run.
Mapping of GPU IDs to the 2 GPU tasks in the 1 rank on this node:
PP:0,PME:0
PP tasks will do (non-perturbed) short-ranged interactions on the GPU
PP task will update and constrain coordinates on the GPU
PME tasks will do all aspects on the GPU
should i recompile it or is there any other way to make gpu-aware mpi.
any suggestions might help