How to enable GMX_ENABLE_DIRECT_GPU_COMM

GROMACS version:2025.1
GROMACS modification: Yes/No
Hi, i am trying to make gpu aware MPI support and used

cmake .. -DGMX_BUILD_OWN_FFTW=ON -DREGRESSIONTEST_DOWNLOAD=ON -DGMX_GPU=CUDA -DGMX_MPI=ON

and everythink successfully complied but when i ran a simulation it shows

                     :-) GROMACS - gmx mdrun, 2025.1 (-:

Copyright 1991-2025 The GROMACS Authors.
GROMACS is free software; you can redistribute it and/or modify it
under the terms of the GNU Lesser General Public License
as published by the Free Software Foundation; either version 2.1
of the License, or (at your option) any later version.

                        Current GROMACS contributors:
      Mark Abraham           Andrey Alekseenko           Brian Andrews       
     Vladimir Basov              Paul Bauer                Hugh Bird         
     Eliane Briand               Ania Brown              Mahesh Doijade      
     Giacomo Fiorin          Stefan Fleischmann          Sergey Gorelov      
  Gilles Gouaillardet            Alan Gray              M. Eric Irrgang      
  Farzaneh Jalalypour         Petter Johansson          Carsten Kutzner      
   Grzegorz Łazarski         Justin A. Lemkul          Magnus Lundborg      
      Pascal Merz             Vedran Miletić            Dmitry Morozov      
    Lukas MĂźllender            Julien Nabet             SzilĂĄrd PĂĄll      
Andrea Pasquadibisceglie     Michele Pellegrino         Nicola Piasentin     
    Daniele Rapetti         Muhammad Umair Sadiq         Hubert Santuz       
     Roland Schulz             Michael Shirts           Tatiana Shugaeva     
    Alexey Shvetsov            Philip Turner            Alessandra Villa     
Sebastian WingbermĂźhle  

                        Previous GROMACS contributors:
       Emile Apol             Rossen Apostolov           James Barnett       
 Herman J.C. Berendsen         Cathrine Bergh             Par Bjelkmar       
     Christian Blau          Viacheslav Bolnykh            Kevin Boyd        
   Aldert van Buuren          Carlo Camilloni           Rudi van Drunen      
     Anton Feenstra           Oliver Fleetwood            Vytas Gapsys       
      Gaurav Garg             Gerrit Groenhof            Bert de Groot       
     Anca Hamuraru           Vincent Hindriksen          Victor Holanda      
    Aleksei Iupinov              Joe Jordan            Christoph Junghans    
   Prashanth Kanduri        Dimitrios Karkoulis           Peter Kasson       
     Sebastian Kehl           Sebastian Keller             Jiri Kraus        
      Per Larsson              Viveca Lindahl            Erik Marklund       
   Pieter Meulenhoff           Teemu Murtola              Sander Pronk       
     Alfons Sijbers            Balint Soproni         David van der Spoel    
     Peter Tieleman            Carsten Uphoff             Jon Vincent        
    Teemu Virolainen         Christian Wennberg           Maarten Wolf       
     Artem Zhmurov       

                 Coordinated by the GROMACS project leaders:
                          Berk Hess and Erik Lindahl

GROMACS:      gmx mdrun, version 2025.1
Executable:   /usr/local/gromacs/bin/gmx_mpi
Data prefix:  /usr/local/gromacs
Working dir:  /-------------------------------
Process ID:   467
Command line:
 gmx_mpi mdrun -v -deffnm md

GROMACS version:     2025.1
Precision:           mixed
Memory model:        64 bit
MPI library:         MPI
MPI library version: Open MPI v4.1.6, package: Debian OpenMPI, ident: 4.1.6, repo rev: v4.1.6, Sep 30, 2023
OpenMP support:      enabled (GMX_OPENMP_MAX_THREADS = 128)
GPU support:         CUDA
NBNxM GPU setup:     super-cluster 2x2x2 / cluster 8 (cluster-pair splitting on)
SIMD instructions:   AVX2_256
CPU FFT library:     fftw-3.3.10-sse2-avx-avx2-avx2_128
GPU FFT library:     cuFFT
Multi-GPU FFT:       none
RDTSCP usage:        enabled
TNG support:         enabled
Hwloc support:       disabled
Tracing support:     disabled
C compiler:          /usr/bin/cc GNU 13.3.0
C compiler flags:    -fexcess-precision=fast -funroll-all-loops -mavx2 -mfma -Wno-missing-field-initializers -O3 -DNDEBUG
C++ compiler:        /usr/bin/c++ GNU 13.3.0
C++ compiler flags:  -fexcess-precision=fast -funroll-all-loops -mavx2 -mfma -Wno-missing-field-initializers -Wno-old-style-cast -Wno-cast-qual -Wno-suggest-override -Wno-suggest-destructor-override -Wno-zero-as-null-pointer-constant -Wno-cast-function-type-strict SHELL:-fopenmp -O3 -DNDEBUG
BLAS library:        External - detected on the system
LAPACK library:      External - detected on the system
CUDA compiler:       /usr/local/cuda-12.6/bin/nvcc nvcc: NVIDIA (R) Cuda compiler driver;Copyright (c) 2005-2024 NVIDIA Corporation;Built on Thu_Sep_12_02:18:05_PDT_2024;Cuda compilation tools, release 12.6, V12.6.77;Build cuda_12.6.r12.6/compiler.34841621_0
CUDA compiler flags:-DONNX_NAMESPACE=onnx_c2;-gencode;arch=compute_89,code=sm_89;-Xcudafe;--diag_suppress=cc_clobber_ignored,--diag_suppress=field_without_dll_interface,--diag_suppress=base_class_has_different_dll_interface,--diag_suppress=dll_interface_conflict_none_assumed,--diag_suppress=dll_interface_conflict_dllexport_assumed,--diag_suppress=bad_friend_decl;--expt-relaxed-constexpr;--expt-extended-lambda-fexcess-precision=fast -funroll-all-loops -mavx2 -mfma -Wno-missing-field-initializers -Wno-old-style-cast -Wno-cast-qual -Wno-suggest-override -Wno-suggest-destructor-override -Wno-zero-as-null-pointer-constant -Wno-cast-function-type-strict SHELL:-fopenmp -O3 -DNDEBUG
CUDA driver:         12.80
CUDA runtime:        12.60


Running on 1 node with total 8 cores, 16 processing units, 1 compatible GPU
Hardware detected on host Alpha (the node of MPI rank 0):
 CPU info:
   Vendor: Intel
   Brand:  12th Gen Intel(R) Core(TM) i5-12500H
   Family: 6   Model: 154   Stepping: 3
   Features: aes apic avx avx2 clfsh cmov cx8 cx16 f16c fma htt intel lahf mmx msr nonstop_tsc pcid pclmuldq pdpe1gb popcnt pse rdrnd rdtscp sha sse2 sse3 sse4.1 sse4.2 ssse3 tdt x2apic
 Hardware topology: Basic
   Packages, cores, and logical processors:
   [indices refer to OS logical processors]
     Package  0: [   0   1] [   2   3] [   4   5] [   6   7] [   8   9] [  10  11] [  12  13] [  14  15]
   CPU limit set by OS: -1   Recommended max number of threads: 16
 GPU info:
   Number of GPUs detected: 1
   #0: NVIDIA NVIDIA GeForce RTX 4060 Laptop GPU, compute cap.: 8.9, ECC:  no, stat: compatible

-------------------------------------
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
M. J. Abraham, T. Murtola, R. Schulz, S. PĂĄll, J. C. Smith, B. Hess, E.
Lindahl
GROMACS: High performance molecular simulations through multi-level
parallelism from laptops to supercomputers
SoftwareX (2015)
DOI: 10.1016/j.softx.2015.06.001
-------- -------- --- Thank You --- -------- --------

Input Parameters:
----------
Changing nstlist from 20 to 100, rlist from 1.22 to 1.365

Update groups can not be used for this system because atoms that are (in)directly constrained together are interdispersed with other atoms

GPU-aware MPI was not detected, will not use direct GPU communication. Check the GROMACS install guide for recommendations for GPU-aware support. If you are certain about GPU-aware support in your MPI library, you can force its use by setting the GMX_FORCE_GPU_AWARE_MPI environment variable.

Local state does not use filler particles

1 GPU selected for this run.
Mapping of GPU IDs to the 2 GPU tasks in the 1 rank on this node:
 PP:0,PME:0
PP tasks will do (non-perturbed) short-ranged interactions on the GPU
PP task will update and constrain coordinates on the GPU
PME tasks will do all aspects on the GPU

should i recompile it or is there any other way to make gpu-aware mpi.
any suggestions might help

Hi!

You are using the standard Debian’s OpenMPI package; it is not GPU-aware.

You need to build OpenMPI or MPICH yourself with CUDA support enabled in it. Tip: doing sudo apt remove openmpi-common openmpi-bin before that would be helpful to avoid mixing up several MPI libraries).

However, in the example above, you’re running one rank with one GPU. The GPU-awareness only matters when you have multiple ranks. And it rarely makes sense to use multiple ranks if you have only one GPU. So, going for GPU-aware MPI in such setup seems unusual.

1 Like

Thanks for your guidance. Can you or your team members help me with my proteasome query on geometry distortion which protein convertion from gro to pdb for verification if the structure is intact or not.

This forum is for help with technical issues with GROMACS, not for solving your scientific problems.