I am trying to run gromacs 2023.2 gpu on HPC. I am getting super slow performance in HPC. I run the calculation for the test purpose on 8 cores for 10 mins only. The projected steps of calculation is about 15,000 in 10 minuites, unfortunately couldn’t get performance upto mark. The script file and the last comments in output file are as shown below.
SLURM SCRIPT FILE
#!/bin/bash
#SBATCH -N 1
#SBATCH --ntasks-per-node=8
#SBATCH --gres=gpu:A100-SXM4:1
#SBATCH --partition=testp
#SBATCH --time=00:10:00
#SBATCH --error=error_test.%J.err
#SBATCH --output=output_test.%J.out
echo “Starting at date
”
echo “Running on hosts: $SLURM_NODELIST”
echo “Running on $SLURM_NNODES nodes.”
echo “Running $SLURM_NTASKS tasks.”
echo “Job id is $SLURM_JOBID”
echo “Job submission directory is : $SLURM_SUBMIT_DIR”
cd $SLURM_SUBMIT_DIR
source /opt/hpcx-v2.9.0-gcc-MLNX_OFED_LINUX-5.4-1.0.3.0-ubuntu20.04-x86_64/env.sh
source /nlsasfs/home/groupiiiv/sarthakt/software_cdac/gromacs-2023.2/install/bin/GMXRC
mpirun -mca pml ucx -x UCX_NET_DEVICES -np 8 /nlsasfs/home/groupiiiv/sarthakt/software_cdac/gromacs-2023.2/build/bin/gmx_mpi mdrun -ntomp 4 --deffnm md_0_10 -cpi md_0_10.cpt -noappend
OUTPUT FILE
Started mdrun on rank 0 Thu Oct 26 11:14:28 2023
** Step Time**
** 0 0.00000**
** Energies (kJ/mol)**
** Bond U-B Proper Dih. Improper Dih. CMAP Dih.**
** 5.17908e+03 1.42265e+04 1.66364e+04 8.73544e+02 -8.25547e+02**
** LJ-14 Coulomb-14 LJ (SR) Coulomb (SR) Coul. recip.**
** 5.15878e+03 7.27696e+04 8.95562e+04 -1.15064e+06 3.48482e+03**
** Potential Kinetic En. Total Energy Conserved En. Temperature**
** -9.43579e+05 1.87862e+05 -7.55718e+05 -7.55674e+05 3.01866e+02**
** Pressure (bar) Constr. rmsd**
** 2.79474e+02 2.98607e-06**
DD step 99 load imb.: force 5.3%
step 600: timed with pme grid 56 56 56, coulomb cutoff 1.200: 65983.2 M-cycles
step 800: timed with pme grid 48 48 48, coulomb cutoff 1.400: 58358.5 M-cycles
step 1000: timed with pme grid 44 44 44, coulomb cutoff 1.527: 60890.5 M-cycles
step 1200: timed with pme grid 40 40 40, coulomb cutoff 1.680: 60729.1 M-cycles
step 1400: timed with pme grid 36 36 36, coulomb cutoff 1.866: 58230.7 M-cycles
step 1400: the maximum allowed grid scaling limits the PME load balancing to a coulomb cut-off of 1.866
step 1600: timed with pme grid 36 36 36, coulomb cutoff 1.866: 60691.8 M-cycles
step 1800: timed with pme grid 40 40 40, coulomb cutoff 1.680: 56804.9 M-cycles
step 2000: timed with pme grid 42 42 42, coulomb cutoff 1.600: 61399.7 M-cycles
Received the TERM signal, stopping within 100 steps
THANK ALL OF YOU IN ADVANCE. FEEL FREE TO ASK MORE DETAILS, SO THAT I CAN START MY CALCULATIONS HAPPILY. :D
GROMACS version:
GROMACS modification: Yes/No
Here post your question
CAN ANYONE SUGGEST ME THE SOLUTIONS TO ENHANCE PERFORMANCE?