Hi all,
I’m running GROMACS (2019.6, GPU build) on my HPC cluster (gpunode002: 24 core, 256GB ram, gpu NVidia Tesla P40) with the following SLURM script:
#!/bin/bash
#SBATCH --job-name=WT
#SBATCH --partition=gpuq
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=20
#SBATCH --mem=200G
#SBATCH --gres=gpu:1
#SBATCH --nodelist=gpunode002
#SBATCH --output=gmx.%j.out
#SBATCH --error=gmx.%j.err
set -euo pipefail
module purge
module load cuda80/toolkit/8.0.61
module load cuda80/fft/8.0.61
module load gromacs/single/gpu/2019.6
export CUDA_ROOT=${CUDA_HOME:-${CUDA_PATH:-$(dirname "$(dirname "$(which nvcc)")")}}
export LD_LIBRARY_PATH="$CUDA_ROOT/lib64:${LD_LIBRARY_PATH:-}"
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}
cd "$SLURM_SUBMIT_DIR"
# Production MD
gmx grompp -f md.mdp -c npt.gro -t npt.cpt -p topol.top -o md_0_1.tpr -maxwarn 10
gmx mdrun -deffnm md_0_1 -ntmpi 1 -ntomp 20 -nb gpu -pme gpu -pin on -dlb auto -gpu_id 0
But the performance is only ~5 ns/day, which feels extremely low for a GPU job. Can you please review the commands for best performance? Thank you!