sure
thank you very much
gmx_mpi mdrun -v -deffnm 6pij_equilnpt -s npt.tpr
GROMACS version: 2018
Precision: single
Memory model: 64 bit
MPI library: MPI
OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 256)
GPU support: CUDA
SIMD instructions: AVX_256
FFT library: fftw-3.3.3-sse2
RDTSCP usage: enabled
TNG support: enabled
Hwloc support: disabled
Tracing support: disabled
Built on: 2018-02-20 21:19:31
Built by: mmadrid@gpu048.pvt.bridges.psc.edu [CMAKE]
Build OS/arch: Linux 3.10.0-693.11.6.el7.x86_64 x86_64
Build CPU vendor: Intel
Build CPU brand: Intel® Xeon® CPU E5-2683 v4 @ 2.10GHz
Build CPU family: 6 Model: 79 Stepping: 1
Build CPU features: aes apic avx avx2 clfsh cmov cx8 cx16 f16c fma hle htt intel lahf mmx msr nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdrnd rdtscp rtm sse2 sse3 sse4.1 sse4.2 ssse3 tdt x2apic
C compiler: /usr/lib64/ccache/cc GNU 4.8.5
C compiler flags: -mavx -O3 -DNDEBUG -funroll-all-loops -fexcess-precision=fast
C++ compiler: /usr/lib64/ccache/c++ GNU 4.8.5
C++ compiler flags: -mavx -std=c++11 -O3 -DNDEBUG -funroll-all-loops -fexcess-precision=fast
CUDA compiler: /opt/packages/cuda/9.0RC/bin/nvcc nvcc: NVIDIA ® Cuda compiler driver;Copyright © 2005-2017 NVIDIA Corporation;Built on Mon_Jun_26_16:13:28_CDT_2017;Cuda compilation tools, release 9.0, V9.0.102
CUDA compiler flags:-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_70,code=compute_70;-use_fast_math;;; ;-mavx;-std=c++11;-O3;-DNDEBUG;-funroll-all-loops;-fexcess-precision=fast;
CUDA driver: 10.20
CUDA runtime: 9.0
Running on 1 node with total 28 cores, 28 logical cores, 4 compatible GPUs
Hardware detected on host gpu013.pvt.bridges.psc.edu (the node of MPI rank 0):
CPU info:
Vendor: Intel
Brand: Intel® Xeon® CPU E5-2695 v3 @ 2.30GHz
Family: 6 Model: 63 Stepping: 2
Features: aes apic avx avx2 clfsh cmov cx8 cx16 f16c fma htt intel lahf mmx msr nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdrnd rdtscp sse2 sse3 sse4.1 sse4.2 ssse3 tdt x2apic
Hardware topology: Basic
Sockets, cores, and logical processors:
Socket 0: [ 0] [ 14] [ 1] [ 15] [ 2] [ 16] [ 3] [ 17] [ 4] [ 18] [ 5] [ 19] [ 6] [ 20]
Socket 1: [ 7] [ 21] [ 8] [ 22] [ 9] [ 23] [ 10] [ 24] [ 11] [ 25] [ 12] [ 26] [ 13] [ 27]
GPU info:
Number of GPUs detected: 4
#0: NVIDIA Tesla K80, compute cap.: 3.7, ECC: yes, stat: compatible
#1: NVIDIA Tesla K80, compute cap.: 3.7, ECC: yes, stat: compatible
#2: NVIDIA Tesla K80, compute cap.: 3.7, ECC: yes, stat: compatible
#3: NVIDIA Tesla K80, compute cap.: 3.7, ECC: yes, stat: compatible
Highest SIMD level requested by all nodes in run: AVX2_256
SIMD instructions selected at compile time: AVX_256
This program was compiled for different hardware than you are running on,
which could influence performance.
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
M. J. Abraham, T. Murtola, R. Schulz, S. Páll, J. C. Smith, B. Hess, E.
Lindahl
GROMACS: High performance molecular simulations through multi-level
parallelism from laptops to supercomputers
SoftwareX 1 (2015) pp. 19-25
-------- -------- — Thank You — -------- --------
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
S. Páll, M. J. Abraham, C. Kutzner, B. Hess, E. Lindahl
Tackling Exascale Software Challenges in Molecular Dynamics Simulations with
GROMACS
In S. Markidis & E. Laure (Eds.), Solving Software Challenges for Exascale 8759 (2015) pp. 3-27
-------- -------- — Thank You — -------- --------
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
S. Pronk, S. Páll, R. Schulz, P. Larsson, P. Bjelkmar, R. Apostolov, M. R.
Shirts, J. C. Smith, P. M. Kasson, D. van der Spoel, B. Hess, and E. Lindahl
GROMACS 4.5: a high-throughput and highly parallel open source molecular
simulation toolkit
Bioinformatics 29 (2013) pp. 845-54
-------- -------- — Thank You — -------- --------
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
B. Hess and C. Kutzner and D. van der Spoel and E. Lindahl
GROMACS 4: Algorithms for highly efficient, load-balanced, and scalable
molecular simulation
J. Chem. Theory Comput. 4 (2008) pp. 435-447
-------- -------- — Thank You — -------- --------
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
D. van der Spoel, E. Lindahl, B. Hess, G. Groenhof, A. E. Mark and H. J. C.
Berendsen
GROMACS: Fast, Flexible and Free
J. Comp. Chem. 26 (2005) pp. 1701-1719
-------- -------- — Thank You — -------- --------
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
E. Lindahl and B. Hess and D. van der Spoel
GROMACS 3.0: A package for molecular simulation and trajectory analysis
J. Mol. Mod. 7 (2001) pp. 306-317
-------- -------- — Thank You — -------- --------
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
H. J. C. Berendsen, D. van der Spoel and R. van Drunen
GROMACS: A message-passing parallel molecular dynamics implementation
Comp. Phys. Comm. 91 (1995) pp. 43-56
-------- -------- — Thank You — -------- --------
Input Parameters:
integrator = md
tinit = 0
dt = 0.002
nsteps = 500000
init-step = 0
simulation-part = 1
comm-mode = Linear
nstcomm = 100
bd-fric = 0
ld-seed = -1404411713
emtol = 10
emstep = 0.01
niter = 20
fcstep = 0
nstcgsteep = 1000
nbfgscorr = 10
rtpi = 0.05
nstxout = 1000
nstvout = 1000
nstfout = 1000
nstlog = 1000
nstcalcenergy = 100
nstenergy = 1000
nstxout-compressed = 1000
compressed-x-precision = 1000
cutoff-scheme = Verlet
nstlist = 10
ns-type = Grid
pbc = xyz
periodic-molecules = false
verlet-buffer-tolerance = 0.005
rlist = 1.2
coulombtype = PME
coulomb-modifier = Potential-shift
rcoulomb-switch = 0
rcoulomb = 1.2
epsilon-r = 1
epsilon-rf = inf
vdw-type = Cut-off
vdw-modifier = Potential-shift
rvdw-switch = 0.8
rvdw = 1.2
DispCorr = EnerPres
table-extension = 1
fourierspacing = 0.12
fourier-nx = 128
fourier-ny = 160
fourier-nz = 192
pme-order = 4
ewald-rtol = 1e-05
ewald-rtol-lj = 0.001
lj-pme-comb-rule = Geometric
ewald-geometry = 0
epsilon-surface = 0
implicit-solvent = No
gb-algorithm = Still
nstgbradii = 1
rgbradii = 1
gb-epsilon-solvent = 80
gb-saltconc = 0
gb-obc-alpha = 1
gb-obc-beta = 0.8
gb-obc-gamma = 4.85
gb-dielectric-offset = 0.009
sa-algorithm = Ace-approximation
sa-surface-tension = 2.05016
tcoupl = Nose-Hoover
nsttcouple = 10
nh-chain-length = 1
print-nose-hoover-chain-variables = false
pcoupl = Parrinello-Rahman
pcoupltype = Isotropic
nstpcouple = 10
tau-p = 2.5
compressibility (3x3):
compressibility[ 0]={ 4.50000e-05, 0.00000e+00, 0.00000e+00}
compressibility[ 1]={ 0.00000e+00, 4.50000e-05, 0.00000e+00}
compressibility[ 2]={ 0.00000e+00, 0.00000e+00, 4.50000e-05}
ref-p (3x3):
ref-p[ 0]={ 1.00000e+00, 0.00000e+00, 0.00000e+00}
ref-p[ 1]={ 0.00000e+00, 1.00000e+00, 0.00000e+00}
ref-p[ 2]={ 0.00000e+00, 0.00000e+00, 1.00000e+00}
refcoord-scaling = No
posres-com (3):
posres-com[0]= 0.00000e+00
posres-com[1]= 0.00000e+00
posres-com[2]= 0.00000e+00
posres-comB (3):
posres-comB[0]= 0.00000e+00
posres-comB[1]= 0.00000e+00
posres-comB[2]= 0.00000e+00
QMMM = false
QMconstraints = 0
QMMMscheme = 0
MMChargeScaleFactor = 1
qm-opts:
ngQM = 0
constraint-algorithm = Lincs
continuation = true
Shake-SOR = false
shake-tol = 0.0001
lincs-order = 4
lincs-iter = 1
lincs-warnangle = 30
nwall = 0
wall-type = 9-3
wall-r-linpot = -1
wall-atomtype[0] = -1
wall-atomtype[1] = -1
wall-density[0] = 0
wall-density[1] = 0
wall-ewald-zfac = 3
pull = false
awh = false
rotation = false
interactiveMD = false
disre = No
disre-weighting = Conservative
disre-mixed = false
dr-fc = 1000
dr-tau = 0
nstdisreout = 100
orire-fc = 0
orire-tau = 0
nstorireout = 100
free-energy = no
cos-acceleration = 0
deform (3x3):
deform[ 0]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
deform[ 1]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
deform[ 2]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
simulated-tempering = false
swapcoords = no
userint1 = 0
userint2 = 0
userint3 = 0
userint4 = 0
userreal1 = 0
userreal2 = 0
userreal3 = 0
userreal4 = 0
applied-forces:
electric-field:
x:
E0 = 0
omega = 0
t0 = 0
sigma = 0
y:
E0 = 0
omega = 0
t0 = 0
sigma = 0
z:
E0 = 0
omega = 0
t0 = 0
sigma = 0
grpopts:
nrdf: 147287 3010.99 5098.98 959295
ref-t: 300 300 300 300
tau-t: 1 1 1 1
annealing: No No No No
annealing-npoints: 0 0 0 0
acc: 0 0 0
nfreeze: N N N
energygrp-flags[ 0]: 0
Changing nstlist from 10 to 100, rlist from 1.2 to 1.337
Initializing Domain Decomposition on 4 ranks
Dynamic load balancing: locked
Initial maximum inter charge-group distances:
two-body bonded interactions: 0.446 nm, LJ-14, atoms 36965 36972
multi-body bonded interactions: 0.446 nm, Proper Dih., atoms 36965 36972
Minimum cell size due to bonded interactions: 0.490 nm
Maximum distance for 5 constraints, at 120 deg. angles, all-trans: 0.218 nm
Estimated maximum distance required for P-LINCS: 0.218 nm
Using 0 separate PME ranks
Scaling the initial minimum size with 1/0.8 (option -dds) = 1.25
Optimizing the DD grid for 4 cells with a minimum initial size of 0.613 nm
The maximum allowed number of cells is: X 24 Y 29 Z 33
Domain decomposition grid 1 x 4 x 1, separate PME ranks 0
PME domain decomposition: 1 x 4 x 1
Domain decomposition rank 0, coordinates 0 0 0
The initial number of communication pulses is: Y 1
The initial domain decomposition cell size is: Y 4.47 nm
The maximum allowed distance for charge groups involved in interactions is:
non-bonded interactions 1.337 nm
(the following are initial values, they could change due to box deformation)
two-body bonded interactions (-rdd) 1.337 nm
multi-body bonded interactions (-rdd) 1.337 nm
atoms separated by up to 5 constraints (-rcon) 4.465 nm
When dynamic load balancing gets turned on, these settings will change to:
The maximum number of communication pulses is: Y 1
The minimum size for domain decomposition cells is 1.337 nm
The requested allowed shrink of DD cells (option -dds) is: 0.80
The allowed shrink of domain decomposition cells is: Y 0.30
The maximum allowed distance for charge groups involved in interactions is:
non-bonded interactions 1.337 nm
two-body bonded interactions (-rdd) 1.337 nm
multi-body bonded interactions (-rdd) 1.337 nm
atoms separated by up to 5 constraints (-rcon) 1.337 nm
Using 4 MPI processes
Using 7 OpenMP threads per MPI process
On host gpu013.pvt.bridges.psc.edu 4 GPUs auto-selected for this run.
Mapping of GPU IDs to the 4 GPU tasks in the 4 ranks on this node:
PP:0,PP:1,PP:2,PP:3
NOTE: GROMACS was configured without NVML support hence it can not exploit
application clocks of the detected Tesla K80 GPU to improve performance.
Recompile with the NVML library (compatible with the driver used) or set application clocks manually.
Non-default thread affinity set probably by the OpenMP library,
disabling internal thread affinity
System total charge: 0.000
Will do PME sum in reciprocal space for electrostatic interactions.
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
U. Essmann, L. Perera, M. L. Berkowitz, T. Darden, H. Lee and L. G. Pedersen
A smooth particle mesh Ewald method
J. Chem. Phys. 103 (1995) pp. 8577-8592
-------- -------- — Thank You — -------- --------
Using a Gaussian width (1/beta) of 0.384195 nm for Ewald
Potential shift: LJ r^-12: -1.122e-01 r^-6: -3.349e-01, Ewald -8.333e-06
Initialized non-bonded Ewald correction tables, spacing: 1.02e-03 size: 1176
Long Range LJ corr.: 3.3098e-04
Generated table with 1168 data points for Ewald.
Tabscale = 500 points/nm
Generated table with 1168 data points for LJ6.
Tabscale = 500 points/nm
Generated table with 1168 data points for LJ12.
Tabscale = 500 points/nm
Generated table with 1168 data points for 1-4 COUL.
Tabscale = 500 points/nm
Generated table with 1168 data points for 1-4 LJ6.
Tabscale = 500 points/nm
Generated table with 1168 data points for 1-4 LJ12.
Tabscale = 500 points/nm
Using GPU 8x8 nonbonded short-range kernels
Using a dual 8x4 pair-list setup updated with dynamic, rolling pruning:
outer list: updated every 100 steps, buffer 0.137 nm, rlist 1.337 nm
inner list: updated every 12 steps, buffer 0.002 nm, rlist 1.202 nm
At tolerance 0.005 kJ/mol/ps per atom, equivalent classical 1x1 list would be:
outer list: updated every 100 steps, buffer 0.290 nm, rlist 1.490 nm
inner list: updated every 12 steps, buffer 0.051 nm, rlist 1.251 nm
Using Lorentz-Berthelot Lennard-Jones combination rule
Initializing Parallel LINear Constraint Solver
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
B. Hess
P-LINCS: A Parallel Linear Constraint Solver for molecular simulation
J. Chem. Theory Comput. 4 (2008) pp. 116-122
-------- -------- — Thank You — -------- --------
The number of constraints is 30135
There are inter charge-group constraints,
will communicate selected coordinates each lincs iteration
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
S. Miyamoto and P. A. Kollman
SETTLE: An Analytical Version of the SHAKE and RATTLE Algorithms for Rigid
Water Models
J. Comp. Chem. 13 (1992) pp. 952-962
-------- -------- — Thank You — -------- --------
Linking all bonded interactions to atoms
Intra-simulation communication will occur every 10 steps.
Center of mass motion removal mode is Linear
We have the following groups for center of mass motion removal:
0: Protein
1: non-Protein
There are: 541456 Atoms
Atom distribution over 4 domains: av 135364 stddev 1229 min 134101 max 136761
NOTE: DLB will not turn on during the first phase of PME tuning
Started mdrun on rank 0 Tue Dec 8 10:12:41 2020
Step Time
0 0.00000