Simulation speed on Different GPU cards; same simulation setting

GROMACS version: 2020.4
GROMACS modification: NO
CUDA: 11

Dear users,
I have two types of GPUs with us in two different machines. The first one, GeForce RTX 2080 Ti, and the second one is Tesla V100-SXM2-32GB. Apart from this difference, other things like compilers used and version of the Gromacs package, the Cuda version are all same.

For comparison, I am using the same system in both the system. However, I see a huge reduction in simulation speed (up to 6-8 times slower) in the case of the V100-SXM2-32GB GPU card. It is surprising because this card should have better performance than GeForce RTX 2080 Ti. Also, the CPUs of the machine having V100-SXM2-32GB GPU are better than the old machine.

I wonder why this is happening. Looking forward to your reply.

Thank you