You ask — we answer!

Tensorflow Tesla® instances benchmark

Summary of test model results for the images classification with Tesla® LeaderGPU® servers

LeaderGPU® is a new player in the GPU computing market, and it intends to change the rules of the game. At this moment in time, the GPU computing market comprises several large players such as Amazon AWS, Google Cloud, etc. However, a large player does not always mean the best market offer. The LeaderGPU® project, in comparison to Amazon AWS and Google Cloud, provides physical servers, not VPS, where hardware resources can be shared among several dozens of users.

Tests were conducted on the LeaderGPU® Tesla® computing systems on synthetic data of the following network models: ResNet-50, ResNet-152, VGG16 and AlexNet. At the end of this article, you will find the results of tests carried out on other models. The testing of synthetic data was performed using tf.Variable in analogy with the models configured for ImageNet.

The following commands were used to run the test:

git clone https://github.com/tensorflow/benchmarks.git
python3.5 benchmarks/scripts/tf_cnn_benchmarks/tf_cnn_benchmarks.py --num_gpus=2 --model alexnet (vgg11, vgg16, etc.) --batch_size 32 (64, 128, 256, 512)

LeaderGPU® Tesla® instances

Testing environment: 2 x Tesla® P100 PCI (ltbv32), 2 x Tesla® V100 PCI (ltbv20), 2 x Tesla® V100 NVLink™ (ltbv46)

Instance type: 2 x Tesla® P100 PCI (ltbv32), 2 x Tesla® V100 PCI (ltbv20), 2 x Tesla® V100 NVLink (ltbv46) 
GPUs: Nvidia® Tesla® cards OS: CentOS 7 CUDA® / cuDNN: 9.0 / 7.0.5 TensorFlow™ 1.7 from repo Benchmark GitHub hash: 9165a70 Date of testing: 25.04.2018

Options

Inception V3

VGG16

ResNet-50

ResNet-152

Alexnet

Batch size on GPU

64

32

64

32

512

Optimization

sgd

sgd

sgd

sgd

sgd

picture

Testing synthetic data (images / s)

GPUs InceptionV3 VGG16 ResNet-50 ResNet-152 Alexnet

2x P100

268.24 224.90 446.08 150.04 5252.43

2x PCI V100

430.77 309.82 667.62 213.04 7545.40

2x NVlink™ V100

450.75 417.22 698.97 236.90 8786.56

Other results

Testing synthetic data (images / s)

2x PCI Tesla® P100

Batch size Alexnet vgg11 vgg16 vgg19 lenet googlenet
32 1411.48 378.47 224.90 199.87 14944.76 788.43
64 2460.54 473.82 256.68 225.58 29215.60 913.38
128 3576.26 539.08 278.83 243.67 47375.83 1035.37
256 4545.45 561.73 - - 67116.75 1127.05
512 5252.43 - - - 83665.27 1165.75
Batch size overfeat inceptionv3 inception4 resnet50 resnet101 resnet152
32 548.55 248.72 122.22 389.73 220.26 150.04
64 952.51 268.24 133.96 446.08 253.86 176.09
128 1437.54 283.39 - 483.51 - -
256 1847.21 - - - - -
512 2186.47 - - - - -

2x PCI Tesla® V100

Batch size Alexnet vgg11 vgg16 vgg19 lenet googlenet
32 1665.82 526.55 309.82 282.81 17583.47 1268.95
64 3056.89 695.42 374.22 331.41 32271.30 1487.77
128 4660.06 831.39 410.27 360.79 62652.62 1704.92
256 6255.16 729.42 - - 98828.17 1921.02
512 7545.40 - - - 136553.56 2039.60
Batch size overfeat inceptionv3 inception4 resnet50 resnet101 resnet152
32 625.35 371.94 186.38 579.01 318.30 213.04
64 1194.50 430.77 210.41 667.62 379.37 259.16
128 1934.71 462.09 - 746.73 - -
256 2690.65 - - - - -
512 3267.15 - - - - -

2x NVlink™ Tesla® V100

Batch size Alexnet vgg11 vgg16 vgg19 lenet googlenet
32 3743.79 775.95 417.22 360.08 12460.77 1250.49
64 5514.97 904.65 447.46 386.92 28038.87 1546.01
128 6990.88 982.62 465.05 401.43 50064.03 1791.36
256 7960.86 805.59 - - 94842.75 1895.35
512 8786.56 - - - 131914.42 2158.45
Batch size overfeat inceptionv3 inception4 resnet50 resnet101 resnet152
32 1404.21 397.70 195.51 602.97 341.20 236.90
64 2216.08 450.75 220.00 698.97 395.01 272.37
128 3005.20 475.38 - 781.50 - -
256 3656.48 - - - - -
512 4073.38 - - - - -



Updated: 18.03.2025

Published: 26.04.2018


Still have questions? Write to us!

By clicking «I Accept» you confirm that you have read and accepted the website Terms and Conditions, Privacy Policy, and Moneyback Policy.