Tesla TensorFlow
Summary of test model results for the images classification with Tesla LeaderGPU servers
LeaderGPU is a new player in the GPU computing market, and it intends to change the rules of the game. At this moment in time, the GPU computing market comprises several large players such as Amazon AWS, Google Cloud, etc. However, a large player does not always mean the best market offer. The LeaderGPU project, in comparison to Amazon AWS and Google Cloud, provides physical servers, not VPS, where hardware resources can be shared among several dozens of users.
Tests were conducted on the LeaderGPU Tesla computing systems on synthetic data of the following network models: ResNet-50, ResNet-152, VGG16 and AlexNet. At the end of this article, you will find the results of tests carried out on other models. The testing of synthetic data was performed using tf.Variable in analogy with the models configured for ImageNet.
The following commands were used to run the test:
# git clone https://github.com/tensorflow/benchmarks.git
# python3.5 benchmarks/scripts/tf_cnn_benchmarks/tf_cnn_benchmarks.py --num_gpus=2 --model alexnet (vgg11, vgg16, etc.) --batch_size 32 (64, 128, 256, 512)
LeaderGPU Tesla instances
- Testing environment:2 x Tesla P100 PCI (ltbv32), 2 x Tesla V100 PCI (ltbv20), 2 x Tesla V100 NVLink (ltbv46)
- Instance type:2 x Tesla P100 PCI (ltbv32), 2 x Tesla V100 PCI (ltbv20), 2 x Tesla V100 NVLink (ltbv46)
- GPUs:Nvidia Tesla cards
- OS:CentOS 7
- CUDA / cuDNN:9.0 / 7.0.5
- TensorFlow 1.7 from repo
- Benchmark GitHub hash:9165a70
- Date of testing:25.04.2018
Options | Inception V3 | VGG16 | ResNet-50 | ResNet-152 | Alexnet |
---|---|---|---|---|---|
Batch size on GPU | 64 | 32 | 64 | 32 | 512 |
Optimization | sgd | sgd | sgd | sgd | sgd |
Testing synthetic data (images / s)
GPUs | InceptionV3 | VGG16 | ResNet-50 | ResNet-152 | Alexnet |
---|---|---|---|---|---|
GPUs | InceptionV3 | VGG16 | ResNet-50 | ResNet-152 | Alexnet |
2x P100 | 268.24 | 224.90 | 446.08 | 150.04 | 5252.43 |
2x PCI V100 | 430.77 | 309.82 | 667.62 | 213.04 | 7545.40 |
2x NVlink V100 | 450.75 | 417.22 | 698.97 | 236.90 | 8786.56 |
Other results
Testing synthetic data (images / s)
2x PCI Tesla P100
Batch size | Alexnet | vgg11 | vgg16 | vgg19 | lenet | googlenet |
---|---|---|---|---|---|---|
32 | 1411.48 | 378.47 | 224.90 | 199.87 | 14944.76 | 788.43 |
64 | 2460.54 | 473.82 | 256.68 | 225.58 | 29215.60 | 913.38 |
128 | 3576.26 | 539.08 | 278.83 | 243.67 | 47375.83 | 1035.37 |
256 | 4545.45 | 561.73 | - | - | 67116.75 | 1127.05 |
512 | 5252.43 | - | - | - | 83665.27 | 1165.75 |