Docker GPU

Usage of Docker containers for deep machine learning and deep learning purposes is gaining extensive popularity at a recent time. Usage of containers makes it almost effortless to install deep machine learning libraries with all the dependencies and makes deployment and scaling more simple and convenient.

The Docker Hub Registry (https://hub.docker.com/) contains pre-assembled container images for all popular deep machine learning libraries such as Tensorflow™, Caffe2, Torch, CNTK, Theano, and others.

We’ve decided to conduct a study and see if there is a performance decreasing when using Docker containers on GPU servers for Deep Learning tasks. For testing purposes, the official Docker image of the TensorFlow™ deep machine learning library was quoted (https://hub.docker.com/r/tensorflow/tensorflow/).

The tests performed on the GPU server with the following configuration (www.leadergpu.com):

GPU: NVIDIA® Tesla® P100 (16 GB)
CPU: 2 x Intel® Xeon® E5-2630v4 2.2 GHz
RAM: 128 GB
SSD: 960 GB
Ports: 40 Gbps
OS: CentOS 7
Python 2.7
TensorFlow™ 1.3

Benchmark settings:

synthetic tests from the official TensorFlow™ web site. The neural network model is Inception v3. (https://www.tensorflow.org/lite/performance/measurement)
testing with real data. The CIFAR-10 dataset used, which contains a 32x32 RGB images. Neural network - 9 layers. Full description here: https://www.tensorflow.org/tutorials/images/cnn

Test procedure on the local machine

The following commands were used to run the test:

Synthetic tests

# mkdir ~/Anaconda
# cd ~/Anaconda
# git clone https://github.com/tensorflow/benchmarks.git
# cd ~/Anaconda/benchmarks/scripts/tf_cnn_benchmarks
# python tf_cnn_benchmarks.py --num_gpus=1 --model inception3 --batch_size 32

Result: total images/sec: 126.34

Tests on real data

# cd ~/Anaconda
# git clone https://github.com/tensorflow/models.git
# cd ~/Anaconda/models/tutorials/image/cifar10
# python cifar10_train.py

Result: sec/batch 0.009-0.028

Test procedure on the Docker container

The following commands were used to run the test:

Synthetic tests

# docker pull tensorflow/tensorflow:latest-devel-gpu
# nvidia-docker run -it --rm -v ~/Anaconda:/root/Anaconda -p 8880:8888 -p 6000:6006 tensorflow/tensorflow:latest-devel-gpu
# cd ~/Anaconda/benchmarks/scripts/tf_cnn_benchmarks
# python tf_cnn_benchmarks.py --num_gpus=1 --model inception3 --batch_size 32

Result: total images/sec: 126.34

Tests on real data

# cd ~/Anaconda
# git clone https://github.com/tensorflow/models.git
# cd ~/Anaconda/models/tutorials/image/cifar10
# python cifar10_train.py

Result: sec/batch 0.009-0.028

Test results

	Local	Docker
Synthetic data	images/sec: 126.34	images/sec: 126.34
Real data	sec/batch: 0.009-0.028	sec/batch: 0.009-0.028

It can be concluded from the results of testing based on synthetic data and on real data, usage of Docker containers is not reducing the performance on GPU servers for Deep Learning tasks.