Docker GPU

Usage of Docker containers for deep machine learning and deep learning purposes is gaining extensive popularity at a recent time. Usage of containers makes it almost effortless to install deep machine learning libraries with all the dependencies and makes deployment and scaling more simple and convenient.

The Docker Hub Registry (https://hub.docker.com/) contains pre-assembled container images for all popular deep machine learning libraries such as Tensorflow™, Caffe2, Torch, CNTK, Theano, and others.

We’ve decided to conduct a study and see if there is a performance decreasing when using Docker containers on GPU servers for Deep Learning tasks. For testing purposes, the official Docker image of the TensorFlow™ deep machine learning library was quoted (https://hub.docker.com/r/tensorflow/tensorflow/).

The tests performed on the GPU server with the following configuration (www.leadergpu.com):

  • GPU: NVIDIA® Tesla® P100 (16 GB)
  • CPU: 2 x Intel® Xeon® E5-2630v4 2.2 GHz
  • RAM: 128 GB
  • SSD: 960 GB
  • Ports: 40 Gbps
  • OS: CentOS 7
  • Python 2.7
  • TensorFlow™ 1.3

Benchmark settings:

Test procedure on the local machine

The following commands were used to run the test:

Synthetic tests

# mkdir ~/Anaconda
# cd ~/Anaconda
# git clone https://github.com/tensorflow/benchmarks.git
# cd ~/Anaconda/benchmarks/scripts/tf_cnn_benchmarks
# python tf_cnn_benchmarks.py --num_gpus=1 --model inception3 --batch_size 32

Result: total images/sec: 126.34

Tests on real data

# cd ~/Anaconda
# git clone https://github.com/tensorflow/models.git
# cd ~/Anaconda/models/tutorials/image/cifar10
# python cifar10_train.py

Result: sec/batch 0.009-0.028

Test procedure on the Docker container

The following commands were used to run the test:

Synthetic tests

# docker pull tensorflow/tensorflow:latest-devel-gpu
# nvidia-docker run -it --rm -v ~/Anaconda:/root/Anaconda -p 8880:8888 -p 6000:6006 tensorflow/tensorflow:latest-devel-gpu
# cd ~/Anaconda/benchmarks/scripts/tf_cnn_benchmarks
# python tf_cnn_benchmarks.py --num_gpus=1 --model inception3 --batch_size 32

Result: total images/sec: 126.34

Tests on real data

# cd ~/Anaconda
# git clone https://github.com/tensorflow/models.git
# cd ~/Anaconda/models/tutorials/image/cifar10
# python cifar10_train.py

Result: sec/batch 0.009-0.028

Test results

Local Docker
Synthetic data images/sec: 126.34 images/sec: 126.34
Real data sec/batch: 0.009-0.028 sec/batch: 0.009-0.028

It can be concluded from the results of testing based on synthetic data and on real data, usage of Docker containers is not reducing the performance on GPU servers for Deep Learning tasks.