You ask — we answer!

How to check multi-GPU support in PyTorch

Renting a server with multiple GPUs solves one simple problem: reducing calculation time through parallelized workloads. However, GPUs alone can’t ensure parallel computations - this is always the developer’s responsibility. In most cases, no separate mechanism is needed. If you’ve used a framework like PyTorch to create your application, it already includes this functionality out of the box.

Let’s say you’ve rented a server with 8 GPUs and want to ensure they’re available to your PyTorch-based application. For a quick test, you can use one of the ready-made small standard datasets, such as CIFAR-10. This dataset contains 60,000 images: 50,000 for training and 10,000 for testing.

To save time on writing your own scripts, you can use existing solutions on GitHub. For instance, you could clone the following repository:

git clone https://github.com/kentaroy47/pytorch-mgpu-cifar10.git

Navigate to the downloaded directory:

cd pytorch-mgpu-cifar10

Now you need to set the CUDA_VISIBLE_DEVICES variable based on the GPUs installed in the server. This variable doesn’t specify the number of GPUs, but rather their identification numbers. For example, if the server has two cards, you’d specify “0,1”. In our case, with 8 cards, we specify IDs from 0 to 7:

export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7

You can now begin the neural network training process using this dataset:

python train_cifar10.py

If no errors occur, you can install and run the nvtop utility in a separate SSH session to monitor the real-time load on each GPU:

sudo apt update && sudo apt -y install nvtop && nvtop

This approach ensures that all GPUs are accessible to PyTorch and evenly loaded.

See also:



Updated: 28.03.2025

Published: 22.10.2024


Still have questions? Write to us!

By clicking «I Accept» you confirm that you have read and accepted the website Terms and Conditions, Privacy Policy, and Moneyback Policy.