Apps & Guides

Stable Diffusion: Riffusion

In our previous articles, we explored the fascinating capabilities of Stable Diffusion for generating captivating images. However, it’s important to note that this powerful generative neural network has even more to offer.

Riffusion is a Stable Diffusion model for music creation and editing. With Riffusion, you can generate a spectrogram of a desired musical segment and effortlessly transform it into a musical excerpt. Let’s install Riffusion on a LeaderGPU server and try it in action.

Prerequisites

Start by updating the package cache repository and installed packages:

sudo apt update && sudo apt -y upgrade

Don’t forget to install NVIDIA® drivers using the autoinstall command or manually, using our step-by-step guide:

sudo ubuntu-drivers autoinstall

Reboot the server:

sudo shutdown -r now

For creating a virtual environment, developers suggest using a tool named Anaconda. You can also use venv, which we discussed in the Linux system utilities tutorial. Download Anaconda installation script using curl:

curl --output anaconda.sh https://repo.anaconda.com/archive/Anaconda3-5.3.1-Linux-x86_64.sh

Make it executable:

chmod +x anaconda.sh

And run:

./anaconda.sh

Answer YES to all questions, except the last one (install Microsoft VSCode). Then, re-login to the SSH console and create a new virtual environment with Python v3.9:

conda create --name riffusion python=3.9

Activate the new virtual environment:

conda activate riffusion

If you want to use music formats other than wav, it is necessary to install the FFmpeg library set as well:

conda install -c conda-forge ffmpeg

Install Riffusion

Clone the Riffusion repository:

git clone https://github.com/riffusion/riffusion.git

Open the downloaded directory:

cd riffusion

Let’s make some changes in the requirements file. This prevents errors with torch compatibility:

nano requirements.txt

Find and fix packages versions:

diffusers==0.9.0
torchaudio==2.0.1

Save changes and proceed with preparing a virtual environment. The following command installs all necessary packages:

python -m pip install -r requirements.txt

Finally, you can open a “playground”. This is a simple web interface that helps you learn more about Riffusion’s features:

python -m riffusion.streamlit.playground

Open your favorite browser and enter the address http://[SERVER_IP]:8501/

Test a playground

Now, you can create music using text prompts and by changing the other parameters:

Also, you can do some tricky things, like splitting audio into separate components. For example, you can extract vocal from Bohemian rhapsody by Queen:

Remember, this is merely a single example of how Riffusion can be utilized. By creating your own application, you can achieve significantly more captivating outcomes. Powerful servers by LeaderGPU will take care of the calculations.