<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
  xmlns:content="http://purl.org/rss/1.0/modules/content/"
  xmlns:dc="http://purl.org/dc/elements/1.1/"
  xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/"
  xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd">
  <channel>
    <title>LeaderGPU® | GPU solutions for high-performance computing</title>
    <link>https://www.leadergpu.com</link>
    <description>A catalog of solutions in which you will find the best libraries, programs and tools for high-performance computing in different categories and areas.</description>
    <language>en</language>
    <item>
      <title>Qwen3-Coder: A Broken Paradigm</title>
      <link>https://www.leadergpu.com/catalog/628-qwen3-coder-a-broken-paradigm</link>
      <description>&lt;p&gt;We’re used to thinking that open-source models always lag behind their commercial counterparts in quality. It may seem that they’re developed exclusively by enthusiasts who cannot afford to invest vast sums in creating high-quality datasets and training models on tens of thousands of modern GPUs.&lt;/p&gt;
&lt;p&gt;It’s a different story when large corporations like OpenAI, Anthropic, or Meta take on the task. They not only have the resources but also the world’s top neural network specialists. Unfortunately, the models they create, especially the latest versions, are closed-source. Developers explain this by citing the risks of uncontrolled use and the need to ensure AI safety.&lt;/p&gt;
&lt;p&gt;On one hand, their reasoning is understandable: many ethical questions remain unresolved, and the very nature of neural network models allows only indirect influence on the final output. On the other hand, keeping models closed and offering access only through their own API is also a solid business model.&lt;/p&gt;
&lt;p&gt;Not all companies behave this way, however. For instance, the French company Mistral AI offers both commercial and open-source models, enabling researchers and enthusiasts to use them in their projects. But special attention should be paid to the achievements of Chinese companies, most of which build open-weight and open-source models capable of seriously competing with proprietary solutions.&lt;/p&gt;
&lt;h2&gt;DeepSeek, Qwen3, and Kimi K2&lt;/h2&gt;
&lt;p&gt;The first major breakthrough came with DeepSeek-V3. This multimodal language model from DeepSeek AI was developed using the Mixture of Experts (MoE) approach and impressive 671B parameters, with 37B most relevant ones activated for each token. Most importantly, all its components (model weights, inference code, and training pipelines) were released openly.&lt;/p&gt;
&lt;p&gt;This instantly made it one of the most attractive LLMs for AI application developers, and researchers alike. The next headline-grabber was DeepSeek-R1 - the first open-source reasoning model. On its release day, it rattled the U.S. stock market after its developers claimed that training such an advanced model had cost only $6 million.&lt;/p&gt;
&lt;p&gt;While the hype around DeepSeek eventually cooled down, the next releases were no less important for the global AI industry. We’re talking, of course, about Qwen 3. We covered its features in detail in our &lt;a href=&quot;https://www.leadergpu.com/catalog/624-what-s-new-in-qwen-3&quot; target=&quot;_blank&quot;&gt;What&#39;s new in Qwen 3&lt;/a&gt; review, so we won’t linger on it here. Soon after, another player appeared: Kimi K2 from Moonshot AI.&lt;/p&gt;
&lt;p&gt;With its MoE architecture, 1T parameters (32B activated per token), and open-source code, Kimi K2 quickly drew community attention. Rather than focusing on reasoning, Moonshot AI aimed for state of the art performance in mathematics, programming, and deep cross-disciplinary knowledge.&lt;/p&gt;
&lt;p&gt;The ace up Kimi K2’s sleeve was its optimization for integration into AI agents. This network was literally designed to make full use of all available tools. It excels in tasks requiring not only code writing but also iterative testing at each development stage. However, it has weaknesses too, which we’ll discuss later.&lt;/p&gt;
&lt;p&gt;Kimi K2 is a large language model in every sense. Running the full-size version requires ~2 TB of VRAM (FP8: ~1 TB). For obvious reasons, this isn’t something you can do at home, and even many GPU servers won’t handle it. The model needs at least 8 NVIDIA® H200 accelerators. Quantized versions can help, but at a noticeable cost to accuracy.&lt;/p&gt;
&lt;h2&gt;Qwen3-Coder&lt;/h2&gt;
&lt;p&gt;Seeing Moonshot AI’s success, Alibaba developed its own Kimi K2-like model, but with significant advantages that we’ll discuss shortly. Initially, it was released in two versions:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://huggingface.co/Qwen/Qwen3-Coder-480B-A35B-Instruct&quot; target=&quot;_blank&quot;&gt;Qwen3-Coder-480B-A35B-Instruct&lt;/a&gt; (~250 GB VRAM)&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://huggingface.co/Qwen/Qwen3-Coder-480B-A35B-Instruct-FP8&quot; target=&quot;_blank&quot;&gt;Qwen3-Coder-480B-A35B-Instruct-FP8&lt;/a&gt; (~120 GB VRAM)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;A few days later, smaller models without the reasoning mechanism appeared, requiring far less VRAM:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://huggingface.co/Qwen/Qwen3-Coder-30B-A3B-Instruct&quot; target=&quot;_blank&quot;&gt;Qwen3-Coder-30B-A3B-Instruct&lt;/a&gt; (~32 GB VRAM)&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://huggingface.co/Qwen/Qwen3-Coder-30B-A3B-Instruct-FP8&quot; target=&quot;_blank&quot;&gt;Qwen3-Coder-30B-A3B-Instruct-FP8&lt;/a&gt; (~18 GB VRAM)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Qwen3-Coder was designed for integration with development tools. It includes a special parser for function calls (qwen3coder_tool_parser.py, analogous to OpenAI’s function calling). Alongside the model, a console utility was released, capable of taste ranging from code compilation to querying a knowledge base. This idea isn’t new, essentially it’s heavily reworked extension of Anthropic’s Gemini code app.&lt;/p&gt;
&lt;p&gt;The model is compatible with the OpenAI API, allowing it to be deployed locally or on a remote server and connected to most systems that support this API. This includes both ready-made client apps and machine learning libraries. This makes it viable not only for the B2C but also for the B2B segment, offering a seamless drop-in replacement for OpenAI’s product without any changes to application logic.&lt;/p&gt;
&lt;p&gt;One of its most in-demand features is extended context length. By default, it supports 256k tokens but can be increased to 1M using the &lt;b translate=&quot;no&quot;&gt;YaRN&lt;/b&gt; (Yet another RoPe extensioN) mechanism. Modern LLMs are typically trained on short datasets (2k–8k tokens), and large context lengths can cause them to lose track of earlier content.&lt;/p&gt;
&lt;p&gt;YaRN is an elegant “trick” that makes the model think it’s working with its usual short sequences while actually processing much longer ones. The key idea is to “stretch” or “dilate” the positional space while preserving the mathematical structure the model expects. This allows effective processing of sequences tens of thousands of tokens long without retraining or extra memory required by traditional context extension methods.&lt;/p&gt;
&lt;h2&gt;Downloading and Running Inference&lt;/h2&gt;
&lt;p&gt;Make sure you’ve installed CUDA® beforehand, either using NVIDIA®’s official instructions or the &lt;a href=&quot;https://www.leadergpu.com/articles/615-install-cuda-toolkit-in-linux&quot; target=&quot;_blank&quot;&gt;Install CUDA® toolkit in Linux&lt;/a&gt; guide. To check for the required compiler:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;nvcc --version&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Expected output:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Tue_Feb_27_16:19:38_PST_2024
Cuda compilation tools, release 12.4, V12.4.99
Build cuda_12.4.r12.4/compiler.33961263_0&lt;/pre&gt;
&lt;p&gt;If you get:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;Command &#39;nvcc&#39; not found, but can be installed with:
sudo apt install nvidia-cuda-toolkit&lt;/pre&gt;
&lt;p&gt;you need to add the CUDA® binaries to your system’s $PATH.&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;export PATH=/usr/local/cuda-12.4/bin:$PATH&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;export LD_LIBRARY_PATH=/usr/local/cuda-12.4/lib64:$LD_LIBRARY_PATH&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This is a temporary solution. For permanent edit &lt;b translate=&quot;no&quot;&gt;~/.bashrc&lt;/b&gt; and add the same two lines at the end.&lt;/p&gt;
&lt;p&gt;Now, prepare your system for managing virtual environments. You can use Python’s built-in venv or the more advanced Miniforge. Assuming Miniforge is installed:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;conda create -n venv python=3.10&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;conda activate venv&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Install PyTorch with CUDA® support matching your system:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-venv&quot;&gt;pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu124&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then install the essential libraries:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;Transformers&lt;/b&gt; – Hugging Face’s main model library&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;Accelerate&lt;/b&gt; – enables multi-GPU inference&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;HuggingFace Hub&lt;/b&gt; – for downloading/uploading models &amp; datasets&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;Safetensors&lt;/b&gt; – safe model weight format&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;vLLM&lt;/b&gt; – recommended inference library for Qwen&lt;/li&gt;
&lt;/ul&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-venv&quot;&gt;pip install transformers accelerate huggingface_hub safetensors vllm&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Download the model:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-venv&quot;&gt;hf download Qwen/Qwen3-Coder-30B-A3B-Instruct --local-dir ./Qwen3-30B&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Run inference with tensor parallelism (splitting layer tensors across GPUs, for example 8):&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-venv&quot;&gt;python -m vllm.entrypoints.openai.api_server \
--model /home/usergpu/Qwen3-30B \
--tensor-parallel-size 8 \
--gpu-memory-utilization 0.9 \
--dtype auto \
--host 0.0.0.0 \
--port 8000&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This launches the vLLM OpenAI API Server.&lt;/p&gt;
&lt;h2&gt;Testing and Integration&lt;/h2&gt;
&lt;h3&gt;cURL&lt;/h3&gt;
&lt;p&gt;Install &lt;b translate=&quot;no&quot;&gt;jq&lt;/b&gt; for pretty-printing JSON:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-venv&quot;&gt;sudo apt -y install jq&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Test the server:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-venv&quot;&gt;curl -s http://127.0.0.1:8000/v1/chat/completions -H &quot;Content-Type: application/json&quot; -d &#39;{
  &quot;model&quot;: &quot;/home/usergpu/Qwen3-30B&quot;,
  &quot;messages&quot;: [
    {&quot;role&quot;: &quot;system&quot;, &quot;content&quot;: &quot;You are a helpful assistant.&quot;},
    {&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: &quot;Hello! What can you do?&quot;}
  ],
  &quot;max_tokens&quot;: 180
}&#39; | jq -r &#39;.choices[0].message.content&#39;&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;VSCode&lt;/h3&gt;
&lt;p&gt;To integrate with &lt;b translate=&quot;no&quot;&gt;Visual Studio Code&lt;/b&gt;, install the &lt;b translate=&quot;no&quot;&gt;Continue&lt;/b&gt; extension and add to &lt;b translate=&quot;no&quot;&gt;config.yaml&lt;/b&gt;:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot;&gt;- name: Qwen3-Coder 30B
  provider: openai
  apiBase: http://[server_IP_address]:8000/v1
  apiKey: none
  model: /home/usergpu/Qwen3-30B
  roles:
    - chat
    - edit
    - apply&lt;/code&gt;&lt;/pre&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/183/original/sh_qwen3_coder_a_broken_paradigm_1.png?1755000294&quot; alt=&quot;Continue extension&quot;&gt;
&lt;h3&gt;Qwen-Agent&lt;/h3&gt;
&lt;p&gt;For a GUI-based setup with Qwen-Agent (including RAG, MCP, and code interpreter):&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-venv&quot;&gt;pip install -U &quot;qwen-agent[gui,rag,code_interpreter,mcp]&quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Open the nano editor:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-venv&quot;&gt;nano script.py&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Example Python script to launch Qwen-Agent with a Gradio WebUI:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot;&gt;from qwen_agent.agents import Assistant
from qwen_agent.gui import WebUI

llm_cfg = {
    &#39;model&#39;: &#39;/home/usergpu/Qwen3-30B&#39;,
    &#39;model_server&#39;: &#39;http://localhost:8000/v1&#39;,
    &#39;api_key&#39;: &#39;EMPTY&#39;,
    &#39;generate_cfg&#39;: {&#39;top_p&#39;: 0.8},
}

tools = [&#39;code_interpreter&#39;]

bot = Assistant(
    llm=llm_cfg,
    system_message=&quot;You are a helpful coding assistant.&quot;,
    function_list=tools
)

WebUI(bot).run()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Run the script:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-venv&quot;&gt;python script.py&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The server will be available at: http://127.0.0.1:7860&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/184/original/sh_qwen3_coder_a_broken_paradigm_2.png?1755000323&quot; alt=&quot;Qwen-Agent with tools&quot;&gt;
&lt;p&gt;You can also integrate Qwen3-Coder into agent frameworks like CrewAI for automating complex tasks with toolsets such as web search or vector database memory.&lt;/p&gt;
&lt;p&gt;See also:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/627-how-to-install-crewai-with-gui&quot; target=&quot;_blank&quot;&gt;How to install CrewAI with GUI&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/601-low-code-ai-app-builder-langflow&quot; target=&quot;_blank&quot;&gt;Low-code AI app builder Langflow&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/602-how-to-monitor-langflow-application&quot; target=&quot;_blank&quot;&gt;How to monitor LangFlow application&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description>
      <enclosure url="https://assets.getwildcard.com/system/images/imgs/000/001/182/original/il_qwen3_coder_a_broken_paradigm.png?1755000263"
        length="0"
        type="image/jpeg"/>
      <pubDate>Tue, 12 Aug 2025 14:11:06 +0200</pubDate>
      <guid isPermaLink="false">628</guid>
      <dc:date>2025-08-12 14:11:06 +0200</dc:date>
    </item>
    <item>
      <title>How to install CrewAI with GUI</title>
      <link>https://www.leadergpu.com/catalog/627-how-to-install-crewai-with-gui</link>
      <description>&lt;p&gt;The capabilities of neural network models are growing every day. Researchers and commercial companies are investing more and more into training them. But on their own, these models can’t act autonomously. To solve specific tasks, they need guidance: context extension and direction setting. This approach isn’t always efficient, especially for complex problems.&lt;/p&gt;
&lt;p&gt;But what if we allowed a neural network to act autonomously? And what if we provided it with many tools to interact with the external world? You’d get an AI agent capable of solving tasks by independently determining which tools to use. Sounds complicated, but it works very well. However, even for an advanced user, creating an AI agent from scratch can be a non-trivial task.&lt;/p&gt;
&lt;p&gt;The reason is that most popular libraries lack a graphical user interface. They require interaction through a programming language like Python. This drastically raises the entry threshold and makes AI agents too complex for independent implementation. This is exactly the case with CrewAI.&lt;/p&gt;
&lt;h2&gt;What is CrewAI&lt;/h2&gt;
&lt;p&gt;CrewAI is a very popular and convenient library, but it doesn’t come with a GUI by default. This prompted independent developers to create an unofficial interface. The open source nature of CrewAI made the task much easier, and soon the community released the project CrewAI Studio.&lt;/p&gt;
&lt;p&gt;Developers and enthusiasts gained deeper insight into the system’s architecture and could build tools tailored to specific tasks. Regular users could create AI agents without writing a single line of code. It became easier to assign tasks and manage access to neural networks and tools. It also allowed for exporting and importing agents from server to server and sharing them with friends, colleagues, or the open source community.&lt;/p&gt;
&lt;p&gt;A separate advantage of CrewAI Studio is its deployment flexibility. It can be installed as a regular app or as a Docker container - the preferred method since it includes all necessary libraries and components for running the system.&lt;/p&gt;
&lt;h2&gt;Installation&lt;/h2&gt;
&lt;p&gt;Update your OS packages and installed apps to the latest versions:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt update &amp;&amp; sudo apt -y upgrade&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Use the automatic driver installation script or follow our guide &lt;a href=&quot;https://www.leadergpu.com/articles/499-install-nvidia-drivers-in-linux&quot; target=&quot;_blank&quot;&gt;Install NVIDIA® drivers in Linux&lt;/a&gt;:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo ubuntu-drivers autoinstall&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Reboot the server for changes to take effect:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo shutdown - r now&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;After reconnecting via SSH, install Apache 2 web server utilities, which will give you access to the &lt;b translate=&quot;no&quot;&gt;.htpasswd&lt;/b&gt; file generator used for basic user authentication:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt install -y apache2-utils&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Install Docker Engine using the official shell script:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;curl -sSL https://get.docker.com/ | sh&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Add Docker Compose to the system:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt install -y docker-compose&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Clone the repository:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;git clone https://github.com/strnad/CrewAI-Studio.git&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Navigate to the downloaded directory:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd CrewAI-Studio&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Create a &lt;b translate=&quot;no&quot;&gt;.htpasswd&lt;/b&gt; file for the &lt;b translate=&quot;no&quot;&gt;usergpu&lt;/b&gt; user. You’ll be prompted to enter a password twice:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;htpasswd -c .htpasswd usergpu&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now edit the container deployment file. By default, there are two containers:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo nano docker-compose.yaml&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Delete the section:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot;&gt;ports:
  - &quot;5432:5432&quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And add the following service:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot;&gt;
nginx:
  image: nginx:latest
  container_name: crewai_nginx
  ports:
    - &quot;80:80&quot;
  volumes:
    - ./nginx.conf:/etc/nginx/nginx.conf:ro
    - ./.htpasswd:/etc/nginx/.htpasswd:ro
  depends_on:
    - web&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Nginx will need a config file, so create one:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo nano nginx.conf&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Paste in the following:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot;&gt;events {}

http {
  server {
    listen 80;

    location / {
      proxy_pass http://web:8501;

      # WebSocket headers
      proxy_http_version 1.1;
      proxy_set_header Upgrade $http_upgrade;
      proxy_set_header Connection &quot;upgrade&quot;;

      # Forward headers
      proxy_set_header Host $host;
      proxy_set_header X-Real-IP $remote_addr;
      proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
      proxy_set_header X-Forwarded-Proto $scheme;

      auth_basic &quot;Restricted Content&quot;;
      auth_basic_user_file /etc/nginx/.htpasswd;
    }
  }
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;All important service variables for CrewAI are defined in the &lt;b translate=&quot;no&quot;&gt;.env&lt;/b&gt; file. Open the &lt;b translate=&quot;no&quot;&gt;.env_example&lt;/b&gt; file for editing:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;nano .env_example&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Add the following lines:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot;&gt;OLLAMA_HOST=&quot;http://open-webui:11434&quot;
OLLAMA_MODELS=&quot;ollama/llama3.2:latest&quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And add Postgres config:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot;&gt;POSTGRES_USER=&quot;admin&quot;
POSTGRES_PASSWORD=&quot;your_password&quot;
POSTGRES_DB=&quot;crewai_db&quot;
AGENTOPS_ENABLED=&quot;False&quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now copy the example file and rename it to &lt;b translate=&quot;no&quot;&gt;.env&lt;/b&gt; so the system can read it during container deployment:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cp .env_example .env&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In this example, we’ll use local models with inference handled by Ollama. We recommend our guide &lt;a href=&quot;https://www.leadergpu.com/catalog/584-open-webui-all-in-one&quot; target=&quot;_blank&quot;&gt;Open WebUI: All in one&lt;/a&gt;, and during deployment add &lt;b translate=&quot;no&quot;&gt;-e OLLAMA_HOST=0.0.0.0&lt;/b&gt; to allow CrewAI to connect directly to the Ollama container. Download the desired model (e.g., llama3.2:latest) via WebUI or by connecting to the container console and running:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-root&quot;&gt;ollama pull llama3.2:latest&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Once everything is set up, launch the deployment:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo docker-compose up -d --build&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now, visiting &lt;b translate=&quot;no&quot;&gt;http://[your_server_ip]/&lt;/b&gt; will prompt for login credentials. Upon correct input, the CrewAI interface will appear.&lt;/p&gt;
&lt;h2&gt;Features&lt;/h2&gt;
&lt;p&gt;Let’s explore the key entities CrewAI uses. This will help you understand how to configure workflows. The central entity in the &lt;b translate=&quot;no&quot;&gt;Agent&lt;/b&gt;, an autonomous task executor. Each agent has attributes that help them fulfill their duties:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;Role&lt;/b&gt;. A brief, 2-3 word job description.&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;Backstory&lt;/b&gt;. Optional; helps the language model understand how the agent should behave and what experiences to rely on.&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;Goal&lt;/b&gt;. The objective the agent should pursue.&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;Allow delegation&lt;/b&gt;. Enables the agent to delegate tasks (or parts of them) to others.&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;Verbose&lt;/b&gt;. Tells the agent to log detailed actions.&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;LLM Provider and Model&lt;/b&gt;. Specifies the model and provider to use.&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;Temperature&lt;/b&gt;. Determines response creativity. Higher = more creative.&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;Max iterations&lt;/b&gt;. Number of tries the agent has to succeed, acting as a safeguard (e.g., against infinite loops).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Agents operate by iteratively analyzing input, reasoning, and drawing conclusions using available tools.&lt;/p&gt;
&lt;p&gt;Input is defined by a &lt;b translate=&quot;no&quot;&gt;Task&lt;/b&gt; entity. Each task includes a description, an assigned agent and optionally an expected result. Tasks run sequentially by default but can be parallelized using the &lt;b translate=&quot;no&quot;&gt;Async execution&lt;/b&gt; flag.&lt;/p&gt;
&lt;p&gt;Autonomous agent work is supported by &lt;b translate=&quot;no&quot;&gt;Tools&lt;/b&gt; that enable real-world interaction. CrewAI includes tools for web searches, site parsing, API calls, and file handling, enhancing context and helping agents achieve goals.&lt;/p&gt;
&lt;p&gt;Lastly, there is the &lt;b translate=&quot;no&quot;&gt;Crew entity&lt;/b&gt;. It unites agents with different roles into a team to tackle complex problems. They can communicate, delegate, review, and correct one another, essentially forming a collective intelligence.&lt;/p&gt;
&lt;h2&gt;Using&lt;/h2&gt;
&lt;p&gt;Now that you’re familiar with the entities, let’s build and run a minimal CrewAI workflow. In this example, we’ll track global progress in cancer drug development.&lt;/p&gt;
&lt;p&gt;We’ll use three agents:&lt;/p&gt;
&lt;ol&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;Oncology Drug Pipeline Analyst&lt;/b&gt; - tracks new developments from early stages to clinical trials.&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;Regulatory and Approval Watchdog&lt;/b&gt; - monitors new drug approvals and regulatory changes.&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;Scientific Literature and Innovation Scout&lt;/b&gt; - scans scientific publications and patents related to oncology.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Open the Agents section and create the first agent:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/174/original/sh_how_to_install_crewai_with_gui_1.png?1753263006&quot; alt=&quot;Agent creation&quot;&gt;
&lt;p&gt;For now, we’re using the previously downloaded &lt;b translate=&quot;no&quot;&gt;llama3.2:latest&lt;/b&gt; model, but in a real scenario, choose the one that best fits the task. Repeat the process for the remaining agents and move on to task creation.&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/175/original/sh_how_to_install_crewai_with_gui_2.png?1753263034&quot; alt=&quot;Task creation&quot;&gt;
&lt;p&gt;Gather all agents into a crew and assign the prepared task to them:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/176/original/sh_how_to_install_crewai_with_gui_3.png?1753263057&quot; alt=&quot;Crew creation&quot;&gt;
&lt;p&gt;Activate necessary tools from the list:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/177/original/sh_how_to_install_crewai_with_gui_4.png?1753263096&quot; alt=&quot;Tools selection&quot;&gt;
&lt;p&gt;Finally, go to the &lt;b translate=&quot;no&quot;&gt;Kickoff!&lt;/b&gt; page and click &lt;b translate=&quot;no&quot;&gt;Run Crew!&lt;/b&gt; After some iterations, the system will return a result, such as:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/178/original/sh_how_to_install_crewai_with_gui_5.png?1753263118&quot; alt=&quot;Example CrewAI result&quot;&gt;
&lt;p&gt;Before we finish, let’s check the &lt;b translate=&quot;no&quot;&gt;Import/export&lt;/b&gt; section. Your workflow or crew can be exported as JSON to transfer to another CrewAI server. You can also create a Single-Page Application (SPA) with a single click - perfect for production deployment:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/179/original/sh_how_to_install_crewai_with_gui_6.png?1753263147&quot; alt=&quot;Import and export settings&quot;&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;CrewAI significantly simplifies the creation of AI agents, allowing integration into any application or standalone use. The library is based on the idea of distributed intelligence, where each agent is a domain expert, and the combined team outperforms a single generalist agent.&lt;/p&gt;
&lt;p&gt;Since it’s written in Python, CrewAI integrates easily with ML platforms and tools. Its open source nature allows for extension through third-party modules. Inter-agent communication reduces token usage by distributing context processing.&lt;/p&gt;
&lt;p&gt;As a result, complex tasks are completed faster and more efficiently. The lower entry barrier provided by CrewAI Studio expands the reach of AI agents and multi-agent systems. And support for local models ensures better control over sensitive data.&lt;/p&gt;
&lt;p&gt;See also:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/601-low-code-ai-app-builder-langflow&quot;&gt;Low-code AI app builder Langflow&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/622-how-to-install-n8n&quot;&gt;How to install N8N&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/623-mcp-server-based-on-n8n&quot;&gt;MCP server based on N8N&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description>
      <enclosure url="https://assets.getwildcard.com/system/images/imgs/000/001/180/original/il_how_to_install_crewai_with_gui.png?1753275220"
        length="0"
        type="image/jpeg"/>
      <pubDate>Wed, 23 Jul 2025 15:05:43 +0200</pubDate>
      <guid isPermaLink="false">627</guid>
      <dc:date>2025-07-23 15:05:43 +0200</dc:date>
    </item>
    <item>
      <title>What&#39;s new in Qwen 3</title>
      <link>https://www.leadergpu.com/catalog/624-what-s-new-in-qwen-3</link>
      <description>&lt;p&gt;The global AI race is accelerating. Research institutions, private corporations, and even entire nations are now competing for leadership in the AI domain. Broadly speaking, this race can be divided into several phases. The first stage involved the creation of narrow AI. Existing neural network models such as GPT, MidJourney, and AlphaFold show that this stage has been successfully achieved.&lt;/p&gt;
&lt;p&gt;The next step envisions the evolution of AI into AGI (Artificial General Intelligence). AGI should match human intelligence in solving a wide range of tasks, from writing stories and performing scientific calculations to understanding social situations and learning independently. As of the time of writing, this level has not been reached yet.&lt;/p&gt;
&lt;p&gt;The ultimate stage in AI development is referred to as ASI (Artificial Super Intelligence). It would far exceed human capabilities in all areas. This would make it possible to develop technologies we can’t even imagine today and to manage global systems with a precision beyond human capabilities. However, this might only become a reality after decades (or even centuries) of continuous advancement.&lt;/p&gt;
&lt;p&gt;As a result, most AI race participants are focused on reaching AGI while retaining control over it. The development of AGI is closely tied to a host of complex technical, ethical, and legal challenges. Still, the potential rewards far outweigh the costs, which is why corporations like Alibaba Group are investing heavily in this area.&lt;/p&gt;
&lt;p&gt;The release of &lt;a href=&quot;https://github.com/QwenLM/Qwen3&quot; target=&quot;_blank&quot;&gt;Qwen 3&lt;/a&gt; marks a significant milestone not only for one company’s neural networks but also on the global stage. Compared to its predecessor, the model introduces several important innovations.&lt;/p&gt;
&lt;h2&gt;Features&lt;/h2&gt;
&lt;p&gt;Qwen 2.5 was pretrained on a dataset of 18B tokens, while the new model has doubled that amount to 36B tokens. The largest dataset has significantly improved the base model’s accuracy. Interestingly, in addition to publicly available internet data gathered through parsing, the system was also trained on PDF documents. These are typically well-structured and knowledge-dense, which helps the model provide more accurate answers and better understand complex formulations.&lt;/p&gt;
&lt;p&gt;One of the most promising directions in AI development is building models capable of reasoning, which can expand the task context through an iterative process. On one hand, this allows for more comprehensive problem-solving, but on the other hand, reasoning tends to slow the process down considerably. Therefore, the developers of Qwen 3 have introduced two operational modes:&lt;/p&gt;
&lt;ol&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;Thinking mode.&lt;/b&gt; The model builds up context step-by-step before providing a final answer. This makes it possible to tackle complex problems that require deep understanding.&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;Non-thinking mode.&lt;/b&gt; The model responds almost instantly but may produce more superficial answers without in-depth analysis.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;This manual control over model behavior enhances user experience for handling many routine tasks. Reducing the use of thinking mode also significantly lowers GPU load, allowing more tokens to be processed within the same time frame.&lt;/p&gt;
&lt;p&gt;In addition to this binary choice, there’s also a soft-switching mechanism. This hybrid behavior allows the model to adapt to context using internal weighting mechanisms. If the model deems a task difficult, it will automatically trigger reasoning or even self-verification. It can also respond to user cues such as “Let’s think step by step”.&lt;/p&gt;
&lt;p&gt;Another significant improvement is expanded multilingual support. While Qwen 2.5 supported only 29 languages, version 3 can now understand and generate text in 119 languages and dialects. This has greatly improved instruction following and contextual comprehension. As a result, Qwen 3 can now be effectively used in non-English environments.&lt;/p&gt;
&lt;p&gt;In addition, Qwen 3 is now significantly better integrated with MCP servers, giving the model tools to dive deeper into problem-solving and execute actions. It can now interact with external sources and manage complex processes directly.&lt;/p&gt;
&lt;h2&gt;Model training&lt;/h2&gt;
&lt;h3&gt;Pre Training&lt;/h3&gt;
&lt;p&gt;Such a substantial leap forward wouldn’t have been possible without a multi-stage training system. Initially, the model was pretrained on 30B tokens with a 4K context length, allowing it to acquire general knowledge and basic language skills.&lt;/p&gt;
&lt;p&gt;This was followed by a refinement stage using more scientific and well-structured data. During this stage, the model also gained the ability to effectively write applications in multiple programming languages.&lt;/p&gt;
&lt;p&gt;Finally, it was trained on a high-quality dataset with extended context. As a result, Qwen 3 now supports an effective context length of 128K tokens, that’s roughly 350 pages of typed text, depending on the language. For instance, Cyrillic-based languages often have shorter tokens due to morphology and use of prefixes, suffixes, etc.&lt;/p&gt;
&lt;h3&gt;Reasoning Pipeline&lt;/h3&gt;
&lt;p&gt;Building reasoning-capable models is a fascinating but labor-intensive process that combines various existing techniques aimed at simulating human thought. Based on publicly available information, we can assume that Qwen 3’s reasoning training involved four main stages:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;Cold start for long chains of thought.&lt;/b&gt; Training the model to break problems into multiple steps without prior adaptation. This helps it learn iterative thinking and develop a basic layer of reasoning skills.&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;Reinforcement learning based on reasoning.&lt;/b&gt; At this stage, rewards depend not only on the final answer but also on how well the model constructs logical, interpretable, and structured reasoning chains. The absence of errors and hallucinations is also evaluated.&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;Merging reasoning modes.&lt;/b&gt; Humans typically rely on two thinking styles: fast (intuitive) and slow (analytical). Depending on the task type, the neural model should learn to both switch between and integrate these styles. This is usually done using examples that mix both styles or through special tokens indicating which style to apply.&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;General reinforcement learning.&lt;/b&gt; This final stage resembles a sandbox environment where the model learns to interact with tools, perform multi-steps tasks, and develop adaptive behavior. Here, it also becomes attuned to user preferences.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Qwen 3 is a major milestone for Alibaba Group. Its training quality and methodology make it a serious contender against established players like OpenAI and Anthropic. The improvements over the previous version are substantial.&lt;/p&gt;
&lt;p&gt;An added benefit is its open-source nature, with the codebase publicly available on GitHub under the Apache 2.0 license.&lt;/p&gt;
&lt;p&gt;Further development of the Qwen model family will help strengthen its position in the global AI arena and narrow the gap with closed-source commercial models. And all current achievements are, in one way or another, steps toward humanity’s progress in building AGI.&lt;/p&gt;
&lt;p&gt;See also:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/578-your-own-qwen-using-hf&quot;&gt;Your own Qwen using HF&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/579-qwen-2-vs-llama-3&quot;&gt;Qwen 2 vs Llama 3&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description>
      <enclosure url="https://assets.getwildcard.com/system/images/imgs/000/001/168/original/il_whats_new_in_qwen_3.png?1752240562"
        length="0"
        type="image/jpeg"/>
      <pubDate>Mon, 14 Jul 2025 08:05:08 +0200</pubDate>
      <guid isPermaLink="false">624</guid>
      <dc:date>2025-07-14 08:05:08 +0200</dc:date>
    </item>
    <item>
      <title>MCP server based on N8N</title>
      <link>https://www.leadergpu.com/catalog/623-mcp-server-based-on-n8n</link>
      <description>&lt;p&gt;The development of generative neural networks has accelerated significantly in recent years. They’ve become noticeably faster and more accurate in their responses and have learned to reason. However, their capabilities are still fundamentally limited by their architecture. For example, every existing LLM at the time of writing has a knowledge cutoff date. This means that with each passing day, such an LLM becomes more likely to produce incorrect answers, simply because it lacks information about events that occurred after that date.&lt;/p&gt;
&lt;p&gt;This limitation necessitates retraining the model entirely on fresher data, which is expensive and time-consuming. But there is another way. If you enable the model to interact with the outside world, it can independently find and update the information requested during a user conversation, without requiring retraining.&lt;/p&gt;
&lt;p&gt;This is roughly how the RAG (Retrieval Augmented Generation) mechanism works. When answering a question, the model first queries a pre-prepared vector database, and if it finds relevant information, it incorporates it into the prompt. Thus, by explaining and updating the vector DB, the quality of LLM responses can be greatly improved.&lt;/p&gt;
&lt;p&gt;But there is another, even more interesting way to embed up-to-date context into prompts. It’s called MCP, which stands for Model Context Protocol. It was originally developed by Anthropic for its Claude model. The key moment came when the source code for MCP was made open-source, allowing thousands of AI researchers to build custom servers for various purposes.&lt;/p&gt;
&lt;p&gt;The essence of MCP is to give a neural network model access to tools with which it can independently update its knowledge and perform various actions to efficiently solve given tasks. The model itself decides which tool to use and whether it’s appropriate in each situation.&lt;/p&gt;
&lt;p&gt;Support for MCP soon appeared in various IDEs like Cursor, as well as in automation platforms like N8N. The latter is especially intuitive, as workflows are created visually, making it easier to understand. Within N8N, you can either connect to an existing MCP server or create your own. Moreover, you can even organize a direct connection within a single workflow. But let’s go step by step.&lt;/p&gt;
&lt;h2&gt;Creating a Simple AI Agent&lt;/h2&gt;
&lt;p&gt;Before getting started, make sure the main requirement is met, you have an LLM ready for connections. This could be a locally running model using Ollama or an external service like OpenAI’s ChatGPT. In the first case, you’ll need to know the local Ollama API address (and optionally its authentication), and in the second case, you’ll need an active OpenAI account with sufficient credits.&lt;/p&gt;
&lt;p&gt;Building an agent starts with the key AI Agent node. At a minimum, it must be linked with two other nodes, one to act as a trigger, and the other to connect to the LLM. If you don’t specify a trigger, the system will create one automatically, triggering the agent upon receiving any message in the internal chat:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/158/original/sh_mcp_server_based_on_n8n_1.png?1751458377&quot; alt=&quot;AI Agent only&quot;&gt;
&lt;p&gt;The only missing piece is the LLM. For instance, you can use our &lt;a href=&quot;https://www.leadergpu.com/catalog/584-open-webui-all-in-one&quot;&gt;Open WebUI: All in one&lt;/a&gt; guide to set up Ollama with a web interface. The only change required is that the containers for N8N and Open WebUI must be on the same network. For example, if the N8N container is on a network named &lt;b translate=&quot;no&quot;&gt;web&lt;/b&gt;, then in the deployment command for Open WebUI, replace &lt;b translate=&quot;no&quot;&gt;--network=host&lt;/b&gt; with &lt;b translate=&quot;no&quot;&gt;--network=web&lt;/b&gt;.&lt;/p&gt;
&lt;p&gt;In some cases, you will also need to manually set the &lt;b translate=&quot;no&quot;&gt;OLLAMA_HOST&lt;/b&gt; environment variable, for example: &lt;b translate=&quot;no&quot;&gt;-e OLLAMA_HOST=0.0.0.0&lt;/b&gt;. This allows connections to the Ollama API not only from localhost but also from other containers. Suppose Ollama is deployed in a container named &lt;b translate=&quot;no&quot;&gt;ollama-webui&lt;/b&gt;. Then the base URL for connecting from N8N would be:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot;&gt;http://open-webui:11434&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Before connecting the Ollama Chat Model node, don’t forget to download at least one model. You can do this either from the web interface or via the container CLI. The following command will download the Llama 3.1 model with 8 billion parameters:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;ollama pull llama3.1:8b&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Once downloaded and installed, the model will automatically appear in the list of available ones:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/159/original/sh_mcp_server_based_on_n8n_2.png?1751458416&quot; alt=&quot;Model select&quot;&gt;
&lt;p&gt;A minimal working AI Agent workflow looks like this:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/160/original/sh_mcp_server_based_on_n8n_3.png?1751458451&quot; alt=&quot;Minimal working AI Agent&quot;&gt;
&lt;p&gt;In this form, the agent can use only one model and doesn’t store input data or enhance prompts using external tools. So it makes sense to add at least the &lt;b translate=&quot;no&quot;&gt;Simple Memory&lt;/b&gt; node. For light loads, it’s sufficient to store requests and responses.&lt;/p&gt;
&lt;p&gt;But let&#39;s go back to MCP. To start, create a server using the special &lt;b translate=&quot;no&quot;&gt;MCP Server Trigger&lt;/b&gt; node:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/161/original/sh_mcp_server_based_on_n8n_4.png?1751458483&quot; alt=&quot;MCP Server Trigger only&quot;&gt;
&lt;p&gt;This node is fully self-contained and doesn’t require external activation. It’s triggered solely by an incoming external request to its webhook address. By default, there are two URLs: &lt;b translate=&quot;no&quot;&gt;Test URL&lt;/b&gt; and &lt;b translate=&quot;no&quot;&gt;Production URL&lt;/b&gt;. The first is used during development, while the second works only when the workflow is saved and activated.&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/162/original/sh_mcp_server_based_on_n8n_5.png?1751458569&quot; alt=&quot;MCP Server Trigger settings&quot;&gt;
&lt;p&gt;The trigger is useless on its own, it needs connected tools. For example, let’s connect one of the simplest tools: a calculator. It will expect a mathematical expression as input. Nodes communicate using plain JSON, so for the calculator to compute 2 + 2, the input should be:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot;&gt;[
  {
    &quot;query&quot;: {
      &quot;input&quot;: &quot;2 + 2&quot;
    }
  }
]&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;LLMs can easily generate such JSON from plain text task descriptions and send them to the node, which performs the calculations and returns the result. Let’s connect the MCP client to the agent:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/163/original/sh_mcp_server_based_on_n8n_6.png?1751458623&quot; alt=&quot;AI Agent with tools&quot;&gt;
&lt;p&gt;It’s worth noting that this node doesn’t need any additional connections. In its settings, it’s enough to specify the endpoint address where it will send data from the AI Agent. In our example, this address points to the container named &lt;b translate=&quot;no&quot;&gt;n8n&lt;/b&gt;.&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/164/original/sh_mcp_server_based_on_n8n_7.png?1751458735&quot; alt=&quot;MCP Client Settings&quot;&gt;
&lt;p&gt;Of course, at this stage you can specify any external MCP server address available to you. But for this article, we’ll use a local instance running within N8N. Let’s see how the client and server behave when the AI Agent is asked to perform a simple math operation:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/165/original/sh_mcp_server_based_on_n8n_8.png?1751458808&quot; alt=&quot;MCP Client calculations example&quot;&gt;
&lt;p&gt;Upon receiving the request, the AI Agent will:&lt;/p&gt;
&lt;ol&gt;
    &lt;li&gt;Search in Simple Memory to see if the user asked this before or if any context can be reused.&lt;/li&gt;
    &lt;li&gt;Send the prompt to the LLM, which will correctly break down the math expression and prepare the corresponding JSON.&lt;/li&gt;
    &lt;li&gt;Send the JSON to the Calculator tool and receive the result.&lt;/li&gt;
    &lt;li&gt;Use the LLM to generate the final response and insert the result into the reply.&lt;/li&gt;
    &lt;li&gt;Store the result in Simple Memory.&lt;/li&gt;
    &lt;li&gt;Output the message in the chat.&lt;/li&gt;
&lt;/ol&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/166/original/sh_mcp_server_based_on_n8n_9.png?1751458859&quot; alt=&quot;MCP Client calculations JSON&quot;&gt;
&lt;p&gt;Similarly, agents can work with other tools on the MCP server. Instead of Simple Memory, you can use more advanced options like MongoDB, Postgres, Redis, or even something like Zep. Of course, these require minimal database maintenance, but overall performance will increase significantly.&lt;/p&gt;
&lt;p&gt;There are also far more options for tool selection. Out of the box, the &lt;b translate=&quot;no&quot;&gt;MCP Server Trigger&lt;/b&gt; node supports over 200 tools. These can be anything, from simple HTTP requests to prebuilt integrations with public internet services. Within a single workflow, you can create both a server and a client. One important thing to note: these nodes can’t be visually connected in the editor, and that’s expected behavior:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/167/original/sh_mcp_server_based_on_n8n_10.png?1751458939&quot; alt=&quot;MCP Server and Client with tools&quot;&gt;
&lt;p&gt;Instead of the default trigger, you can use other options such as receiving a message via a messenger, submitting a website form, or executing on a schedule. This lets you set up workflows that react to events or perform routine operations like daily data exports from Google Ads.&lt;/p&gt;
&lt;p&gt;And that’s not the end of what’s possible with AI agents. You can build multi-agent systems using different neural network models that work together to solve tasks with greater accuracy, considering many more influencing factors in the process.&lt;/p&gt;
&lt;p&gt;See also:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/622-how-to-install-n8n&quot;&gt;How to install N8N&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/601-low-code-ai-app-builder-langflow&quot;&gt;Low-code AI app builder Langflow&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/602-how-to-monitor-langflow-application&quot;&gt;How to monitor LangFlow application&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description>
      <enclosure url="https://assets.getwildcard.com/system/images/imgs/000/001/157/original/il_mcp_server_based_on_n8n.png?1751457996"
        length="0"
        type="image/jpeg"/>
      <pubDate>Wed, 02 Jul 2025 15:28:18 +0200</pubDate>
      <guid isPermaLink="false">623</guid>
      <dc:date>2025-07-02 15:28:18 +0200</dc:date>
    </item>
    <item>
      <title>How to install N8N</title>
      <link>https://www.leadergpu.com/catalog/622-how-to-install-n8n</link>
      <description>&lt;p&gt;AI agents in 2025 remain one of the most promising approaches for solving complex tasks using large language models. These agents are autonomous and capable of selecting various tools on their own to accomplish assigned tasks. This approach enables achieving results with less human involvement and higher quality. It also opens up opportunities for discovering more original and effective ways of dealing with problems.&lt;/p&gt;
&lt;p&gt;Instead of just formulating a task, you instruct the neural network to solve it independently, based on the resources allocated to it. However, for this scheme to work, there needs to be a mechanism that connects neural network interfaces with various tools, whether its web search or a vector database for storing intermediate results.&lt;/p&gt;
&lt;p&gt;n8n is an automation platform that supports integration with various neural networks and public services. Users can visually design how data will be processed and what final result needs to be achieved. Unlike classic no-code solutions, n8n allows arbitrary code to be included at any stage of the process, which is especially useful when built-in functionality is not sufficient.&lt;/p&gt;
&lt;p&gt;The result is a system that combines the simplicity of no-code with the flexibility of traditional programming. However, to fully understand it, you&#39;ll still need to spend some time exploring and reviewing workflow examples for better comprehension. In this article, we’ll walk you through how to deploy n8n on LeaderGPU servers.&lt;/p&gt;
&lt;h2&gt;Preparing the server&lt;/h2&gt;
&lt;h3&gt;Update the system&lt;/h3&gt;
&lt;p&gt;Update the package list and upgrade all installed packages:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt update &amp;&amp; sudo apt -y upgrade&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Automatically install the recommended NVIDIA® driver (proprietary) or use our step-by-step guide &lt;a href=&quot;https://www.leadergpu.com/articles/499-install-nvidia-drivers-in-linux&quot; target=&quot;_blank&quot;&gt;Install NVIDIA® drivers in Linux&lt;/a&gt;:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo ubuntu-drivers autoinstall&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now reboot the server:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo shutdown -r now&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;Install Docker&lt;/h3&gt;
&lt;p&gt;You can use the official installation script:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;curl -sSL https://get.docker.com/ | sh&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let’s add NVIDIA® container toolkit GPG key and repository for Docker integration:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&amp;&amp; curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed &#39;s#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g&#39; | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Update the package list and install the NVIDIA® container toolkit:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt update &amp;&amp; sudo apt -y install nvidia-container-toolkit&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Restart Docker to apply changes and enable installed toolkit:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo systemctl restart docker&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;Install n8n&lt;/h3&gt;
&lt;p&gt;To allow the system to store data, you need to create a volume before launching the container:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo docker volume create n8n_data&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now, let’s launch a container that will open port 5678 for external connections and mount the created &lt;b translate=&quot;no&quot;&gt;n8n_data&lt;/b&gt; volume to the directory &lt;b translate=&quot;no&quot;&gt;/home/node/.n8n&lt;/b&gt; inside the container:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo docker run -d --name n8n -p 5678:5678 -v n8n_data:/home/node/.n8n docker.n8n.io/n8nio/n8n&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The first time you launch the application, you might be puzzled by the following error message:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/152/original/sh_how_to_install_n8n_1.png?1750667132&quot; alt=&quot;TLS-error N8N&quot;&gt;
&lt;p&gt;This isn’t exactly an error, it’s more of a warning about how to properly configure the system for access. The issue is that by default, the system doesn’t have a TLS/HTTPS certificate. Without it, the connection won’t be secure. So, you have three options:&lt;/p&gt;
&lt;ol&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;Connect your own certificate&lt;/b&gt;. You can do this by specifying the paths to the certificate files via environment variables, or by configuring a reverse proxy server.&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;Create an SSH tunnel and forward port 5678&lt;/b&gt; to localhost on the computer you’re connecting from. This way, you’ll immediately get a secure personal connection. However, no one else will be able to access the server externally.&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;Bypass the warning&lt;/b&gt;. If this is a test server not intended for production use and you don’t care about security, you can disable the warning by setting the &lt;b translate=&quot;no&quot;&gt;N8N_SECURE_COOKIE&lt;/b&gt; environment variable to &lt;b translate=&quot;no&quot;&gt;FALSE&lt;/b&gt;. This is strongly discouraged as it makes the server vulnerable to potential attacks. Still, it might be acceptable in specific scenarios.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;This article will explore each option in detail so you can choose the right one.&lt;/p&gt;
&lt;h2&gt;Connecting to server&lt;/h2&gt;
&lt;p&gt;If you don’t yet have an SSL certificate, we recommend ordering one from &lt;a href=&quot;https://www.leaderssl.com/&quot; target=&quot;_blank&quot;&gt;&lt;/a&gt;LeaderSSL. It can be used for any website, online store, or to verify an email’s authenticity.&lt;/p&gt;
&lt;h3&gt;Using Environment Variables&lt;/h3&gt;
&lt;p&gt;The simplest way to configure HTTPS is to upload your certificate to the server and specify it via Docker environment variables. Start by creating a directory for the certificate files:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;mkdir ~/n8n-certs&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You can upload these files (typically cert.crt and privkey.key) to this directory using any method. For more detailed info, see:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/articles/495-file-exchange-from-windows&quot; target=&quot;_blank&quot;&gt;File exchange from Windows&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/articles/494-file-exchange-from-linux&quot; target=&quot;_blank&quot;&gt;File exchange from Linux&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/articles/496-file-exchange-from-macos&quot; target=&quot;_blank&quot;&gt;File exchange from macOS&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Now, let’s launch the container using one full command:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo docker run -d \
--name n8n \
-p 5678:5678 \
-v n8n_data:/home/node/.n8n \
-v ~/n8n-certs:/certs \
-e N8N_PROTOCOL=https \
-e N8N_SSL_CERT=&quot;/certs/cert.crt&quot; \
-e N8N_SSL_KEY=&quot;/certs/privkey.key&quot; \
docker.n8n.io/n8nio/n8n&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Here’s a breakdown of each argument:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;sudo docker run -d&lt;/b&gt; launches the Docker container in daemon (background) mode&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;--name n8n&lt;/b&gt; assigns the container the name &lt;b translate=&quot;no&quot;&gt;n8n&lt;/b&gt;&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;-p 5678:5678&lt;/b&gt; forwards port &lt;b translate=&quot;no&quot;&gt;5678&lt;/b&gt; to the container&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;-v n8n_data:/home/node/.n8n&lt;/b&gt; creates and mounts a volume named &lt;b translate=&quot;no&quot;&gt;n8n_data&lt;/b&gt; to the hidden directory &lt;b translate=&quot;no&quot;&gt;/home/node/.n8n&lt;/b&gt; inside the container&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;-v ~/n8n-certs:/certs&lt;/b&gt; mounts the certificate directory&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;-e N8N_PROTOCOL=https&lt;/b&gt; forces N8N to use the HTTPS protocol&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;-e N8N_SSL_CERT=&quot;/certs/cert.crt&quot;&lt;/b&gt; sets the path to the certificate file&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;-e N8N_SSL_KEY=&quot;/certs/privkey.key&quot;&lt;/b&gt; sets the path to the certificate key&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;docker.n8n.io/n8nio/n8n&lt;/b&gt; container image source&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Traefik&lt;/h3&gt;
&lt;p&gt;A slightly more complex but flexible setup involves using the Traefik reverse proxy server to secure the connection to N8N. The configuration file is based on the official method specified in the documentation. First, install &lt;b translate=&quot;no&quot;&gt;docker-compose&lt;/b&gt; tool:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt -y install docker-compose&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We’ll deploy Traefik and N8N together, and they need to be on the same network. Create a network called the &lt;b translate=&quot;no&quot;&gt;web&lt;/b&gt;.&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo docker network create web&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now, create a &lt;b translate=&quot;no&quot;&gt;docker-compose.yml&lt;/b&gt; file to define and run both containers:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;nano docker-compose.yml&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot;&gt;services:
  traefik:
    image: &quot;traefik&quot;
    container_name: &quot;proxy&quot;
    restart: always
    command:
      - &quot;--api.insecure=true&quot;
      - &quot;--providers.docker=true&quot;
      - &quot;--providers.docker.exposedbydefault=false&quot;
      - &quot;--entrypoints.web.address=:80&quot;
      - &quot;--entrypoints.web.http.redirections.entryPoint.to=websecure&quot;
      - &quot;--entrypoints.web.http.redirections.entrypoint.scheme=https&quot;
      - &quot;--entrypoints.websecure.address=:443&quot;
      - &quot;--certificatesresolvers.mytlschallenge.acme.tlschallenge=true&quot;
      - &quot;--certificatesresolvers.mytlschallenge.acme.email=${SSL_EMAIL}&quot;
      - &quot;--certificatesresolvers.mytlschallenge.acme.storage=/letsencrypt/acme.json&quot;
    ports:
      - &quot;80:80&quot;
      - &quot;443:443&quot;
    volumes:
      - traefik_data:/letsencrypt
      - /var/run/docker.sock:/var/run/docker.sock:ro
    networks:
      - web

  n8n:
    image: docker.n8n.io/n8nio/n8n
    container_name: &quot;n8n&quot;
    restart: always
    ports:
      - &quot;127.0.0.1:5678:5678&quot;
    labels:
      - traefik.enable=true
      - traefik.http.routers.n8n.rule=Host(`${SUBDOMAIN}.${DOMAIN_NAME}`)
      - traefik.http.routers.n8n.tls=true
      - traefik.http.routers.n8n.entrypoints=web,websecure
      - traefik.http.routers.n8n.tls.certresolver=mytlschallenge
      - traefik.http.middlewares.n8n.headers.SSLRedirect=true
      - traefik.http.middlewares.n8n.headers.STSSeconds=315360000
      - traefik.http.middlewares.n8n.headers.browserXSSFilter=true
      - traefik.http.middlewares.n8n.headers.contentTypeNosniff=true
      - traefik.http.middlewares.n8n.headers.forceSTSHeader=true
      - traefik.http.middlewares.n8n.headers.SSLHost=${DOMAIN_NAME}
      - traefik.http.middlewares.n8n.headers.STSIncludeSubdomains=true
      - traefik.http.middlewares.n8n.headers.STSPreload=true
      - traefik.http.routers.n8n.middlewares=n8n@docker
    environment:
      - N8N_HOST=${SUBDOMAIN}.${DOMAIN_NAME}
      - N8N_PORT=5678
      - N8N_PROTOCOL=https
      - NODE_ENV=production
      - WEBHOOK_URL=https://${SUBDOMAIN}.${DOMAIN_NAME}/
      - GENERIC_TIMEZONE=${GENERIC_TIMEZONE}
    volumes:
      - n8n_data:/home/node/.n8n
      - ./local-files:/files
    networks:
      - web

volumes:
  n8n_data:
  traefik_data:

networks:
  web:
    name: web&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In addition to the &lt;b translate=&quot;no&quot;&gt;docker-compose.yml&lt;/b&gt; file, we will create another file named &lt;b translate=&quot;no&quot;&gt;.env&lt;/b&gt;. This file will contain variables such as the domain name and email address used to request an SSL certificate from Let&#39;s Encrypt. If we ever need to change something, like the domain name, we&#39;ll only need to update it in this file and then recreate the container.&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;nano .env&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot;&gt;DOMAIN_NAME=example.com
SUBDOMAIN=n8n
GENERIC_TIMEZONE=Europe/Amsterdam
SSL_EMAIL=user@example.com&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Finally, deploy both containers:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo docker-compose up -d&lt;/code&gt;&lt;/pre&gt;
Now, N8N is available here: &lt;b translate=&quot;no&quot;&gt;https://n8n.example.com&lt;/b&gt;.
&lt;h3&gt;Nginx Proxy Manager&lt;/h3&gt;
&lt;p&gt;Unlike Traefik, which is configured via files, Nginx Proxy Manager offers a user-friendly web-interface. However, it doesn’t detect services dynamically, you must add them manually. Still, it works well for static services like N8N.&lt;/p&gt;
&lt;p&gt;Create another &lt;b translate=&quot;no&quot;&gt;docker-compose.yml&lt;/b&gt; file in a separate directory with the following content:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot;&gt;services:
  app:
    image: &#39;jc21/nginx-proxy-manager:latest&#39;
    container_name: proxy
    restart: unless-stopped
    ports:
      - &#39;80:80&#39;
      - &#39;443:443&#39;
      - &#39;81:81&#39;
    volumes:
      - ./data:/data
      - ./letsencrypt:/etc/letsencrypt
    networks:
      - web

  n8n:
    image: docker.n8n.io/n8nio/n8n
    container_name: n8n
    restart: unless-stopped
    environment:
      - N8N_HOST=n8n.example.com
      - N8N_PORT=5678
      - WEBHOOK_URL=https://n8n.example.com/
      - N8N_PROTOCOL=http
    volumes:
      - n8n_data:/home/node/.n8n
    networks:
      - web

volumes:
  n8n_data:

networks:
  web:
    external: true&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Deploy with:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo docker-compose up -d&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then open web-interface at: &lt;b translate=&quot;no&quot;&gt;http://your_hostname_or_ip:81&lt;/b&gt;&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;Username: &lt;b translate=&quot;no&quot;&gt;admin@example.com&lt;/b&gt;&lt;/li&gt;
    &lt;li&gt;Password: &lt;b translate=&quot;no&quot;&gt;changeme&lt;/b&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You’ll be prompted to update your credentials. After that, open &lt;b translate=&quot;no&quot;&gt;Hosts → Proxy Hosts → Add Proxy Host&lt;/b&gt;, enter your domain name (e.g., &lt;b translate=&quot;no&quot;&gt;n8n.example.com&lt;/b&gt;):&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/153/original/sh_how_to_install_n8n_2.png?1750667229&quot; alt=&quot;Add domain N8N&quot;&gt;
&lt;p&gt;Fill up the necessary fields:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;Set &lt;b translate=&quot;no&quot;&gt;Destination/IP&lt;/b&gt; to &lt;b translate=&quot;no&quot;&gt;n8n&lt;/b&gt;.&lt;/li&gt;
    &lt;li&gt;Set &lt;b translate=&quot;no&quot;&gt;Port&lt;/b&gt; to &lt;b translate=&quot;no&quot;&gt;5678&lt;/b&gt;.&lt;/li&gt;
    &lt;li&gt;Under the &lt;b translate=&quot;no&quot;&gt;SSL&lt;/b&gt; tab, choose &lt;b translate=&quot;no&quot;&gt;Request a new SSL certificate with Let’s Encrypt&lt;/b&gt;.&lt;/li&gt;
    &lt;li&gt;Enter your email and agree to the terms.&lt;/li&gt;
    &lt;li&gt;Click on &lt;b&gt;Websockets support&lt;/b&gt;.&lt;/li&gt;
    &lt;li&gt;Optionally click on &lt;b translate=&quot;no&quot;&gt;Force SSL&lt;/b&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;After pressing &lt;b translate=&quot;no&quot;&gt;Save&lt;/b&gt; button, the certificate will be requested and installed:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/154/original/sh_how_to_install_n8n_3.png?1750667362&quot; alt=&quot;Nginx Proxy Manager ready&quot;&gt;
&lt;p&gt;Once done, opening your domain will lead to the N8N interface.&lt;/p&gt;
&lt;h3&gt;SSH-tunnel&lt;/h3&gt;
&lt;p&gt;If you don’t need N8N accessibility externally, you can forward port 5678 via SSH. This encrypts all traffic, and N8N will be available at &lt;b translate=&quot;no&quot;&gt;http://localhost:5678/&lt;/b&gt;.&lt;/p&gt;
&lt;p&gt;&lt;i&gt;Note: This setup won’t work for integrations with external services like messengers that require public HTTPS access.&lt;/i&gt;&lt;/p&gt;
&lt;p&gt;The easiest way to forward the port is with the popular SSH client &lt;a href=&quot;https://www.chiark.greenend.org.uk/~sgtatham/putty/latest.html&quot; target=&quot;_blank&quot;&gt;PuTTY&lt;/a&gt;. Once installed, open &lt;b translate=&quot;no&quot;&gt;SSH → Tunnels&lt;/b&gt; and set &lt;b translate=&quot;no&quot;&gt;Source port - 5678&lt;/b&gt; and &lt;b translate=&quot;no&quot;&gt;Destination&lt;/b&gt; - &lt;b translate=&quot;no&quot;&gt;localhost:5678&lt;/b&gt;. Then click &lt;b translate=&quot;no&quot;&gt;Add&lt;/b&gt;.&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/155/original/sh_how_to_install_n8n_4.png?1750667446&quot; alt=&quot;PuTTY port forwarding&quot;&gt;
&lt;p&gt;Go back to &lt;b translate=&quot;no&quot;&gt;Session&lt;/b&gt;, enter your server’s IP, and click &lt;b translate=&quot;no&quot;&gt;Open&lt;/b&gt;. Once authenticated, the tunnel is active. Open &lt;b translate=&quot;no&quot;&gt;http://localhost:5678&lt;/b&gt; in a browser to access N8N.&lt;/p&gt;
&lt;p&gt;&lt;i&gt;Note: The connection only works while the SSH session is active. Closing PuTTY will terminate the tunnel.&lt;/i&gt;&lt;/p&gt;
&lt;h3&gt;Bypass&lt;/h3&gt;
&lt;p&gt;This method is not recommended for use on public networks. If you launch the container with the &lt;b translate=&quot;no&quot;&gt;N8N_SECURE_COOKIE=false&lt;/b&gt; environment variable, the warning will disappear, and you’ll get access via HTTP:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo docker run -d --name n8n -p 5678:5678 -e N8N_SECURE_COOKIE=false -v n8n_data:/home/node/.n8n docker.n8n.io/n8nio/n8n&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;b translate=&quot;no&quot;&gt;Warning:&lt;/b&gt; this exposes the N8N admin panel via unencrypted HTTP, making it vulnerable to MITM (Man-In-The-Middle) attacks and potentially allows an attacker to fully take over your server.&lt;/p&gt;
&lt;p&gt;See also:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/584-open-webui-all-in-one&quot;&gt;Open WebUI: All in one&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/601-low-code-ai-app-builder-langflow&quot;&gt;Low-code AI app builder Langflow&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/602-how-to-monitor-langflow-application&quot;&gt;How to monitor LangFlow application&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description>
      <enclosure url="https://assets.getwildcard.com/system/images/imgs/000/001/151/original/il_how_to_install_n8n.png?1750667003"
        length="0"
        type="image/jpeg"/>
      <pubDate>Mon, 23 Jun 2025 14:30:26 +0200</pubDate>
      <guid isPermaLink="false">622</guid>
      <dc:date>2025-06-23 14:30:26 +0200</dc:date>
    </item>
    <item>
      <title>Triton™ Inference Server</title>
      <link>https://www.leadergpu.com/catalog/614-triton-inference-server</link>
      <description>&lt;p&gt;Business requirements may vary, but they all share one core principle: systems must operate quickly and deliver the highest possible quality. When dealing with neural network inference, efficient use of computing resources becomes crucial. Any GPU underutilization or idle time directly translates to financial losses.&lt;/p&gt;
&lt;p&gt;Consider a marketplace as an example. These platforms host numerous products, each with multiple attributes: text descriptions, technical specifications, categories, and multimedia content like photos and videos. All content requires moderation to maintain fair conditions for sellers and prevent prohibited goods or illegal content from appearing on the platform.&lt;/p&gt;
&lt;p&gt;While manual moderation is possible, it’s slow and inefficient. In today’s competitive environment, sellers need to expand their product range quickly: the faster items appear on the marketplace, the better chances of being discovered and purchased. Manual moderation is also costly and prone to human error, potentially allowing inappropriate content through.&lt;/p&gt;
&lt;p&gt;Automatic moderation using specially trained neural networks offers a solution. This approach brings multiple benefits: it substantially reduces the moderation costs while typically improving quality. Neural networks process content much faster than humans, allowing sellers to clear the moderation stage more quickly, especially when handling large products volumes.&lt;/p&gt;
&lt;p&gt;The approach does have its challenges. Implementing automated moderation requires developing and training neural network models, demanding both skilled personnel and substantial computing resources. However, the benefits become apparent quickly after initial implementation. Adding automated model deployment can significantly streamline ongoing operations.&lt;/p&gt;
&lt;h2&gt;Inference&lt;/h2&gt;
&lt;p&gt;Assume we’ve figured out the machine learning procedures. The next step is determining how to run model inference on a rented server. For a single model, you typically choose a tool that works well with the specific framework it was built on. However, when dealing with multiple models created in different frameworks, you have two options.&lt;/p&gt;
&lt;p&gt;You can either convert all models to a single format, or choose a tool that supports multiple frameworks. Triton™ Inference Server fits perfectly with the second approach. It supports the following backends:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;TensorRT™&lt;/li&gt;
    &lt;li&gt;TensorRT-LLM&lt;/li&gt;
    &lt;li&gt;vLLM&lt;/li&gt;
    &lt;li&gt;Python&lt;/li&gt;
    &lt;li&gt;PyTorch (LibTorch)&lt;/li&gt;
    &lt;li&gt;ONNX Runtime&lt;/li&gt;
    &lt;li&gt;Tensorflow&lt;/li&gt;
    &lt;li&gt;FIL&lt;/li&gt;
    &lt;li&gt;DALI&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Additionally, you can use any application as a backend. For instance, if you need post-processing with a C/C++ application, you can integrate it seamlessly.&lt;/p&gt;
&lt;h2&gt;Scaling&lt;/h2&gt;
&lt;p&gt;Triton™ Inference Server efficiently manages computing resources on a single server by running multiple models simultaneously and distributing the workload across GPUs.&lt;/p&gt;
&lt;p&gt;Installation is done through a Docker container. DevOps engineers can control GPU allocation at startup, choosing to use all GPUs or limit their number. While the software doesn’t handle horizontal scaling directly, you can use traditional load balancers like HAproxy or deploy applications in a Kubernetes cluster for this purpose.&lt;/p&gt;
&lt;h2&gt;Preparing the system&lt;/h2&gt;
&lt;p&gt;To set up Triton™ on a LeaderGPU server running Ubuntu 22.04, first update the system using this command:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt update &amp;&amp; sudo apt -y upgrade&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;First, install the NVIDIA® drivers using the autoinstaller script:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo ubuntu-drivers autoinstall&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Reboot the server to apply the changes:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo shutdown -r now&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Once the server is back online, install Docker using the following installation script:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;curl -sSL https://get.docker.com/ | sh&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Since Docker can’t pass through GPUs to containers by default, you’ll need the NVIDIA® Container Toolkit. Add the NVIDIA® repository by downloading and registering its GPG key:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&amp;&amp; curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed &#39;s#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g&#39; | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Update the packages cache and install the toolkit:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt update &amp;&amp; sudo apt -y install nvidia-container-toolkit&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Restart Docker to enable the new capabilities:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo systemctl restart docker&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The operating system is now ready to use.&lt;/p&gt;
&lt;h2&gt;Installing Triton™ Inference Server&lt;/h2&gt;
&lt;p&gt;Let’s download the project repository:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;git clone https://github.com/triton-inference-server/server&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This repository contains pre-configured neural network samples and a model download script. Navigate to the examples directory:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd server/docs/examples&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Download the models by running the following script, which will save them to &lt;b translate=&quot;no&quot;&gt;~/server/docs/examples/model_repository&lt;/b&gt;:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;./fetch_models.sh&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Triton™ Inference Server’s architecture requires models to be stored separately. You can store them either locally in any server directory or on network storage. When starting the server, you’ll need to mount this directory to the container at the /models mount point. This serves as a repository for all model versions.&lt;/p&gt;
&lt;p&gt;Launch the container with this command&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo docker run --gpus=all --rm -p8000:8000 -p8001:8001 -p8002:8002 -v ~/server/docs/examples/model_repository:/models nvcr.io/nvidia/tritonserver:25.01-py3 tritonserver --model-repository=/models&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Here’s what each parameter does:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;--gpus=all&lt;/b&gt; specifies that all available GPUs will be used in the server;&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;--rm&lt;/b&gt; destroy the container after the process is completed or stopped;&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;-p8000:8000&lt;/b&gt; forwards port 8000 to receive HTTP requests;&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;-p8001:8001&lt;/b&gt; forwards port 8001 to receive gRPC requests;&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;-p8002:8002&lt;/b&gt; forwards port 8002 to request metrics;&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;-v ~/server/docs/examples/model_repository:/models&lt;/b&gt; forwards the directory with models;&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;nvcr.io/nvidia/tritonserver:25.01-py3&lt;/b&gt; address of the container from the NGC™ catalog;&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;tritonserver --model-repository=/models&lt;/b&gt; launch the Triton™ Inference Server with the location of the models repository at /models.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The command output will show all available models in the repository, each ready to accept requests:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;+----------------------+---------+--------+
| Model                | Version | Status |
+----------------------+---------+--------+
| densenet_onnx        | 1       | READY  |
| inception_graphdef   | 1       | READY  |
| simple               | 1       | READY  |
| simple_dyna_sequence | 1       | READY  |
| simple_identity      | 1       | READY  |
| simple_int8          | 1       | READY  |
| simple_sequence      | 1       | READY  |
| simple_string        | 1       | READY  |
+----------------------+---------+--------+&lt;/pre&gt;
&lt;p&gt;The three services have been successfully launched on ports 8000, 8001, and 8002:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;I0217 08:00:34.930188 1 grpc_server.cc:2466] Started GRPCInferenceService at 0.0.0.0:8001
I0217 08:00:34.930393 1 http_server.cc:4636] Started HTTPService at 0.0.0.0:8000
I0217 08:00:34.972340 1 http_server.cc:320] Started Metrics Service at 0.0.0.0:8002&lt;/pre&gt;
&lt;p&gt;Using the nvtop utility, we can verify that all GPUs are ready to accept the load:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/132/original/sh_triton_inference_server_1.png?1740580538&quot; alt=&quot;8 x A6000 Triton Inference Server examples&quot;&gt;
&lt;h2&gt;Installing the client&lt;/h2&gt;
&lt;p&gt;To access our server, we’ll need to generate an appropriate request using the client included in the SDK. We can download this SDK as a Docker container:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo docker pull nvcr.io/nvidia/tritonserver:25.01-py3-sdk&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Run the container in interactive mode to access the console:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo docker run -it --gpus=all --rm --net=host nvcr.io/nvidia/tritonserver:25.01-py3-sdk&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let’s test this with the DenseNet model in ONNX format, using the INCEPTION method to preprocess and analyze image &lt;b translate=&quot;no&quot;&gt;mug.jpg&lt;/b&gt;:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-root&quot;&gt;/workspace/install/bin/image_client -m densenet_onnx -c 3 -s INCEPTION /workspace/images/mug.jpg&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The client will contact the server, which will create a batch and process it using the container’s available GPUs. Here’s the output:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot;&gt;Request 0, batch size 1
Image &#39;/workspace/images/mug.jpg&#39;:
   15.349562 (504) = COFFEE MUG
   13.227461 (968) = CUP
   10.424891 (505) = COFFEEPOT&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Preparing the repository&lt;/h2&gt;
&lt;p&gt;For Triton™ to manage models correctly, you must prepare the repository in a specific way. Here’s the directory structure:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;model_repository/ 
        └── your_model/ 
                ├── config.pbtxt 
                └── 1/
                    └── model.*&lt;/pre&gt;
&lt;p&gt;Each model needs its own directory containing a &lt;b translate=&quot;no&quot;&gt;config.pbtxt&lt;/b&gt; configuration file with its description. Here’s an example:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot;&gt;name: &quot;Test&quot;
platform: &quot;pytorch_libtorch&quot;
max_batch_size: 8
input [
  {
    name: &quot;INPUT_0&quot;
    data_type: TYPE_FP32
    dims: [ 3, 224, 224 ]
  }
]
output [
  {
    name: &quot;OUTPUT_0&quot;
    data_type: TYPE_FP32
    dims: [ 1000 ]
  }
]&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In this example, a model named &lt;b translate=&quot;no&quot;&gt;Test&lt;/b&gt; will run on the PyTorch backend. The &lt;b translate=&quot;no&quot;&gt;max_batch_size&lt;/b&gt; parameter sets the maximum number of items that can be processed simultaneously, enabling efficient load balancing across resources. Setting this value to zero disables batching, causing the model to process requests sequentially.&lt;/p&gt;
&lt;p&gt;The model accepts one input and produces one output, both using the FP32 number type. The parameters must match the model’s requirements exactly. For image processing, a typical dimension specification is &lt;b translate=&quot;no&quot;&gt;dims: [ 3, 224, 224 ]&lt;/b&gt;, where:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;3&lt;/b&gt; - number of color channels (RGB);&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;224&lt;/b&gt; - image height in pixels;&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;224&lt;/b&gt; - image width in pixels.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The output &lt;b translate=&quot;no&quot;&gt;dims: [ 1000 ]&lt;/b&gt; represents a one-dimensional vector of 1000 elements, which suits image classification tasks. To determine the correct dimensionality for your model, consult its documentation. If the configuration file is incomplete, Triton™ will attempt to generate any missing parameters automatically.&lt;/p&gt;
&lt;h2&gt;Launching a custom model&lt;/h2&gt;
&lt;p&gt;Let’s launch the inference of the distilled DeepSeek-R1 model we &lt;a href=&quot;https://www.leadergpu.com/catalog/613-deepseek-r1-future-of-llms&quot;&gt;discussed&lt;/a&gt; earlier. First, we’ll create the necessary directory structure:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;mkdir ~/model_repository &amp;&amp; mkdir ~/model_repository/deepseek &amp;&amp; mkdir ~/model_repository/deepseek/1&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Navigate to the model directory:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd ~/model_repository/deepseek&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Create a configuration file &lt;b translate=&quot;no&quot;&gt;config.pbtxt&lt;/b&gt;:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;nano config.pbtxt&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Paste the following:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot;&gt;# Copyright 2023, NVIDIA CORPORATION &amp; AFFILIATES. All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
#  * Redistributions of source code must retain the above copyright
#    notice, this list of conditions and the following disclaimer.
#  * Redistributions in binary form must reproduce the above copyright
#    notice, this list of conditions and the following disclaimer in the
#    documentation and/or other materials provided with the distribution.
#  * Neither the name of NVIDIA CORPORATION nor the names of its
#    contributors may be used to endorse or promote products derived
#    from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS&#39;&#39; AND ANY
# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
# PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE COPYRIGHT OWNER OR
# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
# PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
# OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
    
# Note: You do not need to change any fields in this configuration.
    
backend: &quot;vllm&quot;
    
# The usage of device is deferred to the vLLM engine
instance_group [
  {
    count: 1
    kind: KIND_MODEL
  }
]&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Save the file by pressing &lt;b translate=&quot;no&quot;&gt;Ctrl + O&lt;/b&gt;, then the editor with &lt;b translate=&quot;no&quot;&gt;Ctrl + X&lt;/b&gt;. Navigate to the directory &lt;b translate=&quot;no&quot;&gt;1&lt;/b&gt;:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd 1&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Create a model configuration file &lt;b translate=&quot;no&quot;&gt;model.json&lt;/b&gt; with the following parameters:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot;&gt;{
    &quot;model&quot;:&quot;deepseek-ai/DeepSeek-R1-Distill-Llama-8B&quot;,
    &quot;disable_log_requests&quot;: true,
    &quot;gpu_memory_utilization&quot;: 0.9,
    &quot;enforce_eager&quot;: true
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Note that the &lt;b translate=&quot;no&quot;&gt;gpu_memory_utilization&lt;/b&gt; value varies by GPU and should be determined experimentally. For this guide, we’ll use &lt;b translate=&quot;no&quot;&gt;0.9&lt;/b&gt;. Your directory structure inside &lt;b translate=&quot;no&quot;&gt;~/model_repository&lt;/b&gt; should now look like this:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;└── deepseek
        ├── 1
        │   └── model.json
        └── config.pbtxt&lt;/pre&gt;
&lt;p&gt;Set the &lt;b translate=&quot;no&quot;&gt;LOCAL_MODEL_REPOSITORY&lt;/b&gt; variable for convenience:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;LOCAL_MODEL_REPOSITORY=~/model_repository/&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Start the inference server with this command:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo docker run --rm -it --net host --shm-size=2g  --ulimit memlock=-1 --ulimit stack=67108864 --gpus all -v $LOCAL_MODEL_REPOSITORY:/opt/tritonserver/model_repository  nvcr.io/nvidia/tritonserver:25.01-vllm-python-py3 tritonserver --model-repository=model_repository/&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Here’s what each parameter does:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;--rm&lt;/b&gt; automatically removes the container after stopping;&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;-it&lt;/b&gt; runs the container in interactive mode with terminal output;&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;--net&lt;/b&gt; host uses the host’s network stack instead of container isolation;&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;--shm-size=2g&lt;/b&gt; sets shared memory to 2 GB;&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;--ulimit memlock=-1&lt;/b&gt; removes memory lock limit;&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;--ulimit stack=67108864&lt;/b&gt; sets stack size to 64 MB;&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;--gpus all&lt;/b&gt; enables access to all server GPUs;&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;-v $LOCAL_MODEL_REPOSITORY:/opt/tritonserver/model_repository&lt;/b&gt; mounts the local model directory in the container;&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;nvcr.io/nvidia/tritonserver:25.01-vllm-python-py3&lt;/b&gt; specifies the container with vLLM backend support;&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;tritonserver --model-repository=model_repository/&lt;/b&gt; launch the Triton™ Inference Server with the location of the models repository at model_repository.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Test the server by sending a request with &lt;b translate=&quot;no&quot;&gt;curl&lt;/b&gt;, using a simple prompt and a 4096 token response limit:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;curl -X POST localhost:8000/v2/models/deepseek/generate -d &#39;{&quot;text_input&quot;: &quot;Tell me about the Netherlands?&quot;, &quot;max_tokens&quot;: 4096}&#39;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The server successfully receives and processes the request.&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/133/original/sh_triton_inference_server_2.png?1740580601&quot; alt=&quot;Triton Inference Server processed the test request&quot;&gt;
&lt;p&gt;The internal Triton™ task scheduler handles all incoming requests when the server is under load.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Triton™ Inference Server excels at deploying machine learning models in production by efficiently distributing requests across available GPUs. This maximizes the use of rented server resources and reduces computing infrastructure costs. The software works with various backends, including vLLM for large language models.&lt;/p&gt;
&lt;p&gt;Since it installs as a Docker container, you can easily integrate it into any modern CI/CD pipeline. Try it yourself by &lt;a href=&quot;https://www.leadergpu.com/#chose-best&quot;&gt;renting a server&lt;/a&gt; from LeaderGPU.&lt;/p&gt;</description>
      <enclosure url="https://assets.getwildcard.com/system/images/imgs/000/001/134/original/il_triton_inference_server.png?1740583888"
        length="0"
        type="image/jpeg"/>
      <pubDate>Wed, 26 Feb 2025 16:40:21 +0100</pubDate>
      <guid isPermaLink="false">614</guid>
      <dc:date>2025-02-26 16:40:21 +0100</dc:date>
    </item>
    <item>
      <title>DeepSeek-R1: future of LLMs</title>
      <link>https://www.leadergpu.com/catalog/613-deepseek-r1-future-of-llms</link>
      <description>&lt;p&gt;While generative neural networks have been developing rapidly, their progress in recent years has remained fairly steady. This changed with the arrival of DeepSeek, a Chinese neural network that not only impacted the stock market but also captured the attention of developers and researchers worldwide. In contrast to other major projects, DeepSeek’s code was released under the permissive MIT license. This move towards open source earned praise from the community, who eagerly began exploring the new model’s capabilities.&lt;/p&gt;
&lt;p&gt;The most impressive aspect was that training this new neural network reportedly cost 20 times less than competitors offering similar quality. The model required just 55 days and $5.6 million to train. When DeepSeek was released, it triggered one of the largest single-day drops in US stock market history. Though markets eventually stabilized, the impact was significant.&lt;/p&gt;
&lt;p&gt;This article will examine how accurately media headlines reflect reality and explore which LeaderGPU configurations are suitable for installing this neural network yourself.&lt;/p&gt;
&lt;h2&gt;Architectural features&lt;/h2&gt;
&lt;p&gt;DeepSeek has chosen a path of maximum optimization, unsurprising given China’s U.S. export restrictions. These restrictions prevent the country from officially using the most advanced GPU models for AI development.&lt;/p&gt;
&lt;p&gt;The model employs Multi Token Prediction (MTP) technology, which predicts multiple tokens in a single inference step instead of just one. This works through parallel token decoding combined with special masked layers that maintain autoregressivity.&lt;/p&gt;
&lt;p&gt;MTP testing has shown remarkable results, increasing generation speeds by 2-4 times compared to traditional methods. The technology’s excellent scalability makes it valuable for current and future natural language processing applications.&lt;/p&gt;
&lt;p&gt;The Multi-Head Latent Attention (MLA) model features an enhanced attention mechanism. As the model builds long chains of reasoning, it maintains focused attention on the context at each stage. This enhancement improves its handling of abstract concepts and text dependencies.&lt;/p&gt;
&lt;p&gt;MLA’s key feature is its ability to dynamically adjust attention weights across different abstraction levels. When processing complex queries, MLA examines data from multiple perspectives: word meanings, sentence structures, and overall context. These perspectives form distinct layers that influence the final output. To maintain clarity, MLA carefully balances each layer’s impact while staying focused on the primary task.&lt;/p&gt;
&lt;p&gt;DeepSeek’s developers incorporated Mixture of Experts (MoE) technology into the model. It contains 256 pre-trained expert neural networks, each specialized for different tasks. The system activates 8 of these networks for each token input, enabling efficient data processing without increasing computational costs.&lt;/p&gt;
&lt;p&gt;In the full model with 671b parameters, only 37b are activated for each token. The model intelligently selects the most relevant parameters for processing each incoming token. This efficient optimization saves computational resources while maintaining high performance.&lt;/p&gt;
&lt;p&gt;A crucial feature of any neural network chatbot is its context window length. Llama 2 has a context limit of 4,096 tokens, GPT-3.5 handles 16,284 tokens, while GPT-4 and DeepSeek can process up to 128,000 tokens (about 100,000 words, equivalent to 300 pages of typewritten text).&lt;/p&gt;
&lt;h2&gt;R - stands for Reasoning&lt;/h2&gt;
&lt;p&gt;DeepSeek-R1 has acquired a reasoning mechanism similar to OpenAI o1, enabling it to handle complex tasks more efficiently and accurately. Instead of providing immediate answers, the model expands the context by generating step-by-step reasoning in small paragraphs. This approach enhances the neural network’s ability to identify complex data relationships, resulting in more comprehensive and precise answers.&lt;/p&gt;
&lt;p&gt;When faced with a complex task, DeepSeek uses its reasoning mechanism to break down the problem into components and analyze each one separately. The model then synthesizes these findings to generate a user response. While this appears to be an ideal approach for neural networks, it comes with significant challenges.&lt;/p&gt;
&lt;p&gt;All modern LLMs share a concerning trait - artificial hallucinations. When presented with a question it cannot answer, instead of acknowledging its limitations, the model might generate fictional answers supported by made-up facts.&lt;/p&gt;
&lt;p&gt;When applied to a reasoning neural network, these hallucinations could compromise the thought process by basing conclusions on fictional rather than factual information. This could lead to incorrect conclusions - a challenge that neural network researchers and developers will need to address in the future.&lt;/p&gt;
&lt;h2&gt;VRAM consumption&lt;/h2&gt;
&lt;p&gt;Let’s explore how to run and test DeepSeek R1 on a dedicated server, focusing on the GPU video memory requirements.&lt;/p&gt;
&lt;table style=&quot;margin: auto;&quot; width=&quot;50%&quot;&gt;
    &lt;th&gt;Model&lt;/th&gt;
    &lt;th&gt;VRAM (Mb)&lt;/th&gt;
    &lt;th&gt;Model size (Gb)&lt;/th&gt;
    &lt;tr&gt;
        &lt;td&gt;deepseek-r1:1.5b&lt;/td&gt;
        &lt;td&gt;1,952&lt;/td&gt;
        &lt;td&gt;1.1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td&gt;deepseek-r1:7b&lt;/td&gt;
        &lt;td&gt;5,604&lt;/td&gt;
        &lt;td&gt;4.7&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td&gt;deepseek-r1:8b&lt;/td&gt;
        &lt;td&gt;6,482&lt;/td&gt;
        &lt;td&gt;4.9&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td&gt;deepseek-r1:14b&lt;/td&gt;
        &lt;td&gt;10,880&lt;/td&gt;
        &lt;td&gt;9&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td&gt;deepseek-r1:32b&lt;/td&gt;
        &lt;td&gt;21,758&lt;/td&gt;
        &lt;td&gt;20&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td&gt;deepseek-r1:70b&lt;/td&gt;
        &lt;td&gt;39,284&lt;/td&gt;
        &lt;td&gt;43&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td&gt;deepseek-r1:671b&lt;/td&gt;
        &lt;td&gt;470,091&lt;/td&gt;
        &lt;td&gt;404&lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;
&lt;p&gt;The first three options (1.5b, 7b, 8b) are basic models that can handle most tasks efficiently. These models run smoothly on any consumer GPU with 6-8 GB of video memory. The mid-tier versions (14b and 32b) are ideal for professional tasks but require more VRAM. The largest models (70b and 671b) require specialized GPUs and are primarily used for research and industrial applications.&lt;/p&gt;
&lt;h2&gt;Server selection&lt;/h2&gt;
&lt;p&gt;To help you choose a server for DeepSeek inference, here are the ideal LeaderGPU configurations for each model group:&lt;/p&gt;
&lt;h3&gt;1.5b / 7b / 8b / 14b / 32b / 70b&lt;/h3&gt;
&lt;p&gt;For this group, any server with the following GPU types will be suitable. Most LeaderGPU servers will run these neural networks without any issues. Performance will mainly depend on the number of CUDA® cores. We recommend servers with multiple GPUs, such as:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/?fltr_type%5B%5D=a40#filter_block&quot;&gt;A40&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/?fltr_type%5B%5D=l20#filter_block&quot;&gt;L20&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;671b&lt;/h3&gt;
&lt;p&gt;Now for the most challenging case: how do you run inference on a model with a 404 GB base size? This means approximately 470 GB of video memory will be required. LeaderGPU offers multiple configurations with the following GPUs capable of handling this load:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/?fltr_type%5B%5D=a100#filter_block&quot;&gt;A100&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/?fltr_type%5B%5D=h100#filter_block&quot;&gt;H100&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Both configurations handle the model load efficiently, distributing it evenly across multiple GPUs. For example, this is what a server with 8xH100 looks like after loading the deepseek-r1:671b model:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/126/original/sh_deepseek-r1_future_of_LLMs_1.png?1739199426&quot; alt=&quot;deepseek-r1:671b on 8xH100&quot;&gt;
&lt;p&gt;The computational load balances dynamically across GPUs, while high-speed NVLink® interconnects prevent data exchange bottlenecks, ensuring maximum performance.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;DeepSeek-R1 combines many innovative technologies like Multi Token Prediction, Multi-Head Latent Attention, and Mixture of Experts into one significant model. This open-source software demonstrates that LLMs can be developed more efficiently with fewer computational resources. The model has various versions from smaller 1.5b to huge 671b which require specialized hardware with multiple high-end GPUs working in parallel.&lt;/p&gt;
&lt;p&gt;By renting a server from LeaderGPU for DeepSeek-R1 inference, you get a wide range of configurations, reliability, and fault tolerance. Our technical support team will help you with any problems or questions, while the automatic operating system installation reduces deployment time.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://www.leadergpu.com/#chose-best&quot;&gt;Choose your LeaderGPU server&lt;/a&gt; and discover the possibilities that open up when using modern neural network models. If you have any questions, don’t hesitate to ask them in our chat or &lt;a href=&quot;mailto:info@leadergpu.com&quot;&gt;email&lt;/a&gt;.&lt;/p&gt;</description>
      <enclosure url="https://assets.getwildcard.com/system/images/imgs/000/001/125/original/il_deepseek-r1_future_of_LLMs.png?1739198303"
        length="0"
        type="image/jpeg"/>
      <pubDate>Wed, 19 Feb 2025 15:10:33 +0100</pubDate>
      <guid isPermaLink="false">613</guid>
      <dc:date>2025-02-19 15:10:33 +0100</dc:date>
    </item>
    <item>
      <title>Intel Habana Gaudi 2: install and test</title>
      <link>https://www.leadergpu.com/catalog/611-intel-habana-gaudi-2-install-and-test</link>
      <description>&lt;p&gt;Before you start installing the Gaudi 2 accelerators software, there is one important feature worth mentioning. We are accustomed to the fact that training and inference of neural networks can be performed using GPUs. However, Intel Habana Gaudi 2 is very different from GPUs and represents a different class of devices that are designed solely for the accelerating AI tasks.&lt;/p&gt;
&lt;p&gt;Many familiar applications and frameworks will not work without first preparing the operating system and, in some cases, without a special &lt;a href=&quot;https://docs.habana.ai/en/latest/PyTorch/PyTorch_Model_Porting/GPU_Migration_Toolkit/GPU_Migration_Toolkit.html&quot;&gt;GPU Migration Toolkit&lt;/a&gt;. This explains the large number of preparatory steps that we describe in this article. Let’s start in order.&lt;/p&gt;
&lt;h2&gt;Step 1. Install SynapseAI Software Stack&lt;/h2&gt;
&lt;p&gt;To start working with Intel Habana Gaudi 2 accelerators, you need to install the SynapseAI stack. It includes a special graph compiler that transforms the topology of the neural network model to effectively optimize execution on Gaudi architecture, API libraries for horizontal scaling, as well as a separate SDK for creating high-performance algorithms and machine learning models.&lt;/p&gt;
&lt;p&gt;Separately, we note that SynapseAI is the part that allows you to create a bridge between popular frameworks like PyTorch/TensorFlow and the Gaudi 2 AI accelerators. This allows you to work with familiar abstractions, and Gaudi 2 independently optimizes calculations Specific operators for which accelerators do not have hardware support are executed on the CPU.&lt;/p&gt;
&lt;p&gt;To simplify the installation of individual SynapseAI components, a convenient shell script has been created. Let’s download it:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;wget -nv https://vault.habana.ai/artifactory/gaudi-installer/latest/habanalabs-installer.sh&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Make the file executable:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;chmod +x habanalabs-installer.sh&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Run the script:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;./habanalabs-installer.sh install --type base&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Follow the system prompts during installation. You’ll find a detailed report in the log file. You can see in it which packages were installed, as well as whether the accelerators were successfully found and initialized.&lt;/p&gt;
&lt;p&gt;Logs here: /var/log/habana_logs/install-YYYY-MM-DD-HH-MM-SS.log&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;[  +3.881647] habanalabs hl5: Found GAUDI2 device with 96GB DRAM
[  +0.008145] habanalabs hl0: Found GAUDI2 device with 96GB DRAM
[  +0.032034] habanalabs hl3: Found GAUDI2 device with 96GB DRAM
[  +0.002376] habanalabs hl4: Found GAUDI2 device with 96GB DRAM
[  +0.005174] habanalabs hl1: Found GAUDI2 device with 96GB DRAM
[  +0.000390] habanalabs hl2: Found GAUDI2 device with 96GB DRAM
[  +0.007065] habanalabs hl7: Found GAUDI2 device with 96GB DRAM
[  +0.006256] habanalabs hl6: Found GAUDI2 device with 96GB DRAM&lt;/pre&gt;
&lt;p&gt;Just as the nvidia-smi utility provides information about installed GPUs and running compute processes, SynapseAI has a similar program. You can run it to get a report on the current state of the Gaudi 2 AI accelerators:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;hl-smi&lt;/code&gt;&lt;/pre&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/981/original/sh_intel_habana_gaudi_2_install_and_test_1.png?1714555709&quot; alt=&quot;hl-smi screenshot&quot;&gt;
&lt;h2&gt;Step 2. TensorFlow test&lt;/h2&gt;
&lt;p&gt;TensorFlow is one of the most popular platforms for machine learning. Using the same installation script, you can install a pre-built version of TensorFlow with support for Gaudi 2 accelerators. Let’s start by installing the general dependencies:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;./habanalabs-installer.sh install -t dependencies&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Next, we’ll install dependencies for TensorFlow:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;./habanalabs-installer.sh install -t dependencies-tensorflow&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Install the TensorFlow platform inside a virtual environment implemented using the Python Virtual Environment (venv) mechanism:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;./habanalabs-installer.sh install --type tensorflow --venv&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let’s activate the created virtual environment:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;source habanalabs-venv/bin/activate&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Create a simple Python code example that will utilize the capabilities of Gaudi 2 accelerators:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-venv&quot;&gt;nano example.py&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
import torchvision
import torchvision.transforms as transforms
import os
# Import Habana Torch Library
import habana_frameworks.torch.core as htcore
class SimpleModel(nn.Module):
   def __init__(self):
       super(SimpleModel, self).__init__()
       self.fc1   = nn.Linear(784, 256)
       self.fc2   = nn.Linear(256, 64)
       self.fc3   = nn.Linear(64, 10)
   def forward(self, x):
       out = x.view(-1,28*28)
       out = F.relu(self.fc1(out))
       out = F.relu(self.fc2(out))
       out = self.fc3(out)
       return out
def train(net,criterion,optimizer,trainloader,device):
   net.train()
   train_loss = 0.0
   correct = 0
   total = 0
   for batch_idx, (data, targets) in enumerate(trainloader):
       data, targets = data.to(device), targets.to(device)
       optimizer.zero_grad()
       outputs = net(data)
       loss = criterion(outputs, targets)
       loss.backward()
       # API call to trigger execution
       htcore.mark_step()
       optimizer.step()
       # API call to trigger execution
       htcore.mark_step()
       train_loss += loss.item()
       _, predicted = outputs.max(1)
       total += targets.size(0)
       correct += predicted.eq(targets).sum().item()
   train_loss = train_loss/(batch_idx+1)
   train_acc = 100.0*(correct/total)
   print(&quot;Training loss is {} and training accuracy is {}&quot;.format(train_loss,train_acc))
def test(net,criterion,testloader,device):
   net.eval()
   test_loss = 0
   correct = 0
   total = 0
   with torch.no_grad():
       for batch_idx, (data, targets) in enumerate(testloader):
           data, targets = data.to(device), targets.to(device)
           outputs = net(data)
           loss = criterion(outputs, targets)
           # API call to trigger execution
           htcore.mark_step()
           test_loss += loss.item()
           _, predicted = outputs.max(1)
           total += targets.size(0)
           correct += predicted.eq(targets).sum().item()
   test_loss = test_loss/(batch_idx+1)
   test_acc = 100.0*(correct/total)
   print(&quot;Testing loss is {} and testing accuracy is {}&quot;.format(test_loss,test_acc))
def main():
   epochs = 20
   batch_size = 128
   lr = 0.01
   milestones = [10,15]
   load_path = &#39;./data&#39;
   save_path = &#39;./checkpoints&#39;
   if(not os.path.exists(save_path)):
       os.makedirs(save_path)
   # Target the Gaudi HPU device
   device = torch.device(&quot;hpu&quot;)
   # Data
   transform = transforms.Compose([
       transforms.ToTensor(),
   ])
   trainset = torchvision.datasets.MNIST(root=load_path, train=True,
                                           download=True, transform=transform)
   trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size,
                                           shuffle=True, num_workers=2)
   testset = torchvision.datasets.MNIST(root=load_path, train=False,
                                       download=True, transform=transform)
   testloader = torch.utils.data.DataLoader(testset, batch_size=batch_size,
                                           shuffle=False, num_workers=2)
   net = SimpleModel()
   net.to(device)
   criterion = nn.CrossEntropyLoss()
   optimizer = optim.SGD(net.parameters(), lr=lr,
                       momentum=0.9, weight_decay=5e-4)
   scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=milestones, gamma=0.1)
   for epoch in range(1, epochs+1):
       print(&quot;=====================================================================&quot;)
       print(&quot;Epoch : {}&quot;.format(epoch))
       train(net,criterion,optimizer,trainloader,device)
       test(net,criterion,testloader,device)
       torch.save(net.state_dict(), os.path.join(save_path,&#39;epoch_{}.pth&#39;.format(epoch)))
       scheduler.step()
if __name__ == &#39;__main__&#39;:
   main()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Finally, execute the application:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-venv&quot;&gt;python3 example.py&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;To exit the virtual environment, run the following command:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-venv&quot;&gt;deactivate&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Step 3. Clone training repository&lt;/h2&gt;
&lt;p&gt;Clone the repository with the MLperf code:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;git clone https://github.com/mlcommons/training_results_v3.0&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Create a separate directory that will be used by the Docker container with MLperf:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;mkdir -p mlperf&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Change the directory:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd mlperf&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let’s export some environment variables:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;export MLPERF_DIR=/home/usergpu/mlperf&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;export SCRATCH_DIR=/home/usergpu/mlperf/scratch&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;export DATASETS_DIR=/home/usergpu/mlperf/datasets&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Create new directories using the variables created:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;mkdir -p $MLPERF_DIR/Habana&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;mkdir -p $SCRATCH_DIR&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;mkdir -p $DATASETS_DIR&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Copy the benchmark app to $MLPERF_DIR/Habana:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cp -R training_results_v3.0/Intel-HabanaLabs/benchmarks/ $MLPERF_DIR/Habana&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Export another variable that will store a link to download the desired version of the Docker container:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;export MLPERF_DOCKER_IMAGE=vault.habana.ai/gaudi-docker-mlperf/ver3.1/pytorch-installer-2.0.1:1.13.99-41&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Step 4. Install Docker&lt;/h2&gt;
&lt;p&gt;Our instance runs Ubuntu Linux 22.04 LTS and does not support Docker by default. So, before downloading and running containers, you need to install Docker support. Let’s refresh the packages cache and install some basic packages that you’ll need later:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt update &amp;&amp; sudo apt -y install apt-transport-https ca-certificates curl software-properties-common&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;To install Docker, you need to add a digitally signed project repository. Download the digital signature key and add it to the operating system’s key store:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Docker can run on platforms with various architectures. The following command will detect your server’s architecture and add the corresponding repository line to the APT package manager list:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;echo &quot;deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable&quot; | sudo tee /etc/apt/sources.list.d/docker.list &gt; /dev/null&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Update the packages cache and policies and install docker-ce (Docker Community Edition):&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt update &amp;&amp; apt-cache policy docker-ce &amp;&amp; sudo apt install docker-ce&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Finally, check that Docker daemon is up and running:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo systemctl status docker&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Step 5. Run Docker container&lt;/h2&gt;
&lt;p&gt;Let’s launch the container in privileged mode using the previously specified variables:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo docker run --privileged --security-opt seccomp=unconfined \
  --name mlperf3.0 -td                    \
  -v /dev:/dev                            \
  --device=/dev:/dev                      \
  -e LOG_LEVEL_ALL=6                      \
  -v /sys/kernel/debug:/sys/kernel/debug  \
  -v /tmp:/tmp                            \
  -v $MLPERF_DIR:/root/MLPERF             \
  -v $SCRATCH_DIR:/root/scratch           \
  -v $DATASETS_DIR:/root/datasets/        \
  --cap-add=sys_nice --cap-add=SYS_PTRACE \
  --user root --workdir=/root --net=host  \
  --ulimit memlock=-1:-1 $MLPERF_DOCKER_IMAGE&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;For convenience, you can gain access to the terminal inside the container via SSH:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo docker exec mlperf3.0 bash -c &quot;service ssh start&quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;To open a command shell (bash) in the current session, run the following command:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo docker exec -it mlperf3.0 bash&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Step 6. Prepare a dataset&lt;/h2&gt;
&lt;p&gt;To run Bert implementation tests from MLperf, you need a prepared dataset. The optimal method is to generate a dataset from preloaded data. The MLperf repository includes a special script, prepare_data.sh, which requires a specific set of packages to function. Let’s navigate to the following directory:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd /root/MLPERF/Habana/benchmarks/bert/implementations/PyTorch&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Install all required packages using the pre-generated list and the pip package manager:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;pip install -r requirements.txt&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Set the PYTORCH_BERT_DATA variable to instruct the script where to store data:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;export PYTORCH_BERT_DATA=/root/datasets/pytorch_bert&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Run the script:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;bash input_preprocessing/prepare_data.sh -o $PYTORCH_BERT_DATA&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The generation procedure is quite long and can take several hours. Please be patient and do not interrupt the process. If you plan to disconnect from the SSH session, it is recommended to use the &lt;a href=&quot;https://www.geeksforgeeks.org/screen-command-in-linux-with-examples/&quot;&gt;screen&lt;/a&gt; utility immediately before starting the Docker container.&lt;/p&gt;
&lt;h2&gt;Step 7. Pack the dataset&lt;/h2&gt;
&lt;p&gt;The next step is to “cut” the dataset into equal pieces for the subsequent launch of MLperf. Let’s create the separate directory for packed data:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;mkdir $PYTORCH_BERT_DATA/packed&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Run the packing script:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;python3 pack_pretraining_data_pytorch.py \
  --input_dir=$PYTORCH_BERT_DATA/hdf5/training-4320/hdf5_4320_shards_uncompressed \
  --output_dir=$PYTORCH_BERT_DATA/packed \
  --max_predictions_per_seq=76&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Step 8. Run a test&lt;/h2&gt;
&lt;p&gt;Now that the dataset is prepared, it’s time to run the test. However, it’s impossible to do this without prior preparation. The Bert test authors left some hard-coded values in the script, which will interfere with the test execution. First, rename the following directory:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;mv $PYTORCH_BERT_DATA/packed $PYTORCH_BERT_DATA/packed_data_500_pt&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Change the directory:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd /root/MLPERF/Habana/benchmarks/bert/implementations/HLS-Gaudi2-PT&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Since the GNU Nano editor isn’t installed inside the container, it must be installed separately. Alternatively, you can use the built-in Vi editor:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;apt update &amp;&amp; apt -y install nano&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now, edit the test launch script:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;nano launch_bert_pytorch.sh&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Find the first line:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;DATA_ROOT=/mnt/weka/data/pytorch/bert_mlperf/packed_data&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Replace with the following:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;DATA_ROOT=/root/datasets/pytorch_bert&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Find the second line:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;INPUT_DIR=$DATA_ROOT/packed&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Replace with the following:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;INPUT_DIR=$DATA_ROOT/packed_data_500_pt&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Save the file and exit.&lt;/p&gt;
&lt;p&gt;The test code includes a limiter function that restricts the gradient from exceeding certain values, preventing potential exponential growth. For reasons unknown to us, this function is absent in the PyTorch version used in the container, causing the test to terminate abnormally during the warm-up stage.&lt;/p&gt;
&lt;p&gt;A potential workaround might be to temporarily remove this function from the code in the fastddp.py file. To do this, open the file:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;nano ../PyTorch/fastddp.py&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Find and comment out the following three lines of code using the # (shebang symbol) so they look like this:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;#from habana_frameworks.torch import _hpex_C
#    clip_global_grad_norm = _hpex_C.fused_lamb_norm(grads, 1.0)
#    _fusion_buffer.div_((clip_global_grad_norm * _all_reduce_group_size).to(_fusion_buffer.dtype))&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Also, save the file and exit. Change the directory:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd ../HLS-Gaudi2-PT&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Finally, run the script. It will take approximately 20 minutes to complete:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;./launch_bert_pytorch.sh&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;See also:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://stage.leadergpu.com/catalog/606-what-is-knowledge-distillation&quot;&gt;What is Knowledge Distillation&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/607-advantages-and-disadvantages-of-gpu-sharing&quot;&gt;Advantages and Disadvantages of GPU sharing&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/609-nvidia-rtx-50-expectations-and-reality&quot;&gt;NVIDIA® RTX™ 50: expectations vs reality&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description>
      <enclosure url="https://assets.getwildcard.com/system/images/imgs/000/000/980/original/il_intel_habana_gaudi_2_install_and_test.png?1714555676"
        length="0"
        type="image/jpeg"/>
      <pubDate>Thu, 23 Jan 2025 13:41:09 +0100</pubDate>
      <guid isPermaLink="false">611</guid>
      <dc:date>2025-01-23 13:41:09 +0100</dc:date>
    </item>
    <item>
      <title>NVIDIA® RTX™ 50: expectations and reality</title>
      <link>https://www.leadergpu.com/catalog/609-nvidia-rtx-50-expectations-and-reality</link>
      <description>&lt;p&gt;&lt;b translate=&quot;no&quot;&gt;The highlight of CES 2025 was NVIDIA® CEO Jensen Huang’s speech. The revelation of new GPU specifications within minutes caught many off guard. In this article, we’ll examine how expert predictions matched the actual announcements.&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Let’s look at the lineup first. The RTX™ 40 series launched with 6 models, ranging from the RTX™ 4060 to the RTX™ 4090. While many expected a similar range for the RTX™ 50 series, that didn’t happen. Instead, the RTX™ 50 family includes just 4 models: RTX™ 5070, RTX™ 5070 Ti, RTX™ 5080, and RTX™ 5090. We may see both the RTX™ 5050 and RTX™ 5060 in the future, but no official sources have verified these graphics cards yet.&lt;/p&gt;
&lt;h2&gt;Technological process&lt;/h2&gt;
&lt;p&gt;Moore’s law, the empirical observation that “the number of transistors in an integrated circuit doubles about every two years”, is often said to be no longer relevant to chip performance. Since 2022, Jensen Huang has repeatedly declared Moore’s law dead. Instead he proposed a new concept that emphasizes the simultaneous development of architecture, microchips, software libraries and algorithms.&lt;/p&gt;
&lt;p&gt;Together, this shift allows us to focus on overall system performance rather than just transistors count. The concept of computing efficiency has sparked ongoing discussions in the tech community. While views on this topic vary, the industry clearly faces both physical and economic barriers to further miniaturization.&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/098/original/sh_nvidia_rtx_50_expectation_and_reality_1.png?1736938750&quot; alt=&quot;RTX 5090 vs RTX 4090 - Core clock and boost clock&quot;&gt;
&lt;p&gt;Let’s take a look at the new generation GPU process technology. The presentation didn’t specifically mention this, but all previous generation cards were built on the 4N process. &lt;b translate=&quot;no&quot;&gt;The RTX™ 50 series uses a different 4NP process technology&lt;/b&gt;. At the same time, it’s important to understand that 4N and 4NP are just marketing names. The transistors themselves remain 5 nm in size.&lt;/p&gt;
&lt;p&gt;The improved 4NP process technology primarily enables higher transistor density on the chip and faster clock speeds. While experts predicted that the RTX™ 50 would use the same process technology as the RTX™ 40, they were technically incorrect, though not by much, since the transistor size remains unchanged and TSMC continues as the manufacturer.&lt;/p&gt;
&lt;h2&gt;Number of cores&lt;/h2&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/099/original/sh_nvidia_rtx_50_expectation_and_reality_2.png?1736938790&quot; alt=&quot;RTX 5090 vs RTX 4090 - CUDA cores count&quot;&gt;
&lt;p&gt;Prior to the RTX™ 50 series release, numerous data leaks revealed the GPU’s basic characteristics. Initial insider reports from July 2024 suggested the flagship would feature 24,576 cores, 192 Ray-tracing cores, and 768 Tensor cores. However, subsequent leaks adjusted these numbers to more realistic values.&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/100/original/sh_nvidia_rtx_50_expectation_and_reality_3.png?1736938814&quot; alt=&quot;RTX 5090 vs RTX 4090 - AI cores count&quot;&gt;
&lt;p&gt;The final RTX™ 5090 shipped with &lt;b translate=&quot;no&quot;&gt;21,760 CUDA® cores&lt;/b&gt; (up from the RTX™ 4090’s 16,384), &lt;b translate=&quot;no&quot;&gt;170 Ray-tracing cores&lt;/b&gt;, and &lt;b translate=&quot;no&quot;&gt;680 Tensor cores&lt;/b&gt;. This aligns with the company’s recent strategy of boosting performance not just through increased transistor count, but through comprehensive architectural optimization.&lt;/p&gt;
&lt;h2&gt;Memory&lt;/h2&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/101/original/sh_nvidia_rtx_50_expectation_and_reality_4.png?1736938865&quot; alt=&quot;RTX 5090 vs RTX 4090 - Memory capacity&quot;&gt;
&lt;p&gt;The new GPUs use of GDDR7 memory came as no surprise. Industry experts had predicted this move in 2024 after the three major manufacturers (Samsung, Micron, and SK hynix) showcased their GDDR7 prototypes in succession. NVIDIA® was generous with memory distribution: the base &lt;b translate=&quot;no&quot;&gt;RTX™ 5070&lt;/b&gt; model features &lt;b translate=&quot;no&quot;&gt;12 GB GDDR7&lt;/b&gt; on a &lt;b translate=&quot;no&quot;&gt;192-bit&lt;/b&gt; bus, while the &lt;b translate=&quot;no&quot;&gt;RTX™ 5070 Ti and RTX™ 5080&lt;/b&gt; both carry &lt;b translate=&quot;no&quot;&gt;16 GB GDDR7&lt;/b&gt; on a &lt;b translate=&quot;no&quot;&gt;256-bit&lt;/b&gt; bus. At the top end, the flagship &lt;b translate=&quot;no&quot;&gt;RTX™ 5090&lt;/b&gt; comes with a massive &lt;b translate=&quot;no&quot;&gt;32 GB GDDR7&lt;/b&gt; on a &lt;b translate=&quot;no&quot;&gt;512-bit&lt;/b&gt; bus.&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/102/original/sh_nvidia_rtx_50_expectation_and_reality_5.png?1736938884&quot; alt=&quot;RTX 5090 vs RTX 4090 - Memory throughput&quot;&gt;
&lt;p&gt;Experts initially predicted that the maximum throughput of this memory configuration would be 1,5 Tbps. However, reality surpassed these expectations, &lt;b translate=&quot;no&quot;&gt;achieving a throughput of 1,7 Tbps&lt;/b&gt;. This dramatic improvement primarily benefits the GPU’s AI processing capabilities rather than gaming performance. The new generation’s combination of high capacity and fast memory is particularly valuable for large language models and generative neural networks.&lt;/p&gt;
&lt;h2&gt;Technologies&lt;/h2&gt;
&lt;h3&gt;For gamers&lt;/h3&gt;
&lt;p&gt;Real-time ray tracing has become one of the most revolutionary GPU technologies, marking the beginning of the RTX™ line. For many consumers, this feature has been a key factor in their purchase decisions. In RTX™ 50 series cards, DLSS (Deep Learning Super Sampling) version 4 may play an equally important role. This technology significantly boosts GPU performance in games through its hybrid frame rendering approach.&lt;/p&gt;
&lt;p&gt;With DLSS enabled, instead of rendering every frame conventionally, some frames are generated in real time using AI. While early versions of this technology could only upscale frames to higher resolutions, DLSS 3 introduced a more advanced capability: for every conventionally rendered frame, it can generate an additional AI-created frame.&lt;/p&gt;
&lt;p&gt;DLSS 4 will generate three AI-powered frames for every traditionally rendered frame. &lt;b translate=&quot;no&quot;&gt;This significantly increases the frame per second (FPS) without putting heavy load on the GPU.&lt;/b&gt; The AI analyzes object and scene movement to ensure the generated frames closely match conventionally rendered ones.&lt;/p&gt;
&lt;p&gt;This raises an important question: how do we handle input lag? Since frame generation takes time, each iteration adds to the response time. A smooth picture with slow response to player actions can severely impact the gaming experience. &lt;b translate=&quot;no&quot;&gt;To address this, NVIDIA® has improved their Reflex 2 technology alongside DLSS to minimize latency.&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Specifically, Frame Warp was integrated into the system. This technology reduces game latency by updating rendered frames with the latest mouse input just before display. It enhances both multiplayer competition and single-player responsiveness.&lt;/p&gt;
&lt;h3&gt;For content creators&lt;/h3&gt;
&lt;p&gt;The RTX™ 50 series isn’t just for gaming. Video content creators will find significant value in these new GPUs. The flagship RTX™ 5090 model comes equipped with 3 encoders and 2 decoders, compared to the RTX™ 4090’s 2 encoders and 1 decoder. These components have been enhanced through collaborative development with industry leaders: Adobe, Blackmagic Design, ByteDance, and Wondershare. &lt;b translate=&quot;no&quot;&gt;As a result, the RTX™ 5090 renders video 60% faster than the RTX™ 4090 and four times faster than the RTX™ 3090.&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Beyond raw speed improvements, the quality has also been enhanced. &lt;b translate=&quot;no&quot;&gt;The 9th generation NVENC encoder delivers 5% better quality in HEVC and AV1 tasks. The AV1 Ultra Quality mode achieves better data compression while maintaining image quality, reducing file sizes by 5%.&lt;/b&gt; This means faster video rendering on the RTX™ 5090, and decreasing the time between editing and production.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Looking back six months, the experts’ predictions and expectations proved overly optimistic. As the release date approached, it became evident that the new GPUs would offer more than just additional computing units. &lt;b translate=&quot;no&quot;&gt;The key innovation would be new optimization and AI technologies enhancing existing frame rendering systems.&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;At CES 2025, during the GPU 50 series presentation, a new AI era was unveiled. This vision portrayed a world where digital assistants and robots handle complex tasks. At its core would be an ecosystem combining supercomputers for AI training, affordable inference accelerators for consumer devices, and versatile software operating both locally and in the cloud. While the full extent of this future remains uncertain, one thing is clear - we stand at the threshold of turning science fiction into reality.&lt;/p&gt;
&lt;p&gt;&lt;b translate=&quot;no&quot;&gt;LeaderGPU remains committed to providing reliable access to these cutting-edge technologies. Order your first GPU server today and begin transforming your ideas into reality.&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;See also:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/606-what-is-knowledge-distillation&quot;&gt;What is Knowledge Distillation&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/607-advantages-and-disadvantages-of-gpu-sharing&quot;&gt;Advantages and Disadvantages of GPU sharing&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/611-intel-habana-gaudi-2-install-and-test&quot;&gt;Intel Habana Gaudi 2: install and test&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description>
      <enclosure url="https://assets.getwildcard.com/system/images/imgs/000/001/098/original/sh_nvidia_rtx_50_expectation_and_reality_1.png?1736938750"
        length="0"
        type="image/jpeg"/>
      <pubDate>Thu, 23 Jan 2025 13:34:30 +0100</pubDate>
      <guid isPermaLink="false">609</guid>
      <dc:date>2025-01-23 13:34:30 +0100</dc:date>
    </item>
    <item>
      <title>Advantages and Disadvantages of GPU sharing</title>
      <link>https://www.leadergpu.com/catalog/607-advantages-and-disadvantages-of-gpu-sharing</link>
      <description>&lt;p&gt;Moore’s Law has remained relevant for nearly half a century. Processor chips continue to pack in more transistors, and technologies advance daily. As technology evolves, so does our approach to computing. The rise of certain computing tasks has significantly influenced hardware development. For instance, devices originally designed for graphics processing are now key, affordable tools for modern neural networks.&lt;/p&gt;
&lt;p&gt;The management of computing resources has also transformed. Mass services now rarely use mainframes, as they did in the 1970s and ‘80s. Instead, they prefer cloud services or building their own infrastructure. This shift has changed customer demands, with a focus on rapid, on-demand scaling and maximizing the use of allocated computing resources.&lt;/p&gt;
&lt;p&gt;Virtualization and containerization technologies emerged as solutions. Applications are now packaged in containers with all necessary libraries, simplifying deployment and scaling. However, manual management became impractical as container numbers soared into the thousands. Specialized orchestrators like Kubernetes now handle effective management and scaling. These tools have become an essential part of any modern IT infrastructure.&lt;/p&gt;
&lt;h2&gt;Server virtualization&lt;/h2&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/075/original/sh_advantages_and_disadvantages_of_gpu_sharing_1.png?1731504150&quot; alt=&quot;Server virtualization&quot;&gt;
&lt;p&gt;Concurrently, virtualization technologies evolved, enabling the creation of isolated environments within a single physical server. Virtual machines behave identically to regular physical servers, allowing the use of standard management tools. Depending on the hypervisor, a specialized API is often included, facilitating the automation of routine procedures.&lt;/p&gt;
&lt;p&gt;However, this flexibility comes with reduced security. Attackers have shifted their focus from targeting individual virtual machines to exploiting hypervisor’s vulnerabilities. By gaining control of a hypervisor, attackers can access all associated virtual machines at will. Despite ongoing security improvements, modern hypervisors remain attractive targets.&lt;/p&gt;
&lt;p&gt;Traditional virtualization addresses two key issues. First issue: it ensures the isolation of virtual machines from one another. Bare-metal solutions avoid this problem as customers rent entire physical servers under their control. But for virtual machines isolation is software-based at the hypervisor level. A code error or random bug can compromise this isolation, risking data leakage or corruption.&lt;/p&gt;
&lt;p&gt;The second issue concerns resource management. While it’s possible to guarantee resource allocation to specific virtual machines, managing numerous machines presents a dilemma. Resources can be underutilized, resulting in fewer virtual machines per physical server. This scenario is unprofitable for infrastructure and inevitably leads to price increases.&lt;/p&gt;
&lt;p&gt;Alternatively, you can use automatic resource management mechanisms. Although a virtual machine is allocated specific declared characteristics, in fact, only the required minimum is provided within these limits. If the machine needs more processor time or RAM, the hypervisor will attempt to provide it, but can’t guarantee it. This situation is similar to airplane overbooking, where airlines sell more tickets than there are seats available.&lt;/p&gt;
&lt;p&gt;The logic is identical. If statistics show that about 10% of passengers don&#39;t come on time for their flight, airlines can sell 10% more tickets with minimal risk. If all passengers come, some won’t fit on board. The airline will face minor consequences in the form of compensation but will likely continue this practice.&lt;/p&gt;
&lt;p&gt;Many infrastructure providers employ a similar strategy. Some are transparent about it, stating they don’t guarantee constant availability of computing resources but offer significantly reduced prices. Others use similar mechanisms without advertising it. They’re betting that not all customers will consistently use 100% of their server resources, and even if some do, they’ll be in the minority. Meanwhile, idle resources generate profit.&lt;/p&gt;
&lt;p&gt;In this context, bare-metal solutions have an advantage. They guarantee that allocated resources are fully managed by the customer and not shared with other users of the infrastructure provider. This eliminates scenarios where high load from a neighboring server’s user negatively impacts performance.&lt;/p&gt;
&lt;h2&gt;GPU virtualization&lt;/h2&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/076/original/sh_advantages_and_disadvantages_of_gpu_sharing_2.png?1731504219&quot; alt=&quot;GPU virtualization&quot;&gt;
&lt;p&gt;Classic virtualization inevitably faces the challenge of emulating physical devices. To reduce overhead costs, special technologies have been developed that allow virtual machines to directly access the server’s physical devices. This approach works well in many cases, but when applied to graphics processors, it creates immediate limitations. For instance, if a server has 8 GPUs installed, only 8 virtual machines can access them.&lt;/p&gt;
&lt;p&gt;To overcome this limitation, vGPU technology was invented. It divides one GPU into several logical ones, which can then be assigned to virtual machines. This allows each virtual machine to get its “piece of cake”, and their total number is no longer limited by the number of video cards installed in the server.&lt;/p&gt;
&lt;p&gt;Virtual GPUs are most commonly used when building VDI (Virtual Desktop Infrastructure) in areas where virtual machines require 3D acceleration. For example, a virtual workplace for a designer or planner typically involves graphics processing. Most applications in these fields perform calculations on both the central processor and the GPU. This hybrid approach significantly increases productivity and ensures optimal use of available computing resources.&lt;/p&gt;
&lt;p&gt;However, this technology has several drawbacks. It’s not supported by all GPUs and is only available in the server segment. Support also depends on the installed version of the operating system and the GPU driver. vGPU has a separate licensing mechanism, which substantially increases operations costs. Additionally, its software components can potentially serve as attack vectors.&lt;/p&gt;
&lt;p&gt;Recently, information &lt;a href=&quot;https://www.tomshardware.com/pc-components/gpu-drivers/nvidia-gpu-driver-addresses-eight-major-high-severity-vulnerabilities-nvidia-gpu-owners-should-update-asap&quot;&gt;was published&lt;/a&gt; about eight vulnerabilities affecting all users of NVIDIA® GPUs. Six vulnerabilities were identified in GPU drivers, and two were found in the vGPU software. These issues were quickly addressed, but it serves as a reminder that isolation mechanisms in such systems are not flawless. Constant monitoring and timely installation of updates remain the primary ways to ensure security.&lt;/p&gt;
&lt;p&gt;When building infrastructure to process confidential and sensitive user data, any virtualization becomes a potential risk factor. In such cases, a bare-metal approach may offer better quality and security.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Building a computing infrastructure always requires risk assessment. Key questions to consider include: Is customer data securely protected? Do the chosen technologies create additional attack vectors? How can potential vulnerabilities be isolated and eliminated? Answering these questions helps make informed choices and safeguard against future problems.&lt;/p&gt;
&lt;p&gt;At LeaderGPU, we’ve reached a clear conclusion: currently, bare-metal technology is superior in ensuring user data security while serving as an excellent foundation for building a bare-metal cloud. This approach allows our customers to maintain flexibility without taking on the added risks associated with GPU virtualization.&lt;/p&gt;
&lt;p&gt;See also:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://stage.leadergpu.com/catalog/606-what-is-knowledge-distillation&quot;&gt;What is Knowledge Distillation&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/609-nvidia-rtx-50-expectations-and-reality&quot;&gt;NVIDIA® RTX™ 50: expectations vs reality&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/611-intel-habana-gaudi-2-install-and-test&quot;&gt;Intel Habana Gaudi 2: install and test&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description>
      <enclosure url="https://assets.getwildcard.com/system/images/imgs/000/001/076/original/sh_advantages_and_disadvantages_of_gpu_sharing_2.png?1731504219"
        length="0"
        type="image/jpeg"/>
      <pubDate>Thu, 23 Jan 2025 13:24:12 +0100</pubDate>
      <guid isPermaLink="false">607</guid>
      <dc:date>2025-01-23 13:24:12 +0100</dc:date>
    </item>
    <item>
      <title>What is Knowledge Distillation</title>
      <link>https://www.leadergpu.com/catalog/606-what-is-knowledge-distillation</link>
      <description>&lt;p&gt;Large Language Models (LLMs) have become an integral part of our lives through their unique capabilities. They comprehend context and generate coherent, extensive texts based on it. They can process and respond in any language while considering the cultural nuances of each.&lt;/p&gt;
&lt;p&gt;LLMs excel at complex problem-solving, programming, maintaining conversations, and more. This versatility comes from processing vast amounts of training data, hence the term &quot;large&quot;. These models can contain tens or hundreds of billions of parameters, making them resource-intensive for everyday use.&lt;/p&gt;
&lt;p&gt;Training is the most demanding process. Neural network models learn by processing enormous datasets, adjusting their internal &quot;weights&quot; to form stable connections between neurons. These connections store knowledge that the trained neural network can later use on end devices.&lt;/p&gt;
&lt;p&gt;However, most end devices lack the necessary computing power to run these models. For instance, running the full version of Llama 2 (70B parameters) requires a GPU with 48 GB of video memory, hardware that few users have at home, let alone on mobile devices.&lt;/p&gt;
&lt;p&gt;Consequently, most modern neural networks operate in cloud infrastructure rather than on portable devices, which access them through APIs. Still, device manufacturers are making progress in two ways: equipping devices with specialized computing units like NPUs, and developing methods to improve the performance of compact neural network models.&lt;/p&gt;
&lt;h2&gt;Reducing the size&lt;/h2&gt;
&lt;h3&gt;Cut off the excess&lt;/h3&gt;
&lt;p&gt;Quantization is the first and most effective method for reducing neural network size. Neural network weights typically use 32-bit floating point numbers, but we can shrink them by changing this format. Using 8-bit values (or even binary ones in some cases) can reduce the network&#39;s size tenfold, though this significantly decreases answer accuracy.&lt;/p&gt;
&lt;p&gt;Pruning is another approach, which removes unimportant connections in the neural network. This process works during both training and with completed networks. Beyond just connections, pruning can remove neurons or entire layers. This reduction in parameters and connections leads to lower memory requirements.&lt;/p&gt;
&lt;p&gt;Matrix or tensor decomposition is the third common size-reduction technique. Breaking down one large matrix into a product of three smaller matrices reduces the total parameters while maintaining quality. This can shrink the network&#39;s size by dozens of times. Tensor decomposition offers even better results, though it requires more hyperparameters.&lt;/p&gt;
&lt;p&gt;While these methods effectively reduce size, they all face the challenge of quality loss. Large compressed models outperform their smaller, uncompressed counterparts, but each compression risks reducing answer accuracy. Knowledge distillation represents an interesting attempt to balance quality with size.&lt;/p&gt;
&lt;h3&gt;Let’s try it together&lt;/h3&gt;
&lt;p&gt;Knowledge distillation is best explained through the analogy of a student and teacher. While students learn, teachers teach and also continuously update their existing knowledge. When both encounter new knowledge, the teacher has an advantage, they can draw upon their broad knowledge from other areas, while the student lacks this foundation yet.&lt;/p&gt;
&lt;p&gt;This principle applies to neural networks. When training two neural networks of the same type but different sizes on identical data, the larger network typically performs better. Its greater capacity for &quot;knowledge&quot; enables more accurate responses than its smaller counterpart. This raises an interesting possibility: why not train the smaller network not just on the dataset, but also on the more accurate outputs of the larger network?&lt;/p&gt;
&lt;p&gt;This process is knowledge distillation: a form of supervised learning where a smaller model learns to replicate the predictions of a larger one. While this technique helps offset the quality loss from reducing neural network size, it does require extra computational resources and training time.&lt;/p&gt;
&lt;h2&gt;Software and logic&lt;/h2&gt;
&lt;p&gt;With the theoretical foundation now clear, let&#39;s examine the process from a technical perspective. We&#39;ll begin with software tools that can guide you through the training and knowledge distillation stages.&lt;/p&gt;
&lt;p&gt;Python, along with the &lt;a href=&quot;https://pytorch.org/torchtune/stable/index.html&quot;&gt;TorchTune&lt;/a&gt; library from the &lt;a href=&quot;https://pytorch.org/&quot;&gt;PyTorch&lt;/a&gt; ecosystem, offers the simplest approach for studying and fine-tuning large language models. Here&#39;s how the application works:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/092/original/il_what_is_knowledge_distillation.png?1733302736&quot; alt=&quot;What is Knowledge Distillation main illustration&quot;&gt;
&lt;p&gt;Two models are loaded: a full model (teacher) and a reduced model (student). During each training iteration, the teacher model generates high-temperature predictions while the student model processes the dataset to make its own predictions.&lt;/p&gt;
&lt;p&gt;Both models&#39; raw output values (logits) are evaluated through a loss function (a numerical measure of how much a prediction deviates from the correct value). Weight adjustments are then applied to the student model through backpropagation. This enables the smaller model to learn and replicate the teacher model&#39;s predictions.&lt;/p&gt;
&lt;p&gt;The primary configuration file in the application code is called a recipe. This file stores all distillation parameters and settings, making experiments reproducible and allowing researchers to track how different parameters influence the final outcome.&lt;/p&gt;
&lt;p&gt;When selecting parameter values and iteration counts, maintaining balance is crucial. A model that&#39;s distilled too much may lose its ability to recognize subtle details and context, defaulting to templated responses. While perfect balance is nearly impossible to achieve, careful monitoring of the distillation process can substantially improve the prediction quality of even modest neural network models.&lt;/p&gt;
&lt;p&gt;It is also worth paying attention to monitoring during the training process. This will help to identify problems in time and promptly correct them. For this, you can use the &lt;a href=&quot;https://www.tensorflow.org/tensorboard&quot;&gt;TensorBoard&lt;/a&gt; tool. It integrates seamlessly into PyTorch projects and allows you to visually evaluate many metrics, such as accuracy and losses. Moreover, it allows you to build a model graph, track memory usage and execution time of operations.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Knowledge distillation is an effective method for optimizing neural networks to improve compact models. It works best when balancing performance with answer quality is essential.&lt;/p&gt;
&lt;p&gt;Though knowledge distillation requires careful monitoring, its results can be remarkable. Models become much smaller while maintaining prediction quality, and they perform better with fewer computing resources.&lt;/p&gt;
&lt;p&gt;When planned well with appropriate parameters, knowledge distillation serves as a key tool for creating compact neural networks without sacrificing quality.&lt;/p&gt;
&lt;p&gt;See also:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/607-advantages-and-disadvantages-of-gpu-sharing&quot;&gt;Advantages and Disadvantages of GPU sharing&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/609-nvidia-rtx-50-expectations-and-reality&quot;&gt;NVIDIA® RTX™ 50: expectations vs reality&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/611-intel-habana-gaudi-2-install-and-test&quot;&gt;Intel Habana Gaudi 2: install and test&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description>
      <enclosure url="https://assets.getwildcard.com/system/images/imgs/000/001/092/original/il_what_is_knowledge_distillation.png?1733302736"
        length="0"
        type="image/jpeg"/>
      <pubDate>Thu, 23 Jan 2025 13:21:29 +0100</pubDate>
      <guid isPermaLink="false">606</guid>
      <dc:date>2025-01-23 13:21:29 +0100</dc:date>
    </item>
    <item>
      <title>AudioCraft by MetaAI: create music by description</title>
      <link>https://www.leadergpu.com/catalog/604-audiocraft-by-metaai-create-music-by-description</link>
      <description>&lt;p&gt;Modern generative neural networks are becoming smarter. They are writing stories, engaging in conversations with people, and creating ultra-realistic images. Now, they can produce simple music tracks without the need for professional artists. This future has become a reality today. It’s expected, as musical harmonies and rhythms are rooted in mathematical principles.&lt;/p&gt;
&lt;p&gt;Meta has demonstrated its commitment to the world of open-source software. They have placed three neural network models publicity available that enable the creation of sounds and music from text descriptions:&lt;/p&gt;
&lt;ol&gt;
  &lt;li&gt;&lt;a href=&quot;https://musicgen.com/&quot;&gt;MusicGen&lt;/a&gt; — generates music from text.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://audiocraft.metademolab.com/audiogen.html&quot;&gt;AudioGen&lt;/a&gt; — generates audio from text.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/facebookresearch/encodec&quot;&gt;EnCodec&lt;/a&gt; — high quality neural audio compressor.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;MusicGen was trained on 20,000 hours of music. You can utilize it locally via dedicated LeaderGPU servers as a platform.&lt;/p&gt;
&lt;h2&gt;Standard installation&lt;/h2&gt;
&lt;p&gt;Update the package cache repository:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt update &amp;&amp; sudo apt -y upgrade&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Install the Python package manager, pip, and the ffmpeg libraries:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt -y install python3-pip ffmpeg&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Install torch 2.0 or newer using pip:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;pip install &#39;torch&gt;=2.0&#39;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The next command automatically installs &lt;b translate=&quot;no&quot;&gt;audiocraft&lt;/b&gt; and all necessary dependencies:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;pip install -U audiocraft&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let’s write a simple Python app, using the &lt;a href=&quot;https://huggingface.co/facebook/musicgen-large&quot;&gt;large pre-trained MusicGen model&lt;/a&gt; with 3.3B parameters:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;nano generate.py&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;from audiocraft.models import MusicGen
from audiocraft.data.audio import audio_write
model = MusicGen.get_pretrained(&quot;facebook/musicgen-large&quot;)
model.set_generation_params(duration=30)  # generate a 30 seconds sample.
descriptions = [&quot;rock solo&quot;]
wav = model.generate(descriptions)  # generates sample.
for idx, one_wav in enumerate(wav):
    # Will save under {idx}.wav, with loudness normalization at -14 db LUFS.
    audio_write(f&#39;{idx}&#39;, one_wav.cpu(), model.sample_rate, strategy=&quot;loudness&quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Execute the created app:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;python3 generate.py&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;After a few seconds, the generated file (0.wav) will appear in the directory.&lt;/p&gt;
&lt;h2&gt;Coffee Vampir 3&lt;/h2&gt;
&lt;p&gt;Clone a project repository:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;git clone https://github.com/CoffeeVampir3/audiocraft-webui.git&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Open the cloned directory:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd audiocraft-webui&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Run command that prepares your system and installs all necessary packages:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;pip install -r requirements.txt&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then, run the Coffee Vampire 3 server with the following command:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;python3 webui.py&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Coffee Vampire 3 uses Flask as a framework. By default, it runs on localhost with port 5000. If you want remote access, please use the port forwarding feature in your SSH client. Otherwise, you can organize VPN-connection to the server.&lt;/p&gt;
&lt;p&gt;&lt;font color=&quot;red&quot;&gt;&lt;i&gt;Attention! This is a potentially dangerous action; use at your own risk:&lt;/i&gt;&lt;/font&gt;&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;nano webui.py&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Scroll down to the end and replace &lt;b translate=&quot;no&quot;&gt;socketio.run(app)&lt;/b&gt; to &lt;b translate=&quot;no&quot;&gt;socketio.run(app, host=’0.0.0.0’, port=5000)&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Save the file and run the server using the command above. This allows access to the server from the public internet without any authentication.&lt;/p&gt;
&lt;p&gt;Don’t forget to &lt;b translate=&quot;no&quot;&gt;disable AdBlock software&lt;/b&gt;, as it can block the music player on the right side of the webpage. You can start by entering the prompt and confirming with the &lt;b translate=&quot;no&quot;&gt;Submit&lt;/b&gt; button:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/902/original/sh_audiocraft_by_metaai_create_music_by_description_1.png?1713360831&quot; alt=&quot;Main page Audiocraft WebUI&quot;&gt;
&lt;h2&gt;TTS Generation WebUI&lt;/h2&gt;
&lt;h3&gt;Step 1. Drivers&lt;/h3&gt;
&lt;p&gt;Update the package cache repository:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt update &amp;&amp; sudo apt -y upgrade&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Install NVIDIA® drivers using automatic installer or our guide &lt;a href=&quot;https://www.leadergpu.com/articles/499-install-nvidia-drivers-in-linux&quot;&gt;Install NVIDIA® drivers in Linux&lt;/a&gt;:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo ubuntu-drivers autoinstall&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Reboot the server:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo shutdown -r now&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;Step 2. Docker&lt;/h3&gt;
&lt;p&gt;The next step is to install Docker. Let’s install some packages that need to be added to the Docker repository: &lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt -y install apt-transport-https curl gnupg-agent ca-certificates software-properties-common&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Download the Docker GPG key and store it:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Add the repository:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo add-apt-repository &quot;deb [arch=amd64] https://download.docker.com/linux/ubuntu focal stable&quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Install &lt;b translate=&quot;no&quot;&gt;Docker CE&lt;/b&gt; (Community Edition) with CLI and the &lt;b translate=&quot;no&quot;&gt;containerd&lt;/b&gt; runtime:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt -y install docker-ce docker-ce-cli containerd.io&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Add the current user to the docker group:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo usermod -aG docker $USER&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Apply changes without the logout and login procedure:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;newgrp docker&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;Step 3. GPU passthrough&lt;/h3&gt;
&lt;p&gt;Let’s enable NVIDIA® GPUs passthrough in Docker. The following command reads the current OS version into the distribution variable, which we can use in the next step:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;distribution=$(. /etc/os-release;echo $ID$VERSION_ID)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Download the NVIDIA® repository GPG key and store it:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Download the NVIDIA® repos list and store it for use in the standard APT package manager:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Update the package cache repository and install the GPU passthrough toolkit:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt-get update &amp;&amp; sudo apt-get install -y nvidia-container-toolkit&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Restart the Docker daemon:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo systemctl restart docker&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;Step 4. WebUI&lt;/h3&gt;
&lt;p&gt;Download the repository archive:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;wget https://github.com/rsxdalv/tts-generation-webui/archive/refs/heads/main.zip&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Unpack it:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;unzip main.zip&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Open the project’s directory:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd tts-generation-webui-main&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Start building the image:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;docker build -t rsxdalv/tts-generation-webui .&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Run the created container:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;docker compose up -d&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now you can open &lt;b translate=&quot;no&quot;&gt;http://[server_ip]:7860&lt;/b&gt;, type your prompt, select the necessary model, and click the &lt;b translate=&quot;no&quot;&gt;Generate&lt;/b&gt; button:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/903/original/sh_audiocraft_by_metaai_create_music_by_description_2.png?1713360865
&quot; alt=&quot;Audiocraft generated sound&quot;&gt;
&lt;p&gt;The system automatically downloads the selected model during the first generation. Enjoy!&lt;/p&gt;
&lt;p&gt;See also:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/594-stable-diffusion-riffusion&quot;&gt;Stable Diffusion: Riffusion&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/597-stable-video-diffusion&quot;&gt;Stable Video Diffusion&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/598-easy-diffusion-ui&quot;&gt;Easy Diffusion UI&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description>
      <enclosure url="https://assets.getwildcard.com/system/images/imgs/000/001/117/original/il_audiocraft_by_metaai_create_music_by_description.png?1737557205"
        length="0"
        type="image/jpeg"/>
      <pubDate>Wed, 22 Jan 2025 15:51:35 +0100</pubDate>
      <guid isPermaLink="false">604</guid>
      <dc:date>2025-01-22 15:51:35 +0100</dc:date>
    </item>
    <item>
      <title>How to monitor LangFlow application</title>
      <link>https://www.leadergpu.com/catalog/602-how-to-monitor-langflow-application</link>
      <description>&lt;p&gt;In our article &lt;a href=&quot;https://www.leadergpu.com/articles/558-low-code-ai-app-builder-langflow&quot;&gt;Low-code AI app builder Langflow&lt;/a&gt; we explored how to get started with this low-code AI app builder’s visual programming environment. It enables anyone, even those without programming knowledge, to build applications powered by large neural network models. These could be AI chatbots or document processing applications that can analyze and summarize content.&lt;/p&gt;
&lt;p&gt;Langflow uses a building-block approach where users connect pre-made components to create their desired application. However, two key challenges often arise: troubleshooting when neural networks behave unexpectedly, and managing costs. Neural networks require substantial computing resources, making it essential to monitor and predict infrastructure expenses.&lt;/p&gt;
&lt;p&gt;LangWatch addresses both challenges. This specialized tool helps Langflow developers monitor user requests, track costs, and detect anomalies, such as when applications are used in unintended ways.&lt;/p&gt;
&lt;p&gt;This tool was originally designed as a service but can be deployed on any server, including locally. It integrates with most LLM providers, whether cloud-based or on-premise. Being open source, LangWatch can be adapted to almost any project: adding new features or connecting with internal systems.&lt;/p&gt;
&lt;p&gt;LangWatch lets you set up alerts when specific metrics exceed defined thresholds. This helps you quickly detect unexpected increases in request costs or unusual response delays. Early detection helps prevent unplanned expenses and potential service attacks.&lt;/p&gt;
&lt;p&gt;For neural network researchers, this application enables both monitoring and optimization of common user requests. It also provides tools to evaluate model response quality and make adjustments when needed.&lt;/p&gt;
&lt;h2&gt;Quick start&lt;/h2&gt;
&lt;h3&gt;System prepare&lt;/h3&gt;
&lt;p&gt;Like Langflow, the simplest way to run the application is through a Docker container. Before installing LangWatch, you’ll need to install Docker Engine on your server. First, update your package cache and the packages to their latest versions:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt update &amp;&amp; sudo apt -y upgrade&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Install additional packages required by Docker:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt -y install apt-transport-https ca-certificates curl software-properties-common&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Download the GPG key to add the official Docker repository:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Add the repository to APT using the key you downloaded and installed earlier:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;echo &quot;deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable&quot; | sudo tee /etc/apt/sources.list.d/docker.list &gt; /dev/null&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Refresh the package list:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt update&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;To ensure that Docker will be installed from the newly added repository and not from the system one, you can run the following command:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;apt-cache policy docker-ce&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Install Docker Engine:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt install docker-ce&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Verify that Docker has been installed successfully and the corresponding daemon is running and in the &lt;b translate=&quot;no&quot;&gt;active (running)&lt;/b&gt; status:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo systemctl status docker&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;● docker.service - Docker Application Container Engine
    Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset&gt;
    Active: active (running) since Mon 2024-11-18 08:26:35 UTC; 3h 27min ago
TriggeredBy: ● docker.socket
      Docs: https://docs.docker.com
  Main PID: 1842 (dockerd)
     Tasks: 29
    Memory: 1.8G
       CPU: 3min 15.715s
    CGroup: /system.slice/docker.service&lt;/pre&gt;
&lt;h3&gt;Build and run&lt;/h3&gt;
&lt;p&gt;With Docker Engine installed and running, you can download the LangWatch application repository:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;git clone https://github.com/langwatch/langwatch&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The application includes a sample configuration file with environment variables. Copy this file so the image build utility can process it:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cp langwatch/.env.example langwatch/.env&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now you’re ready for the first launch:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo docker compose up --build&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The system will take a moment to download all necessary container layers for LangWatch. Once complete, you’ll see a console message indicating the application is available at:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;http://[LeaderGPU_IP_address]:3000&lt;/pre&gt;
&lt;p&gt;Navigate to this page in your browser, where you’ll be prompted to create a user account:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/089/original/sh_how_to_monitor_langflow_application_1.png?1732712766&quot; alt=&quot;LangWatch login screen&quot;&gt;
&lt;p&gt;Unlike Langflow, this system has authentication enabled by default. After logging in, you’ll need to configure the system to collect data from your Langflow server.&lt;/p&gt;
&lt;h2&gt;Langflow integration&lt;/h2&gt;
&lt;p&gt;LangWatch needs a data source to function. The server listens on port 3000 and uses a RESTful API, which authenticates incoming data through an automatically generated API key.&lt;/p&gt;
&lt;p&gt;To enable data transfer, you’ll need to set two variables in the Langflow configuration files: &lt;b translate=&quot;no&quot;&gt;LANGWATCH_ENDPOINT&lt;/b&gt; and &lt;b translate=&quot;no&quot;&gt;LANGWATCH_API_KEY&lt;/b&gt;. First, establish an SSH connection to your Langflow server (which should be offline during this process).&lt;/p&gt;
&lt;p&gt;Navigate to the directory with the sample configuration for Docker:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd langflow/docker_example&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Open the configuration file for editing:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;nano docker-compose.yml&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In the “environment:” section, add the following variables (without brackets [] or quotation marks):&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot;&gt;- LANGWATCH_API_KEY= [YOUR_API_KEY]
- LANGWATCH_ENDPOINT=http://[IP_ADDRESS]:3000&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The YML file requires specific formatting. Follow these two key rules:&lt;/p&gt;
&lt;ol&gt;
    &lt;li&gt;Use spaces (2 or 4) for indentation, never tabs.&lt;/li&gt;
    &lt;li&gt;Maintain proper hierarchical structure with consistent indentation.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Save the file with &lt;b translate=&quot;no&quot;&gt;Ctrl + O&lt;/b&gt; and exit the editor with &lt;b translate=&quot;no&quot;&gt;Ctrl + X&lt;/b&gt;, Langflow is now ready to launch:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo docker compose up&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;After launching, verify that everything works properly. Create a new project or open an existing one, then initiate a dialogue through Playground. Langflow will automatically send data to LangWatch for monitoring, which you can view in the web interface.&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/090/original/sh_how_to_monitor_langflow_application_2.png?1732712788&quot; alt=&quot;LangWatch integration checks&quot;&gt;
&lt;p&gt;In the integration verification section, a check mark appears on the “Sync your first message” item. This indicates that data from Langflow is successfully flowing into LangWatch, confirming your setup is correct. Let’s examine that appears in the &lt;b translate=&quot;no&quot;&gt;Messages&lt;/b&gt; section:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/091/original/sh_how_to_monitor_langflow_application_3.png?1732712853&quot; alt=&quot;LangWatch messages lookup&quot;&gt;
&lt;p&gt;The Messages section displays the data entered into the application, the parameters used for response generation, and the neural network’s response itself. You can evaluate response quality and use various filters to sort through the data, even with hundreds or thousands of messages.&lt;/p&gt;
&lt;p&gt;After this initial setup, explore the application’s features systematically. In the &lt;b translate=&quot;no&quot;&gt;Evaluations&lt;/b&gt; section, you can set up dialogue verification algorithms for either dialogue moderation or data recognition, such as &lt;b translate=&quot;no&quot;&gt;PII Detection&lt;/b&gt;. This feature scans input for sensitive information like social security numbers or phone numbers.&lt;/p&gt;
&lt;p&gt;The application offers both local and cloud-based options through providers like Azure or Cloudflare. To use cloud features, you’ll need accounts with these services, along with their endpoint addresses and API keys. Keep in mind that these are third-party providers, so check their service costs directly.&lt;/p&gt;
&lt;p&gt;For local options, the application features sophisticated RAG (Retrieval-augmented generation) capabilities. You can measure the accuracy and relevance of RAG-generated content, and use the gathered statistics to optimize the RAG system for more accurate neural network responses.&lt;/p&gt;
&lt;p&gt;See also:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://stage.leadergpu.com/catalog/601-low-code-ai-app-builder-langflow&quot;&gt;Low-code AI app builder Langflow&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/586-photogrammetry-with-meshroom&quot;&gt;Photogrammetry with Meshroom&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/588-blender-remote-rendering-with-flamenco&quot;&gt;Blender remote rendering with Flamenco&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description>
      <enclosure url="https://assets.getwildcard.com/system/images/imgs/000/001/088/original/il_how_to_monitor_langflow_application.png?1732712732"
        length="0"
        type="image/jpeg"/>
      <pubDate>Wed, 22 Jan 2025 15:14:55 +0100</pubDate>
      <guid isPermaLink="false">602</guid>
      <dc:date>2025-01-22 15:14:55 +0100</dc:date>
    </item>
    <item>
      <title>Low-code AI app builder Langflow</title>
      <link>https://www.leadergpu.com/catalog/601-low-code-ai-app-builder-langflow</link>
      <description>&lt;p&gt;Software development has evolved dramatically in recent years. Modern programmers now have access to hundreds of programming languages and frameworks. Beyond traditional imperative and declarative approaches, a new and exciting method of creating applications is emerging. This innovative approach harnesses the power of neural networks, opening up fantastic possibilities for developers.&lt;/p&gt;
&lt;p&gt;People have grown accustomed to AI assistants in IDEs helping with code autocompletion and modern neural networks easily generating code for simple Python games. However, new hybrid tools are emerging that could revolutionize the development landscape. One such tool in Langflow.&lt;/p&gt;
&lt;p&gt;Langflow serves multiple purposes. For professional developers, it offers better control over complex systems like neural networks. For those unfamiliar with programming, it enables the creation of simple yet practical applications. These goals are achieved through different means, which we’ll explore in more detail.&lt;/p&gt;
&lt;h2&gt;Neural networks&lt;/h2&gt;
&lt;p&gt;The concept of a neural network can be simplified for users. Imagine a black box that receives input data and parameters influencing the final result. This box processes the input using complex algorithms, often referred to as “magic”, and produces output data that can be presented to the user.&lt;/p&gt;
&lt;p&gt;The inner workings of this black box vary based on the neural network’s design and training data. It’s crucial to understand that developers and users can never achieve 100% certainty in results. Unlike traditional programming where 2 + 2 always equals 4, a neural network might give this answer with 99% certainty, always maintaining a margin of error.&lt;/p&gt;
&lt;p&gt;Control over a neural network&#39;s &quot;thinking&quot; process is indirect. We can only adjust certain parameters, such as &quot;temperature.&quot; This parameter determines how creative or constrained the neural network can be in its approach. A low temperature value limits the network to a more formal, structured approach to tasks and solutions. Conversely, high temperature values grant the network more freedom, potentially leading to reliance on less reliable facts or even the creation of fictional information.&lt;/p&gt;
&lt;p&gt;This example illustrates how users can influence the final output. For traditional programming, this uncertainty poses a significant challenge - errors may appear unexpectedly, and specific results become unpredictable. However, this unpredictability is primarily a problem for computers, not for humans who can adapt to and interpret varying outputs.&lt;/p&gt;
&lt;p&gt;If a neural network’s output is intended for a human, the specific wording used to describe it is generally less important. Given the context, people can correctly interpret various results from the machine’s perspective. While concepts like “positive value”, &quot;result achieved”, or “positive decision” might mean roughly the same thing to a person, traditional programming would struggle with this flexibility. It would need to account for all possible answer variations, which is nearly impossible.&lt;/p&gt;
&lt;p&gt;On the other hand, if further processing is handed off to another neural network, it can correctly understand and process the obtained result. Based on this, it can then form its own conclusion with a certain degree of confidence, as mentioned earlier.&lt;/p&gt;
&lt;h2&gt;Low-code&lt;/h2&gt;
&lt;p&gt;Most programming languages involve writing code. Programmers create the logic for each part of an application in their minds, then describe it using language-specific expressions. This process forms an algorithm: a clear sequence of actions leading to a specific, predetermined result. It’s a complex task requiring significant mental effort and a deep understanding of the language’s capabilities.&lt;/p&gt;
&lt;p&gt;However, there is no need to reinvent the wheel. Many problems faced by modern developers have already been solved in various ways. Relevant code snippets can often be &lt;a href=&quot;https://stackoverflow.com/&quot;&gt;found&lt;/a&gt; on StackOverflow. Modern programming can be likened to assembling a whole from parts of different construction sets. The Lego system offers a successful model, having standardized different sets of parts to ensure compatibility.&lt;/p&gt;
&lt;p&gt;The low-code programming method follows a similar principle. Various code pieces are modified to fit together seamlessly and are presented to developers as ready-made blocks. Each block can have data inputs and outputs. Documentation specifies the task each block type solves and the format in which it accepts or outputs data.&lt;/p&gt;
&lt;p&gt;By connecting these blocks in a specific sequence, developers can form an application’s algorithm and clearly visualize its operational logic. Perhaps the most well-known example of this programming method is the turtle graphics method, commonly used in educational settings to introduce programming concepts and develop algorithmic thinking.&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/078/original/sh_low-code_ai_app_builder_langflow_1.png?1732099423&quot; alt=&quot;Turtle graphics&quot;&gt;
&lt;p&gt;The essence of this method is simple: drawing images on the screen using a virtual turtle that leaves a trail as it crawls across the canvas. Using ready-made blocks, such as moving a set number of pixels, turning at specific angles, or raising and lowering the pen, developers can create programs that draw their desired pictures. Creating applications using a low-code constructor is similar to turtle graphics, but it allows users to solve a wide range of problems, not just drawing on a canvas.&lt;/p&gt;
&lt;p&gt;This method was best implemented in IBM’s Node-RED programming tool. It was developed as a universal means of ensuring the joint operation of diverse devices, online services, and APIs. The equivalent of code snippets were nodes from the standard library (palette).&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/079/original/sh_low-code_ai_app_builder_langflow_2.png?1732099465&quot; alt=&quot;Node-RED canvas&quot;&gt;
&lt;p&gt;Node-RED’s capabilities can be expanded by installing add-ons or creating custom nodes that perform specific data actions. Developers place nodes from the palette onto the desktop and build relationships between them. This process creates the application’s logic, with visualization helping to maintain clarity.&lt;/p&gt;
&lt;p&gt;Adding neural networks to this concept yields an intriguing system. Instead of processing data with specific mathematical formulas, you can feed it into a neural network and specify the desired output. Although the input data may vary slightly each time, the results remain suitable for interpretation by humans or other neural networks.&lt;/p&gt;
&lt;h2&gt;Retrieval Augmented Generation (RAG)&lt;/h2&gt;
&lt;p&gt;The accuracy of data in large language models is a pressing concern. These models rely solely on knowledge gained during training, which depends on the relevance of the datasets used. Consequently, large language models may lack sufficient relevant data, potentially leading to incorrect results.&lt;/p&gt;
&lt;p&gt;To address this issue, data updating methods are necessary. Allowing neural networks to extract context from additional sources, such as websites, can significantly improve the quality of answers. This is precisely how RAG (Retrieval-Augmented Generation) works. Additional data is converted into vector representations and stored in a database.&lt;/p&gt;
&lt;p&gt;In operation, neural network models can convert user requests into vector representations and compare them with those stored in the database. When similar vectors are found, the data is extracted and used in forming a response. Vector databases are fast enough to support this scheme in real-time.&lt;/p&gt;
&lt;p&gt;For this system to function correctly, interaction between the user, the neural network model, external data sources, and the vector database must be established. Langflow simplifies this setup with its visual component - users simply build standard blocks and &quot;link&quot; them, creating a path for data flow.&lt;/p&gt;
&lt;p&gt;The first step is to populate the vector database with relevant sources. These can include files from a local computer or web pages from the Internet. Here&#39;s a simple example of loading data into the database:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/080/original/sh_low-code_ai_app_builder_langflow_3.png?1732099495&quot; alt=&quot;RAG data load&quot;&gt;
&lt;p&gt;Now that we have a vector database in addition to the trained LLM, we can incorporate it into the general scheme. When a user submits a request in the chat, it simultaneously forms a prompt and queries the vector database. If similar vectors are found, the extracted data is parsed and added as context to the formed prompt. The system then sends a request to the neural network and outputs the received response to the user in the chat.&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/081/original/sh_low-code_ai_app_builder_langflow_4.png?1732099527&quot; alt=&quot;RAG scheme&quot;&gt;
&lt;p&gt;While the example mentions cloud services like OpenAI and AstraDB, you can use any compatible services, including those deployed locally on LeaderGPU servers. If you can&#39;t find the integration you need in the list of available blocks, you can either write it yourself or add one created by someone else.&lt;/p&gt;
&lt;h2&gt;Quick start&lt;/h2&gt;
&lt;h3&gt;System prepare&lt;/h3&gt;
&lt;p&gt;The simplest way to deploy Langflow is within a Docker container. To set up the server, begin by installing Docker Engine. Then, update both the package cache and the packages to their latest versions:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt update &amp;&amp; sudo apt -y upgrade&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Install additional packages required by Docker:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt -y install apt-transport-https ca-certificates curl software-properties-common&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Download the GPG key to add the official Docker repository:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Add the repository to APT using the key you downloaded and installed earlier:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;echo &quot;deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable&quot; | sudo tee /etc/apt/sources.list.d/docker.list &gt; /dev/null&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Refresh the package list:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt update&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;To ensure that Docker will be installed from the newly added repository and not from the system one, you can run the following command:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;apt-cache policy docker-ce&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Install Docker Engine:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt install docker-ce&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Verify that Docker has been installed successfully and the corresponding daemon is running and in the &lt;b translate=&quot;no&quot;&gt;active (running)&lt;/b&gt; status:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo systemctl status docker&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot;&gt;● docker.service - Docker Application Container Engine
  Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset&gt;
  Active: active (running) since Mon 2024-11-18 08:26:35 UTC; 3h 27min ago
TriggeredBy: ● docker.socket
    Docs: https://docs.docker.com
Main PID: 1842 (dockerd)
   Tasks: 29
  Memory: 1.8G
     CPU: 3min 15.715s
  CGroup: /system.slice/docker.service
&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;Build and run&lt;/h3&gt;
&lt;p&gt;Everything is ready to build and run a Docker container with Langflow. However, there&#39;s one caveat: at the time of writing this guide, the latest version (tagged v1.1.0) has an error and won&#39;t start. To avoid this issue, we&#39;ll use the previous version, v1.0.19.post2, which works flawlessly right after download.&lt;/p&gt;
&lt;p&gt;The simplest approach is to download the project repository from GitHub:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;git clone https://github.com/langflow-ai/langflow&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Navigate to the directory containing the sample deployment configuration:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd langflow/docker_example&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now you will need to do two things. First, change the release tag so that a working version (at the time of writing this instruction) is built. Second, add simple authorization so that no one can use the system without knowing the login and password.&lt;/p&gt;
&lt;p&gt;Open the configuration file:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo nano docker-compose.yml&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;instead of the following line:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot;&gt;image: langflowai/langflow:latest&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;specify the version instead of the &lt;b translate=&quot;no&quot;&gt;latest&lt;/b&gt; tag:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot;&gt;image: langflowai/langflow:v1.0.19.post2&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You also need to add three variables to the &lt;b translate=&quot;no&quot;&gt;environment&lt;/b&gt; section:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot;&gt;  - LANGFLOW_AUTO_LOGIN=false
  - LANGFLOW_SUPERUSER=admin
  - LANGFLOW_SUPERUSER_PASSWORD=your_secure_password&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The first variable disables access to the web interface without authorization. The second adds the username that will receive system administrator rights. The third adds the corresponding password.&lt;/p&gt;
&lt;p&gt;If you plan to store the &lt;b translate=&quot;no&quot;&gt;docker-compose.yml&lt;/b&gt; file in a version control system, avoid writing the password directly in this file. Instead, create a separate file with a &lt;b translate=&quot;no&quot;&gt;.env&lt;/b&gt; extension in the same directory and store the variable value there.&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot;&gt;LANGFLOW_SUPERUSER_PASSWORD=your_secure_password&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In the &lt;b translate=&quot;no&quot;&gt;docker-compose.yml&lt;/b&gt; file, you can now reference a variable instead of directly specifying a password:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot;&gt;LANGFLOW_SUPERUSER_PASSWORD=${LANGFLOW_SUPERUSER_PASSWORD}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;To prevent accidentally exposing the &lt;b translate=&quot;no&quot;&gt;*.env&lt;/b&gt; file on GitHub, remember to add it to &lt;b translate=&quot;no&quot;&gt;.gitignore&lt;/b&gt;. This will keep your password reasonably secure from unwanted access.&lt;/p&gt;
&lt;p&gt;Now, all that&#39;s left is to build our container and run it:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo docker compose up&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Open the web page at &lt;b translate=&quot;no&quot;&gt;http://[LeaderGPU_IP_address]:7860&lt;/b&gt;, and you&#39;ll see the authorization form:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/082/original/sh_low-code_ai_app_builder_langflow_5.png?1732099559&quot; alt=&quot;Login screen&quot;&gt;
&lt;p&gt;Once you enter your login and password, the system grants access to the web interface where you can create your own applications. For more in-depth guidance, we suggest consulting &lt;a href=&quot;https://docs.langflow.org/&quot;&gt;the official documentation&lt;/a&gt;. It provides details on various environment variables that allow easy customization of the system to suit your needs.&lt;/p&gt;
&lt;p&gt;See also:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/602-how-to-monitor-langflow-application&quot;&gt;How to monitor LangFlow application&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/586-photogrammetry-with-meshroom&quot;&gt;Photogrammetry with Meshroom&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/588-blender-remote-rendering-with-flamenco&quot;&gt;Blender remote rendering with Flamenco&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description>
      <enclosure url="https://assets.getwildcard.com/system/images/imgs/000/001/077/original/il_low-code_ai_app_builder_langflow.png?1732099387"
        length="0"
        type="image/jpeg"/>
      <pubDate>Wed, 22 Jan 2025 15:11:30 +0100</pubDate>
      <guid isPermaLink="false">601</guid>
      <dc:date>2025-01-22 15:11:30 +0100</dc:date>
    </item>
    <item>
      <title>Easy Diffusion UI</title>
      <link>https://www.leadergpu.com/catalog/598-easy-diffusion-ui</link>
      <description>&lt;p&gt;Easy Diffusion UI is an open source software available for download on GitHub. Here’s how to install it on Ubuntu 22.04 LTS. If you’ve just rented a server, install the GPU drivers and extend your home directory. Then, download the latest release of Easy Diffusion UI:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;wget https://github.com/cmdr2/stable-diffusion-ui/releases/latest/download/Easy-Diffusion-Linux.zip&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Unpack the downloaded ZIP-archive:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;unzip Easy-Diffusion-Linux.zip&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Change directory to easy-diffusion:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd easy-diffusion&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Start the installation:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;./start.sh&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This is a script collection that automatically downloads and installs all necessary components. It also downloads the standard Stable Diffusion model in SafeTensors format. Once all downloads and installations are complete, the Easy Diffusion UI will launch automatically.&lt;/p&gt;
&lt;h2&gt;Using&lt;/h2&gt;
&lt;p&gt;The previous article, &lt;a href=&quot;https://www.leadergpu.com/catalog/565-stable-diffusion-webui&quot;&gt;Stable Diffusion WebUI&lt;/a&gt;, outlines a method for accepting connections from the public internet and provides simple login and password authorization. In this case, we aim to demonstrate another universal method for forwarding ports through an SSH connection. We use PuTTY to establish a secure connection to the remote server. You can find more information about it in our guide &lt;a href=&quot;https://www.leadergpu.com/articles/488-connect-to-a-linux-server&quot;&gt;Connect to a Linux server&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;To choose which ports to forward, please open &lt;b translate=&quot;no&quot;&gt;Connection &gt; SSH &gt; Tunnels&lt;/b&gt; in the left option tree. Type &lt;b translate=&quot;no&quot;&gt;9000&lt;/b&gt; in the &lt;b translate=&quot;no&quot;&gt;Source Port&lt;/b&gt; field and &lt;b translate=&quot;no&quot;&gt;127.0.0.1:9000&lt;/b&gt; in the &lt;b translate=&quot;no&quot;&gt;Destination&lt;/b&gt; field. Then click the &lt;b translate=&quot;no&quot;&gt;Add&lt;/b&gt; button:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/823/original/sh_easy_diffusion_ui_1.png?1712299445&quot; alt=&quot;Port forwarding in PuTTY&quot;&gt;
&lt;p&gt;After that, you can return to &lt;b translate=&quot;no&quot;&gt;Session&lt;/b&gt; and save it for later use. Connect to the remote server as usual. Now, all data that you send or receive at port 9000 on the loopback address 127.0.0.1 will be redirected to the remote server. This method creates a virtual secure tunnel that exists as long as the connection does.&lt;/p&gt;
&lt;p&gt;Once Easy Diffusion UI starts and port forwarding is on, you can open a web browser and navigate to the address &lt;a href=&quot;http://127.0.0.1:9000&quot;&gt;http://127.0.0.1:9000&lt;/a&gt;. We recommend downloading and installing custom models, as described in this article, instead of relying solely on the standard model to generate images. Don’t forget to increase the number of inference steps and adjust the desired image resolution (marked with asterisks).&lt;/p&gt;
&lt;p&gt;One of the major benefits of the Easy Diffusion UI is its support for multiple GPUs. When you want to create a batch of images, you can choose how many images will be created in parallel. For example, if you have dual GPU configuration:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/824/original/sh_easy_diffusion_ui_2.png?1712299546&quot; alt=&quot;Easy Diffusion UI change threads number&quot;&gt;
&lt;p&gt;You can display the GPU’s load during the image generation process. Establish another SSH connection and execute a single command:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;watch -n 1 nvidia-smi&lt;/code&gt;&lt;/pre&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/825/original/sh_easy_diffusion_ui_3.png?1712299806&quot; alt=&quot;nvidia-smi two threads&quot;&gt;
&lt;p&gt;Also, Easy Diffusion UI simplifies your prompts creation as it provides numerous examples of image modifiers. You can mix them to achieve more accurate results:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/826/original/sh_easy_diffusion_ui_4.png?1712299873&quot; alt=&quot;Image modifiers&quot;&gt;
&lt;p&gt;It’s a good idea to explore &lt;a href=&quot;https://openart.ai/promptbook&quot;&gt;PromptBook by OpenArt&lt;/a&gt;. This guide can significantly enhance your prompt creation skills. With the Easy Diffusion UI, once the image is generated, you can download it, use as example for generating the next image, or make modifications with just one click:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/827/original/sh_easy_diffusion_ui_5.png?1712299912&quot; alt=&quot;Control elements&quot;&gt;
&lt;p&gt;The most common use of the &lt;b translate=&quot;no&quot;&gt;Upscale&lt;/b&gt; button is to increase an image’s resolution. The generative neural network uses the original image as a basis and adds additional pixels, thereby interpolating the source image to the desired size.&lt;/p&gt;
&lt;p&gt;When generating faces, issues may arise such as misaligned eyes, disproportionate sizes, or malformed parts. Fortunately, these problems can be solved using the &lt;b translate=&quot;no&quot;&gt;Fix Faces&lt;/b&gt; button. Additionally, negative prompts may be utilized to prevent incorrect faces from being generated.&lt;/p&gt;
&lt;h2&gt;Uninstall&lt;/h2&gt;
&lt;p&gt;All files, scripts, libraries, and models are stored in a single directory. If you want to remove Easy Diffusion UI from your server, just delete this directory along with all the content:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo rm -rf easy-diffusion&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;See also:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/565-stable-diffusion-webui&quot;&gt;Stable Diffusion WebUI&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/584-open-webui-all-in-one&quot;&gt;Open WebUI: All in one&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/590-fooocus-rethinking-of-sd-and-mj&quot;&gt;Fooocus: Rethinking of SD and MJ&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description>
      <enclosure url="https://assets.getwildcard.com/system/images/imgs/000/000/822/original/il_easy_diffusion_ui.jpg?1712299313"
        length="0"
        type="image/jpeg"/>
      <pubDate>Wed, 22 Jan 2025 12:13:37 +0100</pubDate>
      <guid isPermaLink="false">598</guid>
      <dc:date>2025-01-22 12:13:37 +0100</dc:date>
    </item>
    <item>
      <title>Stable Video Diffusion</title>
      <link>https://www.leadergpu.com/catalog/597-stable-video-diffusion</link>
      <description>&lt;p&gt;Generative neural networks can create various types of content. Stable Diffusion was created to generate images from text description. However, it can also be used to create music, sounds, and even videos. Today, we’ll show you how to create short videos from a single image using Stable Diffusion with WebUI and ComfyUI.&lt;/p&gt;
&lt;h2&gt;Install Stable Diffusion&lt;/h2&gt;
&lt;p&gt;Let’s begin by installing Stable Diffusion using our &lt;a href=&quot;https://www.leadergpu.com/articles/506-stable-diffusion-webui&quot;&gt;step-by-step guide&lt;/a&gt;. After installation, please interrupt &lt;b translate=&quot;no&quot;&gt;webui.sh&lt;/b&gt; script execution by pressing Ctrl + C and close the SSH-connection. The system doesn’t allow you to install extensions with the enabled --listen (--share) options. This means that you need to set up port forwarding (7860 and 8189) from your local machine to the remote server. The first port is needed for WebUI and the second for ComfyUI.&lt;/p&gt;
&lt;p&gt;For example, in PuTTY, you need to open &lt;b translate=&quot;no&quot;&gt;Connection &gt;&gt; SSH &gt;&gt; Tunnels&lt;/b&gt; and add two new forwarded ports as shown in the following screenshot:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/954/original/sh_stable_video_diffusion_1.png?1714024360&quot; alt=&quot;PuTTY port forwarding&quot;&gt;
&lt;p&gt;Now, you can reconnect to the remote server and run ./webui.sh again.&lt;/p&gt;
&lt;p&gt;Open this URL in your browser:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;http://127.0.0.1:7860&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Navigate to &lt;b translate=&quot;no&quot;&gt;Extensions &gt;&gt; Available&lt;/b&gt;, then click on the &lt;b translate=&quot;no&quot;&gt;Load from:&lt;/b&gt; button:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/955/original/sh_stable_video_diffusion_2.png?1714024393&quot; alt=&quot;Load available extensions&quot;&gt;
&lt;p&gt;The system will download the JSON file with all available extensions. Type &lt;b translate=&quot;no&quot;&gt;ComfyUI&lt;/b&gt; in the search input box and click the &lt;b translate=&quot;no&quot;&gt;Install&lt;/b&gt; button:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/956/original/sh_stable_video_diffusion_3.png?1714024430&quot; alt=&quot;Download ComfyUI&quot;&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/957/original/sh_stable_video_diffusion_4.png?1714024463&quot; alt=&quot;Reload UI&quot;&gt;
&lt;p&gt;Web page will be reloaded and you’ll get a new tab &lt;b translate=&quot;no&quot;&gt;ComfyUI&lt;/b&gt; in the main panel. Switch to it and click &lt;b translate=&quot;no&quot;&gt;Install ComfyUI&lt;/b&gt;:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/958/original/sh_stable_video_diffusion_5.png?1714024493&quot; alt=&quot;Install ComfyUI&quot;&gt;
&lt;p&gt;When the installation is finished, interrupt the execution of the webui.sh script again by pressing &lt;b translate=&quot;no&quot;&gt;Ctrl + C&lt;/b&gt;.&lt;/p&gt;
&lt;h2&gt;Install Stable Video Diffusion model&lt;/h2&gt;
&lt;p&gt;Open the model’s directory:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd stable-diffusion-webui/models/Stable-diffusion/&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Download the full Stable Video Diffusion model:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;curl -L https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt/resolve/main/svd_xt.safetensors?download=true --output svd_xt.safetensors&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Return to the home directory:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd ~/&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And run the Stable Diffusion service again:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;./webui.sh&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Download the &lt;a href=&quot;https://github.com/enikolair/comfyui-workflow-svd/blob/main/workflow.json&quot;&gt;example&lt;/a&gt; of the Stable Video Diffusion workflow in JSON format. Erase the ComfyUI default workflow by pressing &lt;b translate=&quot;no&quot;&gt;Clear&lt;/b&gt;, then &lt;b translate=&quot;no&quot;&gt;Load&lt;/b&gt; the downloaded example:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/959/original/sh_stable_video_diffusion_6.png?1714024532&quot; alt=&quot;ComfyUI workflow example&quot;&gt;
&lt;p&gt;Ensure that you have the correct model selected in the &lt;b translate=&quot;no&quot;&gt;Image Only Checkpoint Loader (img2vid model)&lt;/b&gt; node:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/960/original/sh_stable_video_diffusion_7.png?1714024570&quot; alt=&quot;Select CKPT model&quot;&gt;
&lt;p&gt;Click on the &lt;b translate=&quot;no&quot;&gt;choose file to upload&lt;/b&gt; button in the &lt;b translate=&quot;no&quot;&gt;Load Image&lt;/b&gt; node and select any single image that generative neural network will transform into a video:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/961/original/sh_stable_video_diffusion_8.png?1714024606&quot; alt=&quot;Upload an image to ComfyUI&quot;&gt;
&lt;p&gt;Try generating a video with all default parameters by clicking the &lt;b translate=&quot;no&quot;&gt;Queue Prompt&lt;/b&gt; button:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/962/original/sh_stable_video_diffusion_9.png?1714024638&quot; alt=&quot;Send task to queue&quot;&gt;
&lt;p&gt;After the process is completed, you’ll get your video in WEBP format in the &lt;b translate=&quot;no&quot;&gt;SaveAnimatedWEBP&lt;/b&gt; node. Right-click on the generated video and choose &lt;b translate=&quot;no&quot;&gt;Save Image&lt;/b&gt;:&lt;/p&gt;
&lt;p&gt;Here is the &lt;a href=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/963/original/sh_stable_video_diffusion_10.gif?1714024668&quot;&gt;final result GIF&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Troubleshooting&lt;/h2&gt;
&lt;p&gt;If you get an error message: &lt;b translate=&quot;no&quot;&gt;ModuleNotFoundError: No module named &#39;utils.json_util&#39;; &#39;utils&#39; is not a package&lt;/b&gt;, please follow these steps:&lt;/p&gt;
&lt;p&gt;Rename the utils directory to utilities:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;mv /home/usergpu/stable-diffusion-webui/extensions/sd-webui-comfyui/ComfyUI/utils /home/usergpu/stable-diffusion-webui/extensions/sd-webui-comfyui/ComfyUI/utilities&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Edit &lt;b translate=&quot;no&quot;&gt;custom_node_manager.py&lt;/b&gt;:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;nano /home/usergpu/stable-diffusion-webui/extensions/sd-webui-comfyui/ComfyUI/app/custom_node_manager.py&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Replace this line:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot;&gt;from utils.json_util import merge_json_recursive&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;with:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot;&gt;from utilities.json_util import merge_json_recursive&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Save the file (&lt;b translate=&quot;no&quot;&gt;Ctrl + O&lt;/b&gt;) and exit the editor (&lt;b translate=&quot;no&quot;&gt;Ctrl + X&lt;/b&gt;). Then edit &lt;b translate=&quot;no&quot;&gt;main.py&lt;/b&gt;:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;nano /home/usergpu/stable-diffusion-webui/extensions/sd-webui-comfyui/ComfyUI/main.py&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Replace this line:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot;&gt;import utils.extra_config&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;with:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot;&gt;import utilities.extra_config&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Save the file, exit the editor, and run the Stable Diffusion service again:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;./webui.sh&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;See also:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/594-stable-diffusion-riffusion&quot;&gt;Stable Diffusion: Riffusion&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/598-easy-diffusion-ui&quot;&gt;Easy Diffusion UI&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/604-audiocraft-by-metaai-create-music-by-description&quot;&gt;AudioCraft by MetaAI: create music by description&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description>
      <enclosure url="https://assets.getwildcard.com/system/images/imgs/000/000/953/original/il_stable_video_diffusion.png?1714024295"
        length="0"
        type="image/jpeg"/>
      <pubDate>Wed, 22 Jan 2025 11:53:04 +0100</pubDate>
      <guid isPermaLink="false">597</guid>
      <dc:date>2025-01-22 11:53:04 +0100</dc:date>
    </item>
    <item>
      <title>PyTorch for Windows</title>
      <link>https://www.leadergpu.com/catalog/596-pytorch-for-windows</link>
      <description>&lt;p&gt;Before you begin installing PyTorch, you need to install the Python interpreter and Microsoft Visual C++ Redistributable. Open a web-browser and navigate to Python’s &lt;a href=&quot;https://www.python.org/downloads/windows/&quot;&gt;download page&lt;/a&gt;. Find the latest Python 3 release and click on the link:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/828/original/sh_pytorch_for_windows_1.png?1712305722&quot; alt=&quot;Download Python release&quot;&gt;
&lt;p&gt;Then scroll down the page and click on &lt;b translate=&quot;no&quot;&gt;Windows Installer (64-bit)&lt;/b&gt;:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/829/original/sh_pytorch_for_windows_2.png?1712305818&quot; alt=&quot;Select binary&quot;&gt;
&lt;p&gt;Open the downloaded file to proceed with installation:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/830/original/sh_pytorch_for_windows_3.png?1712306001&quot; alt=&quot;Run the installer&quot;&gt;
&lt;p&gt;Check the box for &lt;b translate=&quot;no&quot;&gt;Add python.exe to PATH&lt;/b&gt; and click on &lt;b translate=&quot;no&quot;&gt;Install Now&lt;/b&gt;:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/831/original/sh_pytorch_for_windows_4.png?1712306095&quot; alt=&quot;Select Install Now and Add to PATH&quot;&gt;
&lt;p&gt;Wait a minute for the installation process to complete:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/832/original/sh_pytorch_for_windows_5.png?1712314249&quot; alt=&quot;Python setup process&quot;&gt;
&lt;p&gt;You can optionally &lt;b translate=&quot;no&quot;&gt;Disable path length limit&lt;/b&gt; if you plan to use long names that could exceed the &lt;b translate=&quot;no&quot;&gt;MAX_PATH&lt;/b&gt; limits:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/833/original/sh_pytorch_for_windows_6.png?1712314332&quot; alt=&quot;Python setup complete&quot;&gt;
&lt;h2&gt;Install MS Visual C++&lt;/h2&gt;
&lt;p&gt;Next, download Microsoft Visual C++ Redistributable using &lt;a href=&quot;https://aka.ms/vs/16/release/vc_redist.x64.exe&quot;&gt;this link&lt;/a&gt; and click on the installer:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/834/original/sh_pytorch_for_windows_7.png?1712314944&quot; alt=&quot;Run Microsoft visual C++ redistributable installer&quot;&gt;
&lt;p&gt;You must tick the &lt;b translate=&quot;no&quot;&gt;I agree to the license terms and conditions&lt;/b&gt; box and click the &lt;b translate=&quot;no&quot;&gt;Install&lt;/b&gt; button:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/835/original/sh_pytorch_for_windows_8.png?1712315044&quot; alt=&quot;Visual C++ accept EULA&quot;&gt;
&lt;p&gt;After a few seconds, this software will be installed and you can &lt;b translate=&quot;no&quot;&gt;Close&lt;/b&gt; the installer:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/836/original/sh_pytorch_for_windows_9.png?1712315122&quot; alt=&quot;Visual C++ installation complete&quot;&gt;
&lt;p&gt;Now, everything is ready for PyTorch installation. Click the &lt;b translate=&quot;no&quot;&gt;Start&lt;/b&gt; button and type &lt;b translate=&quot;no&quot;&gt;cmd&lt;/b&gt; on the keyboard. Right-click on &lt;b translate=&quot;no&quot;&gt;Command Prompt&lt;/b&gt; and select &lt;b translate=&quot;no&quot;&gt;Run as administrator&lt;/b&gt; from the context menu:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/837/original/sh_pytorch_for_windows_10.png?1712315294&quot; alt=&quot;PyTorch install using PIP&quot;&gt;
&lt;h2&gt;Install PyTorch&lt;/h2&gt;
&lt;p&gt;Execute the following command:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;pip install torch torchvision&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If you want to install a specific version of PyTorch, you can specify it during the installation:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;pip install torch==1.9.0 torchvision==0.10.0&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;When the installation is complete, let’s check that PyTorch is working properly. Execute the following command to open the Python interpreter:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;python&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Type these two strings, ending your input with the &lt;b translate=&quot;no&quot;&gt;Enter&lt;/b&gt; key:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;import torch
print(torch.__version__)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If you get a result like this, it means that PyTorch was installed correctly:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;2.0.1+cu117&lt;/pre&gt;
&lt;p&gt;See also:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/595-pytorch-for-linux&quot;&gt;PyTorch for Linux&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/581-privategpt-ai-for-documents&quot;&gt;PrivateGPT: AI for documents&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/584-open-webui-all-in-one&quot;&gt;Open WebUI: All in one&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description>
      <enclosure url="https://assets.getwildcard.com/system/images/imgs/000/001/114/original/il_pytorch_for_windows.png?1737541812"
        length="0"
        type="image/jpeg"/>
      <pubDate>Wed, 22 Jan 2025 11:35:30 +0100</pubDate>
      <guid isPermaLink="false">596</guid>
      <dc:date>2025-01-22 11:35:30 +0100</dc:date>
    </item>
    <item>
      <title>PyTorch for Linux</title>
      <link>https://www.leadergpu.com/catalog/595-pytorch-for-linux</link>
      <description>&lt;p&gt;Modern Linux distributions are highly dependent on the installed version of Python. Therefore, before installing PyTorch, we recommend creating a virtual environment using our step-by-step guide &lt;a href=&quot;https://www.leadergpu.com/articles/510-linux-system-utilities&quot;&gt;Linux system utilities&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Activate the created venv and proceed with the pip3 upgrade:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-venv&quot;&gt;pip3 install --upgrade pip&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Start the PyTorch installation:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-venv&quot;&gt;pip3 install torch torchvision&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If you want to install a specific version of PyTorch, just type the required version number:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-venv&quot;&gt;pip3 install torch==1.9.0 torchvision==0.10.0&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;When the installation is finished, let’s check that PyTorch was installed correctly. Open the Python interpreter:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-venv&quot;&gt;python3&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Type these two strings, ending your input with the Enter key:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;import torch
print(torch.__version__)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If you get a result like this, it means that PyTorch has been installed correctly:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;2.0.1+cu117&lt;/pre&gt;
&lt;p&gt;See also:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/596-pytorch-for-windows&quot;&gt;PyTorch for Windows&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/601-low-code-ai-app-builder-langflow&quot;&gt;Low-code AI app builder Langflow&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/604-audiocraft-by-metaai-create-music-by-description&quot;&gt;AudioCraft by MetaAI: create music by description&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description>
      <enclosure url="https://assets.getwildcard.com/system/images/imgs/000/001/113/original/il_pytorch_for_linux.png?1737536957"
        length="0"
        type="image/jpeg"/>
      <pubDate>Wed, 22 Jan 2025 10:14:16 +0100</pubDate>
      <guid isPermaLink="false">595</guid>
      <dc:date>2025-01-22 10:14:16 +0100</dc:date>
    </item>
    <item>
      <title>Stable Diffusion: Riffusion</title>
      <link>https://www.leadergpu.com/catalog/594-stable-diffusion-riffusion</link>
      <description>&lt;p&gt;In our previous articles, we explored the fascinating capabilities of Stable Diffusion for generating captivating images. However, it’s important to note that this powerful generative neural network has even more to offer.&lt;/p&gt;
&lt;p&gt;Riffusion is a Stable Diffusion model for music creation and editing. With Riffusion, you can generate a spectrogram of a desired musical segment and effortlessly transform it into a musical excerpt. Let’s install Riffusion on a LeaderGPU server and try it in action.&lt;/p&gt;
&lt;h2&gt;Prerequisites&lt;/h2&gt;
&lt;p&gt;Start by updating the package cache repository and installed packages:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt update &amp;&amp; sudo apt -y upgrade&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Don’t forget to install NVIDIA® drivers using the &lt;b translate=&quot;no&quot;&gt;autoinstall&lt;/b&gt; command or manually, using our &lt;a href=&quot;https://www.leadergpu.com/articles/499-install-nvidia-drivers-in-linux&quot;&gt;step-by-step&lt;/a&gt; guide:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo ubuntu-drivers autoinstall&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Reboot the server:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo shutdown -r now&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;For creating a virtual environment, developers suggest using a tool named Anaconda. You can also use venv, which we discussed in the Linux system utilities tutorial. Download Anaconda installation script using curl:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;curl --output anaconda.sh https://repo.anaconda.com/archive/Anaconda3-5.3.1-Linux-x86_64.sh&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Make it executable:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;chmod +x anaconda.sh&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And run:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;./anaconda.sh&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Answer YES to all questions, except the last one (install Microsoft VSCode). Then, re-login to the SSH console and create a new virtual environment with Python v3.9:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;conda create --name riffusion python=3.9&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Activate the new virtual environment:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;conda activate riffusion&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If you want to use music formats other than wav, it is  necessary to install the FFmpeg library set as well:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;conda install -c conda-forge ffmpeg&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Install Riffusion&lt;/h2&gt;
&lt;p&gt;Clone the Riffusion repository:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;git clone https://github.com/riffusion/riffusion.git&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Open the downloaded directory:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd riffusion&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let’s make some changes in the requirements file. This prevents errors with torch compatibility:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;nano requirements.txt&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Find and fix packages versions:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;diffusers==0.9.0
torchaudio==2.0.1&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Save changes and proceed with preparing a virtual environment. The following command installs all necessary packages:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;python -m pip install -r requirements.txt&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Finally, you can open a “playground”. This is a simple web interface that helps you learn more about Riffusion’s features:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;python -m riffusion.streamlit.playground&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Open your favorite browser and enter the address &lt;b translate=&quot;no&quot;&gt;http://[SERVER_IP]:8501/&lt;/b&gt;&lt;/p&gt;
&lt;h2&gt;Test a playground&lt;/h2&gt;
&lt;p&gt;Now, you can create music using text prompts and by changing the other parameters:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/913/original/sh_stable_diffusion_riffusion_1.png?1713769543&quot; alt=&quot;Text to audio prompt line&quot;&gt;
&lt;p&gt;Also, you can do some tricky things, like splitting audio into separate components. For example, you can extract vocal from Bohemian rhapsody by Queen:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/914/original/sh_stable_diffusion_riffusion_2.png?1713769583&quot; alt=&quot;Generated results&quot;&gt;
&lt;p&gt;Remember, this is merely a single example of how Riffusion can be utilized. By creating your own application, you can achieve significantly more captivating outcomes. Powerful servers by LeaderGPU will take care of the calculations.&lt;/p&gt;
&lt;p&gt;See also:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/597-stable-video-diffusion&quot;&gt;Stable Video Diffusion&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/598-easy-diffusion-ui&quot;&gt;Easy Diffusion UI&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/604-audiocraft-by-metaai-create-music-by-description&quot;&gt;AudioCraft by MetaAI: create music by description&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description>
      <enclosure url="https://assets.getwildcard.com/system/images/imgs/000/000/912/original/il_stable_diffusion_riffusion.png?1713769486"
        length="0"
        type="image/jpeg"/>
      <pubDate>Tue, 21 Jan 2025 14:12:29 +0100</pubDate>
      <guid isPermaLink="false">594</guid>
      <dc:date>2025-01-21 14:12:29 +0100</dc:date>
    </item>
    <item>
      <title>Stable Diffusion: Generate repeatable faces</title>
      <link>https://www.leadergpu.com/catalog/593-stable-diffusion-generate-repeatable-faces</link>
      <description>&lt;p&gt;Repeatability is the most important aspect when creating graphical content with generative neural networks. This holds true regardless of the type of content you create, be it a cinema or game character, landscape, or scene environment. The main problem can be formulated as: “How can I repeat my result?”. Every time you start generating images with the same positive and negative prompts, you’ll get different results. Sometimes the differences are minor and acceptable, but in most cases, they could pose a problem.&lt;/p&gt;
&lt;p&gt;Stable Diffusion is learned on a large dataset captured from the real world, which explains why repeatability isn’t a strong point of this neural network model. However, this rule doesn’t apply to celebrity photos. These photos are found much more frequently in the real world and, therefore, in the dataset on which Stable Diffusion was trained. You can use these photos as a “constant” or a “starting point” in the generating process.&lt;/p&gt;
&lt;h2&gt;Method 1. “Shaken, not stirred”&lt;/h2&gt;
&lt;p&gt;Of course, you don’t need to create only celebrity images, but you can use multiple relevant prompts to get more or less consistent results. For example, we can take two famous Greek singers: Elena Paparizou and Marina Satti, and get repeatable results:&lt;/p&gt;
&lt;p&gt;&lt;b translate=&quot;no&quot;&gt;Model&lt;/b&gt;: &lt;a href=&quot;https://civitai.com/models/4201/realistic-vision-v60-b1&quot;&gt;Realistic Vision v6.0 beta 1&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;b translate=&quot;no&quot;&gt;Positive prompts:&lt;/b&gt;&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;Elena Paparizou, Marina Satti, fashion portrait, alone, solo, greek woman in beautiful clothes, natural skin, 8k uhd, high quality, film grain, Canon EOS&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;b translate=&quot;no&quot;&gt;Negative prompts:&lt;/b&gt;&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;bad anatomy, bad hands, three hands, three legs, bad arms, missing legs, missing arms, poorly drawn face, bad face, fused face, cloned face, worst face, three crus, extra crus, fused crus, worst feet, three feet, fused feet, fused thigh, three thigh, fused thigh, extra thigh, worst thigh, missing fingers, extra fingers, ugly fingers, long fingers, horn, extra eyes, huge eyes, 2girl, amputation, disconnected limbs, cartoon, cg, 3d, unreal, animate, nsfw, nude, censored&lt;/code&gt;&lt;/pre&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/934/original/sh_stable_diffusion_generate_repeatable_faces_1.png?1713873195&quot; alt=&quot;Greek singer generated&quot;&gt;
&lt;p&gt;It works with any celebrities, as Stable Diffusion tried to reproduce the most prominent facial features. Here, we use the same model and “shake” two Hollywood stars (Dwayne Johnson and Danny Trejo) into one new synthetic character.&lt;/p&gt;
&lt;p&gt;&lt;b translate=&quot;no&quot;&gt;Positive prompts:&lt;/b&gt;&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;Dwayne Johnson, Danny Trejo, fashion portrait, alone, solo, 8k uhd, high quality, film grain, Canon EOS&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;b translate=&quot;no&quot;&gt;Negative prompts:&lt;/b&gt;&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;bad anatomy, bad hands, three hands, three legs, bad arms, missing legs, missing arms, poorly drawn face, bad face, fused face, cloned face, worst face, three crus, extra crus, fused crus, worst feet, three feet, fused feet, fused thigh, three thigh, fused thigh, extra thigh, worst thigh, missing fingers, extra fingers, ugly fingers, long fingers, horn, extra eyes, huge eyes, amputation, disconnected limbs, cartoon, cg, 3d, unreal, animate, nsfw, nude, censored&lt;/code&gt;&lt;/pre&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/935/original/sh_stable_diffusion_generate_repeatable_faces_2.png?1713873232&quot; alt=&quot;Hollywood stars generated&quot;&gt;
&lt;p&gt;Every time you mix the same celebrities, you get similar results. Let’s look at another method to generate repeatable characters.&lt;/p&gt;
&lt;h2&gt;Method 2. Name anchor&lt;/h2&gt;
&lt;p&gt;Celebrities are a good start, but let’s consider other methods for achieving repeatable results. The answer is quite simple: we can use multiple human names. Every nation has unique names, related to linguistic features. For example, the Greek name Kostas can translate to “labor” or “effort”, while Nikos means “Victory of the people”. These two names create a unique image of a generated person, aiding neural network models in understanding our creation objectives.&lt;/p&gt;
&lt;p&gt;&lt;b translate=&quot;no&quot;&gt;Positive prompts:&lt;/b&gt;&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;Portrait of [Kostas | Nikos] on a white background, greek man, short haircut, beard&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;b translate=&quot;no&quot;&gt;Negative prompts:&lt;/b&gt;&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;woman, bad anatomy, bad hands, three hands, three legs, bad arms, missing legs, missing arms, poorly drawn face, bad face, fused face, cloned face, worst face, three crus, extra crus, fused crus, worst feet, three feet, fused feet, fused thigh, three thigh, fused thigh, extra thigh, worst thigh, missing fingers, extra fingers, ugly fingers, long fingers, horn, extra eyes, huge eyes, 2girl, amputation, disconnected limbs, cartoon, cg, 3d, unreal, animate, nsfw, nude, censored&lt;/code&gt;&lt;/pre&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/936/original/sh_stable_diffusion_generate_repeatable_faces_3.png?1713873262&quot; alt=&quot;Greek person generated&quot;&gt;
&lt;p&gt;Let’s generate numerous images (80-100) for further dataset creation. The main prompt was selected to provide convenient images that can be easily cleared from the background. Negative prompts protect us from including random images with distortions in the dataset, as well as images of women.&lt;/p&gt;
&lt;p&gt;&lt;i&gt;Tip: if you receive very different images from each other, try changing the CFG Scale parameter from 7.5 to 15. This will force the neural network to follow the prompts more formally.&lt;/i&gt;&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/937/original/sh_stable_diffusion_generate_repeatable_faces_4.png?1713873299&quot; alt=&quot;Greek person dataset&quot;&gt;
&lt;p&gt;You can select your own unique names with a simple name generator, like &lt;a href=&quot;https://www.behindthename.com/names/list&quot;&gt;Behind the Name&lt;/a&gt;. Also, you can use the ControlNet feature to gain more control.&lt;/p&gt;
&lt;h2&gt;Method 3. Teach appearance&lt;/h2&gt;
&lt;p&gt;We can’t directly influence the final result, but we observe that some tokens (such as celebrity image tokens) carry more weight than others. This means we can create our conditional “celebrity” token by creating an appropriate prompt for it and further training the model on it. This is how LoRA (Low-Rank Adaptation of Large Language Models) operates. You can use &lt;a href=&quot;https://www.leadergpu.com/articles/546-stable-diffusion-lora-selfie&quot;&gt;our step-by-step guide&lt;/a&gt; to train your own LoRA model based on a self-made dataset.&lt;/p&gt;
&lt;p&gt;After removing the background, we obtain clear portraits and use them to create a specific LoRA model. This model helps to replicate a face with a few minor changes:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/938/original/sh_stable_diffusion_generate_repeatable_faces_5.png?1713873334
&quot; alt=&quot;Dataset without background&quot;&gt;
&lt;p&gt;Now, we can generate this character in different locations, create stories, and place him in various roles: from gardener to businessman. His face will be consistent recognizable and repeatable:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/939/original/sh_stable_diffusion_generate_repeatable_faces_6.png?1713873384&quot; alt=&quot;Greek person with various backgrounds&quot;&gt;
&lt;p&gt;This method isn’t ideal, but it works perfectly in a variety of situations. You don’t need to prepare a dataset from a real person, and it can be generated remotely:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/940/original/sh_stable_diffusion_generate_repeatable_faces_7.jpg?1713873419&quot; alt=&quot;Greek person generated result&quot;&gt;
&lt;p&gt;You can attempt to create such a virtual character yourself, without the assistance of a professional designer or 3D-modeling specialist. All you need are fast GPUs, which you can find in dedicated servers by &lt;a href=&quot;https://www.leadergpu.com/&quot;&gt;LeaderGPU&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;See also:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/594-stable-diffusion-riffusion&quot;&gt;Stable Diffusion: Riffusion&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/597-stable-video-diffusion&quot;&gt;Stable Video Diffusion&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/598-easy-diffusion-ui&quot;&gt;Easy Diffusion UI&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description>
      <enclosure url="https://assets.getwildcard.com/system/images/imgs/000/000/933/original/il_stable_diffusion_generate_repeatable_faces.jpg?1713873147"
        length="0"
        type="image/jpeg"/>
      <pubDate>Tue, 21 Jan 2025 13:51:05 +0100</pubDate>
      <guid isPermaLink="false">593</guid>
      <dc:date>2025-01-21 13:51:05 +0100</dc:date>
    </item>
    <item>
      <title>Stable Diffusion: LoRA selfie</title>
      <link>https://www.leadergpu.com/catalog/592-stable-diffusion-lora-selfie</link>
      <description>&lt;p&gt;You can create your first dataset using a simple camera and a fairly uniform background, such as a white wall or monotone blackout curtain. For a sample dataset, I used a mirrorless camera Olympus OM-D EM5 Mark II with 14-42 kit lenses. This camera supports remote control from any smartphone and a very fast continuous shooting mode.&lt;/p&gt;
&lt;p&gt;I mounted the camera on a tripod and set the focus priority to face. After that, I selected the mode in which the camera captures 10 frames consecutively every 3 seconds and initiated the process. During the shooting process, I slowly turned my head in the selected direction and changed the direction after every 10 frames:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/916/original/sh_stable_diffusion_lora_selfie_1.jpg?1713785705&quot; alt=&quot;Face directions&quot;&gt;
&lt;p&gt;The result was around 100 frames with a monotone background:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/917/original/sh_stable_diffusion_lora_selfie_2.png?1713785735&quot; alt=&quot;Photos with background&quot;&gt;
&lt;p&gt;The next step is to remove the background and leave the portrait on a white background.&lt;/p&gt;
&lt;h2&gt;Delete background&lt;/h2&gt;
&lt;p&gt;You can use standard Adobe Photoshop &lt;b translate=&quot;no&quot;&gt;Remove background&lt;/b&gt; function and batch processing. Let’s store actions that we want to apply to every picture in a dataset. Open any image, click on the triangle icon, then click on the &lt;b translate=&quot;no&quot;&gt;+&lt;/b&gt; symbol:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/918/original/sh_stable_diffusion_lora_selfie_3.png?1713785770&quot; alt=&quot;Create new PS action&quot;&gt;
&lt;p&gt;Type the name of the new action, for example, &lt;b translate=&quot;no&quot;&gt;Remove Background&lt;/b&gt; and click &lt;b translate=&quot;no&quot;&gt;Record&lt;/b&gt;:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/919/original/sh_stable_diffusion_lora_selfie_4.png?1713785802&quot; alt=&quot;Type the name of action&quot;&gt;
&lt;p&gt;On the &lt;b translate=&quot;no&quot;&gt;Layers&lt;/b&gt; tab, find the lock symbol and click on it:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/920/original/sh_stable_diffusion_lora_selfie_5.png?1713785831&quot; alt=&quot;Lock the layer&quot;&gt;
&lt;p&gt;Next click on the &lt;b translate=&quot;no&quot;&gt;Remove background&lt;/b&gt; button on the floating panel:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/921/original/sh_stable_diffusion_lora_selfie_6.png?1713785977&quot; alt=&quot;Click remove background&quot;&gt;
&lt;p&gt;Right-click on &lt;b translate=&quot;no&quot;&gt;Layer 0&lt;/b&gt; and select &lt;b translate=&quot;no&quot;&gt;Flatten Image&lt;/b&gt;:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/922/original/sh_stable_diffusion_lora_selfie_7.png?1713786013&quot; alt=&quot;Select Flatten Image&quot;&gt;
&lt;p&gt;All our actions have been recorded. Let’s stop this process:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/923/original/sh_stable_diffusion_lora_selfie_8.png?1713786077&quot; alt=&quot;Stop action recording&quot;&gt;
&lt;p&gt;Now, you can close the open file without saving changes and select &lt;b translate=&quot;no&quot;&gt;File &gt;&gt; Scripts &gt;&gt; Image Processor…&lt;/b&gt;&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/924/original/sh_stable_diffusion_lora_selfie_9.png?1713786112&quot; alt=&quot;Multiple image processor&quot;&gt;
&lt;p&gt;Select input and output directories, choose the created &lt;b translate=&quot;no&quot;&gt;Remove Background&lt;/b&gt; action in step 4 and click on &lt;b translate=&quot;no&quot;&gt;Run&lt;/b&gt; button:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/925/original/sh_stable_diffusion_lora_selfie_10.png?1713786144&quot; alt=&quot;Image processor options&quot;&gt;
&lt;p&gt;Please be patient. Adobe Photoshop will open every picture in the selected directory, repeat the recorded actions (turn off layer lock, delete background, flatten image) and save it in another selected directory. This process can take a couple of minutes, depending on the number of images.&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/926/original/sh_stable_diffusion_lora_selfie_11.png?1713786182&quot; alt=&quot;Photos without background&quot;&gt;
&lt;p&gt;When the process is finished, you can go to the next step.&lt;/p&gt;
&lt;h2&gt;Upload to server&lt;/h2&gt;
&lt;p&gt;Use one of the following guides (tailored to your PC operating system) to upload the &lt;b translate=&quot;no&quot;&gt;dataset&lt;/b&gt; directory to the remote server. For example, place it in the default user’s home directory, &lt;b translate=&quot;no&quot;&gt;/home/usergpu&lt;/b&gt;:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/articles/494-file-exchange-from-linux&quot;&gt;File exchange from Linux&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/articles/495-file-exchange-from-windows&quot;&gt;File exchange from Windows&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/articles/496-file-exchange-from-macos&quot;&gt;File exchange from macOS&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Pre-installation&lt;/h2&gt;
&lt;p&gt;Update existing system packages:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt update &amp;&amp; sudo apt -y upgrade&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Install two additional packages:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt install -y python3-tk python3.10-venv&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let’s install the CUDA® Toolkit version 11.8. Let’s download the specific pin file:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The following command places the downloaded file into the system directory, which is controlled by the &lt;b translate=&quot;no&quot;&gt;apt&lt;/b&gt; package manager:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Next step is to download the main CUDA® repository:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda-repo-ubuntu2204-11-8-local_11.8.0-520.61.05-1_amd64.deb&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;After that, proceed with the package installation using the standard &lt;b translate=&quot;no&quot;&gt;dpkg&lt;/b&gt; utility:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo dpkg -i cuda-repo-ubuntu2204-11-8-local_11.8.0-520.61.05-1_amd64.deb&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Copy the GPG keyring to the system directory. This will make it available for use by operating system utilities, including the apt package manager:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo cp /var/cuda-repo-ubuntu2204-11-8-local/cuda-*-keyring.gpg /usr/share/keyrings/&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Update system cache repositories:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt-get update&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Install the CUDA® toolkit using apt:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt-get -y install cuda&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Add CUDA® to PATH. Open the bash shell config:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;nano ~/.bashrc&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Add the following lines at the end of the file:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;export PATH=/usr/local/cuda/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-11.8/lib64\
                         ${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Save the file and reboot the server:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo shutdown -r now&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Install trainer&lt;/h2&gt;
&lt;p&gt;Copy the Kohya project’s repository to the server:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;git clone https://github.com/bmaltais/kohya_ss.git&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Open the downloaded directory:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd kohya_ss&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Make the setup script executable:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;chmod +x ./setup.sh&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Run the script:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;./setup.sh&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You’ll receive a warning message from the accelerate utility. Let’s resolve the issue. Activate the project’s virtual environment:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;source venv/bin/activate&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Install the missing package:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-venv&quot;&gt;pip install scipy&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And manually configure the accelerate utility:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-venv&quot;&gt;accelerate config&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Be careful, because activating an odd number of CPUs will cause an error. For example, if I have 5 GPUs, only 4 can be used with this software. Otherwise, an error will occur when the process starts. You can immediately check the new utility configuration by calling a default test:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-venv&quot;&gt;accelerate test&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If everything is okay, you’ll receive a message like this:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;Test is a success! You are ready for your distributed training!&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-venv&quot;&gt;deactivate&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now, you can initiate the trainer’s public server with &lt;a href=&quot;https://www.gradio.app/&quot;&gt;Gradio GUI&lt;/a&gt; and simple login/password authentication (change the user/password to your own):&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;./gui.sh --share --username user --password password&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You’ll receive two strings:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;Running on local URL: http://127.0.0.1:7860
Running on public URL: https://&lt;random_numbers_and_letters&gt;.gradio.live&lt;/pre&gt;
&lt;p&gt;Open your web browser and enter the public URL in the address bar. Type your username and password in the appropriate fields, then click Login:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/927/original/sh_stable_diffusion_lora_selfie_12.png?1713786225&quot; alt=&quot;Login screen&quot;&gt;
&lt;h2&gt;Prepare the dataset&lt;/h2&gt;
&lt;p&gt;Start by creating a new folder where you will store the trained LoRA model:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;mkdir /home/usergpu/myloramodel&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Open the following tabs: &lt;b translate=&quot;no&quot;&gt;Utilities &gt;&gt; Captioning &gt;&gt; BLIP captioning&lt;/b&gt;. Fill in the gaps as shown in the picture and click &lt;b translate=&quot;no&quot;&gt;Caption images&lt;/b&gt;:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/928/original/sh_stable_diffusion_lora_selfie_13.png?1713786286&quot; alt=&quot;Set folders&quot;&gt;
&lt;p&gt;Trainer will download and run a specific neural network model (1.6 Gb) that creates text prompts for each image file in the selected directory. It will be executed on a single GPU and takes around a minute.&lt;/p&gt;
&lt;p&gt;Switch tab to &lt;b translate=&quot;no&quot;&gt;LoRA &gt;&gt; Tools &gt;&gt; Dataset preparation &gt;&gt; Dreambooth/LoRA folder preparation&lt;/b&gt;, fill in the gaps, and sequentially press &lt;b translate=&quot;no&quot;&gt;Prepare training data&lt;/b&gt; and &lt;b translate=&quot;no&quot;&gt;Copy info to Folders Tab&lt;/b&gt;:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/929/original/sh_stable_diffusion_lora_selfie_14.png?1713786320&quot; alt=&quot;Set options&quot;&gt;
&lt;p&gt;In this example, we use the name &lt;b translate=&quot;no&quot;&gt;nikolai&lt;/b&gt; as an &lt;b translate=&quot;no&quot;&gt;Instance prompt&lt;/b&gt; and “person” as a &lt;b translate=&quot;no&quot;&gt;Class prompt&lt;/b&gt;. We also set &lt;b translate=&quot;no&quot;&gt;/home/usergpu/dataset&lt;/b&gt; as a &lt;b translate=&quot;no&quot;&gt;Training Images&lt;/b&gt; and &lt;b translate=&quot;no&quot;&gt;/home/usergpu/myloramodel&lt;/b&gt; as a &lt;b translate=&quot;no&quot;&gt;Destination training directory&lt;/b&gt;.&lt;/p&gt;
&lt;p&gt;Switch to the &lt;b translate=&quot;no&quot;&gt;LoRA &gt;&gt; Training &gt;&gt; Folders&lt;/b&gt; tab again. Ensure that the &lt;b translate=&quot;no&quot;&gt;Image folder&lt;/b&gt;, &lt;b translate=&quot;no&quot;&gt;Output folder&lt;/b&gt;, and &lt;b translate=&quot;no&quot;&gt;Logging folder&lt;/b&gt; are correctly filled. If desired, you can change the &lt;b translate=&quot;no&quot;&gt;Model output name&lt;/b&gt; to your own. Finally, click the &lt;b translate=&quot;no&quot;&gt;Start training&lt;/b&gt; button:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/930/original/sh_stable_diffusion_lora_selfie_15.png?1713786423&quot; alt=&quot;Start training&quot;&gt;
&lt;p&gt;The system will start downloading additional files and models (~10 GB). After that, the training process will begin. Depending on the quantity of images and the settings applied, this can take several hours. Once the training is completed, you can download the &lt;b translate=&quot;no&quot;&gt;/home/usergpu/myloramodel&lt;/b&gt; directory to your computer for future use.&lt;/p&gt;
&lt;h2&gt;Test your LoRA&lt;/h2&gt;
&lt;p&gt;We’ve prepared some articles about Stable Diffusion and its forks. You can try to install Easy Diffusion with our guide &lt;a href=&quot;https://www.leadergpu.com/articles/508-easy-diffusion-ui&quot;&gt;Easy Diffusion UI&lt;/a&gt;. After the system was installed and is running, you can upload your LoRA model in SafeTensors format directly to &lt;b translate=&quot;no&quot;&gt;/home/usergpu/easy-diffusion/models/lora&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Update the Easy diffusion web page and select your model from the drop-down list:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/931/original/sh_stable_diffusion_lora_selfie_16.png?1713786465&quot; alt=&quot;Select LoRA model&quot;&gt;
&lt;p&gt;Let’s write a simple prompt, &lt;b translate=&quot;no&quot;&gt;portrait of &amp;lt;nikolai&amp;gt; wearing a cowboy hat&lt;/b&gt;, and generate our first images. Here, we used a &lt;a href=&quot;https://www.leadergpu.com/articles/507-stable-diffusion-models-customization-and-options&quot;&gt;custom Stable Diffusion model&lt;/a&gt; downloaded from &lt;a href=&quot;https://civitai.com/&quot;&gt;civitai.com&lt;/a&gt;: &lt;a href=&quot;https://civitai.com/models/4201/realistic-vision-v51&quot;&gt;Realistic Vision v6.0 B1&lt;/a&gt;:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/932/original/sh_stable_diffusion_lora_selfie_17.png?1713786492&quot; alt=&quot;Generate the image&quot;&gt;
&lt;p&gt;You can experiment with prompts and models, based on Stable Diffusion, to achieve better results. Enjoy!&lt;/p&gt;
&lt;p&gt;See also:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/593-stable-diffusion-generate-repeatable-faces&quot;&gt;Stable Diffusion: Generate repeatable faces&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/594-stable-diffusion-riffusion&quot;&gt;Stable Diffusion: Riffusion&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/597-stable-video-diffusion&quot;&gt;Stable Video Diffusion&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description>
      <enclosure url="https://assets.getwildcard.com/system/images/imgs/000/000/915/original/il_stable_diffusion_lora_selfie.jpg?1713785674"
        length="0"
        type="image/jpeg"/>
      <pubDate>Tue, 21 Jan 2025 13:44:25 +0100</pubDate>
      <guid isPermaLink="false">592</guid>
      <dc:date>2025-01-21 13:44:25 +0100</dc:date>
    </item>
    <item>
      <title>Stable Diffusion: What is ControlNet</title>
      <link>https://www.leadergpu.com/catalog/591-stable-diffusion-what-is-controlnet</link>
      <description>&lt;p&gt;A common misconception among those first encountering generative neural networks is that controlling the final output is tremendously challenging, especially when attempting to alter the output through different prompt phrasing. Currently, a suite of tools known as ControlNet exists to facilitate relatively straightforward and effective control over the generation results.&lt;/p&gt;
&lt;p&gt;In this article, we’ll demonstrate how to easily manipulate the pose of generated characters using pre-existing images and custom “skeletons”, with the help of one such tool, &lt;a href=&quot;https://huggingface.co/lllyasviel/ControlNet-v1-1/tree/main&quot;&gt;OpenPose&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Step 1. Install Stable Diffusion&lt;/h2&gt;
&lt;p&gt;Please use &lt;a href=&quot;https://www.leadergpu.com/articles/506-stable-diffusion-webui&quot;&gt;our step-by-step guide&lt;/a&gt; to install Stable Diffusion with the basic model and WebUI. This guide is based on the AUTOMATIC1111 script.&lt;/p&gt;
&lt;h2&gt;Step 2. Install ControlNet extension&lt;/h2&gt;
&lt;p&gt;We strongly advise against installing the ControlNet extension (sd-webui-controlnet) from the &lt;a href=&quot;https://raw.githubusercontent.com/AUTOMATIC1111/stable-diffusion-webui-extensions/master/index.json&quot;&gt;standard repository&lt;/a&gt; due to potential functionality issues. One significant issue we encountered during the preparation of this guide was the web interface freezing. Although the image is initially generated successfully, the WebUI becomes unresponsive when generating the image a second time. An alternative solution would be to install the same extension from an external source.&lt;/p&gt;
&lt;p&gt;Open WebUI and follow the tabs: &lt;b translate=&quot;no&quot;&gt;Extensions &gt; Install from URL&lt;/b&gt;. Paste this URL in the appropriate field:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;https://github.com/Mikubill/sd-webui-controlnet&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then click &lt;b translate=&quot;no&quot;&gt;Install&lt;/b&gt; button:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/942/original/sh_stable_diffusion_what_is_controlnet_1.png?1713962546&quot; alt=&quot;Install sd-webui-controlnet&quot;&gt;
&lt;p&gt;When the process is completed successfully, the following message should appear:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;Installed into /home/usergpu/stable-diffusion-webui/extensions/sd-webui-controlnet. Use Installed tab to restart.&lt;/pre&gt;
&lt;p&gt;Let’s restart the URL by pressing Apply and restart UI button on the Installed tab:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/943/original/sh_stable_diffusion_what_is_controlnet_2.png?1713962703&quot; alt=&quot;ControlNet Restart UI&quot;&gt;
&lt;p&gt;After rebooting the interface, the new ControlNet element with many additional options will appear:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/944/original/sh_stable_diffusion_what_is_controlnet_3.png?1713962785&quot; alt=&quot;ControlNet enabled&quot;&gt;
&lt;h2&gt;Step 3. Download OpenPose&lt;/h2&gt;
&lt;h3&gt;Add HF key&lt;/h3&gt;
&lt;p&gt;Let’s generate and add an SSH-key that you can use in Hugging Face:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd ~/.ssh &amp;&amp; ssh-keygen&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;When the keypair is generated, you can display the public key in the terminal emulator:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cat id_rsa.pub&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Copy all information starting from ssh-rsa and ending with usergpu@gpuserver, as shown in the following screenshot:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/907/original/sh_llama3_quick_start_3.png?1713533169&quot; alt=&quot;Copy RSA key&quot;&gt;
&lt;p&gt;Open a web browser, type &lt;a href=&quot;https://huggingface.co/&quot;&gt;https://huggingface.co/&lt;/a&gt; into the address bar, and press &lt;b translate=&quot;no&quot;&gt;Enter&lt;/b&gt;. Login into your HF-account and open &lt;a href=&quot;https://huggingface.co/settings/profile&quot;&gt;Profile settings&lt;/a&gt;. Then choose &lt;b translate=&quot;no&quot;&gt;SSH and GPG Keys&lt;/b&gt; and click on the &lt;b translate=&quot;no&quot;&gt;Add SSH Key&lt;/b&gt; button:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/908/original/sh_llama3_quick_start_4.png?1713533229&quot; alt=&quot;Add SSH key&quot;&gt;
&lt;p&gt;Fill in the &lt;b translate=&quot;no&quot;&gt;Key name&lt;/b&gt; and paste the copied &lt;b translate=&quot;no&quot;&gt;SSH Public key&lt;/b&gt; from the terminal. Save the key by pressing &lt;b translate=&quot;no&quot;&gt;Add key&lt;/b&gt;:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/909/original/sh_llama3_quick_start_5.png?1713533267&quot; alt=&quot;Paste the key&quot;&gt;
&lt;p&gt;Now, your HF-account is linked with the public SSH-key. The second part (private key) is stored on the server. The next step is to install a specific Git LFS (Large File Storage) extension, which is used for downloading large files such as neural network models.&lt;/p&gt;
&lt;h3&gt;Install Git LFS&lt;/h3&gt;
&lt;p&gt;The next step is to install a specific Git LFS (Large File Storage) extension, which is used for downloading large files such as neural network models. Open your home directory:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd ~/&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Download and run the shell script. This script installs a new third-party repository with git-lfs:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now, you can install it using the standard package manager:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt-get install git-lfs&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let’s configure git to use our HF nickname:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;git config --global user.name &quot;John&quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And linked to the HF email account:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;git config --global user.email &quot;john.doe@example.com&quot;&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;Download the repository&lt;/h3&gt;
&lt;p&gt;We recommend, if possible, using a local hard drive to download and store models. You can learn more about this from our guide, &lt;a href=&quot;https://www.leadergpu.com/articles/492-disk-partitioning-in-linux&quot;&gt;Disk partitioning in Linux&lt;/a&gt;. For this example, we have mounted an SSD-drive to the /mnt/fastdisk mountpoint. Let’s make it owned by the default user:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo chown usergpu:usergpu /mnt/fastdisk&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Open the directory:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd /mnt/fastdisk&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Clone the ControlNet repository from HuggingFace. Previously installed Git-LFS will automatically replace pointers with real files:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;git clone git@hf.co:lllyasviel/ControlNet-v1-1&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In this example, we add only one model to Stable Diffusion WebUI. However, you can copy all available models from the repository (~18GB):&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cp /mnt/fastdisk/ControlNet-v1-1/control_v11p_sd15_openpose.pth /home/usergpu/stable-diffusion-webui/models/ControlNet/&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Step 4. Run generating process&lt;/h2&gt;
&lt;p&gt;The current model provided is quite basic and might not yield satisfactory results. Therefore, we suggest replacing it with a custom model. Guidelines on how to do this can be found in this article: &lt;a href=&quot;https://www.leadergpu.com/articles/507-stable-diffusion-models-customization-and-options&quot;&gt;Stable Diffusion Models: customization &amp; options&lt;/a&gt;. For this example, we downloaded &lt;a href=&quot;https://civitai.com/api/download/models/130072&quot;&gt;RealisticVision v6.0 B1&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;If you want to generate your first image using OpenPose, open the &lt;b translate=&quot;no&quot;&gt;ControlNet&lt;/b&gt; tab, choose &lt;b translate=&quot;no&quot;&gt;OpenPose&lt;/b&gt;, tick &lt;b translate=&quot;no&quot;&gt;Enable&lt;/b&gt; and &lt;b translate=&quot;no&quot;&gt;Allow Preview&lt;/b&gt;. Then click to &lt;b translate=&quot;no&quot;&gt;Upload&lt;/b&gt; to add an image containing the desired pose:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/945/original/sh_stable_diffusion_what_is_controlnet_4.png?1713962881&quot; alt=&quot;Enable OpenPose and Preview&quot;&gt;
&lt;p&gt;You can request the system to generate a pose preview by clicking the button with the explosion icon:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/946/original/sh_stable_diffusion_what_is_controlnet_5.png?1713963007&quot; alt=&quot;Show preview&quot;&gt;
&lt;p&gt;On the left, the original image is displayed. On the right, you can see the “skeleton” representing the pose as recognized by the neural network model:&lt;/p&gt;
&lt;table border=&quot;0&quot;&gt;
  &lt;tr&gt;
    &lt;td&gt;&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/947/original/sh_stable_diffusion_what_is_controlnet_6.png?1713963067&quot; alt=&quot;Dancing woman&quot;&gt;&lt;/td&gt;
    &lt;td&gt;&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/948/original/sh_stable_diffusion_what_is_controlnet_7.png?1713963111&quot; alt=&quot;OpenPose skeleton&quot;&gt;&lt;/td&gt;
  &lt;/tr&gt;
&lt;/table&gt;
&lt;p&gt;Now you can type the main prompt, for example “&lt;b translate=&quot;no&quot;&gt;dancing bear, by Pixar&lt;/b&gt;” or “&lt;b translate=&quot;no&quot;&gt;dancing fox, by Pixar&lt;/b&gt;” and click the &lt;b translate=&quot;no&quot;&gt;Generate&lt;/b&gt; button. After a few seconds you’ll get results like this:&lt;/p&gt;
&lt;table border=&quot;0&quot;&gt;
  &lt;tr&gt;
    &lt;td&gt;&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/949/original/sh_stable_diffusion_what_is_controlnet_8.png?1713963180&quot; alt=&quot;Dancing bear&quot;&gt;&lt;/td&gt;
    &lt;td&gt;&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/950/original/sh_stable_diffusion_what_is_controlnet_9.png?1713963213&quot; alt=&quot;Dancing fox&quot;&gt;&lt;/td&gt;
  &lt;/tr&gt;
&lt;/table&gt;
&lt;p&gt;The system will attempt to generate a new picture, given the “skeleton” obtained from the original image. In some cases, the pose may not be accurate, but this can be easily corrected by manually editing the “skeleton”.&lt;/p&gt;
&lt;h2&gt;Step 5. Changing pose&lt;/h2&gt;
&lt;p&gt;While it may seem like magic, the model isn’t perfect, and occasional errors can impact the final image. To avoid issues during image generation, you have the option to manually adjust the “skeleton” by clicking on the &lt;b translate=&quot;no&quot;&gt;Edit&lt;/b&gt; button:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/951/original/sh_stable_diffusion_what_is_controlnet_10.png?1713963246&quot; alt=&quot;Edit the skeleton&quot;&gt;
&lt;p&gt;In the provided editor, you can easily adjust the pose by dragging and dropping, or remove unwanted points with a right-click. After that, just click the &lt;b translate=&quot;no&quot;&gt;Send pose to ControlNet&lt;/b&gt; button and the new pose will be applied:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/952/original/sh_stable_diffusion_what_is_controlnet_11.png?1713963279&quot; alt=&quot;Send pose to ControlNet&quot;&gt;
&lt;p&gt;Beyond OpenPose, ControlNet offers a variety of tools to customize and perfect your results. Moreover, the dedicated servers provided by LeaderGPU ensure a quick and convenient process.&lt;/p&gt;
&lt;p&gt;See also:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/592-stable-diffusion-lora-selfie&quot;&gt;Stable Diffusion: LoRA selfie&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/593-stable-diffusion-generate-repeatable-faces&quot;&gt;Stable Diffusion: Generate repeatable faces&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/594-stable-diffusion-riffusion&quot;&gt;Stable Diffusion: Riffusion&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description>
      <enclosure url="https://assets.getwildcard.com/system/images/imgs/000/000/941/original/il_stable_diffusion_what_is_controlnet.png?1713962506"
        length="0"
        type="image/jpeg"/>
      <pubDate>Tue, 21 Jan 2025 10:42:39 +0100</pubDate>
      <guid isPermaLink="false">591</guid>
      <dc:date>2025-01-21 10:42:39 +0100</dc:date>
    </item>
    <item>
      <title>Fooocus: Rethinking of SD and MJ</title>
      <link>https://www.leadergpu.com/catalog/590-fooocus-rethinking-of-sd-and-mj</link>
      <description>&lt;p&gt;The advent of Stable Diffusion and MidJourney has revolutionized our understanding of the potential of generative neural networks. These tools have unveiled a fresh perspective on the process of image creation and the extent to which we can manipulate it. The primary approach involves providing the system with prompts about the desired outcome. Essentially, we highlight three important aspects: object, style, and environment.&lt;/p&gt;
&lt;p&gt;Additional prompts that provide more specific instructions, such as the desired composition, type of camera/lens, and colorization, are also important, but not indispensable. The more comprehensive the instructions, the easier it is for the neural network to process. The role of a prompt engineer has even emerged in the professional space. However, this role can be easily replaced by the same generative neural networks. By combining image creation with text creation skills, we can generate extra prompts to achieve an optimal outcome.&lt;/p&gt;
&lt;p&gt;This is the fundamental concept of Fooocus. It integrates the XL Stable Diffusion model and a GPT2-based prompt generator, which enriches and details your simple prompt. Moreover, Fooocus is equipped with various enhancements and extensions. These features facilitate the generation of spectacular images through a straightforward interface, devoid of complex tools. Let’s delve into its functionality and install Fooocus on a LeaderGPU dedicated server.&lt;/p&gt;
&lt;h2&gt;Prerequisites&lt;/h2&gt;
&lt;p&gt;Begin with the installation prerequisites and reboot afterward:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt update &amp;&amp; sudo apt -y upgrade &amp;&amp; sudo ubuntu-drivers autoinstall &amp;&amp; sudo shutdown -r now&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Download the shell script that installs Anaconda for managing virtual environments:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;wget https://repo.anaconda.com/archive/Anaconda3-2023.09-0-Linux-x86_64.sh&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Set the execution flag and provide data access:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;chmod a+x Anaconda3-2023.09-0-Linux-x86_64.sh&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Run the installation script:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;./Anaconda3-2023.09-0-Linux-x86_64.sh&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;After the process is finished, we recommend disconnecting the SSH-session and preparing for port forwarding. You need to forward port 7865 from the remote server to a local loopback address, 127.0.0.1:7865. For more information, please refer to one of our previous guides: &lt;a href=&quot;https://www.leadergpu.com/articles/528-stable-video-diffusion&quot;&gt;Stable Video Diffusion&lt;/a&gt;. Then, reconnect and proceed with cloning the project’s repository on a GitHub.&lt;/p&gt;
&lt;h2&gt;Fooocus install&lt;/h2&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;git clone https://github.com/lllyasviel/Fooocus.git&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Change directory to Fooocus:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd Fooocus&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Create a virtual environment using Anaconda and the YAML-config prepared by the project’s author:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;conda env create -f environment.yaml&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let’s change our base environment to a newly created one:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;conda activate fooocus&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The following step is to install Python libraries:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-venv&quot;&gt;pip install -r requirements_versions.txt&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now, everything is ready to start:&lt;/p&gt;
&lt;h2&gt;Fooocus start&lt;/h2&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-venv&quot;&gt;python entry_with_update.py&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The initial startup may take some time as the application verifies and downloads all the necessary files for operation. You might want to grab a cup of coffee in the meantime. Once the process is complete, open your browser and type the following URL into the address bar:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;http://127.0.0.1:7865&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Enter your simple prompt and click the &lt;b translate=&quot;no&quot;&gt;Generate&lt;/b&gt; button. If you want more control, tick &lt;b translate=&quot;no&quot;&gt;Advanced&lt;/b&gt; and select the necessary options:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/976/original/sh_fooocus_rethinking_of_sd_and_mj_1.png?1714481840&quot; alt=&quot;Fooocus WebUI&quot;&gt;
&lt;p&gt;The real magic unfolds behind the scenes. The moment you hit the &lt;b translate=&quot;no&quot;&gt;Generate&lt;/b&gt; button, your input prompt is transferred to the GPT2-based language model. This model transforms your brief prompt into a mix of elaborative positive and negative prompts. This mix is subsequently input into the Stable Diffusion XL model, fine-tuned to emulate MidJourney style. As a result, even a brief prompt can generate impressive results.&lt;/p&gt;
&lt;p&gt;Certainly, there’s no restriction on writing your own prompts. However, after multiple iterations, it becomes evident that even in the absence of this, the generated content remains intriguing and diverse.&lt;/p&gt;
&lt;p&gt;See also:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/598-easy-diffusion-ui&quot;&gt;Easy Diffusion UI&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/565-stable-diffusion-webui&quot;&gt;Stable Diffusion WebUI&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/597-stable-video-diffusion&quot;&gt;Stable Video Diffusion&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description>
      <enclosure url="https://assets.getwildcard.com/system/images/imgs/000/000/975/original/il_fooocus_rethinking_of_sd_and_mj.png?1714481802"
        length="0"
        type="image/jpeg"/>
      <pubDate>Tue, 21 Jan 2025 10:36:52 +0100</pubDate>
      <guid isPermaLink="false">590</guid>
      <dc:date>2025-01-21 10:36:52 +0100</dc:date>
    </item>
    <item>
      <title>Blender remote rendering with Flamenco</title>
      <link>https://www.leadergpu.com/catalog/588-blender-remote-rendering-with-flamenco</link>
      <description>&lt;p&gt;When rendering heavy scenes in &lt;a href=&quot;https://www.blender.org/&quot;&gt;Blender&lt;/a&gt; begins to consume too much of your team’s time, you have two options: either upgrade each team member’s computer or outsource rendering to a dedicated farm. Many companies offer ready-made rendering solutions, but if you require full control over the infrastructure, these solutions may not be the most reliable option.&lt;/p&gt;
&lt;p&gt;An alternative approach could involve creating a hybrid infrastructure. In this setup, you would keep your data storage and rendering farm management within your existing infrastructure. The only element that would be located outside would be the rented &lt;a href=&quot;https://www.leadergpu.com/&quot;&gt;GPU servers&lt;/a&gt; on which the rendering would be performed.&lt;/p&gt;
&lt;p&gt;In general, the rendering farm infrastructure for Blender looks like this:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/888/original/sh_blender_remote_rendering_with_flamenco_1.jpg?1713174084&quot; alt=&quot;Basic components scheme&quot;&gt;
&lt;p&gt;Here, we have a central &lt;b translate=&quot;no&quot;&gt;Manager&lt;/b&gt; node that organizes all processes. It receives rendering tasks from users via a specific &lt;b translate=&quot;no&quot;&gt;Blender Add-on&lt;/b&gt; and moves all necessary files to &lt;b translate=&quot;no&quot;&gt;Shared Storage&lt;/b&gt;. Then, the &lt;b translate=&quot;no&quot;&gt;Manager&lt;/b&gt; distributes the tasks to &lt;b translate=&quot;no&quot;&gt;Worker nodes&lt;/b&gt;. They receive a job containing all information about where the Worker can find files to render and what to do with the results obtained. To implement this scheme, you can use a completely free and open-source application called &lt;a href=&quot;https://flamenco.blender.org/&quot;&gt;Flamenco&lt;/a&gt;. In this guide, we show how to prepare all nodes, especially the &lt;b translate=&quot;no&quot;&gt;Manager&lt;/b&gt; and &lt;b translate=&quot;no&quot;&gt;Worker&lt;/b&gt;.&lt;/p&gt;
&lt;p&gt;The &lt;b translate=&quot;no&quot;&gt;Storage&lt;/b&gt; node doesn’t have any specific requirements. It can be used with any operating system that supports SMB/CIFS or NFS protocols. The only requirement is that the storage directory needs to be mounted and accessible by the operating system. In your infrastructure, this can be any shared folder accessible to all nodes.&lt;/p&gt;
&lt;p&gt;Each node has different IP addresses, and the &lt;b translate=&quot;no&quot;&gt;Wireguard VPN&lt;/b&gt; server will be a central point that joins them into one L2-network. This server, located on the external perimeter, allows you to work without making changes to the existing NAT policy.&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/889/original/sh_blender_remote_rendering_with_flamenco_2.jpg?1713174131&quot; alt=&quot;Virtual components scheme&quot;&gt;
&lt;p&gt;For this example, we create the following mixed configuration:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;10.0.0.1 - Wireguard VPN server&lt;/b&gt; (virtual server by any infrastructure provider) with an external IP;&lt;/li&gt;
  &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;10.0.0.2 - Worker node&lt;/b&gt; (dedicated server by LeaderGPU) with an external IP;&lt;/li&gt;
  &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;10.0.0.3 - Manager node&lt;/b&gt; (virtual server in office network) located behind NAT;&lt;/li&gt;
  &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;10.0.0.4 - Storage node&lt;/b&gt; (virtual server in office network) located behind NAT;&lt;/li&gt;
  &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;10.0.0.5 - User node&lt;/b&gt; (consumer laptop in office network) located behind NAT.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Step 1. Wireguard&lt;/h2&gt;
&lt;h3&gt;VPN Server&lt;/h3&gt;
&lt;p&gt;You can install and configure Wireguard manually, using an official guide and examples. However, there is an easier alternative: unofficial script by software engineer from Paris (Stanislas aka &lt;a href=&quot;https://github.com/angristan&quot;&gt;angristan&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;Download the script from GitHub:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-root&quot;&gt;wget https://raw.githubusercontent.com/angristan/wireguard-install/master/wireguard-install.sh&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Make it executable:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-root&quot;&gt;sudo chmod +x wireguard-install.sh&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Execute:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-root&quot;&gt;sudo ./wireguard-install.sh&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Follow the instructions and set the IP address range &lt;b translate=&quot;no&quot;&gt;10.0.0.1/24&lt;/b&gt;. The system will ask you to immediately create a configuration file for the first client. According to the plan, this client will be the worker node with name &lt;b translate=&quot;no&quot;&gt;Worker&lt;/b&gt; and address &lt;b translate=&quot;no&quot;&gt;10.0.0.2&lt;/b&gt;. When the script is completed, a configuration file will appear in the root directory: &lt;b translate=&quot;no&quot;&gt;/root/wg0-client-Worker.conf&lt;/b&gt;.&lt;/p&gt;
&lt;p&gt;Execute the following command to view this configuration:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-root&quot;&gt;cat /home/usergpu/wg0-client-Worker.conf&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;[Interface]
PrivateKey = [CLIENT_PRIVATE_KEY]
Address = 10.0.0.2/32,fd42:42:42::2/128
DNS = 1.1.1.1,1.0.0.1
[Peer]
PublicKey = [SERVER_PRIVATE_KEY]
PresharedKey = [SERVER_PRESHARED_KEY]
Endpoint = [IP_ADDRESS:PORT]
AllowedIPs = 10.0.0.0/24,::/0&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Execute installation script again to create another client. Add all future clients this way, and finally, you can check that all configuration files were created:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-root&quot;&gt;cd ~/&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-root&quot;&gt;ls -l | grep wg0&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;-rw-r--r-- 1 root    root      529 Jul 14 12:59 wg0-client-Manager.conf
-rw-r--r-- 1 root    root      529 Jul 14 12:59 wg0-client-Storage.conf
-rw-r--r-- 1 root    root      529 Jul 14 12:59 wg0-client-User.conf
-rw-r--r-- 1 root    root      529 Jul 14 12:58 wg0-client-Worker.conf&lt;/pre&gt;
&lt;h3&gt;VPN Clients&lt;/h3&gt;
&lt;p&gt;VPN clients include all nodes that need to be connected to a single network. In our guide, this refers to the manager node, storage node, client node (if using Linux), and worker nodes. If the VPN server is running on a worker node, it does not need to be configured as a client (this step can be skipped).&lt;/p&gt;
&lt;p&gt;Update the packages cache repository, then install Wireguard and CIFS support packages:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt update &amp;&amp; sudo apt -y install wireguard cifs-utils&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Elevate privileges to superuser:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo -i&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Open the Wireguard configuration directory:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-root&quot;&gt;cd /etc/wireguard&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Execute the &lt;b translate=&quot;no&quot;&gt;umask&lt;/b&gt; command so that only the superuser has access to files in this directory:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-root&quot;&gt;umask 077&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Generate a private key and save it into a file:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-root&quot;&gt;wg genkey &gt; private-key&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Generate a public key using the private key:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-root&quot;&gt;wg pubkey &gt; public-key &lt; private-key&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Create a configuration file:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-root&quot;&gt;nano /etc/wireguard/wg0.conf&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Paste your own configuration, created for this client:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;[Interface]
PrivateKey = [CLIENT_PRIVATE_KEY]
Address = 10.0.0.2/32,fd42:42:42::2/128
DNS = 1.1.1.1,1.0.0.1
[Peer]
PublicKey = [SERVER_PRIVATE_KEY]
PresharedKey = [SERVER_PRESHARED_KEY]
Endpoint = [SERVER_IP_ADDRESS:PORT]
AllowedIPs = 10.0.0.0/24,::/0
PersistentKeepalive = 1&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Don’t forget to add the &lt;b translate=&quot;no&quot;&gt;PersistentKeepalive = 1&lt;/b&gt; option (where 1 means 1 second) on every node located behind NAT. You can choose this period experimentally. The recommended value by Wireguard’s authors is 25. Save the file and exit, using the &lt;b translate=&quot;no&quot;&gt;CTRL + X&lt;/b&gt; shortcut and the &lt;b translate=&quot;no&quot;&gt;Y&lt;/b&gt; key to confirm.&lt;/p&gt;
&lt;p&gt;If you want to passthrough internet trafic set &lt;b translate=&quot;no&quot;&gt;AllowedIPs&lt;/b&gt; to &lt;b translate=&quot;no&quot;&gt;0.0.0.0/0,::/0&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Then, logout from the root account:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-root&quot;&gt;exit&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Start the connection using systemctl:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo systemctl start wg-quick@wg0.service&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Check that everything is OK and the service has started successfully:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo systemctl status wg-quick@wg0.service&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;● wg-quick@wg0.service - WireGuard via wg-quick(8) for wg0
Loaded: loaded (/lib/systemd/system/wg-quick@.service; enabled; vendor preset: enabled)
Active: active (exited) since Mon 2023-10-23 09:47:53 UTC; 1h 45min ago
  Docs: man:wg-quick(8)
        man:wg(8)
        https://www.wireguard.com/
        https://www.wireguard.com/quickstart/
        https://git.zx2c4.com/wireguard-tools/about/src/man/wg-quick.8
        https://git.zx2c4.com/wireguard-tools/about/src/man/wg.8
Process: 4128 ExecStart=/usr/bin/wg-quick up wg0 (code=exited, status=0/SUCCESS)
Main PID: 4128 (code=exited, status=0/SUCCESS)
  CPU: 76ms&lt;/pre&gt;
&lt;p&gt;If you encounter an error such as «resolvconf: command not found» in Ubuntu 22.04 simply create a symbol link:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo ln -s /usr/bin/resolvectl /usr/local/bin/resolvconf&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Enable the new service to connect automatically while the operating system is booting:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo systemctl enable wg-quick@wg0.service&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now, you can check connectivity by sending echo packets:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;ping 10.0.0.1&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.
64 bytes from 10.0.0.1: icmp_seq=1 ttl=64 time=145 ms
64 bytes from 10.0.0.1: icmp_seq=2 ttl=64 time=72.0 ms
64 bytes from 10.0.0.1: icmp_seq=3 ttl=64 time=72.0 ms
64 bytes from 10.0.0.1: icmp_seq=4 ttl=64 time=72.2 ms
--- 10.0.0.1 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3004ms
rtt min/avg/max/mdev = 71.981/90.230/144.750/31.476 ms&lt;/pre&gt;
&lt;h2&gt;Step 2. NAS node&lt;/h2&gt;
&lt;p&gt;Connect to the VPN server using the guide from Step 1. Then, install the server and client Samba packages:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt install samba samba-client&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Backup your default configuration:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo cp /etc/samba/smb.conf /etc/samba/smb.conf.bak&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Create a directory that will be used as a share:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo mkdir /mnt/share&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Create a new user group that will get access to the new share:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo groupadd smbusers&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Add an existing user to the created group:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo usermod -aG smbusers user&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Set a password for this user. This is a necessary step because the system password and tha Samba password are different entities:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo smbpasswd -a $USER&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Remove the default configuration:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo rm /etc/samba/smb.conf&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And create a new one:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo nano /etc/samba/smb.conf&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;[global]
workgroup = WORKGROUP
security = user
map to guest = bad user
wins support = no
dns proxy = no
[private]
path = /mnt/share
valid users = @smbusers
guest ok = no
browsable = yes
writable = yes&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Save the file and test the new parameters:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;testparm -s&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Restart both Samba services:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo service smbd restart&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo service nmbd restart&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Finally, give the permission to share the folder:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo chown user:smbusers /mnt/share&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Step 3. Samba client connection&lt;/h2&gt;
&lt;p&gt;All nodes in Flamenco use a shared directory located at /mnt/flamenco. You must mount this directory on each node before running the flamenco-client or flamenco-manager scripts. In this example, we use a worker node hosted on LeaderGPU with the username &lt;b translate=&quot;no&quot;&gt;usergpu&lt;/b&gt;. Please replace these details with your own if they differ.&lt;/p&gt;
&lt;p&gt;Create a hidden file where you can store SMB share credentials:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;nano /home/usergpu/.smbcredentials&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Type these two strings:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;username=user # your Samba username
password=password # your Samba password&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Save this file and exit. Then, secure this file by changing the access permissions:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo chmod 600 /home/usergpu/.smbcredentials&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Create a new directory that can be used as a mount point to attach the remote storage:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo mkdir /mnt/flamenco&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And make the user the owner of this directory:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo chown usergpu:users /mnt/flamenco&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The only thing left is to have the network directory mounted automatically:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo nano /etc/systemd/system/mnt-flamenco.mount&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;[Unit]
Description=Mount Remote Storage
[Mount]
What=//10.0.0.4/private
Where=/mnt/flamenco
Type=cifs
Options=mfsymlinks,credentials=/home/usergpu/.smbcredentials,uid=usergpu,gid=users
[Install]
WantedBy=multi-user.target&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Add two lines to your VPN configuration in the &lt;b translate=&quot;no&quot;&gt;[Interface]&lt;/b&gt; section:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo -i&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-root&quot;&gt;nano /etc/wireguard/wg0.conf
&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;…
PostUp = ping 10.0.0.4 -c 4 &amp;&amp; systemctl start mnt-flamenco.mount
PostDown = systemctl stop mnt-flamenco.mount
…&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Reboot the server:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo shutdown -r now&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Check that the services are loaded and the shared directory is successfully mounted:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;df -h&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;Filesystem          Size  Used Avail Use% Mounted on
tmpfs                35G  3.3M   35G   1% /run
/dev/sda2            99G   18G   77G  19% /
tmpfs               174G     0  174G   0% /dev/shm
tmpfs               5.0M     0  5.0M   0% /run/lock
tmpfs                35G  8.0K   35G   1% /run/user/1000
//10.0.0.4/private   40G  9.0G   31G  23% /mnt/flamenco&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Step 4. Manager node&lt;/h2&gt;
&lt;p&gt;Set up a VPN connection using the guide from Step 1. Stop the VPN service before continuing:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo systemctl stop wg-quick@wg0.service&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let’s prepare. Automatic mounting required utilities for the CIFS protocol:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt -y install cifs-utils&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The next important step is to install Blender. You can do this using the standard APT packet manager, but this will most likely install one of the older versions (less then v3.6.4). Let’s use Snap to install the latest version:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo snap install blender --classic&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Check the installed version using the following command:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;blender --version&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;Blender 4.4.3
build date: 2025-04-29
build time: 15:12:13
build commit date: 2025-04-29
build commit time: 14:09
build hash: 802179c51ccc
build branch: blender-v4.4-release
build platform: Linux
build type: Release
…&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If you receive an error message indicating missing libraries, simply install them. All these libraries are included in the XOrg package:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt -y install xorg&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Download the application:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;wget https://flamenco.blender.org/downloads/flamenco-3.7-linux-amd64.tar.gz&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Unpack the downloaded archive:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;tar xvfz flamenco-3.7-linux-amd64.tar.gz&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Go to the created directory:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd flamenco-3.7-linux-amd64/&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And start Flamenco for the first time:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;./flamenco-manager&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Open the following address in your web-browser: &lt;a href=&quot;http://10.0.0.3:8080/&quot;&gt;http://10.0.0.3:8080/&lt;/a&gt;. Click on the &lt;b translate=&quot;no&quot;&gt;Let&#39;s go&lt;/b&gt; button. Type &lt;b translate=&quot;no&quot;&gt;/mnt/flamenco&lt;/b&gt; in the required field, then click &lt;b translate=&quot;no&quot;&gt;Next&lt;/b&gt;:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/890/original/sh_blender_remote_rendering_with_flamenco_3.png?1713174175&quot; alt=&quot;Shared storage setup&quot;&gt;
&lt;p&gt;Flamenco will attempt to locate the Blender executable file. If you have installed Blender from Snap, the path will be &lt;b translate=&quot;no&quot;&gt;/snap/bin/blender&lt;/b&gt;. Check this point and click &lt;b translate=&quot;no&quot;&gt;Next&lt;/b&gt;:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/891/original/sh_blender_remote_rendering_with_flamenco_4.png?1713174210&quot; alt=&quot;PATH environment setup&quot;&gt;
&lt;p&gt;Check the summary and click &lt;b translate=&quot;no&quot;&gt;Confirm&lt;/b&gt;:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/892/original/sh_blender_remote_rendering_with_flamenco_5.png?1713174240&quot; alt=&quot;Check summary settings&quot;&gt;
&lt;p&gt;Return to the SSH session and use the &lt;b translate=&quot;no&quot;&gt;Ctrl + C&lt;/b&gt; keyboard shortcut to interrupt the application. The first launch generates the configuration file &lt;b translate=&quot;no&quot;&gt;flamenco-manager.yaml&lt;/b&gt;. Let’s add some options to the &lt;b translate=&quot;no&quot;&gt;variables&lt;/b&gt; and &lt;b translate=&quot;no&quot;&gt;blenderArgs&lt;/b&gt; sections:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;nano flamenco-manager.yaml&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;# Configuration file for Flamenco.
# For an explanation of the fields, refer to flamenco-manager-example.yaml
#
# NOTE: this file will be overwritten by Flamenco Manager&#39;s web-based configuration system.
#
# This file was written on 2023-10-17 12:41:28 +00:00 by Flamenco 3.7
_meta:
  version: 3
manager_name: Flamenco Manager
database: flamenco-manager.sqlite
listen: :8080
autodiscoverable: true
local_manager_storage_path: ./flamenco-manager-storage
shared_storage_path: /mnt/flamenco
shaman:
  enabled: true
  garbageCollect:
    period: 24h0m0s
    maxAge: 744h0m0s
    extraCheckoutPaths: []
task_timeout: 10m0s
worker_timeout: 1m0s
blocklist_threshold: 3
task_fail_after_softfail_count: 3
variables:
  blender:
    values:
    - platform: linux
      value: blender
    - platform: windows
      value: blender
    - platform: darwin
      value: blender
  storage:
    values:
    is_twoway: true
    values:
    - platform: linux
      value: /mnt/flamenco
    - platform: windows
      value: Z:\
    - platform: darwin
      value: /Volumes/shared/flamenco
  blenderArgs:
    values:
    - platform: all
      value: -b -y -E CYCLES -P gpurender.py&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The first additional block describes additional &lt;a href=&quot;https://flamenco.blender.org/usage/variables/multi-platform/&quot;&gt;Two-way variables&lt;/a&gt;, which are needed for multi-platform farms. This solves the main problem with slashes and paths. In Linux, we use the forward slash symbol (/) as a separator, but in Windows, we use the backslash symbol (\). Here, we create the replacement rule for all available alternatives: Linux, Windows and macOS (&lt;a href=&quot;https://en.wikipedia.org/wiki/Darwin_(operating_system)&quot;&gt;Darwin&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;When you mount network share in Windows, you need to choose a drive letter. For example our &lt;b translate=&quot;no&quot;&gt;Storage&lt;/b&gt; is mounted with &lt;b translate=&quot;no&quot;&gt;Z:&lt;/b&gt; letter. The replacement rule tells the system that for the Windows platform, the &lt;b translate=&quot;no&quot;&gt;/mnt/flamenco&lt;/b&gt; path will be located at &lt;b translate=&quot;no&quot;&gt;Z:\&lt;/b&gt;. For macOS, this path will be &lt;b translate=&quot;no&quot;&gt;/Volumes/shared/flamenco&lt;/b&gt;.&lt;/p&gt;
&lt;p&gt;Look at the second added block. This instructs Blender to use &lt;a href=&quot;https://www.cycles-renderer.org/&quot;&gt;Cycles&lt;/a&gt; rendering engine and calls a simple Python script, &lt;b translate=&quot;no&quot;&gt;gpurender.py&lt;/b&gt;, when Blender executes. This is a simple trick to select the GPU instead of the CPU. There is no standard option to do this directly. You can’t invoke &lt;b translate=&quot;no&quot;&gt;blender --use-gpu&lt;/b&gt; or something similar. However, you can invoke any external Python script using the &lt;b translate=&quot;no&quot;&gt;-P&lt;/b&gt; option. This command instructs the &lt;b translate=&quot;no&quot;&gt;Worker&lt;/b&gt; to find a script in the local directory and execute it when the assigned job invokes the Blender executable.&lt;/p&gt;
&lt;p&gt;Now, we can delegate control of the application to the &lt;a href=&quot;https://systemd.io/&quot;&gt;systemd&lt;/a&gt; init subsystem. Let’s inform the system about the location of the working directory, the executable file, and the user privileges required for launching. Create a new file:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo nano /etc/systemd/system/flamenco-manager.service&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Fill it with the following strings:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;[Unit]
Description=Flamenco Manager service
[Service]
User=user
WorkingDirectory=/home/user/flamenco-3.7-linux-amd64
ExecStart=/home/user/flamenco-3.7-linux-amd64/flamenco-manager
Restart=always
[Install]
WantedBy=multi-user.target&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Save the file and exit the nano text editor.&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo systemctl daemon-reload&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo systemctl start flamenco-manager.service&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo systemctl status flamenco-manager.service&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;● flamenco-manager.service - Flamenco Manager service
Loaded: loaded (/etc/systemd/system/flamenco-manager.service; disabled; vendor preset: enabled)
Active: active (running) since Tue 2023-10-17 11:03:50 UTC; 7s ago
Main PID: 3059 (flamenco-manage)
 Tasks: 7 (limit: 4558)
  Memory: 28.6M
     CPU: 240ms
CGroup: /system.slice/flamenco-manager.service
        └─3059 /home/user/flamenco-3.7-linux-amd64/flamenco-manager&lt;/pre&gt;
&lt;p&gt;Enable automatic start when the system boots:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo systemctl enable flamenco-manager.service&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Step 5. Worker node&lt;/h2&gt;
&lt;p&gt;Connect to the VPN server using the guide from Step 1 and mount the share from Step 3. Stop the VPN service before continuing:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo snap install blender --classic&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Modern *.blend files are compressed with the Zstandard algorithm. To avoid mistakes, it is essential to incorporate support for this algorithm:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt -y install python3-zstd&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Download the application:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;wget https://flamenco.blender.org/downloads/flamenco-3.7-linux-amd64.tar.gz&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Unpack the downloaded archive:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;tar xvfz flamenco-3.7-linux-amd64.tar.gz&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Navigate to the created directory:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd flamenco-3.7-linux-amd64/&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Create an additional script that enables GPU rendering when Flamenco jobs runs:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;nano gpurender.py&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;import bpy
def enable_gpus(device_type, use_cpus=False):
    preferences = bpy.context.preferences
    cycles_preferences = preferences.addons[&quot;cycles&quot;].preferences
    cycles_preferences.refresh_devices()
    devices = cycles_preferences.devices
    if not devices:
        raise RuntimeError(&quot;Unsupported device type&quot;)
    activated_gpus = []
    for device in devices:
        if device.type == &quot;CPU&quot;:
            device.use = use_cpus
        else:
            device.use = True
            activated_gpus.append(device.name)
            print(&#39;activated gpu&#39;, device.name)
    cycles_preferences.compute_device_type = device_type
    bpy.context.scene.cycles.device = &quot;GPU&quot;
    return activated_gpus
enable_gpus(&quot;CUDA&quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Save the file and exit. Then, create a separate service to run Flamenco from systemd:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo nano /etc/systemd/system/flamenco-worker.service&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;[Unit]
Description=Flamenco Worker service
[Service]
User=usergpu
WorkingDirectory=/home/usergpu/flamenco-3.7-linux-amd64
ExecStart=/home/usergpu/flamenco-3.7-linux-amd64/flamenco-worker
Restart=always
[Install]
WantedBy=multi-user.target&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Reload configuration and start the new service:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo systemctl daemon-reload&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo systemctl start flamenco-worker.service&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo systemctl status flamenco-worker.service&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;● flamenco-worker.service - Flamenco Worker service
Loaded: loaded (/etc/systemd/system/flamenco-worker.service; enabled; preset: enabled)
Active: active (running) since Tue 2023-10-17 13:56:18 EEST; 47s ago
Main PID: 636 (flamenco-worker)
 Tasks: 5 (limit: 23678)
Memory: 173.9M
   CPU: 302ms
CGroup: /system.slice/flamenco-worker.service
        └─636 /home/user/flamenco-3.7-linux-amd64/flamenco-worker&lt;/pre&gt;
&lt;p&gt;Enable automatic start when the system boots:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo systemctl enable flamenco-worker.service&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Step 6. User node&lt;/h2&gt;
&lt;p&gt;User node can be managed with any operating system. For this guide, we show how to set up a node with Windows 11 and 4 necessary components:&lt;/p&gt;
&lt;ol&gt;
  &lt;li&gt;VPN connection&lt;/li&gt;
  &lt;li&gt;Mounted remote directory&lt;/li&gt;
  &lt;li&gt;Blender installed&lt;/li&gt;
  &lt;li&gt;Flamenco add-on&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Download and install Wireguard from the &lt;a href=&quot;https://download.wireguard.com/windows-client/wireguard-installer.exe&quot;&gt;official website&lt;/a&gt;. Create a new text file and paste the configuration, generated for the client in Step 1. Rename the file to &lt;b translate=&quot;no&quot;&gt;flamenco.conf&lt;/b&gt; and add it in Wireguard using the &lt;b translate=&quot;no&quot;&gt;Add tunnel&lt;/b&gt; button:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/893/original/sh_blender_remote_rendering_with_flamenco_6.png?1713174282&quot; alt=&quot;Wireguard Add Tunnel&quot;&gt;
&lt;p&gt;Connect to your server by pressing the &lt;b translate=&quot;no&quot;&gt;Activate&lt;/b&gt; button:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/894/original/sh_blender_remote_rendering_with_flamenco_7.png?1713174312&quot; alt=&quot;Activate the tunnel&quot;&gt;
&lt;p&gt;Let’s mount a remote directory. Right-click on &lt;b translate=&quot;no&quot;&gt;This PC&lt;/b&gt; and select &lt;b translate=&quot;no&quot;&gt;Map network drive…&lt;/b&gt;&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/895/original/sh_blender_remote_rendering_with_flamenco_8.png?1713174340&quot; alt=&quot;Mount the remote directory&quot;&gt;
&lt;p&gt;Choose &lt;b translate=&quot;no&quot;&gt;Z:&lt;/b&gt; as the drive letter, type the Samba share address &lt;b translate=&quot;no&quot;&gt;\\10.0.0.4\private&lt;/b&gt; and don’t forget to tick &lt;b translate=&quot;no&quot;&gt;Connect using different credentials&lt;/b&gt;. Then click &lt;b translate=&quot;no&quot;&gt;Finish&lt;/b&gt;. The system will ask you to enter a username and password for the share. After that, the network directory will be mounted as the Z: drive.&lt;/p&gt;
&lt;p&gt;Download and install Blender from the &lt;a href=&quot;https://www.blender.org/download/&quot;&gt;official website&lt;/a&gt;. Then, open the URL &lt;a href=&quot;http://10.0.0.3:8080/flamenco3-addon.zip&quot;&gt;http://10.0.0.3:8080/flamenco3-addon.zip&lt;/a&gt; and install the Flamenco add-on. Activate it in preferences: &lt;b translate=&quot;no&quot;&gt;Edit &gt; Preferences &gt; Add-ons&lt;/b&gt;. Tick &lt;b translate=&quot;no&quot;&gt;System: Flamenco 3&lt;/b&gt;, enter the Manager URL &lt;a href=&quot;http://10.0.0.3:8080&quot;&gt;http://10.0.0.3:8080&lt;/a&gt;, and click the refresh button. The system will connect to the manager node and load storage settings automatically:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/896/original/sh_blender_remote_rendering_with_flamenco_9.png?1713174378&quot; alt=&quot;Enable Flamenco add-on&quot;&gt;
&lt;p&gt;Open the file that you need to render. On the &lt;b translate=&quot;no&quot;&gt;Scene&lt;/b&gt; tab, choose &lt;b translate=&quot;no&quot;&gt;Cycles&lt;/b&gt; from the Render &lt;b translate=&quot;no&quot;&gt;Engine&lt;/b&gt; drop-down list. Don’t forget to save the file, because these settings are stored directly in the *.blend file:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/897/original/sh_blender_remote_rendering_with_flamenco_10.png?1713174408&quot; alt=&quot;Select Render Engine&quot;&gt;
&lt;p&gt;Scroll down and find the &lt;b translate=&quot;no&quot;&gt;Flamenco 3&lt;/b&gt; section. Click &lt;b translate=&quot;no&quot;&gt;Fetch job types&lt;/b&gt; to get a list of available types. Select &lt;b translate=&quot;no&quot;&gt;Simple Blender Render&lt;/b&gt; from the drop-down list and set other options, such as the number of frames, chunk size, and output folder. Finally, click &lt;b translate=&quot;no&quot;&gt;Submit to Flamenco&lt;/b&gt;:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/898/original/sh_blender_remote_rendering_with_flamenco_11.png?1713174432
&quot; alt=&quot;Set rendering parameters&quot;&gt;
&lt;p&gt;The Flamenco add-on creates a new job and uploads a blend file to shared storage. The system will submit the job to an available worker and start the rendering process:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/899/original/sh_blender_remote_rendering_with_flamenco_12.png?1713174462&quot; alt=&quot;Check added rendering job&quot;&gt;
&lt;p&gt;If you check the GPU’s load with nvtop or similar utilities, it shows that all GPUs have compute tasks:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/900/original/sh_blender_remote_rendering_with_flamenco_13.png?1713174495&quot; alt=&quot;Check GPU load&quot;&gt;
You will find the result in a directory that you selected in the previous step. Example &lt;a href=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/901/original/sh_blender_remote_rendering_with_flamenco_14.gif?1713175121
&quot;&gt;here&lt;/a&gt; (&lt;a href=&quot;https://www.blender.org/download/demo-files/&quot;&gt;Ripple Dreams&lt;/a&gt; by &lt;a href=&quot;https://twitter.com/redjam_9&quot;&gt;James Redmond&lt;/a&gt;)
&lt;p&gt;See also:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/586-photogrammetry-with-meshroom&quot;&gt;Photogrammetry with Meshroom&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/590-fooocus-rethinking-of-sd-and-mj&quot;&gt;Fooocus: Rethinking of SD and MJ&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/565-stable-diffusion-webui&quot;&gt;Stable Diffusion WebUI&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description>
      <enclosure url="https://assets.getwildcard.com/system/images/imgs/000/000/888/original/sh_blender_remote_rendering_with_flamenco_1.jpg?1713174084"
        length="0"
        type="image/jpeg"/>
      <pubDate>Tue, 21 Jan 2025 09:47:24 +0100</pubDate>
      <guid isPermaLink="false">588</guid>
      <dc:date>2025-01-21 09:47:24 +0100</dc:date>
    </item>
    <item>
      <title>Photogrammetry with Meshroom</title>
      <link>https://www.leadergpu.com/catalog/586-photogrammetry-with-meshroom</link>
      <description>&lt;p&gt;Photogrammetry is a method of transforming physical objects into three-dimensional digital models that can be edited with 3D software. This process typically uses specialized devices called 3D scanners, which come in two main types: optical and laser.&lt;/p&gt;
&lt;p&gt;Optical scanners often use one or more digital cameras and special lighting to evenly illuminate the object during scanning. This allows for the creation of a 3D model. Laser scanners, on the other hand, use laser beams. These devices emit multiple laser beams and measure the time it takes for each beam to bounce back from the object. Using this data, along with information from position sensors, the scanner calculates the distance to each point on the object. This creates a “point cloud” that forms the basis of the 3D model.&lt;/p&gt;
&lt;h3&gt;Points cloud&lt;/h3&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/050/original/sh_photogrammetry_with_meshroom_1.png?1724833519&quot; alt=&quot;Points cloud rabbit&quot;&gt;
&lt;p&gt;To build the future framework of an object, the system needs to know the coordinates of each vertex in three-dimensional space. The set of vertices is called a point cloud. The more vertices there are, the more detailed the object will be. Creating a point cloud is the first and one of the most crucial steps in recreating a 3D model from photographs.&lt;/p&gt;
&lt;p&gt;It’s important to note that each vertex in the point cloud is initially unconnected to other vertices. This allows for easy filtering: keeping the necessary points and removing the rest, before starting to recreate the object’s mesh.&lt;/p&gt;
&lt;h3&gt;Mesh objects&lt;/h3&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/051/original/sh_photogrammetry_with_meshroom_2.png?1724833553&quot; alt=&quot;Mesh object rabbit&quot;&gt;
&lt;p&gt;A mesh object is a type of 3D model consisting of triangular geometric primitives, often referred to as meshes or polymeshes. Once object points are formed, the application can independently compose triangular primitives from them. By connecting these primitives, it’s possible to create a 3D model of almost any shape. At this stage, the model lacks color and remains unpainted.&lt;/p&gt;
&lt;p&gt;The subsequent texturing stage addresses this issue.&lt;/p&gt;
&lt;h3&gt;Texturing&lt;/h3&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/052/original/sh_photogrammetry_with_meshroom_3.png?1724833589&quot; alt=&quot;Textured rabbit&quot;&gt;
&lt;p&gt;The final stage is the application stretching the image texture extracted from the photos onto the prepared mesh object. The quality of the photos taken and their resolution play a key role here. If it is low, the final result will not look its best. But if a sufficient number of good quality shots were taken, then at the output you’ll receive a fully ready-to-use 3D model of a real object. Below we’ll give some useful tips on preparing the original photos.&lt;/p&gt;
&lt;h2&gt;Camera settings&lt;/h2&gt;
&lt;p&gt;To avoid disappointment with your first attempts at creating a 3D model from photographs, consider these simple basic rules. Each rule will help prevent issues that typically arise during the mesh object creation stage.&lt;/p&gt;
&lt;p&gt;First, don’t rely on your digital camera’s automatic settings. Modern cameras try to balance four key parameters independently:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;ISO,&lt;/li&gt;
    &lt;li&gt;white balance,
    &lt;/li&gt;
    &lt;li&gt;shutter speed,&lt;/li&gt;
    &lt;li&gt;aperture.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In automatic mode, even slight changes in external conditions can cause these settings to vary between frames. These variations can lead to noticeable inconsistencies during the texturing stage.&lt;/p&gt;
&lt;p&gt;To maintain consistent parameters across frames, use the &lt;b translate=&quot;no&quot;&gt;Manual&lt;/b&gt; mode (M). The aperture is a crucial setting here. Depending on your lens, aim for a position where it’s nearly closed. This helps to achieve maximum depth of field: the less open the aperture, the better. However, avoid extreme values. If your lens can be close to &lt;b translate=&quot;no&quot;&gt;f/22&lt;/b&gt;, you’ll get good results using values between &lt;b translate=&quot;no&quot;&gt;f/11&lt;/b&gt; and &lt;b translate=&quot;no&quot;&gt;f/20&lt;/b&gt;.&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/053/original/sh_photogrammetry_with_meshroom_4.png?1724833619&quot; alt=&quot;Makarios aperture difference&quot;&gt;
&lt;p align=&quot;center&quot;&gt;&lt;sup&gt;&lt;i&gt;Left f/11, right f/22&lt;/i&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;p&gt;Closing the aperture, however, creates another problem: insufficient light. This can be addressed in two ways: by increasing ISO sensitivity or lengthening shutter speed. Both methods will affect the final result, albeit differently. Raising the ISO to 6400 introduces digital noise in the image, so it’s best to use the lowest possible values. For near-ideal results, setting the ISO to 100 makes sense. Yet, this means the issue of insufficient lighting persists:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/054/original/sh_photogrammetry_with_meshroom_5.png?1724833649&quot; alt=&quot;Makarios ISO difference&quot;&gt;
&lt;p align=&quot;center&quot;&gt;&lt;sup&gt;&lt;i&gt;Left ISO 100, right ISO 6400&lt;/i&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;p&gt;The most effective way to increase light passing through the camera sensor in low-light conditions is to lengthen the shutter speed. The longer the shutter remains open, the more photons hit the sensor, resulting in a better image quality. However, this approach presents a challenge: without a tripod, a shutter speed of 1/50 second or longer can blur the image. Using a tripod eliminates this problem.&lt;/p&gt;
&lt;p&gt;White balance is the final crucial parameter. It’s important to disable the automatic setting and choose either a preset profile (such as “Sunny day”) or a custom value in Kelvin. For instance, 5200K is a common setting. Lower values shift the hue towards yellow, while higher values lean towards blue. To avoid time-consuming color corrections in post-processing, use the same white balance profile for all photos in a series.&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/055/original/sh_photogrammetry_with_meshroom_6.png?1724833681&quot; alt=&quot;Makarios white balance&quot;&gt;
&lt;p align=&quot;center&quot;&gt;&lt;sup&gt;&lt;i&gt;WB profiles. Left “Sunny day”, right “Auto”&lt;/i&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;p&gt;In summary, to capture high-quality photos for photogrammetry:&lt;/p&gt;
&lt;ol&gt;
    &lt;li&gt;Use a tripod when there is insufficient light.&lt;/li&gt;
    &lt;li&gt;Close the aperture nearly to its minimum.&lt;/li&gt;
    &lt;li&gt;Set the ISO to its minimum value.&lt;/li&gt;
    &lt;li&gt;Choose a shutter speed that gives you the desired result (or use your camera’s built-in exposure meter).&lt;/li&gt;
    &lt;li&gt;Use the same white balance preset.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;Taking photos&lt;/h2&gt;
&lt;p&gt;Let’s discuss how many photos to take and from which angles. The type of object and its background significantly influences the final result. Objects without shiny, transparent, or reflective surfaces are ideal for photogrammetry. In practice, objects like windows and glass often require correction in a 3D editor later. However, the general shooting technique remains the same.&lt;/p&gt;
&lt;p&gt;For small objects placed on a surface, imagine a sphere around the object. Take photos as if your camera is circling the object three times: once from below, once at the middle, and once from above.&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/056/original/sh_photogrammetry_with_meshroom_7.png?1724833713&quot; alt=&quot;Rabbit camera positions&quot;&gt;
&lt;p&gt;It’s crucial that the object occupies at least half, preferably three-quarters of each frame. Instead of using zoom, try to get physically closer to the object. When creating a cloud point, the software needs as many pixels as possible.&lt;/p&gt;
&lt;p&gt;When shooting, remember that the software combines frames into a single object for correct geometry. Make it a rule to take at least three frames from each angle. Once you’ve centered the object in the frame, mentally divide it vertically into three equal parts. Take three pictures, each focusing on one-third of the object. This provides the necessary overlap for the application to accurately calculate each point’s location in 3D space. After photographing the object from all possible sides and angles, you can start preparing the software.&lt;/p&gt;
&lt;h2&gt;Install Meshroom&lt;/h2&gt;
&lt;p&gt;&lt;a href=&quot;https://alicevision.org/&quot;&gt;Meshroom&lt;/a&gt; is a free, cross-platform application that sequentially performs all processing stages, utilizing CPU and GPU resources. While it can run on a standard home computer, each stage may be time-consuming. For large-scale projects involving 3D reconstruction of numerous objects, such as creating an impressive 3D scene, renting a &lt;a href=&quot;https://www.leadergpu.com/&quot;&gt;dedicated GPU server&lt;/a&gt; might be a practical solution.&lt;/p&gt;
&lt;p&gt;Let’s consider a LeaderGPU server with the following configuration: &lt;b translate=&quot;no&quot;&gt;2 x NVIDIA® RTX™ 3090, 2 x Intel® Xeon® Silver 4210 (3.20 GHz), 128GB RAM&lt;/b&gt;. We’ll use Windows Server 2022 as the operating system. Before installing Meshroom, you’ll need to perform some preliminary steps:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/articles/489-connect-to-a-windows-server&quot;&gt;Connect to a Windows server&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/articles/500-install-nvidia-drivers-in-windows&quot;&gt;Install NVIDIA® drivers in Windows&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/articles/513-gpu-rendering-in-rdp&quot;&gt;GPU rendering in RDP&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Visit the project’s official website to &lt;a href=&quot;https://alicevision.org/#meshroom&quot;&gt;download Meshroom&lt;/a&gt;. Unpack the resulting archive to find a ready-to-use application that doesn’t require additional installation. Launch &lt;b translate=&quot;no&quot;&gt;Meshroom.exe&lt;/b&gt; to begin.&lt;/p&gt;
&lt;h3&gt;Upload images&lt;/h3&gt;
&lt;p&gt;The main window of the application is divided into two parts: upper and lower. The upper section contains the Image Gallery, Image Viewer, and 3D Viewer. The lower section houses the Graph editor and Task Manager. To start, drag and drop your captured photos into the designated area. Both compressed (for example, JPG) and RAW file formats are supported. It is recommended to use RAW files because they contain significantly more data for each frame.&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/057/original/sh_photogrammetry_with_meshroom_8.png?1724833779&quot; alt=&quot;Meshroom main window&quot;&gt;
&lt;p&gt;Please note that you already have a ready-made standard pipeline by default, which is schematically displayed in the Graph Editor. This is one of the most important controls that helps to configure all aspects of image processing at each stage. You can manually run each stage by right-clicking and selecting &lt;b translate=&quot;no&quot;&gt;Compute&lt;/b&gt; from the drop-down menu.&lt;/p&gt;
&lt;p&gt;But for the first time, you can simply click the green &lt;b translate=&quot;no&quot;&gt;Start&lt;/b&gt; button, and the application will do everything for you. It will prompt you to save the project, so that you do not accidentally lose the results of the calculation. Click &lt;b translate=&quot;no&quot;&gt;Save&lt;/b&gt;, specify a name and directory and save the project:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/058/original/sh_photogrammetry_with_meshroom_9.png?1724833809&quot; alt=&quot;Meshroom save project&quot;&gt;
&lt;p&gt;Next, the application transfers all processing stages from the Graph Editor to the Task Manager, which handles their execution in a specific order. To check the status of each stage, select the corresponding block in the Graph Editor and click the &lt;b translate=&quot;no&quot;&gt;Log&lt;/b&gt; button in the lower right corner of the screen. You can also see in real time which stage is currently being processed:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/059/original/sh_photogrammetry_with_meshroom_10.png?1724833857&quot; alt=&quot;Meshroom task manager&quot;&gt;
&lt;p&gt;On the right side, you can see the point cloud you’ve built. The final result, generated using the standard pipeline, is available in the directory:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;[Your_Project_Path]\MeshroomCache\Texturing\[Random_Symbols]\texturedMesh.obj&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Of course, if you fix the output path in the final node of the pipeline beforehand, the object will end up on the path you specified. Then you can import it into any text editor to fix surfaces, add light sources and other effects before rendering.&lt;/p&gt;
&lt;h2&gt;Integration&lt;/h2&gt;
&lt;p&gt;While the initial result may look impressive, it often requires refinement in a 3D editor. Meshroom simplifies this process by allowing you to import not just the model, but also the points cloud and camera positions into third-party editors like &lt;a href=&quot;https://www.sidefx.com/&quot;&gt;Houdini&lt;/a&gt; or &lt;a href=&quot;https://www.blender.org/&quot;&gt;Blender&lt;/a&gt;. In the following section we’ll explore how to do it.&lt;/p&gt;
&lt;h3&gt;Houdini&lt;/h3&gt;
&lt;p&gt;In fact, Meshroom is a user-friendly interface for the AliceVision engine, which handles all computation-related operations. This interface implements the corresponding pipeline and task manager. If you use Houdini, you can create your own pipeline directly within the application and use it alongside other tools, eliminating the need to launch Meshroom separately.&lt;/p&gt;
&lt;p&gt;To get started, it’s best to &lt;a href=&quot;https://www.sidefx.com/download/download-houdini/120709/&quot;&gt;download&lt;/a&gt; and install a dedicated launcher that will manage Houdini updates and plugins. Next, add the SideFX Labs plugin, which offers numerous additional tools, including specific nodes for AliceVision. To do this, click the &lt;b translate=&quot;no&quot;&gt;+&lt;/b&gt; button, then select &lt;b translate=&quot;no&quot;&gt;Shelves&lt;/b&gt;:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/060/original/sh_photogrammetry_with_meshroom_11.png?1724833888&quot; alt=&quot;Houdini add Shelves Houdini add Shelves&quot;&gt;
&lt;p&gt;Scroll down the list and select &lt;b translate=&quot;no&quot;&gt;SideFX Labs&lt;/b&gt;, then click the &lt;b translate=&quot;no&quot;&gt;Update Toolset&lt;/b&gt; button:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/061/original/sh_photogrammetry_with_meshroom_12.png?1724833916&quot; alt=&quot;Houdini SideFX Labs Update Toolset&quot;&gt;
&lt;p&gt;To install a plugin, follow these steps: Click the &lt;b translate=&quot;no&quot;&gt;Start Launcher&lt;/b&gt; button, navigate to the &lt;b translate=&quot;no&quot;&gt;Labs/Packages&lt;/b&gt; section in the left menu, and select &lt;b translate=&quot;no&quot;&gt;Install packages&lt;/b&gt;. This will open a window where you can choose packages to install:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/062/original/sh_photogrammetry_with_meshroom_13.png?1724833946&quot; alt=&quot;Add Houdini plugin&quot;&gt;
&lt;p&gt;Choose the &lt;b translate=&quot;no&quot;&gt;Production Build&lt;/b&gt; for your version of Houdini and click &lt;b translate=&quot;no&quot;&gt;Install&lt;/b&gt;. Afterward, restart the application to ensure the new effect icons appear at the top:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/063/original/sh_photogrammetry_with_meshroom_14.png?1724833974&quot; alt=&quot;Houdini new items&quot;&gt;
&lt;p&gt;It’s crucial to note that you won’t find any mention of AliceVision or Meshlab here. This is because the corresponding plugin only functions within the geometry context pipeline. To verify this, click the &lt;b translate=&quot;no&quot;&gt;+&lt;/b&gt; icon, then select &lt;b translate=&quot;no&quot;&gt;New Pane Tab Type&lt;/b&gt;, and choose &lt;b translate=&quot;no&quot;&gt;Network View&lt;/b&gt;:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/064/original/sh_photogrammetry_with_meshroom_15.png?1724834004&quot; alt=&quot;Houdini Network View&quot;&gt;
&lt;p&gt;Press the &lt;b translate=&quot;no&quot;&gt;Tab&lt;/b&gt; key and add a &lt;b translate=&quot;no&quot;&gt;Geometry&lt;/b&gt; node:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/065/original/sh_photogrammetry_with_meshroom_16.png?1724834038&quot; alt=&quot;Houdini add Geometry&quot;&gt;
&lt;p&gt;Double click to open the created node and type &lt;b translate=&quot;no&quot;&gt;av&lt;/b&gt; on your keyboard. The system will instantly display a list of available nodes starting with the Labs AV symbols. These nodes allow you to control the AliceVision engine and integrate it into your own pipelines:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/066/original/sh_photogrammetry_with_meshroom_17.png?1724834061&quot; alt=&quot;Houdini AliceVision nodes&quot;&gt;
&lt;p&gt;To create a proper pipeline, refer to the &lt;a href=&quot;https://www.sidefx.com/tutorials/alicevision-plugin/&quot;&gt;official documentation&lt;/a&gt; for the plugin. Additionally, consider adding the AliceVision directory to the list of environmental variables in the houdini.env file. For a standard installation using the launcher, this file is typically located in the directory &lt;b translate=&quot;no&quot;&gt;C:\Users\Administrator\Documents\houdini20.5\&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Open the houdini.env file with any text editor and add the following line:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;ALICEVISION_PATH = [path to alicevision directory in Meshroom folder]&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;For example, if you installed Meshroom in the root directory of the D: drive, your path might look like this:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;ALICEVISION_PATH = D:\Meshroom\aliceVision&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Save the file, then restart the Houdini application.&lt;/p&gt;
&lt;h3&gt;Blender&lt;/h3&gt;
&lt;p&gt;For Blender users, we recommend the &lt;b translate=&quot;no&quot;&gt;Meshroom2Blender&lt;/b&gt; plugin. While it functions differently from the Houdini plugin, it allows you to export point clouds and camera positions calculated by Meshroom to Blender. To access the plugin code, open the link in your browser:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;https://raw.githubusercontent.com/tibicen/meshroom2blender/master/view3d_point_cloud_visualizer.py&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Save the code as &lt;b translate=&quot;no&quot;&gt;view3d_point_cloud_visualizer.py&lt;/b&gt; in a convenient directory. Next, open Blender and navigate to &lt;b translate=&quot;no&quot;&gt;Edit&lt;/b&gt; - &lt;b translate=&quot;no&quot;&gt;Preferences&lt;/b&gt;. From there, select the &lt;b translate=&quot;no&quot;&gt;Add-ons&lt;/b&gt; tab:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/067/original/sh_photogrammetry_with_meshroom_18.png?1724834088&quot; alt=&quot;Blender Preferences&quot;&gt;
&lt;p&gt;Click the down arrow and select &lt;b translate=&quot;no&quot;&gt;Install from Disk&lt;/b&gt;:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/068/original/sh_photogrammetry_with_meshroom_19.png?1724834111&quot; alt=&quot;Blender install addons&quot;&gt;
&lt;p&gt;In the newly opened window, navigate to the directory where you saved the plugin. Select the plugin file and click the &lt;b translate=&quot;no&quot;&gt;Install from Disk button&lt;/b&gt;:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/069/original/sh_photogrammetry_with_meshroom_20.png?1724834139&quot; alt=&quot;Blender choose plugin file&quot;&gt;
&lt;p&gt;The plugin is now installed. It’s recommended to restart the application. After restarting, you’ll see the &lt;b translate=&quot;no&quot;&gt;Point Cloud Visualizer&lt;/b&gt; item in the viewing mode. The plugin requires you to specify the path to a file with the &lt;b translate=&quot;no&quot;&gt;.ply&lt;/b&gt; extension:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/070/original/sh_photogrammetry_with_meshroom_21.png?1724834177&quot; alt=&quot;Blender new option&quot;&gt;
&lt;p&gt;By default, Meshroom doesn’t generate this type of file. To create it, open the pipeline and add the &lt;b translate=&quot;no&quot;&gt;ConvertSfMFormat&lt;/b&gt; node. Use the &lt;b translate=&quot;no&quot;&gt;SfMData&lt;/b&gt; from the &lt;b translate=&quot;no&quot;&gt;StructureFromMotion&lt;/b&gt; node as input. For output, specify the &lt;b translate=&quot;no&quot;&gt;Images Folder&lt;/b&gt; of the &lt;b translate=&quot;no&quot;&gt;Texturing&lt;/b&gt; node.&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/071/original/sh_photogrammetry_with_meshroom_22.png?1724834206&quot; alt=&quot;Meshroom add Convert node&quot;&gt;
&lt;p&gt;The final step is to specify the format. Click on &lt;b translate=&quot;no&quot;&gt;SfM File Format&lt;/b&gt; in the &lt;b translate=&quot;no&quot;&gt;ConvertSfMFormat&lt;/b&gt; node and select &lt;b translate=&quot;no&quot;&gt;ply&lt;/b&gt; from the drop-down list:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/072/original/sh_photogrammetry_with_meshroom_23.png?1724834239&quot; alt=&quot;Meshroom Convert format&quot;&gt;
&lt;p&gt;Right click on the created node and select &lt;b translate=&quot;no&quot;&gt;Compute&lt;/b&gt;:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/073/original/sh_photogrammetry_with_meshroom_24.png?1724834267&quot; alt=&quot;Meshroom compute task&quot;&gt;
&lt;p&gt;Once the process is complete, you’ll find the required file in the directory:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;[Your_Project_Path]\MeshroomCache\ConvertSfMFormat\[Random_Symbols]\sfm.ply&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You can load it into Blender in two ways: through the aforementioned plugin or via the standard import process &lt;b translate=&quot;no&quot;&gt;File&lt;/b&gt; - &lt;b translate=&quot;no&quot;&gt;Import&lt;/b&gt; - &lt;b translate=&quot;no&quot;&gt;Stanford PLY (.ply)&lt;/b&gt;:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/074/original/sh_photogrammetry_with_meshroom_25.png?1724834291&quot; alt=&quot;Blender import points cloud&quot;&gt;
&lt;p&gt;For more information on using this plugin, we suggest consulting the &lt;a href=&quot;https://github.com/tibicen/meshroom2blender&quot;&gt;project repository&lt;/a&gt; or on a specialized web resource.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Photogrammetry is a large field of knowledge, where we tried to tell only some basic techniques for converting 2D images into a 3D model. This is used in many industries, from architecture to the creation of computer games.&lt;/p&gt;
&lt;p&gt;Having gained the first experience of shooting a dataset and its consistent transformation into a 3D model, you will be able to improve your skills and transfer physical objects into a virtual 3D space. Well, LeaderGPU will help you with computing power, reducing the calculation time and freeing up your workstation for other, often higher-priority tasks.&lt;/p&gt;
&lt;p&gt;See also:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/588-blender-remote-rendering-with-flamenco&quot;&gt;Blender remote rendering with Flamenco&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/590-fooocus-rethinking-of-sd-and-mj&quot;&gt;Fooocus: Rethinking of SD and MJ&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/565-stable-diffusion-webui&quot;&gt;Stable Diffusion WebUI&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description>
      <enclosure url="https://assets.getwildcard.com/system/images/imgs/000/001/049/original/il_photogrammetry_with_meshroom.png?1724833423"
        length="0"
        type="image/jpeg"/>
      <pubDate>Tue, 21 Jan 2025 09:38:44 +0100</pubDate>
      <guid isPermaLink="false">586</guid>
      <dc:date>2025-01-21 09:38:44 +0100</dc:date>
    </item>
    <item>
      <title>Open WebUI: All in one</title>
      <link>https://www.leadergpu.com/catalog/584-open-webui-all-in-one</link>
      <description>&lt;p&gt;Open WebUI was originally developed for Ollama, which we talked about in one of our articles. Previously, it was called Ollama WebUI, but over time, the focus shifted to universality of application, and the name was changed to Open WebUI. This software solves the key problem of convenient work with large neural network models placed locally or on user-controlled servers.&lt;/p&gt;
&lt;h2&gt;Installation&lt;/h2&gt;
&lt;p&gt;The main and most preferred installation method is to deploy a Docker container. This allows you not to think about the presence of dependencies or other components that ensure the correct operation of the software. However, you can install Open WebUI by cloning the project repository from GitHub and building it from source code. In this article, we’ll consider both options.&lt;/p&gt;
&lt;p&gt;Before you begin, make sure that the GPU drivers are installed on the server. Our instruction &lt;a href=&quot;https://www.leadergpu.com/articles/499-install-nvidia-drivers-in-linux&quot;&gt;Install NVIDIA® drivers in Linux&lt;/a&gt; will help you do this.&lt;/p&gt;
&lt;h3&gt;Using Docker&lt;/h3&gt;
&lt;p&gt;If you’ve just ordered a server, then the Docker Engine itself and the necessary set of tools for passing GPUs to the container will be missing. We don’t recommend installing Docker from the standard Ubuntu repository, since it may be outdated and not support all modern options. It would be better to use the installation script posted on the official website:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;curl -sSL https://get.docker.com/ | sh&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In addition to Docker, you need to install the NVIDIA® Container Toolkit, so enable the NVIDIA® repository:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&amp;&amp; curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed &#39;s#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g&#39; | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Update your package cache and install NVIDIA® Container Toolkit:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt update &amp;&amp; sudo apt -y install nvidia-container-toolkit&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;For the toolchain to work, you’ll need to restart the Docker daemon:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo systemctl restart docker&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now you can run the desired container. Note that the following command doesn&#39;t isolate containers from the host network because later you can enable additional options, such as generating images using the Stable Diffusion WebUI. This command will automatically download and run all layers of the image:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo docker run -d --network=host --gpus=all -v ollama:/root/.ollama -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:ollama&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;Using Git&lt;/h3&gt;
&lt;h4&gt;Ubuntu 22.04&lt;/h4&gt;
&lt;p&gt;First, you need to clone the contents of the repository:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;git clone https://github.com/open-webui/open-webui.git&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Open the downloaded directory:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd open-webui/&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Copy the example configuration (you can modify it if necessary), which will set the environment variables for the build:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cp -RPp .env.example .env&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Install the NVM installer, which will help you install the required version of Node.js on the server:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.7/install.sh | bash&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;After that, you need to close and reopen the SSH session so that the next command works correctly.&lt;/p&gt;
&lt;p&gt;Install Node Package Manager:&lt;/p&gt;  
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt -y install npm&lt;/code&gt;&lt;/pre&gt; 
&lt;p&gt;Install Node.js version 22 (current at the time of writing this article):&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;npm install 22&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Install the dependencies required for further assembly:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;npm install&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let’s start the build. Please note that it requires more than 4GB of free RAM:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;npm run build&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The frontend is ready; now it’s time to prepare the backend. Go to the directory with the same name:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd ./backend&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Install pip and ffmpeg packages:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt -y install python3-pip ffmpeg&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Before installation, you need to add a new path to the environment variable:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo nano ~/.bashrc&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Add the following line to the end of the file:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;export PATH=&quot;/home/usergpu/.local/bin:$PATH&quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let’s update it to the latest version:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;python3 -m pip install --upgrade pip&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now you can install the dependencies:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;pip install -r requirements.txt -U&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Install Ollama:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;curl -fsSL https://ollama.com/install.sh | sh&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Everything is ready to launch the application:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;bash start.sh&lt;/code&gt;&lt;/pre&gt;
&lt;h4&gt;Ubuntu 24.04 / 24.10&lt;/h4&gt;
&lt;p&gt;When installing OpenWebUI on Ubuntu 24.04/24.10, you&#39;ll face a key challenge: the operating system uses Python 3.12 by default, while OpenWebUI only supports version 3.11. You can&#39;t simply downgrade Python, doing so would break the operating system. Since the python3.11 package isn&#39;t available in the standard repositories, you&#39;ll need to create a virtual environment to use the correct Python version.&lt;/p&gt;
&lt;p&gt;The best solution is to use the Conda package management system. Conda works like pip but adds virtual environment support similar to venv. Since you only need basic functionality, you&#39;ll use Miniconda, a lightweight distribution. Download the latest release from GitHub:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;curl -L -O &quot;https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh&quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Run the script:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;bash Miniforge3-$(uname)-$(uname -m).sh&lt;/code&gt;&lt;/pre&gt;
Let&#39;s create a virtual environment named pyenv and specify the Python version 3.11:
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;conda create -n pyenv python=3.11&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Activate the created environment:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;conda activate pyenv&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now you can proceed with the standard OpenWebUI installation steps for Ubuntu 22.04. The virtual environment ensures that all installation scripts will run smoothly without any package version conflicts.&lt;/p&gt;
&lt;h2&gt;Models&lt;/h2&gt;
&lt;h3&gt;Ollama library&lt;/h3&gt;
&lt;p&gt;Open WebUI allows you to upload models directly from the web interface, specifying only the name in the format &lt;b translate=&quot;no&quot;&gt;model:size&lt;/b&gt;. To do this, navigate to &lt;a href=&quot;http://192.168.88.20:8080/admin/settings&quot;&gt;http://192.168.88.20:8080/admin/settings&lt;/a&gt; and click &lt;b translate=&quot;no&quot;&gt;Connections&lt;/b&gt;. Then click on wrench icon opposite the &lt;b translate=&quot;no&quot;&gt;http://localhost:11434&lt;/b&gt; string. After looking at the names of the models in the &lt;a href=&quot;https://ollama.com/library&quot;&gt;library&lt;/a&gt;, enter its name and click on the upload icon:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/027/original/sh_open_webui_all_in_one_1.png?1722870065&quot; alt=&quot;Open WebUI manage models&quot;&gt;
&lt;p&gt;After that, the system will automatically download the required model, and it will immediately become available for use. Depending on the selected size, the download may take a different amount of time. Before downloading, make sure that there is enough space on the disk drive. For more information, see the article &lt;a href=&quot;https://www.leadergpu.com/articles/492-disk-partitioning-in-linux&quot;&gt;Disk partitioning in Linux&lt;/a&gt;.&lt;/p&gt;
&lt;h3&gt;Custom models&lt;/h3&gt;
&lt;p&gt;If you need to integrate a neural network model that is not in the Ollama library, you can use the experimental function and load any arbitrary model in GGUF format. To do this, go to &lt;b translate=&quot;no&quot;&gt;Settings - Admin Settings - Connections&lt;/b&gt; and click on wrench icon opposite the &lt;b translate=&quot;no&quot;&gt;http://localhost:11434&lt;/b&gt;. Click on &lt;b translate=&quot;no&quot;&gt;Show&lt;/b&gt; in the &lt;b translate=&quot;no&quot;&gt;Experimental&lt;/b&gt; section. By default, the file mode is activated, which allows you to load a file from your local computer. If you click &lt;b translate=&quot;no&quot;&gt;File Mode&lt;/b&gt;, it will change to &lt;b translate=&quot;no&quot;&gt;URL Mode&lt;/b&gt;, which allows you to specify the URL of the model file, and the server will download it automatically:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/028/original/sh_open_webui_all_in_one_2.png?1736411361&quot; alt=&quot;Open WebUI upload gguf model&quot;&gt;
&lt;h2&gt;RAG&lt;/h2&gt;
&lt;p&gt;In addition to a convenient and functional web interface, Open WebUI helps expand the capabilities of different models, ensuring their joint use. For example, it’s easy to upload documents to form a RAG (Retrieval-augmented generation) vector database. In the process of generating a response to the user, LLM will be able to rely not only on data obtained directly as a result of training, but also on data placed in a similar vector database.&lt;/p&gt;
&lt;h3&gt;Documents&lt;/h3&gt;
&lt;p&gt;By default, Open WebUI scans the /data/docs directory for files that can be placed in the database vector space and performs the transformation using the built-in &lt;a href=&quot;https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2&quot;&gt;all-MiniLM-L6-v2&lt;/a&gt; model. This is not the only model that is suitable for this task, so it makes sense to try other options, for example, from this list.&lt;/p&gt;
&lt;p&gt;Text documents, cleared of tags and other special characters, are best suited for RAG. Of course, you can upload documents as is, but this can greatly affect the accuracy of the generated answers. For example, if you have a knowledge base in Markdown format, you can first clear it of formatting and only then upload it to /data/docs.&lt;/p&gt;
&lt;h3&gt;Web search&lt;/h3&gt;
&lt;p&gt;In addition to local documents, the neural network model can be instructed to use any websites as a data source. This will allow it to answer questions using not only the data it was trained on, but also data hosted on websites specified by the user.&lt;/p&gt;
&lt;p&gt;In fact, this is a type of RAG, which receives HTML pages as input and then transforms them in a special way, taking their place in a vector database. Searching through such a database will be very fast; therefore, the neural network model will be able to quickly generate a response based on its results. Open WebUI supports different search engines but can only work with one at a time, which is specified in the settings.&lt;/p&gt;
&lt;p&gt;To include web search results in neural network responses, click &lt;b translate=&quot;no&quot;&gt;+&lt;/b&gt; (plus symbol) and slide the Web Search switch:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/029/original/sh_open_webui_all_in_one_3.png?1722870140&quot; alt=&quot;Open WebUI enable Web Search&quot;&gt;
&lt;h2&gt;Image generating&lt;/h2&gt;
&lt;p&gt;The highlight of Open WebUI is that this software allows you to combine several neural networks with different tasks to solve a single problem. For example, Llama 3.1 perfectly conducts a dialogue with the user in several languages, but its answers will be exclusively text. It can’t generate images, so there is no way to illustrate its answers.&lt;/p&gt;
&lt;p&gt;Stable Diffusion, which we often wrote about, is the opposite: this neural network generates images perfectly, but it can’t work with texts at all. The developers of Open WebUI tried to combine the strengths of both neural networks in one dialogue and implemented the following scheme of work.&lt;/p&gt;
&lt;p&gt;When you conduct a dialogue in Open WebUI, a special button appears next to each neural network response. By clicking on it, you’ll receive an illustration of this response directly in the chat:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/030/original/sh_open_webui_all_in_one_4.png?1722870173&quot; alt=&quot;Open WebUI images in dialogue&quot;&gt;
&lt;p&gt;This is achieved by calling the Stable Diffusion WebUI API, and at the moment, a connection with versions from Automatic1111 and a connection to ComfyUI are available. You can also generate images via the Dall-E neural network, but it can’t be deployed locally - this is a paid image generation service with closed source code.&lt;/p&gt;
&lt;p&gt;This feature will only work if, in addition to Open WebUI with Ollama, Stable Diffusion WebUI is installed on the server. You can find the installation instructions &lt;a href=&quot;https://www.leadergpu.com/articles/506-stable-diffusion-webui&quot;&gt;here&lt;/a&gt;. The only thing worth mentioning is that when running the ./webui.sh script, you’ll need to specify an additional key to enable the API:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;./webui.sh --listen --api --gradio-auth user:password&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Another pitfall may arise due to a lack of video memory. If you encounter this, you can use two useful keys: &lt;b translate=&quot;no&quot;&gt;--medvram&lt;/b&gt; and &lt;b translate=&quot;no&quot;&gt;--lowvram&lt;/b&gt;. This will avoid the Out-of-memory error when starting generation.&lt;/p&gt;
&lt;p&gt;See also:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/583-how-does-ollama-work&quot;&gt;How does Ollama work&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/574-your-own-llama-2-in-linux&quot;&gt;Your own LLaMa 2 in Linux&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/573-llama-3-using-hugging-face&quot;&gt;Llama 3 using Hugging Face&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description>
      <enclosure url="https://assets.getwildcard.com/system/images/imgs/000/001/026/original/il_open_webui_all_in_one.png?1722870022"
        length="0"
        type="image/jpeg"/>
      <pubDate>Mon, 20 Jan 2025 15:21:46 +0100</pubDate>
      <guid isPermaLink="false">584</guid>
      <dc:date>2025-01-20 15:21:46 +0100</dc:date>
    </item>
    <item>
      <title>How does Ollama work</title>
      <link>https://www.leadergpu.com/catalog/583-how-does-ollama-work</link>
      <description>&lt;p&gt;Ollama is a tool for running large neural network models locally. The use of public services is often perceived by businesses as a potential risk for leakage of confidential and sensitive data. Therefore, deploying LLM on a controlled server allows you to independently manage the data placed on it while utilizing the strengths of LLM.&lt;/p&gt;
&lt;p&gt;This also helps avoid the unpleasant situation of vendor lock-in, where any public service can unilaterally stop providing services. Of course, the initial goal is to enable the use of generative neural networks in locations where internet access is absent or difficult (for example, on an airplane).&lt;/p&gt;
&lt;p&gt;The idea was to simplify the launch, control and fine-tuning of LLMs. Instead of complex multi-step instructions, Ollama allows you to execute one simple command and receive the finished result after some time. It will be presented simultaneously in the form of a local neural network model, with which you can communicate using a web interface and API for easy integration into other applications.&lt;/p&gt;
&lt;p&gt;For many developers, this became a very useful tool, as in most cases, it was possible to integrate Ollama with the IDE used and receive recommendations or ready-made code written directly while working on the application.&lt;/p&gt;
&lt;p&gt;Ollama was originally intended only for computers with the macOS operating system, but was later ported to Linux and Windows. A special version has also been released for working in containerized environments such as Docker. Currently, it works equally well on both desktops and any dedicated server with a GPU. Ollama supports the ability to switch between different models out-of-the-box and maximizes all available resources. Of course, these models may not perform as well on a regular desktop, but they function quite adequately.&lt;/p&gt;
&lt;h2&gt;How to install Ollama&lt;/h2&gt;
&lt;p&gt;Ollama can be installed in two ways: without using containerization, using an installation script, and as a ready-made Docker container. The first method makes it easier to manage the components of the installed system and models, but is less fault-tolerant. The second method is more fault tolerant, but when using it, you need to take into account all the aspects inherent in containers: slightly more complex management and a different approach to data storage.&lt;/p&gt;
&lt;p&gt;Regardless of the chosen method, several additional steps are needed to prepare the operating system.&lt;/p&gt;
&lt;h3&gt;Prerequisites&lt;/h3&gt;
&lt;p&gt;Update the package cache repository and installed packages:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt update &amp;&amp; sudo apt -y upgrade&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Install all necessary GPU drivers using auto install feature:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo ubuntu-drivers autoinstall&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Reboot the server:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo shutdown -r now&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;Installation via script&lt;/h3&gt;
&lt;p&gt;The following script detects the current operating system architecture and installs the appropriate version of Ollama:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;curl -fsSL https://ollama.com/install.sh | sh&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;During operation, the script will create a separate &lt;b translate=&quot;no&quot;&gt;ollama&lt;/b&gt; user, under which the corresponding daemon will be launched. Incidentally, the same script functions well in WSL2, enabling the installation of the Linux version of Ollama on Windows Server.&lt;/p&gt;
&lt;h3&gt;Installation via Docker&lt;/h3&gt;
&lt;p&gt;There are various methods to install Docker Engine on a server. The easiest way is to use a specific script that installs the current Docker version. This approach is effective for Ubuntu Linux, from version 20.04 (LTS) up to the latest version, Ubuntu 24.04 (LTS):&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;curl -sSL https://get.docker.com/ | sh&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;For Docker containers to interact properly with the GPU, an additional toolkit must be installed. Since it’s not available in the basic Ubuntu repositories, you need to first add a third-party repository using the following command:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&amp;&amp; curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed &#39;s#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g&#39; | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Update the package cache repository:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt update&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And install the &lt;a href=&quot;https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html&quot;&gt;nvidia-container-toolkit&lt;/a&gt; package:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt install nvidia-container-toolkit&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Don’t forget to restart the docker daemon via systemctl:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo systemctl restart docker&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;It’s time to download and run the Ollama with the Open-WebUI web interface:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo docker run -d -p 3000:8080 --gpus=all -v ollama:/root/.ollama -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:ollama&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Open the web browser and navigate to &lt;b translate=&quot;no&quot;&gt;http://[server-ip]:3000&lt;/b&gt;:&lt;/p&gt;
&lt;h2&gt;Download and run the models&lt;/h2&gt;
&lt;h3&gt;Via command line&lt;/h3&gt;
&lt;p&gt;Just run the following command:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;ollama run llama3&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;Via WebUI&lt;/h3&gt;
&lt;p&gt;Open &lt;b translate=&quot;no&quot;&gt;Settings &gt; Models&lt;/b&gt;, type the necessary model name, for example, &lt;b translate=&quot;no&quot;&gt;llama3&lt;/b&gt; and click on the button with download symbol:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/990/original/sh_how_does_ollama_work_1.png?1717153168&quot; alt=&quot;Models download&quot;&gt;
&lt;p&gt;The model will download and install automatically. Once completed, close the settings window and select the downloaded model. After this you can begin a dialogue with it:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/991/original/sh_how_does_ollama_work_2.png?1717153253&quot; alt=&quot;Start chatting&quot;&gt;
&lt;h2&gt;VSCode integration&lt;/h2&gt;
&lt;p&gt;If you have installed Ollama using the installation script, you can launch any of the supported models almost instantly. In the next example, we will run the default model expected by the Ollama Autocoder extension (&lt;b translate=&quot;no&quot;&gt;openhermes2.5-mistral:7b-q4_K_M&lt;/b&gt;):&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;ollama run openhermes2.5-mistral:7b-q4_K_M&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;By default, Ollama permits working through an API, only allowing connections from the local host. Hence, before installing and using the extension for Visual Studio Code, port forwarding is required. Specifically, you need to forward remote port &lt;b translate=&quot;no&quot;&gt;11434&lt;/b&gt; to your local computer. You can find an example of how to do this in our article about &lt;a href=&quot;https://www.leadergpu.com/articles/508-easy-diffusion-ui&quot;&gt;Easy Diffusion WebUI&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Type &lt;b translate=&quot;no&quot;&gt;Ollama Autocoder&lt;/b&gt; in a search field, then click &lt;b translate=&quot;no&quot;&gt;Install&lt;/b&gt;:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/992/original/sh_how_does_ollama_work_3.png?1717153306&quot; alt=&quot;Install Ollama Autocoder&quot;&gt;
&lt;p&gt;After installing the extension, a new item titled &lt;b translate=&quot;no&quot;&gt;Autocomplete with Ollama&lt;/b&gt; will be available in the command palette. Begin coding and initiate this command.&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/993/original/sh_how_does_ollama_work_4.png?1717153542&quot; alt=&quot;Autocomplete with Ollama&quot;&gt;
&lt;p&gt;The extension will connect to the LeaderGPU server using port forwarding and within a few seconds, the generated code will display on your screen:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/994/original/sh_how_does_ollama_work_5.png?1717153572&quot; alt=&quot;Test Python example&quot;&gt;
&lt;p&gt;You can assign this command to a hotkey. Use it whenever you want to supplement your code with a generated fragment. This is just one example of available VSCode extensions. The principle of port forwarding from a remote server to a local computer enables you to set up a single server with a running LLM for an entire developer team. This assurance prevents third-party companies or hackers from using the sent code.&lt;/p&gt;
&lt;p&gt;See also:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/584-open-webui-all-in-one&quot;&gt;Open WebUI: All in one&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/574-your-own-llama-2-in-linux&quot;&gt;Your own LLaMa 2 in Linux&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/573-llama-3-using-hugging-face&quot;&gt;Llama 3 using Hugging Face&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description>
      <enclosure url="https://assets.getwildcard.com/system/images/imgs/000/000/989/original/il_how_does_ollama_work.png?1717153121"
        length="0"
        type="image/jpeg"/>
      <pubDate>Mon, 20 Jan 2025 15:16:02 +0100</pubDate>
      <guid isPermaLink="false">583</guid>
      <dc:date>2025-01-20 15:16:02 +0100</dc:date>
    </item>
    <item>
      <title>PrivateGPT: AI for documents</title>
      <link>https://www.leadergpu.com/catalog/581-privategpt-ai-for-documents</link>
      <description>&lt;p&gt;Large language models have greatly evolved over the past few years and have become effective tools for many tasks. The only problem with their use is that most products based on these models utilize ready-made services from third-party companies. This usage has the potential to leak sensitive data, so many companies avoid uploading internal documents into public LLM services.&lt;/p&gt;
&lt;p&gt;A project like PrivateGPT could be a solution. It is initially designed for completely local use. Its strength is that you can submit various documents as input, and the neural network will read them for you and provide its own comments in response to your requests. For example, you can “feed” large texts to it and ask it to draw some conclusions based on the user’s request. This allows you to significantly save time on proofreading.&lt;/p&gt;
&lt;p&gt;This is particularly true for professional fields like medicine. For instance, a doctor can make a diagnosis and request the neural network to confirm it based on the uploaded array of documents. This enables obtaining an additional independent opinion, thereby reducing the number of medical errors. Since requests and documents do not leave the server, one can be assured that the received data will not appear in the public domain.&lt;/p&gt;
&lt;p&gt;Today, we’ll show you how to deploy a neural network on dedicated LeaderGPU servers with the Ubuntu 22.04 LTS operating system in just 20 minutes.&lt;/p&gt;
&lt;h2&gt;System prepare&lt;/h2&gt;
&lt;p&gt;Begin by updating your packages to the latest version:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt update &amp;&amp; sudo apt -y upgrade&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now, install additional packages, libraries, and the NVIDIA® graphics driver. All of these will be needed to successfully build the software and run it on the GPU:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt -y install build-essential git gcc cmake make openssl libssl-dev libbz2-dev libreadline-dev libsqlite3-dev zlib1g-dev libncursesw5-dev libgdbm-dev libc6-dev zlib1g-dev libsqlite3-dev tk-dev libssl-dev openssl libffi-dev lzma liblzma-dev libbz2-dev&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;CUDA® 12.4 install&lt;/h2&gt;
&lt;p&gt;In addition to the driver, you need to install the NVIDIA® CUDA® toolkit. These instructions were tested on CUDA® 12.4, but everything should also work on CUDA® 12.2. However, keep in mind that you’ll need to indicate the version you are installed when specifying the path to the executable files.&lt;/p&gt;
&lt;p&gt;Run the following command sequentially:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;wget https://developer.download.nvidia.com/compute/cuda/12.4.0/local_installers/cuda-repo-ubuntu2204-12-4-local_12.4.0-550.54.14-1_amd64.deb&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo dpkg -i cuda-repo-ubuntu2204-12-4-local_12.4.0-550.54.14-1_amd64.deb&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo cp /var/cuda-repo-ubuntu2204-12-4-local/cuda-*-keyring.gpg /usr/share/keyrings/&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt-get update &amp;&amp; sudo apt-get -y install cuda-toolkit-12-4&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;More information on installing CUDA® can be &lt;a href=&quot;https://www.leadergpu.com/articles/615-install-cuda-toolkit-in-linux&quot;&gt;found&lt;/a&gt; in our Knowledge Base. Now, reboot the server:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo shutdown -r now&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;PyEnv install&lt;/h2&gt;
&lt;p&gt;It’s time to install a simple Python version control utility called PyEnv. This is a significantly improved fork of the similar project for Ruby (&lt;a href=&quot;https://github.com/rbenv/rbenv&quot;&gt;rbenv&lt;/a&gt;), configured to work with Python. It can be installed with one-line script:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;curl https://pyenv.run | bash&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now, you need to add some variables to the end of the script file, which is executed at login. The first three lines are responsible for the correct operation of PyEnv, and the fourth is needed for Poetry, which will be installed later:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;nano .bashrc&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;export PYENV_ROOT=&quot;$HOME/.pyenv&quot;
[[ -d $PYENV_ROOT/bin ]] &amp;&amp; export PATH=&quot;$PYENV_ROOT/bin:$PATH&quot;
eval &quot;$(pyenv init -)&quot;
export PATH=&quot;/home/usergpu/.local/bin:$PATH&quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Apply the settings you’ve made:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;source .bashrc&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Install Python version 3.11:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;pyenv install 3.11&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Create a virtual environment for Python 3.11:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;pyenv local 3.11&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Poetry install&lt;/h2&gt;
&lt;p&gt;The next piece of the puzzle is Poetry. This is an analogue of pip for managing dependencies in Python projects. The author of Poetry was tired of constantly dealing with different configuration methods, such as &lt;b translate=&quot;no&quot;&gt;setup.cfg&lt;/b&gt;, &lt;b translate=&quot;no&quot;&gt;requirements.txt&lt;/b&gt;, &lt;b translate=&quot;no&quot;&gt;MANIFEST.ini&lt;/b&gt;, and others. This became the driver for the development of a new tool that uses a &lt;b translate=&quot;no&quot;&gt;pyproject.toml&lt;/b&gt; file, which stores all the basic information about a project, not just a list of dependencies.&lt;/p&gt;
&lt;p&gt;Install Poetry:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;curl -sSL https://install.python-poetry.org | python3 -&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;PrivateGPT install&lt;/h2&gt;
&lt;p&gt;Now that everything is ready, you can clone the PrivateGPT repository:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;git clone https://github.com/imartinez/privateGPT&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Go to the downloaded repository:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd privateGPT&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Run dependency installation using Poetry while enabling additional components:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;ui&lt;/b&gt; - adds a &lt;a href=&quot;https://www.gradio.app/&quot;&gt;Gradio&lt;/a&gt; based management web interface to the backend application;&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;embedding-huggingface&lt;/b&gt; - enables support for embedding models downloaded from &lt;a href=&quot;https://huggingface.co/&quot;&gt;HuggingFace&lt;/a&gt;;&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;llms-llama-cpp&lt;/b&gt; - adds support for direct inference of models in GGUF format;&lt;/li&gt;
    &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;vector-stores-qdrant&lt;/b&gt; - adds the qdrant vector database.&lt;/li&gt;
&lt;/ul&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;poetry install --extras &quot;ui embeddings-huggingface llms-llama-cpp vector-stores-qdrant&quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Set your Hugging Face access token. For additional information please read &lt;a href=&quot;https://huggingface.co/docs/hub/security-tokens&quot; target=&quot;_blank&quot;&gt;this article&lt;/a&gt;:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;export HF_TOKEN=&quot;YOUR_HUGGING_FACE_ACCESS_TOKEN&quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now, run the installation script, which will automatically download the model and weights (Meta Llama 3.1 8B Instruct by default):&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;poetry run python scripts/setup&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The following command recompile &lt;b translate=&quot;no&quot;&gt;llms-llama-cpp&lt;/b&gt; separately to enable NVIDIA® CUDA® support, in order to offload workloads to the GPU:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;CUDACXX=/usr/local/cuda-12/bin/nvcc CMAKE_ARGS=&quot;-DGGML_CUDA=on -DCMAKE_CUDA_ARCHITECTURES=native&quot; FORCE_CMAKE=1 pip install llama-cpp-python --no-cache-dir --force-reinstall --upgrade&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If you get an error like &lt;b&gt;nvcc fatal : Unsupported gpu architecture &#39;compute_&#39;&lt;/b&gt; just specify the exact architecture of the GPU you are using. For example: &lt;b&gt;DCMAKE_CUDA_ARCHITECTURES=86&lt;/b&gt; for NVIDIA® RTX™ 3090.&lt;/p&gt;
&lt;p&gt;The final step before beginning is to install support for asynchronous calls (async/await):&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;pip install asyncio&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;PrivateGPT run&lt;/h2&gt;
&lt;p&gt;Run PrivateGPT using a single command:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;make run&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Open your web browser and go to the page &lt;b translate=&quot;no&quot;&gt;http://[LeaderGPU_server_IP_address]:8001&lt;/b&gt;&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/984/original/sh_privategpt_ai_for_documents_1.png?1714731952&quot; alt=&quot;PrivateGPT WebUI&quot;&gt;
&lt;p&gt;See also:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/571-starcoder-your-local-coding-assistant&quot;&gt;StarCoder: your local coding assistant&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/590-fooocus-rethinking-of-sd-and-mj&quot;&gt;Fooocus: Rethinking of SD and MJ&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/565-stable-diffusion-webui&quot;&gt;Stable Diffusion WebUI&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description>
      <enclosure url="https://assets.getwildcard.com/system/images/imgs/000/000/983/original/il_privategpt_ai_for_documents.png?1714731899"
        length="0"
        type="image/jpeg"/>
      <pubDate>Mon, 20 Jan 2025 12:01:00 +0100</pubDate>
      <guid isPermaLink="false">581</guid>
      <dc:date>2025-01-20 12:01:00 +0100</dc:date>
    </item>
    <item>
      <title>Qwen 2 vs Llama 3</title>
      <link>https://www.leadergpu.com/catalog/579-qwen-2-vs-llama-3</link>
      <description>&lt;p&gt;Large Language Models (LLMs) have significantly impacted our lives. Despite understanding their internal structure, these models remain a focal point for scientists who often liken them to a “black box”. The final result depends not only on the LLM’s design but also on its training and the data used for training.&lt;/p&gt;
&lt;p&gt;While scientists find research opportunities, end-users are primarily interested in two things: speed and quality. These criteria play a crucial role in the selection process. To accurately compare two LLMs, many seemingly unrelated factors need to be standardized.&lt;/p&gt;
&lt;p&gt;The equipment used for interference and the software environment, including the operating system, driver versions, and software packages, have the most significant impact. It’s essential to select an LLM version that operates on various equipment and choose a speed metric that’s easily comprehensible.&lt;/p&gt;
&lt;p&gt;We selected ‘tokens per second’ (tokens/s) as this metric. It’s important to note that a token ≠ a word. The LLM breaks words into simpler components, typical of a specific language, referred to as tokens.&lt;/p&gt;
&lt;p&gt;The statistical predictability of the next character varies across languages, so tokenization will differ. For instance, in English, approximately 100 tokens are derived from every 75 words. In languages using the Cyrillic alphabet, the number of tokens per word may be higher. So, 75 words in a Cyrillic language, like Russian, could equate to 120-150 tokens.&lt;/p&gt;
&lt;p&gt;You can verify this using OpenAI’s &lt;a href=&quot;https://platform.openai.com/tokenizer&quot;&gt;Tokenizer&lt;/a&gt; tool. It shows how many tokens a text fragment is broken into, making ‘tokens per second’ a good indicator of an LLM’s natural language processing speed and performance.&lt;/p&gt;
&lt;p&gt;Each test was conducted on the Ubuntu 22.04 LTS operating system with NVIDIA® drivers version 535.183.01 and the NVIDIA® CUDA® 12.5 Toolkit installed. Questions were formulated to assess the LLM’s quality and speed. The processing speed of each answer was recorded and will contribute to the average value for each tested configuration.&lt;/p&gt;
&lt;p&gt;We began testing various GPUs, from the latest models to the older ones. A crucial condition for the test was that we measured the performance of only one GPU, even if the multiple ones were present in the server configuration. This is because the performance of a configuration with multiple GPUs depends on additional factors such as the presence of a high-speed interconnect between them (NVLink®).&lt;/p&gt;
&lt;p&gt;In addition to speed, we also attempted to evaluate the quality of responses on a 5-point scale, where 5 represents the best outcome. This information is provided here for general understanding only. Each time, we’ll pose the same questions to the neural network and attempt to discern how accurately each one comprehends what the user wants from it.&lt;/p&gt;
&lt;h2&gt;Qwen 2&lt;/h2&gt;
&lt;p&gt;Recently, a team of developers from Alibaba Group presented the second version of their generative neural network Qwen. It understands 27 languages and is well optimized for them. Qwen 2 comes in different sizes to make it easy to deploy on any device (from highly resource-constrained embedded systems to a dedicated server with GPUs):&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;0.5B: suitable for IoT and embedded systems;&lt;/li&gt;
    &lt;li&gt;1.5B: an extended version for embedded systems, used where the capabilities of 0.5B will not be enough;&lt;/li&gt;
    &lt;li&gt;7B: medium-sized model, well suited for natural language processing;&lt;/li&gt;
    &lt;li&gt;57B: high-performance large model suitable for demanding applications;&lt;/li&gt;
    &lt;li&gt;72B: the ultimate Qwen 2 model, designed to solve the most complex problems and process large volumes of data.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Versions 0.5B and 1.5B were trained on datasets with a context length of 32K. Versions 7B and 72B were already trained on the 128K context. The compromise model 57B was trained on datasets with a context length of 64K. The creators position Qwen 2 as an analog of Llama 3 capable of solving the same problems, but much faster.&lt;/p&gt;
&lt;h2&gt;Llama 3&lt;/h2&gt;
&lt;p&gt;The third version of the generative neural network from the MetaAI Llama family was introduced in April 2024. It was released, unlike Qwen 2, in only two versions: 8B and 70B. These models were positioned as a universal tool for solving many problems in various cases. It continued the trend towards multilingualism and multimodality, while simultaneously becoming faster than the previous versions and supporting a longer context length.&lt;/p&gt;
&lt;p&gt;The creators of Llama 3 tried to fine-tune the models to reduce the percentage of statistical hallucinations and increase the variety of answers. So Llama 3 is quite capable of giving practical advice, helping to write a business letter, or speculating on a topic specified by the user. The datasets on which Llama 3 models were trained had a context length of 128K and more than 5% included data in 30 languages. However, as stated in the press release, generation performance in English will be significantly higher than in any other language.&lt;/p&gt;
&lt;h2&gt;Comparison&lt;/h2&gt;
&lt;h3&gt;NVIDIA® RTX™ A6000&lt;/h3&gt;
&lt;p&gt;Let’s start our speed measurements with the NVIDIA® RTX™ A6000 GPU, based on the Ampere architecture (not to be confused with the NVIDIA® RTX™ A6000 Ada). This card has very modest characteristics, but at the same time, it has 48 GB of VRAM, which allows it to operate with fairly large neural network models. Unfortunately, low clock speed and bandwidth are the reasons for the low inference speed of text LLMs.&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/995/original/il_qwen_2_vs_llama_3_1.png?1720184216&quot; alt=&quot;Nvidia A6000 chart qwen2-vs-llama3&quot;&gt;
&lt;p&gt;Immediately after launch, the Qwen 2 neural network began to outperform Llama 3. When answering the same questions, the average difference in speed was 24% in favor of Qwen 2. The speed of generating answers was in the range of 11-16 tokens per second. This is 2-3 times faster than trying to run generation even on a powerful CPU, but in our rating, this is the most modest result.&lt;/p&gt;
&lt;h3&gt;NVIDIA® RTX™ 3090&lt;/h3&gt;
&lt;p&gt;The next GPU is also built on the Ampere architecture, has 2 times less video memory, but at the same time, it operates at a higher frequency (19500 MHz versus 16000 Mhz). Video memory bandwidth is also higher (936.2 GB/s versus 768 GB/s). Both of these factors seriously increase the performance of the RTX™ 3090, even taking into account the fact that it has 256 fewer CUDA® cores.&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/996/original/il_qwen_2_vs_llama_3_2.png?1720184259&quot; alt=&quot;Nvidia RTX 3090 chart qwen2-vs-llama3&quot;&gt;
&lt;p&gt;Here you can clearly see that Qwen 2 is much faster (up to 23%) than Llama 3 when performing the same tasks. Regarding the quality of generation, the multi language support of Qwen 3 is truly worthy of praise, and the model always answers in the same language in which the question was asked. With Llama 3, in this regard, it often happens that the model understands the question itself, but prefers to formulate answers in English.&lt;/p&gt;
&lt;h3&gt;NVIDIA® RTX™ 4090&lt;/h3&gt;
&lt;p&gt;Now the most interesting thing: let’s see how the NVIDIA® RTX™ 4090, built on the Ada Lovelace architecture, named after the English mathematician, Augusta Ada King, Countess of Lovelace, copes with the same task. She became famous for becoming the first programmer in the history of mankind, and at the time of writing her first program there was no assembled computer that could execute it. However, it was recognized that the algorithm described by Ada for calculating Bernoulli numbers was the first program in the world written to be played on a computer.&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/997/original/il_qwen_2_vs_llama_3_3.png?1720184288&quot; alt=&quot;Nvidia RTX 4090 chart qwen2-vs-llama3&quot;&gt;
&lt;p&gt;The graph clearly shows that the RTX™ 4090 coped with the inference of both models almost twice as fast. It’s interesting that in one of the iterations Llama 3 managed to outperform the Qwen 2 by 1.2%. However, taking into account the other iterations, Qwen 2 retained its leadership, remaining 7% faster than Llama 3. In all iterations, the quality of responses from both neural networks was high with a minimum number of hallucinations. The only defect is that in rare cases one or two Chinese characters were mixed into the answers, which did not in any way affect the overall meaning.&lt;/p&gt;
&lt;h3&gt;NVIDIA® RTX™ A40&lt;/h3&gt;
&lt;p&gt;The next NVIDIA® RTX™ A40 card, on which we ran similar tests, is again built on the Ampere architecture and has 48 GB of video memory on the motherboard. Compared to the RTX™ 3090, this memory is slightly faster (20000 MHz vs. 19500 MHz), but has lower bandwidth (695.8 GB/s versus 936.2 GB/s). The situation is compensated by the larger number of CUDA® cores (10752 versus 10496), which overall allows the RTX™ A40 to perform slightly faster than the RTX™ 3090.&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/998/original/il_qwen_2_vs_llama_3_4.png?1720184316&quot; alt=&quot;Nvidia A40 chart qwen2-vs-llama3&quot;&gt;
&lt;p&gt;As for comparing the speed of models, here Qwen 2 is also ahead of Llama 3 in all iterations. When running on RTX™ A40, the difference in speed is about 15% with the same answers. In some tasks, Qwen 2 gave a little more important information, while Llama 3 was as specific as possible and gave examples. Despite this, everything has to be double-checked, since sometimes both models begin to produce controversial answers.&lt;/p&gt;
&lt;h3&gt;NVIDIA® L20&lt;/h3&gt;
&lt;p&gt;The last participant in our testing was the NVIDIA® L20. This GPU is built like the RTX™ 4090, on the Ada Lovelace architecture. This is a fairly new model, presented in the fall of 2023. On board, it has 48 GB of video memory and 11776 CUDA® cores. Memory bandwidth is lower than the RTX™ 4090 (864 GB/s versus 936.2 GB/s), as is the effective frequency. So the NVIDIA® L20 inference scores of both models will be closer to 3090 than 4090.&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/999/original/il_qwen_2_vs_llama_3_5.png?1720184358&quot; alt=&quot;Nvidia L20 chart qwen2-vs-llama3&quot;&gt;
&lt;p&gt;The final test didn’t bring any surprises. Qwen 2 turned out to be faster than Llama 3 in all iterations.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Let’s combine all the collected results into one chart. Qwen 2 was faster than Llama 3 from 7% to 24% depending on the used GPU. Based on this, we can clearly conclude that if you need to get high-speed inference from models such as Qwen 2 or Llama 3 on single-GPU configurations, then the undoubted leader will be the RTX™ 3090. A possible alternative could be the A40 or L20. But it’s not worth running the inference of these models on A6000 generation Ampere cards.&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/001/000/original/il_qwen_2_vs_llama_3_6.png?1720184380&quot; alt=&quot;Conclusion chart qwen2-vs-llama3&quot;&gt;
&lt;p&gt;We deliberately didn’t mention cards with a smaller amount of video memory, for example, NVIDIA® RTX™ 2080Ti, in the tests, since it isn’t possible to fit the above-mentioned 7B or 8B models there without quantization. Well, the 1.5B model Qwen 2, unfortunately, doesn’t have high-quality answers and can’t serve as a full replacement for 7B.&lt;/p&gt;
&lt;p&gt;See also:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/578-your-own-qwen-using-hf&quot;&gt;Your own Qwen using HF&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/573-llama-3-using-hugging-face&quot;&gt;Llama 3 using Hugging Face&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/576-your-own-vicuna-in-linux&quot;&gt;Your own Vicuna in Linux&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description>
      <enclosure url="https://assets.getwildcard.com/system/images/imgs/000/001/107/original/il_qwen_2_vs_llama_3.png?1737368521"
        length="0"
        type="image/jpeg"/>
      <pubDate>Mon, 20 Jan 2025 11:27:11 +0100</pubDate>
      <guid isPermaLink="false">579</guid>
      <dc:date>2025-01-20 11:27:11 +0100</dc:date>
    </item>
    <item>
      <title>Your own Qwen using HF</title>
      <link>https://www.leadergpu.com/catalog/578-your-own-qwen-using-hf</link>
      <description>&lt;p&gt;Large neural network models, with their extraordinary abilities, are firmly rooted in our lives. Recognizing this as an opportunity for future development, large corporations began to develop their own versions of these models. The Chinese giant, Alibaba, didn’t stand by. They created their own model, QWen (Tongyi Qianwen), which became the basis for many other neural network models.&lt;/p&gt;
&lt;h2&gt;Prerequisites&lt;/h2&gt;
&lt;h3&gt;Update cache and packages&lt;/h3&gt;
&lt;p&gt;Let’s update the package cache and upgrade your operating system before you start setting up Qwen. Also, we need to add Python Installer Packages (PIP), if it isn’t already present in the system. Please note that for this guide, we are using Ubuntu 22.04 LTS as the operating system:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt update &amp;&amp; sudo apt -y upgrade &amp;&amp; sudo apt install python3-pip&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;Install NVIDIA® drivers&lt;/h3&gt;
&lt;p&gt;You can use the automated utility that is included in Ubuntu distributions by default:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo ubuntu-drivers autoinstall&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Alternatively, you can install NVIDIA® drivers manually using our &lt;a href=&quot;https://www.leadergpu.com/articles/499-install-nvidia-drivers-in-linux&quot;&gt;step-by-step guide&lt;/a&gt;. Don’t forget to reboot the server:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo shutdown -r now&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Text generation web UI&lt;/h2&gt;
&lt;h3&gt;Clone the repository&lt;/h3&gt;
&lt;p&gt;Open the working directory on the SSD:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd /mnt/fastdisk&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Clone the project’s repository:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;git clone https://github.com/oobabooga/text-generation-webui.git&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;Install requirements&lt;/h3&gt;
&lt;p&gt;Open the downloaded directory:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd text-generation-webui&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Check and install all missing components:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;pip install -r requirements.txt&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Add SSH key to HF&lt;/h2&gt;
&lt;p&gt;Before starting, you need to set up port forwarding (remote port 7860 to 127.0.0.1:7860) in your SSH-client. You can find additional information in the following article: &lt;a href=&quot;https://www.leadergpu.com/articles/488-connect-to-a-linux-server&quot;&gt;Connect to Linux server&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Update the package cache repository and installed packages:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt update &amp;&amp; sudo apt -y upgrade&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Generate and add an SSH-key that you can use in Hugging Face:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd ~/.ssh &amp;&amp; ssh-keygen&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;When the keypair is generated, you can display the public key in the terminal emulator:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cat id_rsa.pub&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Copy all information starting from ssh-rsa and ending with usergpu@gpuserver as shown in the following screenshot:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/907/original/sh_llama3_quick_start_3.png?1713533169&quot; alt=&quot;Copy RSA key&quot;&gt;
&lt;p&gt;Open a web browser, type &lt;a href=&quot;https://huggingface.co/&quot;&gt;https://huggingface.co/&lt;/a&gt; into the address bar and press &lt;b translate=&quot;no&quot;&gt;Enter&lt;/b&gt;. Log into your HF-account and open &lt;a href=&quot;https://huggingface.co/settings/profile&quot;&gt;Profile settings&lt;/a&gt;. Then choose &lt;b translate=&quot;no&quot;&gt;SSH and GPG Keys&lt;/b&gt; and click on the &lt;b translate=&quot;no&quot;&gt;Add SSH Key&lt;/b&gt; button:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/908/original/sh_llama3_quick_start_4.png?1713533229&quot; alt=&quot;Add SSH key&quot;&gt;
&lt;p&gt;Fill in the &lt;b translate=&quot;no&quot;&gt;Key name&lt;/b&gt; and paste the copied &lt;b translate=&quot;no&quot;&gt;SSH Public key&lt;/b&gt; from the terminal. Save the key by pressing &lt;b translate=&quot;no&quot;&gt;Add key&lt;/b&gt;:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/909/original/sh_llama3_quick_start_5.png?1713533267&quot; alt=&quot;Paste the key&quot;&gt;
&lt;p&gt;Now, your HF-account is linked with the public SSH-key. The second part (private key) is stored on the server. The next step is to install a specific Git LFS (Large File Storage) extension, which is used for downloading large files such as neural network models. Open your home directory:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd ~/&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Download and run the shell script. This script installs a new third-party repository with git-lfs:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now, you can install it using the standard package manager:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt-get install git-lfs&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let’s configure git to use our HF nickname:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;git config --global user.name &quot;John&quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And linked to the HF email account:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;git config --global user.email &quot;john.doe@example.com&quot;&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Download the model&lt;/h2&gt;
&lt;p&gt;The next step is to download the model using the repository cloning technique commonly used by software developers. The only difference is that the previously installed Git-LFS will automatically process the marked pointer files and download all the content. Open the necessary directory (/mnt/fastdisk in our example):&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd /mnt/fastdisk&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This command may take some time to complete:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;git clone git@hf.co:Qwen/Qwen1.5-32B-Chat-GGUF&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Run the model&lt;/h2&gt;
&lt;p&gt;Execute a script that will start the web server and specify /mnt/fastdisk as the working directory with models. This script may download some additional components upon first launch.&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;./start_linux.sh --model-dir /mnt/fastdisk&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Open your web browser and select the &lt;b translate=&quot;no&quot;&gt;llama.cpp&lt;/b&gt; from the &lt;b translate=&quot;no&quot;&gt;Model loader&lt;/b&gt; drop-down list:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/986/original/sh_your_own_qwen_using_hf_1.png?1716463522&quot; alt=&quot;llama.cpp settings&quot;&gt;
&lt;p&gt;Be sure to set the &lt;b translate=&quot;no&quot;&gt;n-gpu-layers&lt;/b&gt; parameter. It is he who is responsible for what percentage of calculations will be offloaded to the GPU. If you leave the number at 0, then all calculations will be performed on the CPU, which is quite slow. Once all parameters are set, click the &lt;b translate=&quot;no&quot;&gt;Load&lt;/b&gt; button. After that, go to the &lt;b translate=&quot;no&quot;&gt;Chat&lt;/b&gt; tab and select &lt;b translate=&quot;no&quot;&gt;Instruct mode&lt;/b&gt;. Now, you can enter any prompt and receive a response:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/987/original/sh_your_own_qwen_using_hf_2.png?1716463543&quot; alt=&quot;Qwen chat example&quot;&gt;
&lt;p&gt;Processing will be performed by default on all available GPUs, taking into account the previously specified parameters:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/988/original/sh_your_own_qwen_using_hf_3.png?1716463565&quot; alt=&quot;Qwen task GPU loading&quot;&gt;
&lt;p&gt;See also:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/574-your-own-llama-2-in-linux&quot;&gt;Your own LLaMa 2 in Linux&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/573-llama-3-using-hugging-face&quot;&gt;Llama 3 using Hugging Face&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/576-your-own-vicuna-in-linux&quot;&gt;Your own Vicuna in Linux&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description>
      <enclosure url="https://assets.getwildcard.com/system/images/imgs/000/000/985/original/il_your_own_qwen_using_hf.png?1716463472"
        length="0"
        type="image/jpeg"/>
      <pubDate>Mon, 20 Jan 2025 09:43:46 +0100</pubDate>
      <guid isPermaLink="false">578</guid>
      <dc:date>2025-01-20 09:43:46 +0100</dc:date>
    </item>
    <item>
      <title>Your own Vicuna in Linux</title>
      <link>https://www.leadergpu.com/catalog/576-your-own-vicuna-in-linux</link>
      <description>&lt;p&gt;This article will guide you through the process of deploying a basic LLaMA alternative on a LeaderGPU server. We will utilize the &lt;a href=&quot;https://github.com/lm-sys/FastChat&quot;&gt;FastChat&lt;/a&gt; project and the freely available &lt;a href=&quot;https://lmsys.org/blog/2023-03-30-vicuna/&quot;&gt;Vicuna&lt;/a&gt; model for this purpose. &lt;/p&gt;
&lt;p&gt;The model we&#39;ll be using is based on Meta&#39;s LLaMA architecture but has been optimized for efficient deployment on consumer hardware. This setup provides a good balance between performance and resource requirements, making it suitable for both testing and production environments.&lt;/p&gt;
&lt;h2&gt;Preinstallation&lt;/h2&gt;
&lt;p&gt;Let’s prepare to install FastChat by updating the packages cache repository:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt update &amp;&amp; sudo apt -y upgrade&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Install NVIDIA® drivers automatically using the following command:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo ubuntu-drivers autoinstall&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You can also install these drivers manually with &lt;a href=&quot;https://www.leadergpu.com/articles/499-install-nvidia-drivers-in-linux&quot;&gt;our step-by-step guide&lt;/a&gt;. Then, reboot the server:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo shutdown -r now&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The next step is to install PIP (Package Installer for Python):&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt install python3-pip&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Install FastChat&lt;/h2&gt;
&lt;h3&gt;From PyPi&lt;/h3&gt;
&lt;p&gt;There are two possible ways to install FastChat. You can install it directly from PyPi:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;pip3 install &quot;fschat[model_worker,webui]&quot;&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;From GitHub&lt;/h3&gt;
&lt;p&gt;Alternatively, you can clone the FastChat repository from GitHub and install it:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;git clone https://github.com/lm-sys/FastChat.git&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd FastChat&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Don’t forget to upgrade PIP before proceeding:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;pip3 install --upgrade pip&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;pip3 install -e &quot;.[model_worker,webui]&quot;&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Run FastChat&lt;/h2&gt;
&lt;h3&gt;First start&lt;/h3&gt;
&lt;p&gt;To ensure a successful initial launch, it’s recommended to manually call FastChat directly from the command line:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;python3 -m fastchat.serve.cli --model-path lmsys/vicuna-7b-v1.5&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This action automatically retrieves and downloads the designated model of your choice, which should be specified using the --model-path parameter. The 7b represents a model with 7 billion parameters. This is the lightest model, suitable for GPUs with 16 GB of video memory. Links to models with a larger number of parameters can be found in the project’s &lt;a href=&quot;https://github.com/lm-sys/FastChat/blob/main/README.md&quot;&gt;Readme&lt;/a&gt; file.&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/965/original/sh_your_own_vicuna_in_linux_1.png?1714043790&quot; alt=&quot;Sample Vicuna conversation&quot;&gt;
&lt;p&gt;Now you have the option to engage in a conversation with the chatbot directly within the command line interface or you can set up a Web interface. It contains three components:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;Controller&lt;/li&gt;
    &lt;li&gt;Workers&lt;/li&gt;
    &lt;li&gt;Gradio web server&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Set up services&lt;/h3&gt;
&lt;p&gt;Let’s transform each component into a separate systemd service. Create 3 separate files with the following contents:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo nano /etc/systemd/system/vicuna-controller.service&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;[Unit]
Description=Vicuna controller service
[Service]
User=usergpu
WorkingDirectory=/home/usergpu
ExecStart=python3 -m fastchat.serve.controller
Restart=always
[Install]
WantedBy=multi-user.target&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo nano /etc/systemd/system/vicuna-worker.service&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;[Unit]
Description=Vicuna worker service
[Service]
User=usergpu
WorkingDirectory=/home/usergpu
ExecStart=python3 -m fastchat.serve.model_worker --model-path lmsys/vicuna-7b-v1.5
Restart=always
[Install]
WantedBy=multi-user.target&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo nano /etc/systemd/system/vicuna-webserver.service&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;[Unit]
Description=Vicuna web server
[Service]
User=usergpu
WorkingDirectory=/home/usergpu
ExecStart=python3 -m fastchat.serve.gradio_web_server
Restart=always
[Install]
WantedBy=multi-user.target&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Systemd usually updates its daemons database during the system&#39;s startup process. However, you can do this manually using the following command:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo systemctl daemon-reload&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now, let’s add three new services to the startup and immediately launch them using the &lt;b translate=&quot;no&quot;&gt;--now&lt;/b&gt; option:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo systemctl enable vicuna-controller.service --now &amp;&amp; sudo systemctl enable vicuna-worker.service --now &amp;&amp; sudo systemctl enable vicuna-webserver.service --now&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;However, if you attempt to open a web interface at http://[IP_ADDRESS]:7860, you’ll encounter a completely unusable interface with no available models. To resolve this issue, stop the Web interface service:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo systemctl stop vicuna-webserver.service&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Execute the web service manually:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;python3 -m fastchat.serve.gradio_web_server&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;Add an authentication&lt;/h3&gt;
&lt;p&gt;This action calls another script, which will register the previously downloaded model in a Gradio internal database. Wait a few seconds and interrupt the process using the &lt;b translate=&quot;no&quot;&gt;Ctrl + C&lt;/b&gt; shortcut. We’ll also take care of security and activate a simple authentication mechanism for accessing the web interface. Open the following file if you installed FastChat from PyPI:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo nano /home/usergpu/.local/lib/python3.10/site-packages/fastchat/serve/gradio_web_server.py&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;or&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo nano /home/usergpu/FastChat/fastchat/serve/gradio_web_server.py&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Scroll down to the end. Find this line:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;auth=auth,&lt;/pre&gt;
&lt;p&gt;Change it by setting any username or password whichever you want:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;auth=(“username”,”password”),&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Save the file and exit, using &lt;b translate=&quot;no&quot;&gt;Ctrl + X&lt;/b&gt; shortcut. And finally start the web interface:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo systemctl start vicuna-webserver.service&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Open &lt;b translate=&quot;no&quot;&gt;http://[IP_ADDRESS]:7860&lt;/b&gt; in your browser and enjoy FastChat with Vicuna:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/966/original/sh_your_own_vicuna_in_linux_2.png?1714043825&quot; alt=&quot;Sample Vicuna poem&quot;&gt;
&lt;p&gt;See also:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/574-your-own-llama-2-in-linux&quot;&gt;Your own LLaMa 2 in Linux&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/573-llama-3-using-hugging-face&quot;&gt;Llama 3 using Hugging Face&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/578-your-own-qwen-using-hf&quot;&gt;Your own Qwen using HF&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description>
      <enclosure url="https://assets.getwildcard.com/system/images/imgs/000/000/964/original/il_your_own_vicuna_in_linux.jpg?1714043750"
        length="0"
        type="image/jpeg"/>
      <pubDate>Mon, 20 Jan 2025 09:25:01 +0100</pubDate>
      <guid isPermaLink="false">576</guid>
      <dc:date>2025-01-20 09:25:01 +0100</dc:date>
    </item>
    <item>
      <title>Your own LLaMa 2 in Linux</title>
      <link>https://www.leadergpu.com/catalog/574-your-own-llama-2-in-linux</link>
      <description>&lt;h2&gt;Step 1. Prepare operating system&lt;/h2&gt;
&lt;h3&gt;Update cache and packages&lt;/h3&gt;
&lt;p&gt;Let’s update the package cache and upgrade your operating system before you start setting up LLaMa 2. Please note that for this guide, we are using Ubuntu 22.04 LTS as the operating system:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt update &amp;&amp; sudo apt -y upgrade&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Also, we need to add Python Installer Packages (PIP), if it isn’t already present in the system:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt install python3-pip&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;Install NVIDIA® drivers&lt;/h3&gt;
&lt;p&gt;You can use the automated utility that is included in Ubuntu distributions by default:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo ubuntu-drivers autoinstall&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Alternatively, you can install NVIDIA® drivers manually using &lt;a href=&quot;https://www.leadergpu.com/articles/499-install-nvidia-drivers-in-linux&quot;&gt;our step-by-step guide&lt;/a&gt;. Don’t forget to reboot the server:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo shutdown -r now&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Step 2. Get models from MetaAI&lt;/h2&gt;
&lt;h3&gt;Official request&lt;/h3&gt;
&lt;p&gt;Open the following address in your browser: &lt;a href=&quot;https://ai.meta.com/resources/models-and-libraries/llama-downloads/&quot;&gt;https://ai.meta.com/resources/models-and-libraries/llama-downloads/&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Fill in all necessary fields, read user agreement and click on the &lt;b translate=&quot;no&quot;&gt;Agree and Continue&lt;/b&gt; button. After a few minutes (hours, days), you’ll receive a special download URL, which grants you permission to download models for a 24-hours period.&lt;/p&gt;
&lt;h3&gt;Clone the repository&lt;/h3&gt;
&lt;p&gt;Before downloading, please check the available storage:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;df -h&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;Filesystem      Size  Used Avail Use% Mounted on
tmpfs            38G  3.3M   38G   1% /run
/dev/sda2        99G   24G   70G  26% /
tmpfs           189G     0  189G   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
/dev/nvme0n1    1.8T   26G  1.7T   2% /mnt/fastdisk
tmpfs            38G  8.0K   38G   1% /run/user/1000&lt;/pre&gt;
&lt;p&gt;If you have unmounted local disks, please follow the instructions in &lt;a href=&quot;https://www.leadergpu.com/articles/492-disk-partitioning-in-linux&quot;&gt;Disk partitioning in Linux&lt;/a&gt;. This is important because the downloaded models can be very large, and you need to plan their storage location in advance. In this example, we have a local SSD mounted in the /mnt/fastdisk directory. Let’s open it:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd /mnt/fastdisk&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Create a copy of the original LLaMa repository:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;git clone https://github.com/facebookresearch/llama&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If you encounter a permission error, simply grant permissions to the usergpu:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo chown -R usergpu:usergpu /mnt/fastdisk/&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;Download via script&lt;/h3&gt;
&lt;p&gt;Open the downloaded directory:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd llama&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Run the script:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;./download.sh&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Past the URL provided from MetaAI and select all necessary models. We recommend downloading all available models to avoid requesting permission again. However, if you need a specific model, then download only that one.&lt;/p&gt;
&lt;h3&gt;Fast test via example app&lt;/h3&gt;
&lt;p&gt;To begin, we can check for any missing components. If any libraries or applications are missing, the package manager will automatically install them:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;pip install -e .&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The next step is to add new binaries to PATH:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;export PATH=/home/usergpu/.local/bin:$PATH&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Run the demo example:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;torchrun --nproc_per_node 1 /mnt/fastdisk/llama/example_chat_completion.py --ckpt_dir /mnt/fastdisk/llama-2-7b-chat/ --tokenizer_path /mnt/fastdisk/llama/tokenizer.model --max_seq_len 512 --max_batch_size 6&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The application will create a compute process on the first GPU and simulate a simple dialog with typical requests, generating answers using LLaMa 2.&lt;/p&gt;
&lt;h2&gt;Step 3. Get llama.cpp&lt;/h2&gt;
&lt;p&gt;&lt;a href=&quot;https://github.com/ggerganov/llama.cpp/tree/master&quot;&gt;LLaMa C++&lt;/a&gt; is a project created by Bulgarian physicist and software developer Georgi Gerganov. It has many useful utilities that make working with this neural network model easier. All parts of llama.cpp are open source software and are distributed under the &lt;a href=&quot;https://github.com/ggerganov/llama.cpp/blob/master/LICENSE&quot;&gt;MIT license&lt;/a&gt;.&lt;/p&gt;
&lt;h3&gt;Clone the repository&lt;/h3&gt;
&lt;p&gt;Open the working directory on the SSD:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd /mnt/fastdisk&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Clone the project’s repository:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;git clone https://github.com/ggerganov/llama.cpp.git&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;Compile apps&lt;/h3&gt;
&lt;p&gt;Open the cloned directory:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd llama.cpp&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Start the compilation process with the following command:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;make&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Step 4. Get text-generation-webui&lt;/h2&gt;
&lt;h3&gt;Clone the repository&lt;/h3&gt;
&lt;p&gt;Open the working directory on the SSD:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd /mnt/fastdisk&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Clone the project’s repository:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;git clone https://github.com/oobabooga/text-generation-webui.git&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;Install requirements&lt;/h3&gt;
&lt;p&gt;Open the downloaded directory:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd text-generation-webui&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Check and install all missing components:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;pip install -r requirements.txt&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Step 5. Convert PTH to GGUF&lt;/h2&gt;
&lt;h3&gt;Common formats&lt;/h3&gt;
&lt;p&gt;&lt;b translate=&quot;no&quot;&gt;PTH (Python TorcH)&lt;/b&gt; — A consolidated format. Essentially, it’s a standard ZIP-archive with a serialized PyTorch state dictionary. However, this format has faster alternatives such as GGML and GGUF.&lt;/p&gt;
&lt;p&gt;&lt;b translate=&quot;no&quot;&gt;GGML (Georgi Gerganov’s Machine Learning)&lt;/b&gt; — This is a file format created by Georgi Gerganov, the author of llama.cpp. It is based on a library of the same name, written in C++, which has significantly increased the performance of large language models. It has now been replaced with the modern GGUF format.&lt;/p&gt;
&lt;p&gt;&lt;b translate=&quot;no&quot;&gt;GGUF (Georgi Gerganov’s Unified Format)&lt;/b&gt; — A widely used file format for LLMs, supported by various applications. It offers enhanced flexibility, scalability, and compatibility for most use cases.&lt;/p&gt;
&lt;h3&gt;llama.cpp convert.py script&lt;/h3&gt;
&lt;p&gt;Edit the parameters of the model before converting:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;nano /mnt/fastdisk/llama-2-7b-chat/params.json&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Correct &lt;b translate=&quot;no&quot;&gt;&quot;vocab_size&quot;: -1&lt;/b&gt; to &lt;b translate=&quot;no&quot;&gt;&quot;vocab_size&quot;: 32000&lt;/b&gt;. Save the file and exit. Then, open the llama.cpp directory:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd /mnt/fastdisk/llama.cpp&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Execute the script which will convert model to GGUF format:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;python3 convert.py /mnt/fastdisk/llama-2-7b-chat/ --vocab-dir /mnt/fastdisk/llama&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If all the previous steps are correct, you’ll receive a message like this:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;Wrote /mnt/fastdisk/llama-2-7b-chat/ggml-model-f16.gguf&lt;/pre&gt;
&lt;h2&gt;Step 6. WebUI&lt;/h2&gt;
&lt;h3&gt;How to start WebUI&lt;/h3&gt;
&lt;p&gt;Open the directory:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd /mnt/fastdisk/text-generation-webui/&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Execute the start script with some useful parameters:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;--model-dir&lt;/b&gt; indicates the correct path to the models&lt;/li&gt;
  &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;--share&lt;/b&gt; creates a temporary public link (if you don’t want to forward a port through SSH)&lt;/li&gt;
  &lt;li&gt;&lt;b translate=&quot;no&quot;&gt;--gradio-auth&lt;/b&gt; adds authorization with a login and password (replace user:password with your own)&lt;/li&gt;
&lt;/ul&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;./start_linux.sh --model-dir /mnt/fastdisk/llama-2-7b-chat/ --share --gradio-auth user:password&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;After successful launch, you’ll receive a local and temporary share link for access:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;Running on local URL:  http://127.0.0.1:7860
Running on public URL: https://e9a61c21593a7b251f.gradio.live
&lt;/pre&gt;
&lt;p&gt;This share link expires in 72 hours.&lt;/p&gt;
&lt;h3&gt;Load the model&lt;/h3&gt;
&lt;p&gt;Authorize in the WebUI using the selected username and password and follow these 5 simple steps:&lt;/p&gt;
&lt;ol&gt;
  &lt;li&gt;Navigate to the &lt;b translate=&quot;no&quot;&gt;Model&lt;/b&gt; tab.&lt;/li&gt;
  &lt;li&gt;Select &lt;b translate=&quot;no&quot;&gt;ggml-model-f16.gguf&lt;/b&gt; from the drop-down menu.&lt;/li&gt;
  &lt;li&gt;Choose how many layers you want to compute on the GPU (&lt;b translate=&quot;no&quot;&gt;n-gpu-layers&lt;/b&gt;).&lt;/li&gt;
  &lt;li&gt;Choose how many threads you want to start (&lt;b translate=&quot;no&quot;&gt;threads&lt;/b&gt;).
  &lt;/li&gt;
  &lt;li&gt;Click on the &lt;b translate=&quot;no&quot;&gt;Load&lt;/b&gt; button.&lt;/li&gt;
&lt;/ol&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/967/original/sh_your_own_llama_2_in_linux_1.png?1714136367&quot; alt=&quot;Loading the model&quot;&gt;
&lt;h3&gt;Start the dialog&lt;/h3&gt;
&lt;p&gt;Change the tab to &lt;b translate=&quot;no&quot;&gt;Chat&lt;/b&gt;, type your prompt, and click &lt;b translate=&quot;no&quot;&gt;Generate&lt;/b&gt;:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/968/original/sh_your_own_llama_2_in_linux_2.png?1714136407&quot; alt=&quot;Start the dialog&quot;&gt;
&lt;p&gt;See also:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/573-llama-3-using-hugging-face&quot;&gt;Llama 3 using Hugging Face&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/578-your-own-qwen-using-hf&quot;&gt;Your own Qwen using HF&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/579-qwen-2-vs-llama-3&quot;&gt;Qwen 2 vs Llama 3&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description>
      <enclosure url="https://assets.getwildcard.com/system/images/imgs/000/001/025/original/il_your_own_llama_2_in_Linux.png?1721999193"
        length="0"
        type="image/jpeg"/>
      <pubDate>Mon, 20 Jan 2025 09:13:25 +0100</pubDate>
      <guid isPermaLink="false">574</guid>
      <dc:date>2025-01-20 09:13:25 +0100</dc:date>
    </item>
    <item>
      <title>Llama 3 using Hugging Face</title>
      <link>https://www.leadergpu.com/catalog/573-llama-3-using-hugging-face</link>
      <description>&lt;p&gt;On April 18, 2024, the newest major language model from MetaAI, Llama 3, was released. Two versions were presented to users: 8B and 70B. The first version contains more than 15K tokens and was trained on data valid until March 2023. The second, larger version was trained on data valid until December 2023.&lt;/p&gt;

&lt;h2&gt;Step 1. Prepare operating system&lt;/h2&gt;

&lt;h3&gt;Update cache and packages&lt;/h3&gt;

&lt;p&gt;Let’s update the package cache and upgrade your operating system before you start setting up LLaMa 3. Please note that for this guide, we are using Ubuntu 22.04 LTS as the operating system:&lt;/p&gt;


&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt update &amp;amp;&amp;amp; sudo apt -y upgrade&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Also, we need to add Python Installer Packages (PIP), if it isn’t already present in the system:&lt;/p&gt;


&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt install python3-pip&lt;/code&gt;&lt;/pre&gt;

&lt;h3&gt;Install NVIDIA® drivers&lt;/h3&gt;

&lt;p&gt;You can use the automated utility that is included in Ubuntu distributions by default:&lt;/p&gt;


&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo ubuntu-drivers autoinstall&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Alternatively, you can install NVIDIA® drivers manually. Don’t forget to reboot the server:&lt;/p&gt;


&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo shutdown -r now&lt;/code&gt;&lt;/pre&gt;

&lt;h2&gt;Step 2. Get the model&lt;/h2&gt;

&lt;p&gt;Log in to &lt;a href=&quot;https://huggingface.co/&quot;&gt;Hugging Face&lt;/a&gt; using your username and password. Go to the page corresponding to the desired LLM version: &lt;a href=&quot;https://huggingface.co/meta-llama/Meta-Llama-3-8B&quot;&gt;Meta-Llama-3-8B&lt;/a&gt; or &lt;a href=&quot;https://huggingface.co/meta-llama/Meta-Llama-3-70B&quot;&gt;Meta-Llama-3-70B&lt;/a&gt;. At the time of publication of this article, access to the model is provided on an individual basis. Fill a short form and click the &lt;b translate=&quot;no&quot;&gt;Submit&lt;/b&gt; button:&lt;/p&gt;

&lt;h3&gt;Request access from HF&lt;/h3&gt;

&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/905/original/sh_llama3_quick_start_1.png?1713533099&quot; alt=&quot;Fill the form&quot; unselectable=&quot;on&quot;&gt;
&lt;p&gt;Then you will receive a message that your request has been submitted:&lt;/p&gt;

&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/906/original/sh_llama3_quick_start_2.png?1713533131&quot; alt=&quot;Form submitted&quot; unselectable=&quot;on&quot;&gt;
&lt;p&gt;You will gain access after 30-40 minutes and will be notified about this via email.&lt;/p&gt;

&lt;h3&gt;Add SSH key to HF&lt;/h3&gt;

&lt;p&gt;Generate and add an SSH-key that you can use in Hugging Face:&lt;/p&gt;


&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd ~/.ssh &amp;amp;&amp;amp; ssh-keygen&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;When the keypair is generated, you can display the public key in the terminal emulator:&lt;/p&gt;


&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cat id_rsa.pub&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Copy all information starting from &lt;b translate=&quot;no&quot;&gt;ssh-rsa&lt;/b&gt; and ending with &lt;b translate=&quot;no&quot;&gt;usergpu@gpuserver&lt;/b&gt; as shown in the following screenshot:&lt;/p&gt;

&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/907/original/sh_llama3_quick_start_3.png?1713533169&quot; alt=&quot;Copy RSA key&quot; unselectable=&quot;on&quot;&gt;
&lt;p&gt;Open Hugging Face &lt;a href=&quot;https://huggingface.co/settings/profile&quot;&gt;Profile settings&lt;/a&gt;. Then choose &lt;b translate=&quot;no&quot;&gt;SSH and GPG Keys&lt;/b&gt; and click on the Add SSH Key button:&lt;/p&gt;

&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/908/original/sh_llama3_quick_start_4.png?1713533229&quot; alt=&quot;Add SSH key&quot; unselectable=&quot;on&quot;&gt;
&lt;p&gt;Fill in the &lt;b translate=&quot;no&quot;&gt;Key name&lt;/b&gt; and paste the copied &lt;b translate=&quot;no&quot;&gt;SSH Public key&lt;/b&gt; from the terminal. Save the key by pressing &lt;b translate=&quot;no&quot;&gt;Add key&lt;/b&gt;:&lt;/p&gt;

&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/909/original/sh_llama3_quick_start_5.png?1713533267&quot; alt=&quot;Paste the key&quot; unselectable=&quot;on&quot;&gt;
&lt;p&gt;Now, your HF-account is linked with the public SSH-key. The second part (private key) is stored on the server. The next step is to install a specific Git LFS (Large File Storage) extension, which is used for downloading large files such as neural network models. Open your home directory:&lt;/p&gt;


&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd ~/&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Download and run the shell script. This script installs a new third-party repository with git-lfs:&lt;/p&gt;


&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Now, you can install it using the standard package manager:&lt;/p&gt;


&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt-get install git-lfs&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Let’s configure git to use our HF nickname:&lt;/p&gt;


&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;git config --global user.name &quot;John&quot;&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;And linked to the HF email account:&lt;/p&gt;


&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;git config --global user.email &quot;john.doe@example.com&quot;&lt;/code&gt;&lt;/pre&gt;

&lt;h3&gt;Download the model&lt;/h3&gt;

&lt;p&gt;Open the target directory:&lt;/p&gt;


&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd /mnt/fastdisk&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;And start downloading the repository. For this example we chose 8B version:&lt;/p&gt;


&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;git clone git@hf.co:meta-llama/Meta-Llama-3-8B&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This process takes up to 5 minutes.You can monitor this by executing the following command in another SSH-console:&lt;/p&gt;


&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;watch -n 0.5 df -h&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Here, you’ll see how the free disc space on the mounted disc is reduced, ensuring that the download is progressing and the data is being saved. The status will refresh every half-second. To manually stop viewing, press the Ctrl + C shortcut.&lt;/p&gt;

&lt;p&gt;Alternatively, you can install &lt;a href=&quot;https://github.com/aristocratos/btop&quot;&gt;btop&lt;/a&gt; and monitor the process using this utility:&lt;/p&gt;


&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt -y install btop &amp;amp;&amp;amp; btop&lt;/code&gt;&lt;/pre&gt;

&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/910/original/sh_llama3_quick_start_6.png?1713533300&quot; alt=&quot;Btop view&quot; unselectable=&quot;on&quot;&gt;
&lt;p&gt;To quit the btop utility, press the &lt;b translate=&quot;no&quot;&gt;Esc&lt;/b&gt; key and select &lt;b translate=&quot;no&quot;&gt;Quit&lt;/b&gt;.&lt;/p&gt;

&lt;h2&gt;Step 3. Run the model&lt;/h2&gt;

&lt;p&gt;Open the directory:&lt;/p&gt;


&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd /mnt/fastdisk&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Download the Llama 3 repository:&lt;/p&gt;


&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;git clone https://github.com/meta-llama/llama3&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Change the directory:&lt;/p&gt;


&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd llama3&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Run the example:&lt;/p&gt;


&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;torchrun --nproc_per_node 1 example_text_completion.py \
--ckpt_dir /mnt/fastdisk/Meta-Llama-3-8B/original \
--tokenizer_path /mnt/fastdisk/Meta-Llama-3-8B/original/tokenizer.model \
--max_seq_len 128 \
--max_batch_size 4&lt;/code&gt;&lt;/pre&gt;

&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/911/original/sh_llama3_quick_start_7.png?1713533328
&quot; alt=&quot;Llama3 example result&quot; unselectable=&quot;on&quot;&gt;
&lt;p&gt;Now you can use Llama 3 in your applications.&lt;/p&gt;
&lt;p&gt;See also:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/574-your-own-llama-2-in-linux&quot;&gt;Your own LLaMa 2 in Linux&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/578-your-own-qwen-using-hf&quot;&gt;Your own Qwen using HF&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/579-qwen-2-vs-llama-3&quot;&gt;Qwen 2 vs Llama 3&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description>
      <enclosure url="https://assets.getwildcard.com/system/images/imgs/000/000/904/original/il_llama3_quick_start.jpg?1713533056"
        length="0"
        type="image/jpeg"/>
      <pubDate>Mon, 20 Jan 2025 09:05:10 +0100</pubDate>
      <guid isPermaLink="false">573</guid>
      <dc:date>2025-01-20 09:05:10 +0100</dc:date>
    </item>
    <item>
      <title>StarCoder: your local coding assistant</title>
      <link>https://www.leadergpu.com/catalog/571-starcoder-your-local-coding-assistant</link>
      <description>&lt;p&gt;Microsoft CoPilot has brought about a revolution in the field of software development. This AI assistant greatly helps developers with various coding tasks, making their lives easier. However, one drawback is that it isn’t a standalone application but rather a cloud-based service. This means that users must agree to the terms and conditions of service and pay for a subscription.&lt;/p&gt;
&lt;p&gt;Fortunately, the world of open-source software provides us with numerous alternatives. As of the time of writing this article, the most notable alternative to CoPilot is StarCoder, developed by the BigCode project. StarCoder is an extensive neural network model with 15.5B parameters, trained on over 80 programming languages.&lt;/p&gt;
&lt;p&gt;This model is distributed on Hugging Face (HF) using a &lt;a href=&quot;https://huggingface.co/docs/hub/models-gated&quot; target=&quot;_blank&quot;&gt;gated model&lt;/a&gt; under the &lt;a href=&quot;https://huggingface.co/spaces/bigcode/bigcode-model-license-agreement&quot; target=&quot;_blank&quot;&gt;BigCode OpenRAIL-M v1 license agreement&lt;/a&gt;. You can download and use this model for free, but you need to have an HF account with a linked SSH key. Before you can download, there are a few additional steps you need to take.&lt;/p&gt;
&lt;h2&gt;Add SSH key to HF&lt;/h2&gt;
&lt;p&gt;Before starting, you need to set up port forwarding (remote port 7860 to 127.0.0.1:7860) in your SSH-client. You can find additional information in the following articles:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/articles/528-stable-video-diffusion&quot;&gt;Stable Video Diffusion&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/articles/488-connect-to-a-linux-server&quot;&gt;Connect to a Linux server&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Update the package cache repository and installed packages:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt update &amp;&amp; sudo apt -y upgrade&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let’s install Python’s system package manager (PIP):&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt install python3-pip
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Generate and add an SSH-key that you can use in Hugging Face:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd ~/.ssh &amp;&amp; ssh-keygen&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;When the keypair is generated, you can display the public key in the terminal emulator:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cat id_rsa.pub&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Copy all information starting from ssh-rsa and ending with usergpu@gpuserver as shown in the following screenshot:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/907/original/sh_llama3_quick_start_3.png?1713533169&quot; alt=&quot;Copy RSA key&quot;&gt;
&lt;p&gt;Open a web browser, type &lt;a href=&quot;https://huggingface.co/&quot; target=&quot;_blank&quot;&gt;https://huggingface.co/&lt;/a&gt; into the address bar and press &lt;b translate=&quot;no&quot;&gt;Enter&lt;/b&gt;. Log into your HF-account and open &lt;a href=&quot;https://huggingface.co/settings/profile&quot; target=&quot;_blank&quot;&gt;Profile settings&lt;/a&gt;. Then choose &lt;b translate=&quot;no&quot;&gt;SSH and GPG Keys&lt;/b&gt; and click on the &lt;b translate=&quot;no&quot;&gt;Add SSH Key&lt;/b&gt; button:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/908/original/sh_llama3_quick_start_4.png?1713533229&quot; alt=&quot;Add SSH key&quot;&gt;
&lt;p&gt;Fill in the &lt;b translate=&quot;no&quot;&gt;Key name&lt;/b&gt; and paste the copied &lt;b translate=&quot;no&quot;&gt;SSH Public key&lt;/b&gt; from the terminal. Save the key by pressing &lt;b translate=&quot;no&quot;&gt;Add key&lt;/b&gt;:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/909/original/sh_llama3_quick_start_5.png?1713533267
&quot; alt=&quot;Paste the key&quot;&gt;
&lt;p&gt;Now, your HF-account is linked with the public SSH-key. The second part (private key) is stored on the server. The next step is to install a specific Git LFS (Large File Storage) extension, which is used for downloading large files such as neural network models. Open your home directory:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd ~/&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Download and run the shell script. This script installs a new third-party repository with git-lfs:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now, you can install it using the standard package manager:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt-get install git-lfs&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let’s configure git to use our HF nickname:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;git config --global user.name &quot;John&quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And linked to the HF email account:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;git config --global user.email &quot;john.doe@example.com&quot;&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Download the model&lt;/h2&gt;
&lt;p&gt;&lt;font color=&quot;red&quot;&gt;&lt;i&gt;Please note that StarCoder in binary format may take up a significant amount of disk space (&gt;75 GB). Don’t forget to refer to &lt;a href=&quot;https://www.leadergpu.com/articles/492-disk-partitioning-in-linux&quot;&gt;this article&lt;/a&gt; to ensure you’re using the correct mounted partition.&lt;/i&gt;&lt;/font&gt;&lt;/p&gt;
&lt;p&gt;Everything is ready for the model download. Open the target directory:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd /mnt/fastdisk&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And start downloading the repository:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;git clone git@hf.co:bigcode/starcoder&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This process takes up to 15 minutes. Please be patient. You can monitor this by executing the following command in another SSH-console:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;watch -n 0.5 df -h&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Here, you’ll see how the free disc space on the mounted disc is reduced, ensuring that the download is progressing and the data is being saved. The status will refresh every half-second. To manually stop viewing, press the &lt;b translate=&quot;no&quot;&gt;Ctrl + C&lt;/b&gt; shortcut.&lt;/p&gt;
&lt;h2&gt;Run the full model with WebUI&lt;/h2&gt;
&lt;p&gt;Clone the project’s repository:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;git clone https://github.com/oobabooga/text-generation-webui.git&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Open the downloaded directory:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd text-generation-webui&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Execute the start script:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;./start_linux.sh --model-dir /mnt/fastdisk&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The script will check for the presence of the necessary dependencies on the server. Any missing dependencies will be installed automatically. When the application starts, open your web browser and type the following address:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;http://127.0.0.1:7860&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Open the &lt;b translate=&quot;no&quot;&gt;Model&lt;/b&gt; tab and select the downloaded model &lt;b translate=&quot;no&quot;&gt;starcoder&lt;/b&gt; from the drop-down list. Click on the &lt;b translate=&quot;no&quot;&gt;Model loader&lt;/b&gt; list and choose &lt;b translate=&quot;no&quot;&gt;Transformers&lt;/b&gt;. Set the maximum GPU-memory slider for each installed GPU. This is very important, as setting it to 0 restricts the use of VRAM and prevents the model from loading correctly. You also need to set the maximum RAM usage. Now, click the &lt;b translate=&quot;no&quot;&gt;Load&lt;/b&gt; button and wait for the loading process to complete:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/969/original/sh_starcoder_your_local_coding_assistant_1.png?1714386546&quot; alt=&quot;Load StarCoder model&quot;&gt;
&lt;p&gt;Switch to the &lt;b translate=&quot;no&quot;&gt;Chat&lt;/b&gt; tab and test the conversation with the model. Please note that Starcoder isn’t intended for dialogue like ChatGPT. However, it can be useful for checking code for errors and suggesting solutions.&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/970/original/sh_starcoder_your_local_coding_assistant_2.png?1714386599&quot; alt=&quot;Run the StarCoder&quot;&gt;
&lt;p&gt;If you want to get a full-fledged dialogue model, you could try two other models: &lt;a href=&quot;https://huggingface.co/HuggingFaceH4/starchat-alpha&quot; target=&quot;_blank&quot;&gt;starchat-alpha&lt;/a&gt; and &lt;a href=&quot;https://huggingface.co/HuggingFaceH4/starchat-beta&quot; target=&quot;_blank&quot;&gt;starchat-beta&lt;/a&gt;. These models were fine-tuned to conduct a dialogue just like ChatGPT does. The following commands helps to download and run these models:&lt;/p&gt;
&lt;p&gt;For starchat-alpha:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;git clone git@hf.co:HuggingFaceH4/starchat-alpha&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;For starchat-beta:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;git clone git@hf.co:HuggingFaceH4/starchat-beta&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The loading procedure is the same as described above. Also, you can find &lt;a href=&quot;https://github.com/bigcode-project/starcoder.cpp/tree/main&quot; target=&quot;_blank&quot;&gt;C++ implementation&lt;/a&gt; of starcoder, which will be effective for CPU inference.&lt;/p&gt;
&lt;p&gt;See also:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/574-your-own-llama-2-in-linux&quot;&gt;Your own LLaMa 2 in Linux&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/576-your-own-vicuna-in-linux&quot;&gt;Your own Vicuna in Linux&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/578-your-own-qwen-using-hf&quot;&gt;Your own Qwen using HF&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description>
      <enclosure url="https://assets.getwildcard.com/system/images/imgs/000/000/971/original/il_starcoder_your_local_coding_assistant.jpg?1714386646"
        length="0"
        type="image/jpeg"/>
      <pubDate>Fri, 17 Jan 2025 14:52:58 +0100</pubDate>
      <guid isPermaLink="false">571</guid>
      <dc:date>2025-01-17 14:52:58 +0100</dc:date>
    </item>
    <item>
      <title>Stable Diffusion Models: customization and options</title>
      <link>https://www.leadergpu.com/catalog/566-stable-diffusion-models-customization-and-options</link>
      <description>&lt;p&gt;Tuning is an excellent way to enhance every car or gadget. Generative neural networks can be tuned as well. Today, we don&#39;t want to delve deeply into the structure of Stable Diffusion, but we aim to achieve better results than a standard setup.&lt;/p&gt;
&lt;p&gt;There are two easy ways to do this: installing custom models and utilizing standard optimization options. In this article, we’ll learn how to install new models into Stable Diffusion and which options allow us to use hardware more effectively.&lt;/p&gt;
&lt;p&gt;If you want to share funny pictures of cute cats or great looking food, you usually post them on Instagram. If you develop applications and want to make the code available to everyone, you post it on GitHub. But if you train a graphical AI-model and want to share it, you should pay attention to &lt;a href=&quot;https://civitai.com/&quot;&gt;CivitAI&lt;/a&gt;. This is a huge platform to share knowledge and results with community members.&lt;/p&gt;
&lt;p&gt;Before you start downloading, you need to change the working directory. All AI models in Stable Diffusion are placed in the &quot;models&quot; directory:Before you start downloading, you need to change the working directory. All AI models in Stable Diffusion are placed in the &quot;models&quot; directory:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;cd stable-diffusion-webui/models/Stable-diffusion&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let&#39;s check which models are provided by default:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;ls -a&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&#39;Put Stable Diffusion checkpoints here.txt&#39;
v1-5-pruned-emaonly.safetensors&lt;/pre&gt;
&lt;p&gt;There is only one model with the name “v1-5-pruned-emaonly” and the extension “safetensors”. This model is a good starting point, but we have five more interesting models. Let’s download and compare them with the standard model.&lt;/p&gt;
&lt;h2&gt;Stable diffusion prompts&lt;/h2&gt;
&lt;p&gt;To visually show the difference between them, we came up with simple prompts:&lt;/p&gt;
&lt;p&gt;&lt;b translate=&quot;no&quot;&gt;princess, magic, fairy tales, portrait, 85mm, colorful&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;For many models, accurately representing geometry and facial features can be a significant challenge. To address this, add negative prompts to ensure images are generated without these characteristics:&lt;/p&gt;
&lt;p&gt;&lt;b translate=&quot;no&quot;&gt;poorly rendered face, poorly drawn face, poor facial details, poorly drawn hands, poorly rendered hands, low resolution, bad composition, mutated body parts, blurry image, disfigured, oversaturated, bad anatomy, deformed body features&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Set the maximum value of sampling steps (150) to get more details in the result.&lt;/p&gt;
&lt;h3&gt;Standard model&lt;/h3&gt;
&lt;p&gt;The standard model performs well in such tasks. However, some details are not quite accurate. For example, there is a problem with the eyes: they are clearly out of proportion:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/816/original/sh_stable_diffusion_models_customization_and_options_1.png?1712233278&quot; alt=&quot;Stable Diffusion Models standard&quot;&gt;
&lt;p&gt;If you look at the diadem, it is also crooked and asymmetrical. The rest of the details are well-executed and correspond to the given prompts. The background is blurry because we set the prompt “85mm”. This is a very commonly used focal length for portraits in professional photography.&lt;/p&gt;
&lt;h3&gt;Realistic Vision&lt;/h3&gt;
&lt;p&gt;This model is great for portraits. The image appears as if taken with a quality lens with the specified focal length. The proportions of the face and body are accurate, the dress fits perfectly, and the diadem on the head looks aesthetically pleasing:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/817/original/sh_stable_diffusion_models_customization_and_options_2.png?1712233379&quot; alt=&quot;Stable Diffusion Models Realistic Vision&quot;&gt;
&lt;p&gt;By the way, the author recommends using the following template for negative prompts:&lt;/p&gt;
&lt;p&gt;&lt;b translate=&quot;no&quot;&gt;deformed iris, deformed pupils, semi-realistic, cgi, 3d, render, sketch, cartoon, drawing, anime:1.4), text, close up, cropped, out of frame, worst quality, low quality, jpeg artifacts, ugly, duplicate, morbid, mutilated, extra fingers, mutated hands, poorly drawn hands, poorly drawn face, mutation, deformed, blurry, dehydrated, bad anatomy, bad proportions, extra limbs, cloned face, disfigured, gross proportions, malformed limbs, missing arms, missing legs, extra arms, extra legs, fused fingers, too many fingers, long neck&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;But even with our quite simple prompts, the result is excellent.&lt;/p&gt;
Download the model here: &lt;a href=&quot;https://civitai.com/models/4201/realistic-vision-v20&quot;&gt;Realistic Vision&lt;/a&gt;
&lt;h3&gt;Deliberate&lt;/h3&gt;
&lt;p&gt;Another amazing model for such purposes. The details are also well worked out here, but be careful and monitor the number of fingers. This is a very common problem with neural networks: they can often draw extra fingers or even entire limbs.&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/818/original/sh_stable_diffusion_models_customization_and_options_3.png?1712233625&quot; alt=&quot;Stable Diffusion Models Deliberate&quot;&gt;
&lt;p&gt;Creating visual lines is one of favorite movie techniques. So, this model also chose to draw a person against the background of a forest path.&lt;/p&gt;
&lt;p&gt;Download the model here: &lt;a href=&quot;https://huggingface.co/XpucT/Deliberate&quot;&gt;Deliberate&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;OpenJourney&lt;/h3&gt;
&lt;p&gt;Among generative neural networks, Midjourney (MJ) has received special attention. MJ was a pioneer in this field and is often held up as an example to others. The images it creates have a unique style. OpenJourney is inspired by the MJ style and is a suitably tuned Stable Diffusion.&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/819/original/sh_stable_diffusion_models_customization_and_options_4.png?1712233730&quot; alt=&quot;Stable Diffusion Models OpenJourney&quot;&gt;
&lt;p&gt;Generated images that look like a cartoon. They are vibrant and bright. For better results, add the &lt;b translate=&quot;no&quot;&gt;mdjrny-v4&lt;/b&gt; style prompt&lt;/p&gt;
&lt;p&gt;Download the model here: &lt;a href=&quot;https://huggingface.co/prompthero/openjourney&quot;&gt;OpenJourney&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;Anything&lt;/h3&gt;
&lt;p&gt;This model creates images akin to a professional manga artist (a person who draws comics). Thus, we got an anime-style princess.&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/820/original/sh_stable_diffusion_models_customization_and_options_5.png?1712233804&quot; alt=&quot;Stable Diffusion Models Anything&quot;&gt;
&lt;p&gt;This model is trained on images with a resolution of 768x768. You may set this resolution to get better results than standard 512x512.&lt;/p&gt;
&lt;p&gt;Download the model here: &lt;a href=&quot;https://civitai.com/models/66/anything-v3&quot;&gt;Anything&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;Corporate Memphis&lt;/h3&gt;
&lt;p&gt;This style of images gained wild popularity in the early 2020s and was widely used as a corporate style in different high-tech companies. Despite criticism, it is often found in presentations and websites.&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/821/original/sh_stable_diffusion_models_customization_and_options_6.png?1712233943&quot; alt=&quot;Stable Diffusion Models Corporate Memphis&quot;&gt;
&lt;p&gt;The princess turned out to be minimalistic, but quite pretty. Particularly amusing were the details that the model placed on the background.&lt;/p&gt;
&lt;p&gt;Download the model here: &lt;a href=&quot;https://huggingface.co/jinofcoolnes/corporate_memphis&quot;&gt;Corporate Memphis&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;Stable Diffusion Options&lt;/h2&gt;
&lt;p&gt;Stable Diffusion consumes a lot of resources, so many options have been developed for it. The most popular of them is &lt;b translate=&quot;no&quot;&gt;--xformers&lt;/b&gt;. This option enables two optimization mechanisms. The first reduces memory consumption and the second is used to increase speed.&lt;/p&gt;
&lt;p&gt;If you try to add --xformers without additional steps, you will get an error saying that the packages (&lt;a href=&quot;https://pypi.org/project/torch/&quot;&gt;torch&lt;/a&gt; and &lt;a href=&quot;https://pypi.org/project/torchvision/&quot;&gt;torchvision&lt;/a&gt;) are compiled for different versions of CUDA®. To fix this, we need to enter the Python virtual environment (venv) which is used for Stable Diffusion. After that, install the packages for the desired version of CUDA® (v1.18).&lt;/p&gt;
&lt;p&gt;First we must update apt packages cache and install package installer for Python (pip). Next step is to activate Python venv with the &lt;b translate=&quot;no&quot;&gt;activate&lt;/b&gt; script:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;source stable-diffusion-webui/venv/bin/activate&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;After that, the command line prompt changes to &lt;b translate=&quot;no&quot;&gt;(venv) username@hostname:~$&lt;/b&gt; Let’s install the packages torch and torchvision with CUDA® 11.8:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-venv&quot;&gt;pip install torch==2.0.0+cu118 torchvision==0.15.1+cu118 --index-url https://download.pytorch.org/whl/cu118&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This process may take several minutes because the packages are quite large. You&#39;ll have just enough time to pour yourself some coffee. Finally, you can deactivate virtual environment and start Stable Diffusion with the &lt;b translate=&quot;no&quot;&gt;--xformers&lt;/b&gt; option (replace &lt;b translate=&quot;no&quot;&gt;[user]&lt;/b&gt; and &lt;b translate=&quot;no&quot;&gt;[password]&lt;/b&gt; with your own values):&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-venv&quot;&gt;deactivate&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;./webui --xformers --listen --gradio-auth [user]:[password]&lt;/password&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The faster alternative for &lt;b translate=&quot;no&quot;&gt;--xformers&lt;/b&gt; is &lt;b translate=&quot;no&quot;&gt;--opt-sdp-no-mem-attention&lt;/b&gt;. It consumes more memory but works a bit faster. You can use this option without additional steps.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Today, we examined the capabilities of Stable Diffusion when combined with other models added and optimization options. Remember, by increasing or decreasing the number of sampling steps, you can adjust the level of detail in the final image.&lt;/p&gt;
&lt;p&gt;Of course, this is only a small part of what you can do with such a generative neural network. So &lt;a href=&quot;https://www.leadergpu.com/#chose-best&quot;&gt;order a GPU-server right now&lt;/a&gt; and start experimenting. Many more discoveries and opportunities await you. High-speed and powerful video cards will help you save time and generate cool images.&lt;/p&gt;
&lt;p&gt;See also:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/565-stable-diffusion-webui&quot;&gt;Stable Diffusion WebUI&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/598-easy-diffusion-ui&quot;&gt;Easy Diffusion UI&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/595-pytorch-for-linux&quot;&gt;PyTorch for Linux&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/596-pytorch-for-windows&quot;&gt;PyTorch for Windows&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description>
      <enclosure url="https://assets.getwildcard.com/system/images/imgs/000/000/815/original/il_stable_diffusion_models_customization_and_options.png?1712233216"
        length="0"
        type="image/jpeg"/>
      <pubDate>Mon, 25 Nov 2024 13:30:16 +0100</pubDate>
      <guid isPermaLink="false">566</guid>
      <dc:date>2024-11-25 13:30:16 +0100</dc:date>
    </item>
    <item>
      <title>Stable Diffusion WebUI</title>
      <link>https://www.leadergpu.com/catalog/565-stable-diffusion-webui</link>
      <description>&lt;p&gt;Generative neural networks seem magical. They answer questions, create images, and even write code in various programming languages. The success of these networks has two components: pre-trained models and hardware accelerators. Certainly, it&#39;s possible to use CPU cores for this workload, but it would be like a snail race. Generating one small picture can take a significant amount of time - tens of minutes. Generating the same picture on a GPU would take hundreds of times less.&lt;/p&gt;
&lt;p&gt;The first secret lies in the number of cores. CPU cores are universal and can handle complex instructions. However, conventional server processors have a maximum of 64 cores. Even in multiprocessor systems, the number of cores rarely exceeds 256. GPU cores are simpler but as a result, many more of them fit on the chip. For example, one NVIDIA® RTX™  4090 has 16,384 cores.&lt;/p&gt;
&lt;p&gt;The second secret is that the workload can be divided into many simple tasks, which can be run in parallel threads on dedicated GPU cores. This trick significantly speeds up data processing. Today, we will see how it works and deploy a generative neural network &lt;a href=&quot;https://github.com/Stability-AI/stablediffusion&quot;&gt;Stable Diffusion Web UI&lt;/a&gt; on the &lt;a href=&quot;https://www.leadergpu.com/&quot;&gt;LeaderGPU&lt;/a&gt; infrastructure. Take, for example, a server with an NVIDIA® RTX™ 4090 which has 16,384 GPU cores. As an operating system, we selected the current LTS-release Ubuntu 22.04 and chose the “Install NVIDIA® drivers and CUDA® 11.8” option.&lt;/p&gt;
&lt;h2&gt;System prepare&lt;/h2&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/811/original/sh_stable_diffusion_webui_1.png?1712212269&quot; alt=&quot;Stable Diffusion WebUI system prepare&quot;&gt;
&lt;p&gt;Before we start, let&#39;s consider memory. Stable Diffusion is a large system which can occupy up to 13G on your hard disk. The standard virtual disk in a LeaderGPU installation is 100G. The operating system takes up 25G. If we deploy Stable Diffusion without extending the home partition, we’ll exhaust all free memory and encounter a &quot;No space left on device&quot; error. It&#39;s a good idea to extend our home directory.&lt;/p&gt;
&lt;h3&gt;Extend home directory&lt;/h3&gt;
&lt;p&gt;First, we need to check all available disks.&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo fdisk -l&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;Disk /dev/sda: 447.13 GiB, 480103981056 bytes, 937703088 sectors
Disk model: INTEL SSDSC2KB48
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes

Disk /dev/sdb: 50 GiB, 53687091200 bytes, 104857600 sectors
Disk model: VIRTUAL-DISK
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 9D4C1F0C-D4A7-406E-AECB-BF57E4726437&lt;/pre&gt;
&lt;p&gt;Then we need to create a new Linux partition on our physical SSD-drive, /dev/sda:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo fdisk /dev/sda&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Press the following keys, one by one: &lt;b translate=&quot;no&quot;&gt;g → n → Enter → Enter → Enter → w&lt;/b&gt;. This will result in a new /dev/sda1 partition without a filesystem. Now, create an ext4 filesystem on it:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo mkfs.ext4 /dev/sda1&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;When the process is finished, we move to the next step.&lt;/p&gt;
&lt;p&gt;&lt;font color=&quot;red&quot;&gt;&lt;i&gt;Warning! Please proceed with the following operation with great care. Any mistake made while modifying the fstab file can result in your server being unable to boot normally and may require a complete reset of the operating system.&lt;/i&gt;&lt;/font&gt;&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo blkid&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;/dev/sdb2: UUID=&quot;6b17e542-0934-4dba-99ca-a00bd260c247&quot; BLOCK_SIZE=&quot;4096&quot; TYPE=&quot;ext4&quot; PARTUUID=&quot;70030755-75d8-4339-a4e0-26a97f1d1c5d&quot;
/dev/loop1: TYPE=&quot;squashfs&quot;
/dev/sdb1: PARTUUID=&quot;63ff1714-bd29-4062-be04-21af32423c0a&quot;
/dev/loop4: TYPE=&quot;squashfs&quot;
/dev/loop0: TYPE=&quot;squashfs&quot;
/dev/sda1: UUID=&quot;fb2ba455-2b8d-4da0-8719-ce327d0026bc&quot; BLOCK_SIZE=&quot;4096&quot; TYPE=&quot;ext4&quot; PARTUUID=&quot;6e0108df-b000-5848-8328-b187daf37a4f&quot;
/dev/loop5: TYPE=&quot;squashfs&quot;
/dev/loop3: TYPE=&quot;squashfs&quot;&lt;/pre&gt;
&lt;p&gt;Copy &lt;b translate=&quot;no&quot;&gt;UUID&lt;/b&gt; (fb2ba455-2b8d-4da0-8719-ce327d0026bc in example) of the &lt;b translate=&quot;no&quot;&gt;/dev/sda1&lt;/b&gt; partition. Next, we will instruct the system to automatically mount this drive by its UUID at boot time:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo nano /etc/fstab&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Enter this line before &lt;b translate=&quot;no&quot;&gt;/swap.img&lt;/b&gt;… string:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash&quot;&gt;/dev/disk/by-uuid/&lt;PARTITION UUID&gt; /home/usergpu ext4 defaults defaults&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Example:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;# /etc/fstab: static file system information.
#
# Use &#39;blkid&#39; to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# &lt;file system&gt; &lt;mount point&gt;   &lt;type&gt;  &lt;options&gt;       &lt;dump&gt;  &lt;pass&gt;
# / was on /dev/sdb2 during curtin installation
/dev/disk/by-uuid/6b17e542-0934-4dba-99ca-a00bd260c247 / ext4 defaults,_netdev 0 1
/dev/disk/by-uuid/fb2ba455-2b8d-4da0-8719-ce327d0026bc /home/usergpu ext4 defaults defaults
/swap.img       none    swap    sw      0       0&lt;/pre&gt;
&lt;p&gt;Exit with the &lt;b translate=&quot;no&quot;&gt;Ctrl + X&lt;/b&gt; keyboard shortcut and confirm the file save by pressing &lt;b translate=&quot;no&quot;&gt;Enter&lt;/b&gt;. The new settings will be applied at the next system start. Let’s reboot the server:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo shutdown -r now&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;After rebooting, we can check all mounted directories with the following command:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;df -h&lt;/code&gt;&lt;/pre&gt;
&lt;pre translate=&quot;no&quot;&gt;Filesystem      Size  Used Avail Use% Mounted on
tmpfs           6.3G  1.7M  6.3G   1% /run
/dev/sdb2        49G   23G   24G  50% /
tmpfs            32G     0   32G   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
/dev/sda1       440G   28K  417G   1% /home/usergpu
tmpfs           6.3G  4.0K  6.3G   1% /run/user/1000&lt;/pre&gt;
&lt;p&gt;Superb! But now we don&#39;t have access to write something in our home directory because it was changed by the &lt;b translate=&quot;no&quot;&gt;fstab&lt;/b&gt; configuration file. It’s time to reclaim ownership of the directory:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo chown -R usergpu /home/usergpu&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Good job! Let’s move to the next step.&lt;/p&gt;
&lt;h3&gt;Install basic packages&lt;/h3&gt;
&lt;p&gt;Update the software cache from the official Ubuntu repositories and upgrade some packages:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt update &amp;&amp; sudo apt -y upgrade&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The system has informed that a new kernel was installed and it will be operational after system reboot. Select &lt;b translate=&quot;no&quot;&gt;OK&lt;/b&gt; twice.&lt;/p&gt;
&lt;p&gt;Next, we need to resolve dependencies, which require Stable Diffusion. The first package adds Python virtual environment functionality:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt install python3-venv&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The second package adds an implementation of the C programming language’s &lt;b translate=&quot;no&quot;&gt;malloc()&lt;/b&gt; function customized by Google. It prevents &lt;b translate=&quot;no&quot;&gt;“Cannot locate TCMalloc”&lt;/b&gt; error and improves CPU memory usage.&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt install -y --no-install-recommends google-perftools&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Finally, reboot the server again:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo shutdown -r now&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Stable diffusion automatic 1111: install script&lt;/h2&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/812/original/sh_stable_diffusion_webui_2.png?1712212341&quot; alt=&quot;Stable Diffusion WebUI install script&quot;&gt;
&lt;p&gt;The easiest way to install Stable Diffusion with WebUI is by using the premade script written by GitHub user &lt;a href=&quot;https://github.com/AUTOMATIC1111&quot;&gt;AUTOMATIC1111&lt;/a&gt;. This script downloads and sets up these two parts while resolving all necessary dependencies.&lt;/p&gt;
&lt;p&gt;Let’s download the script:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;wget https://raw.githubusercontent.com/AUTOMATIC1111/stable-diffusion-webui/master/webui.sh&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then, give it access to change data and execute as a program:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;chmod a+x webui.sh&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Execute the downloaded script:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;./webui.sh &lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This process may take a couple of minutes. Everything is ready to create perfect images with Stable Diffusion.&lt;/p&gt;
&lt;h3&gt;Troubleshooting&lt;/h3&gt;
&lt;p&gt;If you encounter the error “Torch is not able to use GPU”, you can fix it by reinstalling via apt:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo apt -y install nvidia-driver-535&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You need to reboot the operating system to enable the driver:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;sudo shutdown -r now&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Generate&lt;/h2&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/813/original/sh_stable_diffusion_webui_3.png?1712212549
&quot; alt=&quot;Stable Diffusion WebUI run script&quot;&gt;
&lt;p&gt;The installation script &lt;b translate=&quot;no&quot;&gt;./webui.sh&lt;/b&gt; has another function. It simultaneously serves the server part of Stable Diffusion and WebUI. However, if you use it without arguments, the server will be available as a local daemon at &lt;a href=&quot;http://127.0.0.1:7860&quot;&gt;http://127.0.0.1:7860&lt;/a&gt;. This can be solved in two ways: port forwarding through an SSH-tunnel or allowing connections from external IPs.&lt;/p&gt;
&lt;p&gt;The second way is simpler: just add the &lt;b translate=&quot;no&quot;&gt;--listen&lt;/b&gt; option and you can connect to the web interface at &lt;b translate=&quot;no&quot;&gt;http://[YOUR_LEADERGPU_SERVER_IP_ADDRESS]:7860&lt;/b&gt;. However, this is completely insecure, as every internet user will have access. To prevent unauthorized usage, add &lt;b translate=&quot;no&quot;&gt;--gradio-auth&lt;/b&gt; option alongside the username and password, separated by colon:&lt;/p&gt;
&lt;pre translate=&quot;no&quot;&gt;&lt;code translate=&quot;no&quot; class=&quot;bash-user&quot;&gt;./webui.sh --listen --gradio-auth user:password&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This adds a login page to your WebUI instance.The script will download basic models and required dependencies for the first time:&lt;/p&gt;
&lt;img src=&quot;https://assets.getwildcard.com/system/images/imgs/000/000/814/original/sh_stable_diffusion_webui_4.png?1712212654&quot; alt=&quot;Stable Diffusion WebUI Gradio&quot;&gt;
&lt;p&gt;You can enjoy the result. Just enter a few prompts, separate them by commas, and click the Generate button. After a few seconds, an image generated by the neural network will be displayed.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;We&#39;ve come all the way from an empty LeaderGPU server with just a pre-installed operating system to a ready instance with Stable Diffusion and a WebUI interface. Next time, we’ll learn more about software performance tuning and how to properly enhance your Stable Diffusion instance with new versions of drivers and packages.&lt;/p&gt;
&lt;p&gt;See also:&lt;/p&gt;
&lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/566-stable-diffusion-models-customization-and-options&quot;&gt;Stable Diffusion Models: customization and options&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/598-easy-diffusion-ui&quot;&gt;Easy Diffusion UI&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/595-pytorch-for-linux&quot;&gt;PyTorch for Linux&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;https://www.leadergpu.com/catalog/596-pytorch-for-windows&quot;&gt;PyTorch for Windows&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description>
      <enclosure url="https://assets.getwildcard.com/system/images/imgs/000/000/810/original/il_stable_diffusion_webui.png?1712212156"
        length="0"
        type="image/jpeg"/>
      <pubDate>Mon, 25 Nov 2024 13:24:45 +0100</pubDate>
      <guid isPermaLink="false">565</guid>
      <dc:date>2024-11-25 13:24:45 +0100</dc:date>
    </item>
  </channel>
</rss>