Apps & Guides

MCP server based on N8N

The development of generative neural networks has accelerated significantly in recent years. They’ve become noticeably faster and more accurate in their responses and have learned to reason. However, their capabilities are still fundamentally limited by their architecture. For example, every existing LLM at the time of writing has a knowledge cutoff date. This means that with each passing day, such an LLM becomes more likely to produce incorrect answers, simply because it lacks information about events that occurred after that date.

This limitation necessitates retraining the model entirely on fresher data, which is expensive and time-consuming. But there is another way. If you enable the model to interact with the outside world, it can independently find and update the information requested during a user conversation, without requiring retraining.

This is roughly how the RAG (Retrieval Augmented Generation) mechanism works. When answering a question, the model first queries a pre-prepared vector database, and if it finds relevant information, it incorporates it into the prompt. Thus, by explaining and updating the vector DB, the quality of LLM responses can be greatly improved.

But there is another, even more interesting way to embed up-to-date context into prompts. It’s called MCP, which stands for Model Context Protocol. It was originally developed by Anthropic for its Claude model. The key moment came when the source code for MCP was made open-source, allowing thousands of AI researchers to build custom servers for various purposes.

The essence of MCP is to give a neural network model access to tools with which it can independently update its knowledge and perform various actions to efficiently solve given tasks. The model itself decides which tool to use and whether it’s appropriate in each situation.

Support for MCP soon appeared in various IDEs like Cursor, as well as in automation platforms like N8N. The latter is especially intuitive, as workflows are created visually, making it easier to understand. Within N8N, you can either connect to an existing MCP server or create your own. Moreover, you can even organize a direct connection within a single workflow. But let’s go step by step.

Creating a Simple AI Agent

Before getting started, make sure the main requirement is met, you have an LLM ready for connections. This could be a locally running model using Ollama or an external service like OpenAI’s ChatGPT. In the first case, you’ll need to know the local Ollama API address (and optionally its authentication), and in the second case, you’ll need an active OpenAI account with sufficient credits.

Building an agent starts with the key AI Agent node. At a minimum, it must be linked with two other nodes, one to act as a trigger, and the other to connect to the LLM. If you don’t specify a trigger, the system will create one automatically, triggering the agent upon receiving any message in the internal chat:

The only missing piece is the LLM. For instance, you can use our Open WebUI: All in one guide to set up Ollama with a web interface. The only change required is that the containers for N8N and Open WebUI must be on the same network. For example, if the N8N container is on a network named web, then in the deployment command for Open WebUI, replace --network=host with --network=web.

In some cases, you will also need to manually set the OLLAMA_HOST environment variable, for example: -e OLLAMA_HOST=0.0.0.0. This allows connections to the Ollama API not only from localhost but also from other containers. Suppose Ollama is deployed in a container named ollama-webui. Then the base URL for connecting from N8N would be:

http://open-webui:11434

Before connecting the Ollama Chat Model node, don’t forget to download at least one model. You can do this either from the web interface or via the container CLI. The following command will download the Llama 3.1 model with 8 billion parameters:

ollama pull llama3.1:8b

Once downloaded and installed, the model will automatically appear in the list of available ones:

A minimal working AI Agent workflow looks like this:

In this form, the agent can use only one model and doesn’t store input data or enhance prompts using external tools. So it makes sense to add at least the Simple Memory node. For light loads, it’s sufficient to store requests and responses.

But let's go back to MCP. To start, create a server using the special MCP Server Trigger node:

This node is fully self-contained and doesn’t require external activation. It’s triggered solely by an incoming external request to its webhook address. By default, there are two URLs: Test URL and Production URL. The first is used during development, while the second works only when the workflow is saved and activated.

The trigger is useless on its own, it needs connected tools. For example, let’s connect one of the simplest tools: a calculator. It will expect a mathematical expression as input. Nodes communicate using plain JSON, so for the calculator to compute 2 + 2, the input should be:

[
  {
    "query": {
      "input": "2 + 2"
    }
  }
]

LLMs can easily generate such JSON from plain text task descriptions and send them to the node, which performs the calculations and returns the result. Let’s connect the MCP client to the agent:

It’s worth noting that this node doesn’t need any additional connections. In its settings, it’s enough to specify the endpoint address where it will send data from the AI Agent. In our example, this address points to the container named n8n.

Of course, at this stage you can specify any external MCP server address available to you. But for this article, we’ll use a local instance running within N8N. Let’s see how the client and server behave when the AI Agent is asked to perform a simple math operation:

Upon receiving the request, the AI Agent will:

Search in Simple Memory to see if the user asked this before or if any context can be reused.
Send the prompt to the LLM, which will correctly break down the math expression and prepare the corresponding JSON.
Send the JSON to the Calculator tool and receive the result.
Use the LLM to generate the final response and insert the result into the reply.
Store the result in Simple Memory.
Output the message in the chat.

Similarly, agents can work with other tools on the MCP server. Instead of Simple Memory, you can use more advanced options like MongoDB, Postgres, Redis, or even something like Zep. Of course, these require minimal database maintenance, but overall performance will increase significantly.

There are also far more options for tool selection. Out of the box, the MCP Server Trigger node supports over 200 tools. These can be anything, from simple HTTP requests to prebuilt integrations with public internet services. Within a single workflow, you can create both a server and a client. One important thing to note: these nodes can’t be visually connected in the editor, and that’s expected behavior:

Instead of the default trigger, you can use other options such as receiving a message via a messenger, submitting a website form, or executing on a schedule. This lets you set up workflows that react to events or perform routine operations like daily data exports from Google Ads.

And that’s not the end of what’s possible with AI agents. You can build multi-agent systems using different neural network models that work together to solve tasks with greater accuracy, considering many more influencing factors in the process.