Build Your First AI Agent Workflow with n8n

Most AI agent tutorials start with fifty lines of Python and a vague promise about “autonomous systems.” By the end, you’ve got a script that calls an API and prints to the terminal. Not exactly the autonomous workflow you were picturing.

n8n takes a different approach. Since version 2.0 dropped in January 2026, it ships with native LangChain integration, 70+ AI-specific nodes, and the ability to wire up agents that actually do things — send emails, query databases, update spreadsheets, post to Slack. The whole point of an AI agent is that it takes action, and n8n gives you the connective tissue to make that happen without writing a custom integration for every tool.

This guide walks through two real workflows you can build today. The first is a research agent that pulls web content and produces structured summaries. The second is a customer support agent with RAG and persistent memory. Both are practical enough to use in production, not just demos to show off at a meetup.

Why n8n for AI Agents (and Not Just Another Framework)

There are a dozen ways to build AI agents right now. LangGraph, CrewAI, AutoGen, raw API calls with your own orchestration. So why drag a workflow automation tool into this?

Three reasons, and they’re all practical.

You can see what’s happening. When an agent makes a decision in a Python script, you’re reading logs. In n8n, you’re looking at a visual graph where each step shows its input and output. When something goes wrong — and with agents, something always goes wrong — you click the node that failed and see exactly what the LLM received and what it returned. This alone saves hours of debugging.

Connecting to external services is trivial. n8n has 400+ built-in integrations. Want your agent to check a Google Sheet, send a Slack message, and create a Jira ticket based on what it finds? That’s three nodes, no API wrappers needed. With LangGraph you’d be writing custom tool definitions for each of those.

Self-hosting keeps costs predictable. The community edition is free. You run it on a $10/month VPS and pay only for the LLM API calls your agents make. Compare that to managed agent platforms that charge per execution on top of your model costs.

The trade-off? You give up some flexibility. Complex multi-agent architectures with dynamic routing and shared state are easier to build in code. n8n agents work best when you have a clear workflow with defined tools, not when you need agents negotiating with each other in unpredictable ways.

Setting Up n8n 2.0: Cloud vs. Self-Hosted

You have two paths. Cloud gets you running in five minutes. Self-hosted gives you full control and zero per-execution fees.

n8n Cloud

Head to n8n.io and start the 14-day free trial — no credit card required. The Starter plan runs about EUR 24/month for 2,500 executions. The Pro plan is EUR 60/month for 10,000 executions. For AI agent workflows, where a single execution might involve multiple LLM calls, keep an eye on execution counts. Each time a workflow triggers counts as one execution, regardless of how many nodes run inside it.

For learning and prototyping, the trial is plenty. For production agent workflows that run frequently, self-hosting usually makes more financial sense.

Self-Hosted (Recommended for AI Workflows)

If you have Docker installed, this takes about three minutes:

docker run -d --name n8n \
  -p 5678:5678 \
  -v n8n_data:/home/node/.n8n \
  -e N8N_AI_ENABLED=true \
  n8nio/n8n:latest

Open http://localhost:5678, create your account, and you’re in. The N8N_AI_ENABLED=true flag unlocks the AI nodes in the editor.

For a more permanent setup on a VPS, you’ll want Docker Compose with a PostgreSQL backend instead of the default SQLite:

version: '3.8'
services:
  n8n:
    image: n8nio/n8n:latest
    ports:
      - "5678:5678"
    environment:
      - N8N_AI_ENABLED=true
      - DB_TYPE=postgresdb
      - DB_POSTGRESDB_HOST=postgres
      - DB_POSTGRESDB_DATABASE=n8n
      - DB_POSTGRESDB_USER=n8n
      - DB_POSTGRESDB_PASSWORD=changeme
    volumes:
      - n8n_data:/home/node/.n8n
    depends_on:
      - postgres
  postgres:
    image: postgres:16
    environment:
      - POSTGRES_DB=n8n
      - POSTGRES_USER=n8n
      - POSTGRES_PASSWORD=changeme
    volumes:
      - postgres_data:/var/lib/postgresql/data
 
volumes:
  n8n_data:
  postgres_data:

Running docker compose up -d gives you a production-ready n8n instance backed by PostgreSQL. Total hosting cost on something like a Hetzner CX22: around EUR 5-10/month.

Adding Your API Keys

Before building anything, you need to configure credentials. In n8n, go to Settings > Credentials and add:

OpenAI or Anthropic credentials (your LLM API key)
Any service credentials your agents will use (Google Sheets, Slack, etc.)

n8n stores credentials encrypted, and if you’re self-hosting, they never leave your server.

Understanding n8n’s AI Building Blocks

Before we build anything, a quick tour of the node types you’ll be working with. n8n organizes its AI capabilities into a cluster-node architecture — a fancy way of saying you snap components together like building blocks.

AI Agent Node — The brain. This is the orchestration node that receives input, reasons about what to do, and decides which tools to call. It wraps LangChain’s agent executor under the hood. You’ll use this in every AI agent workflow.

Model Nodes — Connect to your LLM. Options include OpenAI (GPT-4o, GPT-4.1), Anthropic (Claude Sonnet 4, Claude Opus 4), Google (Gemini 2.5), or local models through Ollama. Drag one onto the canvas and attach it to your Agent node.

Tool Nodes — These are the actions your agent can take. HTTP Request for API calls, Code for custom JavaScript/Python, Calculator for math, Wikipedia for lookups, or any of n8n’s 400+ integration nodes configured as tools. The agent decides when and whether to use each tool based on the input it receives.

Memory Nodes — Give your agent conversation history. Window Buffer Memory remembers the last N messages. Summary Buffer Memory condenses older messages into a summary to save tokens. For most use cases, Window Buffer with a 10-message window works fine.

Vector Store Nodes — For RAG. Connect to Pinecone, Qdrant, Supabase, or an in-memory store to give your agent access to a searchable knowledge base.

The pattern for every agent workflow is the same: trigger -> Agent node -> output. You attach a model, optional memory, and tools to the Agent node as sub-nodes. The agent uses the model to reason about the input and calls tools as needed.

Workflow 1: Build a Research Agent That Summarizes Web Pages

This first project is intentionally simple. You’ll build an agent that takes a URL, fetches the page content, and produces a structured summary with key takeaways. It’s useful on its own and teaches the core pattern you’ll reuse in more complex workflows.

Step 1: Create the Trigger

Start a new workflow. Add a Manual Trigger node (or a Webhook node if you want to call it from other apps). For now, manual trigger keeps things simple — you click “Test Workflow” to run it.

Step 2: Add a Set Node for Input

Add a Set node after the trigger. Create a field called url and set it to the page you want to summarize. Later you can replace this with dynamic input from a webhook or form, but hardcoding it while building makes debugging easier.

Add another field called focus — a string describing what the summary should focus on. Something like “technical architecture decisions” or “pricing and feature comparison.” This gives your agent direction beyond just “summarize everything.”

Step 3: Configure the AI Agent

Add an AI Agent node and configure it as follows:

Agent Type: Tools Agent (this lets the agent decide which tools to call and in what order)

System Prompt:

1
You are a research assistant. Given a URL and a focus area, you will:
2
1. Fetch the content from the URL
3
2. Read and analyze the content
4
3. Produce a structured summary with:
5
   - Title and source
6
   - 3-5 key takeaways (focused on the specified area)
7
   - Notable quotes or data points
8
   - One-paragraph executive summary
9
Keep summaries factual. Flag any claims that seem unsubstantiated.

Step 4: Attach the Model

Drag an OpenAI Chat Model node (or Anthropic, whatever you prefer) and connect it to the Agent’s “Model” input. Set the model to GPT-4o or Claude Sonnet 4 — both work well for summarization. Temperature 0.3 keeps output focused and consistent.

Step 5: Add the HTTP Request Tool

This is the tool your agent will use to fetch web pages. Add an HTTP Request node and configure it as a tool:

Name: fetch_webpage
Description: “Fetches the content of a web page given a URL. Returns the page HTML.”
Method: GET
URL: Let the agent provide this (set it to the expression {{ $fromAI('url') }})

The $fromAI() expression is n8n’s way of letting the agent fill in parameters dynamically. When the agent decides to use this tool, it passes the URL from the conversation context.

Step 6: Add a Code Tool for HTML Parsing

Raw HTML is noisy. Add a Code node as a tool that strips HTML tags and extracts the main text content:

const html = $input.first().json.data;
const text = html
  .replace(/<script[^>]*>[\s\S]*?<\/script>/gi, '')
  .replace(/<style[^>]*>[\s\S]*?<\/style>/gi, '')
  .replace(/<[^>]+>/g, ' ')
  .replace(/\s+/g, ' ')
  .trim()
  .substring(0, 12000);
 
return [{ json: { content: text } }];

The 12,000 character limit prevents token overflow on long pages. Adjust based on your model’s context window and budget.

Step 7: Connect and Test

Wire it all together: Manual Trigger -> Set (url, focus) -> AI Agent -> output. The Agent node should have the model and both tools connected as sub-nodes.

Click Test Workflow. Watch the execution log as the agent:

Receives the URL and focus area
Calls fetch_webpage with the URL
Passes the HTML to the code tool for parsing
Generates the structured summary

The execution view shows you exactly what happened at each step. If the summary isn’t focused enough, tweak the system prompt. If the HTML parsing misses content, adjust the code tool.

Making It Production-Ready

A few upgrades to consider once the basic flow works:

Replace the manual trigger with a Webhook so other apps can send URLs
Add an IF node after the agent to check if the summary meets a minimum quality threshold (word count, presence of key sections)
Pipe the output to Google Sheets or Notion to build a research library automatically
Add error handling with a Try/Catch pattern — if the HTTP request fails, the agent should report that gracefully instead of crashing

Workflow 2: Customer Support Agent with RAG and Memory

This second workflow is more ambitious. You’re building a support agent that answers questions using your documentation as a knowledge base, remembers conversation history, and escalates when it can’t help. This is closer to what you’d actually deploy in a business context.

The Architecture

The flow goes like this: a chat message comes in -> the agent searches your knowledge base for relevant docs -> it generates an answer grounded in those docs -> if confidence is low, it flags the conversation for human review. Memory ensures the agent tracks context across multiple messages in the same conversation.

Step 1: Set Up the Vector Store

Before building the workflow, you need to index your documents. Create a separate workflow for this — call it “Knowledge Base Indexer.”

Add a Google Drive Trigger (or whatever holds your docs) that fires when new files are added. Connect it to a Document Loader node (PDF, text, or HTML depending on your docs), then to a Text Splitter node to chunk the documents into 500-1000 token pieces with 50 token overlap. Finally, connect to a Vector Store node to embed and store the chunks.

For the vector store, you have options:

In-Memory — Fine for testing, gone when n8n restarts
Qdrant — Self-hostable, pairs well with a self-hosted n8n setup
Pinecone — Managed service, free tier covers small knowledge bases
Supabase — If you’re already using Supabase, its pgvector extension works well

For a self-hosted stack, I’d go with Qdrant. It runs in Docker alongside n8n and handles everything you need without external dependencies.

# Add to your docker-compose.yml
qdrant:
  image: qdrant/qdrant:latest
  ports:
    - "6333:6333"
  volumes:
    - qdrant_data:/qdrant/storage

Step 2: Build the Support Agent Workflow

Create a new workflow. Start with a Chat Trigger node — this gives you a built-in chat widget for testing, and later you can swap it for a webhook connected to your actual chat system.

Step 3: Configure the Agent with RAG

Add an AI Agent node with this system prompt:

1
You are a customer support agent for [Your Company]. Your job is to answer
2
customer questions accurately using the knowledge base provided.
3

4
Rules:
5
- ONLY answer based on information in the knowledge base
6
- If the knowledge base doesn't contain relevant information, say so honestly
7
- Never make up product features, pricing, or policies
8
- For billing or account-specific questions, collect the customer's email
9
  and escalate to the support team
10
- Keep responses concise but complete

Step 4: Attach the Vector Store Tool

Add a Vector Store Tool node connected to your Qdrant (or Pinecone) collection. Configure it:

Name: search_knowledge_base
Description: “Search the company knowledge base for information about products, features, policies, and common questions. Use this tool whenever the customer asks a question.”
Top K: 4 (returns the 4 most relevant chunks)

When the agent receives a question, it’ll automatically search the knowledge base and use the retrieved context to formulate its answer. This is RAG in action — the agent retrieves relevant documents before generating its response, which keeps answers grounded in your actual documentation.

Step 5: Add Memory

Drag a Window Buffer Memory node and connect it to the Agent’s memory input. Set the window size to 10 messages. This means the agent remembers the last 10 exchanges in each conversation, enough context for follow-up questions without burning excessive tokens.

For the session ID, use the expression {{ $json.sessionId }} from the chat trigger. This ensures each conversation gets its own memory — customer A’s chat doesn’t bleed into customer B’s.

Step 6: Add Escalation Logic

After the Agent node, add an IF node that checks whether the agent’s response contains your escalation trigger phrase (like “I’ll connect you with our support team” or whatever you put in the system prompt).

True path: Send a notification to your support team via Slack or email, including the conversation history and customer’s question
False path: Return the agent’s response directly to the customer

This gives you a clean handoff. The agent handles routine questions, and humans take over for anything complex or sensitive.

Step 7: Add an Escalation Tool

Here’s a better pattern than keyword matching: give the agent an explicit escalation tool. Add a Code node as a tool:

Name: escalate_to_human
Description: “Use this tool when you cannot answer the customer’s question from the knowledge base, or when the customer requests to speak with a human agent. Pass the reason for escalation.”

const reason = $input.first().json.reason;
const chatHistory = $input.first().json.chatHistory;
 
return [{
  json: {
    escalated: true,
    reason: reason,
    timestamp: new Date().toISOString()
  }
}];

Now the agent can actively decide to escalate rather than you having to parse its output for keywords. The downstream IF node checks the escalated flag instead.

Cost Analysis: What Will This Actually Cost to Run?

Everyone talks about AI being cheap until they see their first real API bill. Here’s what these workflows actually cost.

n8n Hosting

Self-hosted on Hetzner CX22: Around EUR 5-10/month for the VPS, plus maybe EUR 5/month for Qdrant if you’re running RAG
n8n Cloud Starter: EUR 24/month for 2,500 executions
n8n Cloud Pro: EUR 60/month for 10,000 executions

LLM API Costs

This is where the real spending happens. For the research agent summarizing a typical 3,000-word article:

Input tokens: ~5,000 (article content + system prompt)
Output tokens: ~800 (structured summary)
Cost per summary with GPT-4o: roughly $0.02-0.03
Cost per summary with Claude Sonnet 4: roughly $0.02-0.03

For the support agent with RAG, each customer interaction involves:

Vector search: negligible (self-hosted) or ~$0.001 (managed)
Input tokens: ~3,000 (retrieved docs + conversation history + system prompt)
Output tokens: ~300 (response)
Cost per interaction with GPT-4o: roughly $0.01-0.02

At 1,000 support conversations per month, you’re looking at $10-20 in API costs plus your hosting. Compare that to a managed AI support platform that charges $0.10-0.50 per conversation, and the economics of self-hosting become obvious.

Total Monthly Cost Estimates

Setup	100 executions/month	1,000 executions/month	10,000 executions/month
Self-hosted + GPT-4o	~EUR 12-17	~EUR 25-40	~EUR 115-215
n8n Cloud Starter + GPT-4o	~EUR 27-30	~EUR 35-50	Over plan limit
n8n Cloud Pro + GPT-4o	~EUR 63-65	~EUR 75-85	~EUR 160-260

At scale, self-hosting wins by a wide margin. The break-even point is around 500 executions per month — below that, cloud is simpler and the cost difference is negligible.

Common Gotchas and How to Avoid Them

After building a dozen agent workflows in n8n, here’s what trips people up most.

Token limits with large documents. If your research agent fetches a 10,000-word page, you’re sending a lot of tokens to the LLM. Always truncate or chunk large inputs. The Code tool with a character limit (like the 12,000 cap in Workflow 1) prevents surprise bills.

Agent loops. Sometimes the agent calls the same tool repeatedly because it’s not getting the result it expects. Set a maximum iterations limit in the Agent node settings (the default is usually fine, but check it). Also make your tool descriptions precise — vague descriptions lead to confused agents.

Memory token accumulation. Window Buffer Memory stores raw messages. A 10-message window with long responses can eat 8,000+ tokens of context per interaction. If costs spike, switch to Summary Buffer Memory, which condenses older messages into a shorter summary.

Credential exposure in logs. n8n’s execution log shows full node inputs and outputs. If your workflow handles API keys or sensitive customer data, turn off detailed logging in production or self-host so the data stays on your server.

Webhook timeouts. If your agent workflow takes 30+ seconds (common with multiple tool calls), the webhook caller might time out. Use an async pattern: acknowledge the webhook immediately, process in the background, and send results via a callback URL or Slack notification.

Where to Go From Here

You’ve got two working agent workflows. The research agent handles information gathering and structuring, the support agent handles customer-facing interactions with RAG. Both patterns extend naturally.

A few directions worth exploring: chain multiple agents together by having one workflow trigger another via webhook. Build a content moderation agent that reviews user submissions before they go live. Create a data pipeline agent that monitors a Google Sheet for new entries and enriches them with external data.

The n8n community template library has over 6,000 workflow templates, and the AI-specific ones have been growing fast since the 2.0 launch. Browse them at n8n.io/workflows — even if you don’t use them directly, they’re good for understanding what’s possible.

One thing I’d push back on: don’t try to make your agent handle everything. The best agent workflows do one thing reliably. If you need ten capabilities, build ten focused workflows and connect them, rather than cramming everything into one agent with fifteen tools. A confused agent is worse than no agent at all.