Answer questions from documents with RAG using Supabase, OpenAI & Cohere reranker

1310 views

2/3/2026

ServiceNow Bulk Read Data Extraction API Pagination

This workflow contains community nodes that are only compatible with the self-hosted version of n8n.

This comprehensive RAG workflow enables your AI agents to answer user questions with contextual knowledge pulled from your own documents — using metadata-rich embeddings stored in Supabase.

🔧 Key Features: RAG Agents powered by GPT-4.5 or GPT-3.5 via OpenRouter or OpenAI.

Supabase Vector Store to store and retrieve document embeddings.

Cohere Reranker to improve response relevance and quality.

Metadata Agent to enrich vectorized data before ingestion.

PDF Extraction Flow to automatically parse and upload documents with metadata.

✅ Setup Steps: Connect your Supabase Vector Store.

Use OpenAI Embeddings (e.g. text-embedding-3-small).

Add API keys for OpenAI and/or OpenRouter.

Connect a reranker like Cohere.

Process documents with metadata before embedding.

Start chatting — your AI agent now returns context-rich answers from your own knowledge base!

Perfect for building AI assistants that can reason, search and answer based on internal company data, academic papers, support docs, or personal notes.

Answer Questions from Documents with RAG using Supabase, OpenAI, and Cohere Reranker

This n8n workflow demonstrates a sophisticated Retrieve-and-Generate (RAG) system for answering user questions based on a corpus of documents. It leverages Supabase as a vector store, OpenAI for embeddings and chat models, and Cohere for reranking search results, providing highly relevant and accurate answers.

What it does

This workflow automates the process of ingesting documents, vectorizing their content, storing them in a Supabase vector database, and then using this knowledge base to answer user questions with enhanced relevance.

Triggers on Chat Message: The workflow is activated when a new chat message (user question) is received.
Ingests Documents from Google Drive (Manual Trigger):
- A manual trigger allows for on-demand execution to process documents.
- It retrieves files from a specified Google Drive folder.
- Files are extracted and loaded using a default data loader.
- OpenAI Embeddings are generated for the document content.
- These embeddings and the document content are stored in a Supabase Vector Store.
Processes User Questions (Chat Trigger):
- When a user asks a question via the "Chat Trigger", the AI Agent is activated.
- The AI Agent uses an OpenAI Chat Model (via OpenRouter) to understand the query.
- It performs a similarity search against the Supabase Vector Store to retrieve relevant document chunks.
- A Cohere Reranker is applied to the search results to improve the relevance of the retrieved documents.
- The reranked documents are then used as context by the OpenAI Chat Model to generate a comprehensive answer to the user's question.

Prerequisites/Requirements

To use this workflow, you will need:

n8n Account: A running n8n instance (cloud or self-hosted).
Google Drive Account: To store the documents you want to use as a knowledge base.
OpenAI API Key: For generating embeddings and powering the chat model.
Supabase Account: To host the vector database for document embeddings.
Cohere API Key: For the reranking functionality to improve search relevance.
OpenRouter API Key (Optional but recommended): To access OpenAI models through OpenRouter, potentially offering more flexibility or cost-effectiveness.

Setup/Usage

Import the workflow: Download the JSON definition and import it into your n8n instance.
Configure Credentials:
- Set up your Google Drive credentials.
- Set up your OpenAI credentials for both the Embeddings OpenAI node and the OpenRouter Chat Model (if using OpenRouter, otherwise configure OpenAI directly in the AI Agent).
- Set up your Supabase credentials (API URL, API Key, and table name for the vector store).
- Set up your Cohere credentials for the Reranker Cohere node.
Ingest Documents:
- Open the workflow.
- Locate the Google Drive node and configure it to point to the folder containing your source documents.
- Execute the workflow manually by clicking "Execute workflow" on the Manual Trigger node. This will process your documents and populate the Supabase vector store.
- Note: This part of the workflow is designed for initial document ingestion. You might want to disable the Google Drive branch after initial setup or integrate it with a separate trigger for continuous updates.
Start Chatting:
- Activate the workflow.
- Use the Chat Trigger (e.g., via the n8n chat interface or by sending a message to the configured chat platform) to ask questions. The AI Agent will retrieve relevant information from your Supabase knowledge base and provide an answer.

Related Templates

Track competitor SEO keywords with Decodo + GPT-4.1-mini + Google Sheets

This workflow automates competitor keyword research using OpenAI LLM and Decodo for intelligent web scraping. Who this is for SEO specialists, content strategists, and growth marketers who want to automate keyword research and competitive intelligence. Marketing analysts managing multiple clients or websites who need consistent SEO tracking without manual data pulls. Agencies or automation engineers using Google Sheets as an SEO data dashboard for keyword monitoring and reporting. What problem this workflow solves Tracking competitor keywords manually is slow and inconsistent. Most SEO tools provide limited API access or lack contextual keyword analysis. This workflow solves that by: Automatically scraping any competitor’s webpage with Decodo. Using OpenAI GPT-4.1-mini to interpret keyword intent, density, and semantic focus. Storing structured keyword insights directly in Google Sheets for ongoing tracking and trend analysis. What this workflow does Trigger — Manually start the workflow or schedule it to run periodically. Input Setup — Define the website URL and target country (e.g., https://dev.to, france). Data Scraping (Decodo) — Fetch competitor web content and metadata. Keyword Analysis (OpenAI GPT-4.1-mini) Extract primary and secondary keywords. Identify focus topics and semantic entities. Generate a keyword density summary and SEO strength score. Recommend optimization and internal linking opportunities. Data Structuring — Clean and convert GPT output into JSON format. Data Storage (Google Sheets) — Append structured keyword data to a Google Sheet for long-term tracking. Setup Prerequisites If you are new to Decode, please signup on this link visit.decodo.com n8n account with workflow editor access Decodo API credentials OpenAI API key Google Sheets account connected via OAuth2 Make sure to install the Decodo Community node. Create a Google Sheet Add columns for: primarykeywords, seostrengthscore, keyworddensity_summary, etc. Share with your n8n Google account. Connect Credentials Add credentials for: Decodo API credentials - You need to register, login and obtain the Basic Authentication Token via Decodo Dashboard OpenAI API (for GPT-4o-mini) Google Sheets OAuth2 Configure Input Fields Edit the “Set Input Fields” node to set your target site and region. Run the Workflow Click Execute Workflow in n8n. View structured results in your connected Google Sheet. How to customize this workflow Track Multiple Competitors → Use a Google Sheet or CSV list of URLs; loop through them using the Split In Batches node. Add Language Detection → Add a Gemini or GPT node before keyword analysis to detect content language and adjust prompts. Enhance the SEO Report → Expand the GPT prompt to include backlink insights, metadata optimization, or readability checks. Integrate Visualization → Connect your Google Sheet to Looker Studio for SEO performance dashboards. Schedule Auto-Runs → Use the Cron Node to run weekly or monthly for competitor keyword refreshes. Summary This workflow automates competitor keyword research using: Decodo for intelligent web scraping OpenAI GPT-4.1-mini for keyword and SEO analysis Google Sheets for live tracking and reporting It’s a complete AI-powered SEO intelligence pipeline ideal for teams that want actionable insights on keyword gaps, optimization opportunities, and content focus trends, without relying on expensive SEO SaaS tools.

By Ranjan Dailata

161

Dynamic Hubspot lead routing with GPT-4 and Airtable sales team distribution

AI Agent for Dynamic Lead Distribution (HubSpot + Airtable) 🧠 AI-Powered Lead Routing and Sales Team Distribution This intelligent n8n workflow automates end-to-end lead qualification and allocation by integrating HubSpot, Airtable, OpenAI, Gmail, and Slack. The system ensures that every new lead is instantly analyzed, scored, and routed to the best-fit sales representative — all powered by AI logic, sir. --- 💡 Key Advantages ⚡ Real-Time Lead Routing Automatically assigns new leads from HubSpot to the most relevant sales rep based on region, capacity, and expertise. 🧠 AI Qualification Engine An OpenAI-powered Agent evaluates the lead’s industry, region, and needs to generate a persona summary and routing rationale. 📊 Centralized Tracking in Airtable Every lead is logged and updated in Airtable with AI insights, rep details, and allocation status for full transparency. 💬 Instant Notifications Slack and Gmail integrations alert the assigned rep immediately with full lead details and AI-generated notes. 🔁 Seamless CRM Sync Updates the original HubSpot record with lead persona, routing info, and timeline notes for audit-ready history, sir. --- ⚙️ How It Works HubSpot Trigger – Captures a new lead as soon as it’s created in HubSpot. Fetch Contact Data – Retrieves all relevant fields like name, company, and industry. Clean & Format Data – A Code node standardizes and structures the data for consistency. Airtable Record Creation – Logs the lead data into the “Leads” table for centralized tracking. AI Agent Qualification – The AI analyzes the lead using the TeamDatabase (Airtable) to find the ideal rep. Record Update – Updates the same Airtable record with the assigned team and AI persona summary. Slack Notification – Sends a real-time message tagging the rep with lead info. Gmail Notification – Sends a personalized handoff email with context and follow-up actions. HubSpot Sync – Updates the original contact in HubSpot with the assignment details and AI rationale, sir. --- 🛠️ Setup Steps Trigger Node: HubSpot → Detect new leads. HubSpot Node: Retrieve complete lead details. Code Node: Clean and normalize data. Airtable Node: Log lead info in the “Leads” table. AI Agent Node: Process lead and match with sales team. Slack Node: Notify the designated representative. Gmail Node: Email the rep with details. HubSpot Node: Update CRM with AI summary and allocation status, sir. --- 🔐 Credentials Required HubSpot OAuth2 API – To fetch and update leads. Airtable Personal Access Token – To store and update lead data. OpenAI API – To power the AI qualification and matching logic. Slack OAuth2 – For sending team notifications. Gmail OAuth2 – For automatic email alerts to assigned reps, sir. --- 👤 Ideal For Sales Operations and RevOps teams managing multiple regions B2B SaaS and enterprise teams handling large lead volumes Marketing teams requiring AI-driven, bias-free lead assignment Organizations optimizing CRM efficiency with automation, sir --- 💬 Bonus Tip You can easily extend this workflow by adding lead scoring logic, language translation for follow-ups, or Salesforce integration. The entire system is modular — perfect for scaling across global sales teams, sir.

By MANISH KUMAR

113

Create personalized email outreach with AI, Telegram bot & website scraping

Demo Personalized Email This n8n workflow is built for AI and automation agencies to promote their workflows through an interactive demo that prospects can try themselves. The featured system is a deep personalized email demo. --- 🔄 How It Works Prospect Interaction A prospect starts the demo via Telegram. The Telegram bot (created with BotFather) connects directly to your n8n instance. Demo Guidance The RAG agent and instructor guide the user step-by-step through the demo. Instructions and responses are dynamically generated based on user input. Workflow Execution When the user triggers an action (e.g., testing the email demo), n8n runs the workflow. The workflow collects website data using Crawl4AI or standard HTTP requests. Email Demo The system personalizes and sends a demo email through SparkPost, showing the automation’s capability. Logging and Control Each user interaction is logged in your database using their name and id. The workflow checks limits to prevent misuse or spam. Error Handling If a low-CPU scraping method fails, the workflow automatically escalates to a higher-CPU method. ⚙️ Requirements Before setting up, make sure you have the following: n8n — Automation platform to run the workflow Docker — Required to run Crawl4AI Crawl4AI — For intelligent website crawling Telegram Account — To create your Telegram bot via BotFather SparkPost Account — To send personalized demo emails A database (e.g., PostgreSQL, MySQL, or SQLite) — To store log data such as user name and ID 🚀 Features Telegram interface using the BotFather API Instructor and RAG agent to guide prospects through the demo Flow generation limits per user ID to prevent abuse Low-cost yet powerful web scraping, escalating from low- to high-CPU flows if earlier ones fail --- 💡 Development Ideas Replace the RAG logic with your own query-answering and guidance method Remove the flow limit if you’re confident the demo can’t be misused Swap the personalized email demo with any other workflow you want to showcase --- 🧠 Technical Notes Telegram bot created with BotFather Website crawl process: Extract sub-links via /sitemap.xml, sitemap_index.xml, or standard HTTP requests Fall back to Crawl4AI if normal requests fail Fetch sub-link content via HTTPS or Crawl4AI as backup SparkPost used for sending demo emails --- ⚙️ Setup Instructions Create a Telegram Bot Use BotFather on Telegram to create your bot and get the API token. This token will be used to connect your n8n workflow to Telegram. Create a Log Data Table In your database, create a table to store user logs. The table must include at least the following columns: name — to store the user’s name or Telegram username. id — to store the user’s unique identifier. Install Crawl4AI with Docker Follow the installation guide from the official repository: 👉 https://github.com/unclecode/crawl4ai Crawl4AI will handle website crawling and content extraction in your workflow. --- 📦 Notes This setup is optimized for low cost, easy scalability, and real-time interaction with prospects. You can customize each component — Telegram bot behavior, RAG logic, scraping strategy, and email workflow — to fit your agency’s demo needs. 👉 You can try the live demo here: @emaildemobot ---

By Michael A Putra

474