Build a tax code assistant with Qdrant, Mistral.ai and OpenAI

10429 views

2/3/2026

table-generator data-transformation reporting documentation markdown

This n8n workflows builds another example of creating a knowledgebase assistant but demonstrates how a more deliberate and targeted approach to ingesting the data can produce much better results for your chatbot.

In this example, a government tax code policy document is used. Whilst we could split the document into chunks by content length, we often lose the context of chapters and sections which may be required by the user.

Our approach then is to first split the document into chapters and sections before importing into our vector store. Additionally, using metadata correctly is key to allow filtering and scoped queries.

Example

Human: "Tell me about what the tax code says about cargo for intentional commerce?"

AI: "Section 11.25 of the Texas Property Tax Code pertains to "MARINE CARGO CONTAINERS USED EXCLUSIVELY IN INTERNATIONAL COMMERCE." In this section, a person who is a citizen of a foreign country or an en..."

How it works

The tax code policy document is downloaded as a zip file from the government website and its pages are extracted as separate chapters.
Each chapter is then parsed and split into its sections using data manipulation expressions.
Each section is then inserted into our Qdrant vector store tagged with its source, chapter and section numbers as metadata.
When our AI Agent needs to retrieve data from our vector store, we use a custom workflow tool to perform the query to Qdrant.
Because we're relying on Qdrant's advanced filtering capabilities, we perform the search using the Qdrant API rather than the Qdrant node.
When the AI Agent, needs to pull full wording or extracts, we can use Qdrant's scroll API and metadata filtering to do so. This makes Qdrant behave like a key-value store for our document.

Requirements

A Qdrant instance is required for the vector store and specifically for it's filtering functionality.
Mistral.ai account for Embeddings and AI models.

Customising this workflow

Depending on your use-case, consider returning actual PDF pages (or links) to the user for the extra confirmation and to build trust.

Not using Mistral? You are able to replace but note to match the distance and dimension size of Qdrant collection to your chosen embedding model.

Tax Code Assistant with Qdrant and Mistral AI (n8n Workflow)

This n8n workflow creates a conversational AI assistant that can answer questions about tax codes. It leverages Qdrant for vector storage, Mistral AI for embeddings, and OpenAI for the chat model, enabling it to provide relevant information based on a knowledge base.

What it does

This workflow orchestrates several AI components to build a robust tax code assistant:

Triggers on Chat Input: The workflow starts when a chat message is received, initiating a conversation.
Manages Conversation History: It utilizes a simple memory buffer to maintain the context of the ongoing conversation.
Embeds User Queries: User questions are converted into numerical vector embeddings using Mistral AI.
Retrieves Relevant Documents: These embeddings are then used to query a Qdrant vector store, retrieving the most relevant tax code documents from a pre-loaded knowledge base.
Generates AI Response: The retrieved documents, along with the conversation history and the user's query, are fed into an OpenAI Chat Model to generate a comprehensive and contextually appropriate answer.

Prerequisites/Requirements

To use this workflow, you will need:

n8n Instance: A running n8n instance.
OpenAI API Key: For the OpenAI Chat Model.
Mistral AI API Key: For generating embeddings.
Qdrant Instance: A running Qdrant vector database instance.
Tax Code Documents: Your tax code documents (e.g., PDF, TXT) to be loaded into Qdrant.

Setup/Usage

Import the workflow: Download the JSON provided and import it into your n8n instance.
Configure Credentials:
- OpenAI Chat Model: Configure your OpenAI API Key credential.
- Embeddings Mistral Cloud: Configure your Mistral AI API Key credential.
- Qdrant Vector Store: Configure your Qdrant connection details (host, port, API key if applicable).
Prepare your Knowledge Base:
- Use the "Extract from File" and "Split Out" nodes (or similar data loading/processing nodes) to load your tax code documents.
- The "Recursive Character Text Splitter" node is crucial for breaking down large documents into manageable chunks for embedding.
- The "Default Data Loader" node will prepare your text for embedding.
- Run the workflow once (or a separate ingestion workflow) to embed your tax code documents using "Embeddings Mistral Cloud" and store them in your "Qdrant Vector Store".
Activate the Workflow: Once all credentials and the knowledge base are set up, activate the workflow.
Start Chatting: You can now interact with your tax code assistant via the "Chat Trigger" (e.g., through a connected chat service or by manually triggering with a test message).

Related Templates

Generate PDF documents from HTML with PDF Generator API, Gmail and Supabase

Who’s this for 💼 This template is designed for teams and developers who need to generate PDF documents automatically from HTML templates. It’s suitable for use cases such as invoices, confirmations, reports, certificates, or any custom document that needs to be created dynamically based on incoming data. What this workflow does ⚙️ This workflow automates the full lifecycle of document generation, from request validation to delivery and storage. It is triggered by a POST webhook that receives structured JSON data describing the requested document and client information. Before generating the document, the workflow validates the client’s email address using Hunter Email Verification to prevent invalid or mistyped emails. If the email is valid, the workflow loads the appropriate HTML template from a Postgres database, fills it with the incoming data, and converts it into a PDF using PDF Generator API. Once the PDF is generated, it is sent to the client via Gmail, uploaded to Supabase Storage, and the transaction is recorded in the database for tracking and auditing purposes. How it works 🛠️ Receives a document generation request via a POST webhook. Validates the client’s email address using Hunter. Generates a PDF document from an HTML template using PDF Generator API. Sends the PDF via Gmail and uploads it to Supabase Storage. Stores a document generation record in the database. How to set up 🖇️ Before activating the workflow, make sure all required services and connections are prepared and available in your n8n environment. Create a POST webhook endpoint that accepts structured JSON input. Add Hunter API credentials for email verification. Add PDF Generator API credentials for HTML to PDF conversion. Prepare a Postgres database with tables for HTML templates and document generation records. Set up Gmail or SMTP credentials for email delivery. Configure Supabase Storage for storing generated PDF files. Requirements ✅ PDF Generator API account Hunter account Postgres database Gmail or SMTP-compatible email provider Supabase project with Storage enabled How to customize the workflow 🤖 This workflow can be adapted to different document generation scenarios by extending or modifying its existing steps: Add extra validation steps before document generation if required. Extend delivery options by sending the generated PDF to additional services or webhooks. Enhance security by adding document encryption or access control. Add support for additional document types by storing more HTML templates in the database. Modify the database schema or queries to store additional metadata related to generated documents. Adjust the data mapping logic in the Code node to match your input structure.

By Marián Današ

113

Sum or aggregate a column of spreadsheet or table data

This workflow shows how to sum multiple items of data, like you would in Excel or Airtable when summing up the total of a column. It uses a Function node with some javascript to perform the aggregation of numeric data. The first node is simply mock data to avoid needing a credential to run the workflow. The second node actually performs the summation - the javascript has various comments in case you need to edit the JS. For example, to sum multiple items of data. Below is an example of the type of data this workflow can sum - so anything that is in a tabular form (Airtable, GSHeets, Postgres etc).

By Max Tkacz

1433

Fetch dynamic prompts from GitHub and auto-populate n8n expressions in prompt

Who Is This For? This workflow is designed for AI engineers, automation specialists, and content creators who need a scalable system to dynamically manage prompts stored in GitHub. It eliminates manual updates, enforces required variable checks, and ensures that AI interactions always receive fully processed prompts. --- 🚀 What Problem Does This Solve? Manually managing AI prompts can be inefficient and error-prone. This workflow: ✅ Fetches dynamic prompts from GitHub ✅ Auto-populates placeholders with values from the setVars node ✅ Ensures all required variables are present before execution ✅ Processes the formatted prompt through an AI agent --- 🛠 How This Workflow Works This workflow consists of three key branches, ensuring smooth prompt retrieval, variable validation, and AI processing. --- 1️⃣ Retrieve the Prompt from GitHub (HTTP Request → Extract from File → SetPrompt) The workflow starts manually or via an external trigger. It fetches a text-based prompt stored in a GitHub repository. The Extract from File Node retrieves the content from the GitHub file. The SetPrompt Node stores the prompt, making it accessible for processing. 📌 Note: The prompt must contain n8n expression format variables (e.g., {{ $json.company }}) so they can be dynamically replaced. --- 2️⃣ Extract & Auto-Populate Variables (Check All Prompt Vars → Replace Variables) A Code Node scans the prompt for placeholders in the n8n expression format ({{ $json.variableName }}). The workflow compares required variables against the setVars node: ✅ If all variables are present, it proceeds to variable replacement. ❌ If any variables are missing, the workflow stops and returns an error listing them. The Replace Variables Node replaces all placeholders with values from setVars. 📌 Example of a properly formatted GitHub prompt: Hello {{ $json.company }}, your product {{ $json.features }} launches on {{ $json.launch_date }}. This ensures seamless replacement when processed in n8n. --- 3️⃣ AI Processing & Output (AI Agent → Prompt Output) The Set Completed Prompt Node stores the final, processed prompt. The AI Agent Node (Ollama Chat Model) processes the prompt. The Prompt Output Node returns the fully formatted response. 📌 Optional: Modify this to use OpenAI, Claude, or other AI models. --- ⚠️ Error Handling: Missing Variables If a required variable is missing, the workflow stops execution and provides an error message: ⚠️ Missing Required Variables: ["launch_date"] This ensures no incomplete prompts are sent to AI agents. --- ✅ Example Use Case 📜 GitHub Prompt File (Using n8n Expressions) Hello {{ $json.company }}, your product {{ $json.features }} launches on {{ $json.launch_date }}. 🔹 Variables in setVars Node json { "company": "PropTechPro", "features": "AI-powered Property Management", "launch_date": "March 15, 2025" } ✅ Successful Output Hello PropTechPro, your product AI-powered Property Management launches on March 15, 2025. 🚨 Error Output (If Missing launch_date) ⚠️ Missing Required Variables: ["launch_date"] --- 🔧 Setup Instructions 1️⃣ Connect Your GitHub Repository Store your prompt in a public or private GitHub repo. The workflow will fetch the raw file using the GitHub API. 2️⃣ Configure the SetVars Node Define the required variables in the SetVars Node. Make sure the variable names match those used in the prompt. 3️⃣ Test & Run Click Test Workflow to execute. If variables are missing, it will show an error. If everything is correct, it will output the fully formatted prompt. --- ⚡ How to Customize This Workflow 💡 Need CRM or Database Integration? Connect the setVars node to an Airtable, Google Sheets, or HubSpot API to pull variables dynamically. 💡 Want to Modify the AI Model? Replace the Ollama Chat Model with OpenAI, Claude, or a custom LLM endpoint. --- 📌 Why Use This Workflow? ✅ No Manual Updates Required – Fetches prompts dynamically from GitHub. ✅ Prevents Broken Prompts – Ensures required variables exist before execution. ✅ Works for Any Use Case – Handles AI chat prompts, marketing messages, and chatbot scripts. ✅ Compatible with All n8n Deployments – Works on Cloud, Self-Hosted, and Desktop versions.

By RealSimple Solutions

3558