Document RAG & chat agent: Google Drive to Qdrant with Mistral OCR

1409 views

2/3/2026

Analytics Google Lighthouse Slack Performance Monitoring Web Performance

Knowledge RAG & AI Chat Agent: Google Drive to Qdrant

Description

This workflow transforms a Google Drive folder into an intelligent, searchable knowledge base and provides a chat agent to query it.
It’s composed of two distinct flows:

An ingestion pipeline to process documents.
A live chat agent that uses RAG (Retrieval-Augmented Generation) and optional web search to answer user questions.

This system fully automates the creation of a “Chat with your docs” solution and enhances it with external web-searching capabilities.

Quick Implementation Steps

Import the workflow JSON into your n8n instance.
Set up credentials for Google Drive, Mistral AI, OpenAI, and Qdrant.
Open the Web Search node and add your Tavily AI API key to the Authorization header.
In the Google Drive (List Files) node, set the Folder ID you want to ingest.
Run the workflow manually once to populate your Qdrant database (Flow 1).
Activate the workflow to enable the chat trigger (Flow 2).
Copy the public webhook URL from the When chat message received node and open it in a new tab to start chatting.

What It Does

The workflow is divided into two primary functions:

1. Knowledge Base Ingestion (Manual Trigger)

This flow populates your vector database.

Scans Google Drive: Lists all files from a specified folder.
Processes Files Individually: Downloads each file.
Extracts Text via OCR: Uses Mistral AI OCR API for text extraction from PDFs, images, etc.
Generates Smart Metadata: A Mistral LLM assigns metadata like document_type, project, and assigned_to.
Chunks & Embeds: Text is cleaned, chunked, and embedded via OpenAI’s text-embedding-3-small model.
Stores in Qdrant: Text chunks, embeddings, and metadata are stored in a Qdrant collection (docaiauto).

2. AI Chat Agent (Chat Trigger)

This flow powers the conversational interface.

Handles User Queries: Triggered when a user sends a chat message.
Internal RAG Retrieval: Searches Qdrant Vector Store first for answers.
Web Search Fallback: If unavailable internally, the agent offers to perform a Tavily AI web search.
Contextual Responses: Combines internal and external info for comprehensive answers.

Who's It For

Ideal for:

Teams building internal AI knowledge bases from Google Drive.
Developers creating AI-powered support, research, or onboarding bots.
Organizations implementing RAG pipelines.
Anyone making unstructured Google Drive documents searchable via chat.

Requirements

n8n instance (self-hosted or cloud).
Google Drive Credentials (to list and download files).
Mistral AI API Key (for OCR & metadata extraction).
OpenAI API Key (for embeddings and chat LLM).
Qdrant instance (cloud or self-hosted).
Tavily AI API Key (for web search).

How It Works

The workflow runs two independent flows in parallel:

Flow 1: Ingestion Pipeline (Manual Trigger)

List Files: Fetch files from Google Drive using the Folder ID.
Loop & Download: Each file is processed one by one.
OCR Processing:
- Upload file to Mistral
- Retrieve signed URL
- Extract text using Mistral DOC OCR
Metadata Extraction: Analyze text using a Mistral LLM.
Text Cleaning & Chunking: Split into 1000-character chunks.
Embeddings Creation: Use OpenAI embeddings.
Vector Insertion: Push chunks + metadata into Qdrant.

Flow 2: AI Chat Agent (Chat Trigger)

Chat Trigger: Starts when a chat message is received.
AI Agent: Uses OpenAI + Simple Memory to process context.
RAG Retrieval: Queries Qdrant for related data.
Decision Logic:
- Found → Form answer.
- Not found → Ask if user wants web search.
Web Search: Performs Tavily web lookup.
Final Response: Synthesizes internal + external info.

How To Set Up

1. Import the Workflow

Upload the provided JSON into your n8n instance.

2. Configure Credentials

Create and assign:

Google Drive → Google Drive nodes
Mistral AI → Upload, Signed URL, DOC OCR, Cloud Chat Model
OpenAI → Embeddings + Chat Model nodes
Qdrant → Vector Store nodes

3. Add Tavily API Key

Open Web Search node → Parameters → Headers
Add your key under Authorization (e.g., tvly-xxxx).

4. Node Configuration

Google Drive (List Files): Set Folder ID.
Qdrant Nodes: Ensure same collection name (docaiauto).

5. Run Ingestion (Flow 1)

Click Test workflow to populate Qdrant with your Drive documents.

6. Activate Chat (Flow 2)

Toggle the workflow ON to enable real-time chat.

7. Test

Open the webhook URL and start chatting!

How To Customize

Change LLMs: Swap models in OpenAI or Mistral nodes (e.g., GPT-4o, Claude 3).
Modify Prompts: Edit the system message in ai chat agent to alter tone or logic.
Chunking Strategy: Adjust chunkSize and chunkOverlap in the Code node.
Different Sources: Replace Google Drive with AWS S3, Local Folder, etc.
Automate Updates: Add a Cron node for scheduled ingestion.
Validation: Add post-processing steps after metadata extraction.
Expand Tools: Add more functional nodes like Google Calendar or Calculator.

Use Case Examples

Internal HR Bot: Answer HR-related queries from stored policy docs.
Tech Support Assistant: Retrieve troubleshooting steps for products.
Research Assistant: Summarize and compare market reports.
Project Management Bot: Query document ownership or project status.

Troubleshooting Guide

| Issue | Possible Solution | |------------|------------------------| | Chat agent doesn’t respond | Check OpenAI API key and model availability (e.g., gpt-4.1-mini). | | Known documents not found | Ensure ingestion flow ran and both Qdrant nodes use same collection name. | | OCR node fails | Verify Mistral API key and input file integrity. | | Web search not triggered | Re-check Tavily API key in Web Search node headers. | | Incorrect metadata | Tune Information Extractor prompt or use a stronger Mistral model. |

Need Help or More Workflows?

Want to customize this workflow for your business or integrate it with your existing tools?
Our team at Digital Biz Tech can tailor it precisely to your use case from automation logic to AI-powered enhancements.

We can help you set it up for free — from connecting credentials to deploying it live.

Contact: shilpa.raju@digitalbiz.tech
Website: https://www.digitalbiz.tech
LinkedIn: https://www.linkedin.com/company/digital-biz-tech/
You can also DM us on LinkedIn for any help.

Document RAG Chat Agent: Google Drive to Qdrant with Mistral OCR

This n8n workflow creates a powerful Retrieval Augmented Generation (RAG) chat agent. It automates the process of extracting text from documents stored in Google Drive, processing them with OCR (if needed), embedding the text, storing it in Qdrant, and then making this information queryable via a Mistral-powered AI chat agent.

What it does

This workflow automates the following steps:

Triggers on Chat Message: Initiates the workflow when a chat message is received.
Initializes AI Agent: Sets up an AI agent using a Mistral Chat Model and a Simple Memory for conversational context.
Checks for Document Ingestion Request: Determines if the user's chat message is a request to ingest documents from Google Drive.
If Ingestion Requested:
- Lists Google Drive Files: Retrieves a list of files from a specified Google Drive folder.
- Loops Over Files: Processes each file individually.
- Downloads File: Downloads the content of each Google Drive file.
- Extracts Text with OCR (if needed): Uses an HTTP Request node to send the file to an OCR service (e.g., Google Document AI or a custom OCR solution) to extract text, especially from image-based documents.
- Loads Document: Uses a Default Data Loader to process the extracted text.
- Splits Text: Breaks down the document text into smaller chunks using a Character Text Splitter.
- Embeds Text: Generates vector embeddings for each text chunk using OpenAI Embeddings.
- Stores in Qdrant: Upserts the embedded text chunks into a Qdrant vector store, making them searchable.
- Confirms Ingestion: Sends a chat message confirming that the documents have been ingested.
If Chat Query:
- Retrieves Relevant Documents: Queries the Qdrant vector store to find documents relevant to the user's chat message.
- Generates Response: Uses the AI Agent (Mistral Chat Model) to generate a conversational response based on the retrieved documents and chat history.
- Responds to Chat: Sends the generated AI response back to the chat.

Prerequisites/Requirements

n8n Instance: A running n8n instance.
Google Drive Account: With access to the folder containing the documents to be ingested.
Google Drive Credential: Configured in n8n for accessing Google Drive.
OCR Service: An external OCR service (e.g., Google Document AI, or a self-hosted solution) accessible via an HTTP API. The "HTTP Request" node is configured for this.
OpenAI API Key: For the "Embeddings OpenAI" node to generate text embeddings.
Mistral Cloud API Key: For the "Mistral Cloud Chat Model" node to power the AI agent.
Qdrant Instance: A running Qdrant vector database instance.
Qdrant Credential: Configured in n8n for connecting to Qdrant.

Setup/Usage

Import the workflow: Import the provided JSON into your n8n instance.
Configure Credentials:
- Set up your Google Drive OAuth2 or API Key credential.
- Set up your OpenAI API Key credential.
- Set up your Mistral Cloud API Key credential.
- Set up your Qdrant API Key and host credential.
Configure Google Drive Node (ID: 58):
- Specify the Folder ID of the Google Drive folder containing your documents.
Configure HTTP Request (ID: 19) for OCR:
- Update the URL to point to your OCR service endpoint.
- Adjust Headers and Body as required by your OCR service API (e.g., for authentication, file format).
- Ensure the Binary Data field is correctly mapped to the downloaded Google Drive file.
Configure Qdrant Vector Store (ID: 1248):
- Specify the Collection Name where documents will be stored.
Activate the workflow: Once all credentials and configurations are set, activate the workflow.
Interact with the Chat Agent: Send a chat message to the configured chat trigger.
- To ingest documents, send a message that matches the condition in the "If" node (ID: 20), for example, "ingest documents".
- To query, send a question related to the ingested documents.

Related Templates

Newsletter signup flow with Email Verification API, Gmail & Google Sheets tracking

Newsletter Sign-up with Email Verification & Welcome Email Automation 📋 Description A complete, production-ready newsletter automation workflow that validates email addresses, sends personalized welcome emails, and maintains comprehensive logs in Google Sheets. Perfect for marketing teams, content creators, and businesses looking to build high-quality email lists with minimal manual effort. ✨ Key Features Email Verification Real-time validation using Verifi Email API Checks email format (RFC compliance) Verifies domain existence and MX records Detects disposable/temporary email addresses Identifies potential spoofed emails Automated Welcome Emails Personalized HTML emails with subscriber's first name Beautiful, mobile-responsive design with gradient headers Branded confirmation and unsubscribe links Sent via Gmail (or SMTP) automatically to valid subscribers Smart Data Handling Comprehensive logging to Google Sheets with three separate tabs Handles incomplete submissions gracefully Preserves original user data throughout verification process Tracks source attribution for multi-channel campaigns Error Management Automatic retry logic on API failures Separate logging for different error types Detailed technical reasons for invalid emails No data loss with direct webhook referencing 🎯 Use Cases Newsletter sign-ups on websites and landing pages Lead generation forms with quality control Marketing campaigns requiring verified email lists Community building with automated onboarding SaaS product launches with email collection Content creator audience building E-commerce customer list management 📊 What Gets Logged Master Log (All Subscribers) Timestamp, name, email, verification result Verification score and email sent status Source tracking, disposable status, domain info Invalid Emails Log Detailed rejection reasons Technical diagnostic information MX record status, RFC compliance Provider information for troubleshooting Invalid Submissions Log Incomplete form data Missing required fields Timestamp for follow-up 🔧 Technical Stack Trigger: Webhook (POST endpoint) Email Verification: Verifi Email API Email Sending: Gmail OAuth2 (or SMTP) Data Storage: Google Sheets (3 tabs) Processing: JavaScript code nodes for data formatting 🚀 Setup Requirements Google Account - For Sheets and Gmail integration Verifi Email API Key - (https://verifi.email) Google Sheets - Pre-configured with 3 tabs (template provided) 5-10 minutes - Quick setup with step-by-step instructions included 📈 Benefits ✅ Improve Email Deliverability - Remove invalid emails before sending campaigns ✅ Reduce Bounce Rates - Only send to verified, active email addresses ✅ Save Money - Don't waste email credits on invalid addresses ✅ Better Analytics - Track conversion rates by source ✅ Professional Onboarding - Personalized welcome experience ✅ Scalable Solution - Handles high-volume sign-ups automatically ✅ Data Quality - Build a clean, high-quality subscriber list 🎨 Customization Options Email Template - Fully customizable HTML design Verification Threshold - Adjust score requirements Brand Colors - Match your company branding Confirmation Flow - Add double opt-in if desired Multiple Sources - Track different signup forms Language - Easily translate email content 📦 What's Included ✅ Complete n8n workflow JSON (ready to import) ✅ Google Sheets template structure ✅ Responsive HTML email template ✅ Setup documentation with screenshots ✅ Troubleshooting guide ✅ Customization examples 🔒 Privacy & Compliance GDPR-compliant with unsubscribe links Secure data handling via OAuth2 No data shared with third parties Audit trail in Google Sheets Easy data deletion/export 💡 Quick Stats 12 Nodes - Fully automated workflow 3 Data Paths - Valid, invalid, and incomplete submissions 100% Uptime - When properly configured Instant Processing - Real-time email verification Unlimited Scale - Based on your API limits 🏆 Perfect For Marketing Agencies SaaS Companies Content Creators E-commerce Stores Community Platforms Educational Institutions Membership Sites Newsletter Publishers 🌟 Why Use This Workflow? Instead of manually verifying emails or dealing with bounce complaints, this workflow automates the entire process from sign-up to welcome email. Save hours of manual work, improve your email deliverability, and create a professional first impression with every new subscriber. Start building a high-quality email list today! ---

By Jitesh Dugar

245

Automate job searching & resume customization with AI, LinkedIn & Google Sheets

🤖 AI-Powered Job Matcher & Resume Customizer Description This advanced workflow automates the entire job search and preparation process, moving beyond simple notifications to provide AI-driven career intelligence. It connects to LinkedIn to scrape fresh job postings, filters against jobs you've already seen, and then uses powerful LLMs (Mistral Large/Small) to perform a detailed resume-to-job match, generate tailored cover letters, and provide concrete resume improvement suggestions. All data is logged into a Google Sheet for comprehensive tracking, and a clean, single Daily Digest Email summarizes the top 5 matches found each day. --- ✨ Key Features Automated Scheduling: Runs daily to find new job postings. Multi-Keyword Search: Uses your main job title and three alternate titles generated by an AI Agent for maximum search coverage. LinkedIn Web Scraping: Pulls new job URLs, details, location, and salary data from LinkedIn Search results. Duplicate Prevention: Uses the Compare Datasets node to ensure only new, unseen jobs are processed against your master Google Sheet. Intelligent Matching (LLM): The workflow performs a detailed job-to-resume comparison, generating: A Match Score (0-100) with evidence for alignment in skills, experience, and domain. A Tailored Cover Letter specific to the job title and company. Actionable Resume Improvement points (e.g., [ADD], [QUANTIFY]) to optimize your resume for the specific role. Centralized Tracking: Saves all job data, match scores, cover letters, and resume suggestions to a Google Sheet. Professional Daily Digest: Sends a single, clean HTML email summarizing the top 5 highest-scoring job matches for easy review. --- 🛠️ Prerequisites n8n Credentials: Google Drive: To download your resume (PDF/DOCX file URL). Google Sheets: To connect to your job tracking sheet. Gmail: To send the daily digest email. Mistral Cloud: For the LLM processing (Resume Breakdown, Job Matching, and Resume Analysis). External Files: A Job Tracking Google Sheet (used as a master database). Your current Resume file (PDF recommended, hosted on Google Drive). Setup Notes: Update the file links (Download Resume node) and Google Sheet details (Get row(s)/Append nodes). Set your personal email address in the Send Digest Email node. Review the LLM prompts to tailor the AI agent's persona and output fields to your exact needs.

By Jordan Hoyle

1405

Generate QA test cases from Figma designs to Google Sheets using GPT-4o-mini

Description Transform Figma design files into detailed QA test cases with AI-driven analysis and structured export to Google Sheets. This workflow helps QA and product teams streamline design validation, test coverage, and documentation — all without manual effort. 🎨🤖📋 What This Template Does Step 1: Trigger manually and input your Figma file ID. 🎯 Step 2: Fetches the full Figma design data (layers, frames, components) via API. 🧩 Step 3: Sends structured design JSON to GPT-4o-mini for intelligent test case generation. 🧠 Step 4: AI analyzes UI components, user flows, and accessibility aspects to generate 5–10 test cases. ✅ Step 5: Parses and formats results into a clean structure. Step 6: Exports test cases directly to Google Sheets for QA tracking and reporting. 📊 Key Benefits ✅ Saves 2–3 hours per design by automating test case creation ✅ Ensures consistent, comprehensive QA documentation ✅ Uses AI to detect UX, accessibility, and functional coverage gaps ✅ Centralizes output in Google Sheets for easy collaboration Features Figma API integration for design parsing GPT-4o-mini model for structured test generation Automated Google Sheets export Dynamic file ID and output schema mapping Built-in error handling for large design files Requirements Figma Personal Access Token OpenAI API key (GPT-4o-mini) Google Sheets OAuth2 credentials Target Audience QA and Test Automation Engineers Product & Design Teams Startups and Agencies validating Figma prototypes Setup Instructions Connect your Figma token as HTTP Header Auth (X-Figma-Token). Add your OpenAI API key in n8n credentials (model: gpt-4o-mini). Configure Google Sheets OAuth2 and select your sheet. Input Figma file ID from the design URL. Run once manually, verify output, then enable for regular use.

By Rahul Joshi

870