Interactive knowledge base chat with Supabase RAG using AI 📚💬

5290 views

2/3/2026

Data Transfer Google Sheets Google Drive Automation File Export

Google Drive File Ingestion to Supabase for Knowledge Base 📂💾

Overview 🌟

This n8n workflow automates the process of ingesting files from Google Drive into a Supabase database, preparing them for a knowledge base system. It supports text-based files (PDF, DOCX, TXT, etc.) and tabular data (XLSX, CSV, Google Sheets), extracting content, generating embeddings, and storing data in structured tables. This is a foundational workflow for building a company knowledge base that can be queried via a chat interface (e.g., using a RAG workflow). 🚀

Problem Solved 🎯

Manually managing a knowledge base with files from Google Drive is time-consuming and error-prone. This workflow solves that by:

Automatically ingesting files from Google Drive as they are created or updated.
Extracting content from various file types (text and tabular).
Generating embeddings for text-based files to enable vector search.
Storing data in Supabase for efficient retrieval.
Handling duplicates and errors to ensure data consistency.

Target Audience:

Knowledge Managers: Build a centralized knowledge base from company files.
Data Teams: Automate the ingestion of spreadsheets and documents.
Developers: Integrate with other workflows (e.g., RAG for querying the knowledge base).

Workflow Description 🔍

This workflow listens for new or updated files in Google Drive, processes them based on their type, and stores the extracted data in Supabase tables for later retrieval. Here’s how it works:

File Detection: Triggers when a file is created or updated in Google Drive.
File Processing: Loops through each file, extracts metadata, and validates the file type.
Duplicate Check: Ensures the file hasn’t been processed before.
Content Extraction:
- Text-based Files: Downloads the file, extracts text, splits it into chunks, generates embeddings, and stores the chunks in Supabase.
- Tabular Files: Extracts data from spreadsheets and stores it as rows in Supabase.
Metadata Storage: Stores file metadata and basic info in Supabase tables.
Error Handling: Logs errors for unsupported formats or duplicates.

Nodes Breakdown 🛠️

1. Detect New File 🔔

Type: Google Drive Trigger
Purpose: Triggers the workflow when a new file is created in Google Drive.
Configuration:
- Credential: Google Drive OAuth2
- Event: File Created
Customization:
- Specify a folder to monitor specific directories.

2. Detect Updated File 🔔

Type: Google Drive Trigger
Purpose: Triggers the workflow when a file is updated in Google Drive.
Configuration:
- Credential: Google Drive OAuth2
- Event: File Updated
Customization:
- Currently disconnected; reconnect if updates need to be processed.

3. Process Each File 🔄

Type: Loop Over Items
Purpose: Processes each file individually from the Google Drive trigger.
Configuration:
- Input: {{ $json.files }}
Customization:
- Adjust the batch size if processing multiple files at once.

4. Extract File Metadata 🆔

Type: Set
Purpose: Extracts metadata like file_id, file_name, mime_type, and web_view_link.
Configuration:
- Fields:
  - file_id: {{ $json.id }}
  - file_name: {{ $json.name }}
  - mime_type: {{ $json.mimeType }}
  - web_view_link: {{ $json.webViewLink }}
Customization:
- Add more metadata fields if needed (e.g., size, createdTime).

5. Check File Type ✅

Type: IF
Purpose: Validates the file type by checking the MIME type.
Configuration:
- Condition: mime_type contains supported types (e.g., application/pdf, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet).
Customization:
- Add more supported MIME types as needed.

6. Find Duplicates 🔍

Type: Supabase
Purpose: Checks if the file has already been processed by querying knowledge_base.
Configuration:
- Operation: Select
- Table: knowledge_base
- Filter: file_id = {{ $node['Extract File Metadata'].json.file_id }}
Customization:
- Add additional duplicate checks (e.g., by file name).

7. Handle Duplicates 🔄

Type: IF
Purpose: Routes the workflow based on whether a duplicate is found.
Configuration:
- Condition: {{ $node['Find Duplicates'].json.length > 0 }}
Customization:
- Add notifications for duplicates if desired.

8. Remove Old Text Data 🗑️

Type: Supabase
Purpose: Deletes old text data from documents if the file is a duplicate.
Configuration:
- Operation: Delete
- Table: documents
- Filter: metadata->>'file_id' = {{ $node['Extract File Metadata'].json.file_id }}
Customization:
- Add logging before deletion.

9. Remove Old Data 🗑️

Type: Supabase
Purpose: Deletes old tabular data from document_rows if the file is a duplicate.
Configuration:
- Operation: Delete
- Table: document_rows
- Filter: dataset_id = {{ $node['Extract File Metadata'].json.file_id }}
Customization:
- Add logging before deletion.

10. Route by File Type 🔀

Type: Switch
Purpose: Routes the workflow based on the file’s MIME type (text-based or tabular).
Configuration:
- Rules: Based on mime_type (e.g., application/pdf for text, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet for tabular).
Customization:
- Add more routes for additional file types.

11. Download File Content 📥

Type: Google Drive
Purpose: Downloads the file content for text-based files.
Configuration:
- Credential: Google Drive OAuth2
- File ID: {{ $node['Extract File Metadata'].json.file_id }}
Customization:
- Add error handling for download failures.

12. Extract PDF Text 📜

Type: Extract from File (PDF)
Purpose: Extracts text from PDF files.
Configuration:
- File Content: {{ $node['Download File Content'].binary.data }}
Customization:
- Adjust extraction settings for better accuracy.

13. Extract DOCX Text 📜

Type: Extract from File (DOCX)
Purpose: Extracts text from DOCX files.
Configuration:
- File Content: {{ $node['Download File Content'].binary.data }}
Customization:
- Add support for other text formats (e.g., TXT, RTF).

14. Extract XLSX Data 📊

Type: Extract from File (XLSX)
Purpose: Extracts tabular data from XLSX files.
Configuration:
- File ID: {{ $node['Extract File Metadata'].json.file_id }}
Customization:
- Add support for CSV or Google Sheets.

15. Split Text into Chunks ✂️

Type: Text Splitter
Purpose: Splits extracted text into manageable chunks for embedding.
Configuration:
- Chunk Size: 1000
- Chunk Overlap: 200
Customization:
- Adjust chunk size and overlap based on document length.

16. Generate Text Embeddings 🌐

Type: OpenAI
Purpose: Generates embeddings for text chunks using OpenAI.
Configuration:
- Credential: OpenAI API key
- Operation: Embedding
- Model: text-embedding-ada-002
Customization:
- Switch to a different embedding model if needed.

17. Store Text in Supabase 💾

Type: Supabase Vector Store
Purpose: Stores text chunks and embeddings in the documents table.
Configuration:
- Credential: Supabase credentials
- Operation: Insert Documents
- Table Name: documents
Customization:
- Add metadata fields to store additional context.

18. Store Tabular Data 💾

Type: Supabase
Purpose: Stores tabular data in the document_rows table.
Configuration:
- Operation: Insert
- Table: document_rows
- Columns: dataset_id, row_data
Customization:
- Add validation for tabular data structure.

19. Store File Metadata 📋

Type: Supabase
Purpose: Stores file metadata in the document_metadata table.
Configuration:
- Operation: Insert
- Table: document_metadata
- Columns: file_id, file_name, file_type, file_url
Customization:
- Add more metadata fields as needed.

20. Record in Knowledge Base 📚

Type: Supabase
Purpose: Stores basic file info in the knowledge_base table.
Configuration:
- Operation: Insert
- Table: knowledge_base
- Columns: file_id, file_name, file_type, file_url, upload_date
Customization:
- Add indexes for faster lookups.

21. Log File Errors ⚠️

Type: Supabase
Purpose: Logs errors for unsupported file types.
Configuration:
- Operation: Insert
- Table: error_log
- Columns: error_type, error_message
Customization:
- Add notifications for errors.

22. Log Duplicate Errors ⚠️

Type: Supabase
Purpose: Logs errors for duplicate files.
Configuration:
- Operation: Insert
- Table: error_log
- Columns: error_type, error_message
Customization:
- Add notifications for duplicates.

Interactive Knowledge Base Chat with Supabase RAG using GPT-4o-mini 📚💬

Introduction 🌟

This n8n workflow creates an interactive chat interface that allows users to query a company knowledge base using Retrieval-Augmented Generation (RAG). It retrieves relevant information from text documents and tabular data stored in Supabase, then generates natural language responses using OpenAI’s GPT-4o-mini model. Designed for teams managing internal knowledge, this workflow enables users to ask questions like “What’s the remote work policy?” or “Show me the latest budget data” and receive accurate, context-aware responses in a conversational format. 🚀

Problem Statement 🎯

Managing a company knowledge base can be a daunting task—employees often struggle to find specific information buried in documents or spreadsheets, leading to wasted time and inefficiencies. Traditional search methods may not understand natural language queries or provide contextually relevant results. This workflow solves these issues by:

Offering a chat-based interface for natural language queries, making it easy for users to ask questions in their own words.
Leveraging RAG to retrieve relevant text and tabular data from Supabase, ensuring responses are accurate and context-aware.
Supporting diverse file types, including text-based files (e.g., PDFs, DOCX) and tabular data (e.g., XLSX, CSV), for comprehensive knowledge access.
Maintaining conversation history to provide context during interactions, improving the user experience.

Target Audience 👥

This workflow is ideal for:

HR Teams: Quickly access company policies, employee handbooks, or benefits documents.
Finance Teams: Retrieve budget data, expense reports, or financial summaries from spreadsheets.
Knowledge Managers: Build a centralized assistant for internal documentation, streamlining information access.
Developers: Extend the workflow with additional tools or integrations for custom use cases.

Workflow Description 🔍

This workflow consists of a chat interface powered by n8n’s Chat Trigger node, an AI Agent node for RAG, and several tools to retrieve data from Supabase. Here’s how it works step-by-step:

User Initiates a Chat: The user interacts with a chat interface, sending queries like “Summarize our remote work policy” or “Show budget data for Q1 2025.”
Query Processing with RAG: The AI Agent processes the query using RAG, retrieving relevant data from Supabase tables and generating a response with OpenAI’s GPT-4o-mini model.
Data Retrieval and Response Generation: The workflow uses multiple tools to fetch data:
- Retrieves text chunks from the documents table using vector search.
- Fetches tabular data from the document_rows table based on file IDs.
- Extracts full document text or lists available files as needed.
- Generates a natural language response combining the retrieved data.
Conversation History Management: Stores the conversation history in Supabase to maintain context for follow-up questions.
Response Delivery: Formats and sends the response back to the chat interface for the user to view.

Nodes Breakdown 🛠️

1. Start Chat Interface 💬

Type: Chat Trigger
Purpose: Provides the interactive chat interface for users to input queries and receive responses.
Configuration:
- Chat Title: Company Knowledge Base Assistant
- Chat Subtitle: Ask me anything about company documents!
- Welcome Message: Hello! I’m your Company Knowledge Base Assistant. How can I help you today?
- Suggestions: What is the company policy on remote work?, Show me the latest budget data., List all policy documents.
- Output Chat Session ID: true
- Output User Message: true
Customization:
- Update the title and welcome message to align with your company branding (e.g., HR Knowledge Assistant).
- Add more suggestions relevant to your use case (e.g., What are the company benefits?).

2. Process Query with RAG 🧠

Type: AI Agent
Purpose: Orchestrates the RAG process by retrieving relevant data using tools and generating responses with OpenAI’s GPT-4o-mini.
Configuration:
- Credential: OpenAI API key
- Model: gpt-4o-mini
- System Prompt: You are a helpful assistant for a company knowledge base. Use the provided tools to retrieve relevant information from documents and tabular data. If the query involves tabular data, format it clearly in your response. If no relevant data is found, respond with "I couldn’t find any relevant information. Can you provide more details?"
- Input Field: {{ $node['Start Chat Interface'].json.message }}
Customization:
- Switch to a different model (e.g., gpt-3.5-turbo) to adjust cost or performance.
- Modify the system prompt to change the tone (e.g., more formal for HR use cases).

3. Retrieve Text Chunks 📄

Type: Supabase Vector Store (Tool)
Purpose: Retrieves relevant text chunks from the documents table using vector search.
Configuration:
- Credential: Supabase credentials
- Operation Mode: Retrieve Documents (As Tool for AI Agent)
- Table Name: documents
- Embedding Field: embedding
- Content Field: content_text
- Metadata Field: metadata
- Embedding Model: OpenAI text-embedding-ada-002
- Top K: 10
Customization:
- Adjust Top K to retrieve more or fewer results (e.g., 5 for faster responses).
- Ensure the match_documents function (see prerequisites) is defined in Supabase.

4. Fetch Tabular Data 📊

Type: Supabase (Tool, Execute Query)
Purpose: Retrieves tabular data from the document_rows table based on a file ID.
Configuration:
- Credential: Supabase credentials
- Operation: Execute Query
- Query: SELECT row_data FROM document_rows WHERE dataset_id = $1 LIMIT 10
- Tool Description: Run a SQL query - use this to query from the document_rows table once you know the file ID you are querying. dataset_id is the file_id and you are always using the row_data for filtering, which is a jsonb field that has all the keys from the file schema given in the document_metadata table.
Customization:
- Modify the query to filter specific columns or add conditions (e.g., WHERE dataset_id = $1 AND row_data->>'year' = '2025').
- Increase the LIMIT for larger datasets.

5. Extract Full Document Text 📜

Type: Supabase (Tool, Execute Query)
Purpose: Fetches the full text of a document by concatenating all text chunks for a given file_id.
Configuration:
- Credential: Supabase credentials
- Operation: Execute Query
- Query: SELECT string_agg(content_text, ' ') as document_text FROM documents WHERE metadata->>'file_id' = $1 GROUP BY metadata->>'file_id'
- Tool Description: Given file id fetch the text from the documents
Customization:
- Add filters to the query if needed (e.g., limit to specific metadata fields).

6. List Available Files 📋

Type: Supabase (Tool, Select)
Purpose: Lists all files in the knowledge base from the document_metadata table.
Configuration:
- Credential: Supabase credentials
- Operation: Select
- Schema: public
- Table: document_metadata
- Tool Description: Use this tool to fetch all documents including the table schema if the file is csv, excel or xlsx
Customization:
- Add filters to list specific file types (e.g., WHERE file_type = 'application/pdf').
- Modify the columns selected to include additional metadata (e.g., file_size).

7. Manage Chat History 💾

Type: Postgres Chat Memory (Tool)
Purpose: Stores and retrieves conversation history to maintain context.
Configuration:
- Credential: Supabase credentials (Postgres-compatible)
- Table Name: n8n_chat_history
- Session ID Field: session_id
- Session ID Value: {{ $node['Start Chat Interface'].json.sessionId }}
- Message Field: message
- Sender Field: sender
- Timestamp Field: timestamp
- Context Window Length: 5
Customization:
- Increase the context window length for longer conversations (e.g., 10 messages).
- Add indexes on session_id and timestamp in Supabase for better performance.

8. Format and Send Response 📤

Type: Set
Purpose: Formats the AI Agent’s response and sends it back to the chat interface.
Configuration:
- Fields:
  - response: {{ $node['Process Query with RAG'].json.output }}
Customization:
- Add additional formatting to the response if needed (e.g., prepend with a timestamp or apply markdown formatting).

Setup Instructions 🛠️

Prerequisites 📋

n8n Setup:
- Ensure you’re using n8n version 1.0 or higher.
- Enable the AI features in n8n settings.
Supabase:
- Create a Supabase project and set up the following tables:
  - documents: id (uuid), content_text (text), embedding (vector(1536)), metadata (jsonb)
  - document_rows: id (uuid), dataset_id (varchar), row_data (jsonb)
  - document_metadata: file_id (varchar), file_name (varchar), file_type (varchar), file_url (text)
  - knowledge_base: id (serial), file_id (varchar), file_name (varchar), file_type (varchar), file_url (text), upload_date (timestamp)
  - n8n_chat_history: id (serial), session_id (varchar), message (text), sender (varchar), timestamp (timestamp)
- Add the match_documents function to Supabase to enable vector search:
```
CREATE OR REPLACE FUNCTION match_documents (
  query_embedding vector(1536),
  match_count int DEFAULT 5,
  filter jsonb DEFAULT '{}'
) RETURNS TABLE (
  id uuid,
  content_text text,
  metadata jsonb,
  similarity float
) LANGUAGE plpgsql AS $$  
BEGIN
  RETURN QUERY
  SELECT
    documents.id,
    documents.content_text,
    documents.metadata,
    1 - (documents.embedding &lt;=&gt; query_embedding) as similarity
  FROM documents
  WHERE documents.metadata @&gt; filter
  ORDER BY similarity DESC
  LIMIT match_count;
END;
  $$;
```

n8n Interactive Knowledge Base Chat with Supabase RAG

This n8n workflow enables an interactive chat experience with a knowledge base, leveraging Supabase for data storage and Retrieval Augmented Generation (RAG) with OpenAI. It automates the process of ingesting documents from Google Drive into a Supabase vector store, and then uses an AI agent to answer user queries based on this knowledge base, maintaining chat history in a Postgres database.

What it does

This workflow automates the following key processes:

Document Ingestion (Manual Trigger):
- Monitors Google Drive: Listens for new or updated files in a specified Google Drive folder.
- Extracts File Content: Downloads and extracts text content from various file types (e.g., PDF, TXT).
- Splits Text into Chunks: Divides the extracted document text into smaller, manageable chunks suitable for embeddings.
- Generates Embeddings: Uses OpenAI's embedding model to create vector representations of these text chunks.
- Stores in Supabase Vector Store: Ingests the text chunks and their corresponding embeddings into a Supabase vector database, making them searchable for RAG.
- Email Notification (Optional): Can be configured to send an email notification upon successful document ingestion.
Interactive Chat (Chat Trigger):
- Listens for Chat Messages: Triggers when a new chat message is received (e.g., from a custom chat interface).
- Manages Chat History: Stores and retrieves conversation history using a Postgres Chat Memory.
- Retrieves Relevant Information: Queries the Supabase Vector Store to find the most relevant document chunks based on the user's query.
- Generates AI Response: Uses an OpenAI Chat Model, augmented with the retrieved information (RAG), to generate a coherent and contextually relevant response.
- Responds to User: Sends the AI-generated response back to the chat interface.

Prerequisites/Requirements

To use this workflow, you will need:

n8n Instance: A running n8n instance.
Google Drive Account: For storing your knowledge base documents.
Supabase Account:
- A Supabase project with a Postgres database.
- A table configured as a vector store (e.g., using pg_vector extension).
- A table for storing chat memory.
OpenAI API Key: For generating embeddings and AI chat responses.
Postgres Database: (Can be the same as Supabase Postgres) for storing chat memory.
Gmail Account (Optional): If you want to receive email notifications.

Setup/Usage

Import the Workflow: Download the provided JSON and import it into your n8n instance.
Configure Credentials:
- Google Drive: Set up a Google Drive credential to access your documents.
- Supabase: Configure a Supabase credential with your project URL and API key.
- OpenAI: Set up an OpenAI credential with your API key.
- Postgres: Configure a Postgres credential pointing to your database for chat memory.
- Gmail (Optional): Set up a Gmail credential if you wish to use email notifications.
Customize Nodes:
- Google Drive Trigger (Node 531): Specify the folder ID in Google Drive where your knowledge base documents are located.
- Supabase Vector Store (Node 1231): Configure the table name and schema for your vector store in Supabase.
- Embeddings OpenAI (Node 1141): Select your desired OpenAI embedding model.
- OpenAI Chat Model (Node 1153): Select your desired OpenAI chat model (e.g., gpt-4, gpt-3.5-turbo).
- Postgres Chat Memory (Node 1267): Configure the table name for storing chat history in your Postgres database.
- Gmail (Node 356): If enabled, set the recipient email address for notifications.
Activate the Workflow: Once all credentials and configurations are set, activate the workflow.
Ingest Documents: To ingest documents, simply add or update files in the specified Google Drive folder. The workflow will automatically process them.
Start Chatting: Interact with your knowledge base via the Chat Trigger endpoint. You'll need to integrate this trigger with your frontend chat application.

Related Templates

Track competitor SEO keywords with Decodo + GPT-4.1-mini + Google Sheets

This workflow automates competitor keyword research using OpenAI LLM and Decodo for intelligent web scraping. Who this is for SEO specialists, content strategists, and growth marketers who want to automate keyword research and competitive intelligence. Marketing analysts managing multiple clients or websites who need consistent SEO tracking without manual data pulls. Agencies or automation engineers using Google Sheets as an SEO data dashboard for keyword monitoring and reporting. What problem this workflow solves Tracking competitor keywords manually is slow and inconsistent. Most SEO tools provide limited API access or lack contextual keyword analysis. This workflow solves that by: Automatically scraping any competitor’s webpage with Decodo. Using OpenAI GPT-4.1-mini to interpret keyword intent, density, and semantic focus. Storing structured keyword insights directly in Google Sheets for ongoing tracking and trend analysis. What this workflow does Trigger — Manually start the workflow or schedule it to run periodically. Input Setup — Define the website URL and target country (e.g., https://dev.to, france). Data Scraping (Decodo) — Fetch competitor web content and metadata. Keyword Analysis (OpenAI GPT-4.1-mini) Extract primary and secondary keywords. Identify focus topics and semantic entities. Generate a keyword density summary and SEO strength score. Recommend optimization and internal linking opportunities. Data Structuring — Clean and convert GPT output into JSON format. Data Storage (Google Sheets) — Append structured keyword data to a Google Sheet for long-term tracking. Setup Prerequisites If you are new to Decode, please signup on this link visit.decodo.com n8n account with workflow editor access Decodo API credentials OpenAI API key Google Sheets account connected via OAuth2 Make sure to install the Decodo Community node. Create a Google Sheet Add columns for: primarykeywords, seostrengthscore, keyworddensity_summary, etc. Share with your n8n Google account. Connect Credentials Add credentials for: Decodo API credentials - You need to register, login and obtain the Basic Authentication Token via Decodo Dashboard OpenAI API (for GPT-4o-mini) Google Sheets OAuth2 Configure Input Fields Edit the “Set Input Fields” node to set your target site and region. Run the Workflow Click Execute Workflow in n8n. View structured results in your connected Google Sheet. How to customize this workflow Track Multiple Competitors → Use a Google Sheet or CSV list of URLs; loop through them using the Split In Batches node. Add Language Detection → Add a Gemini or GPT node before keyword analysis to detect content language and adjust prompts. Enhance the SEO Report → Expand the GPT prompt to include backlink insights, metadata optimization, or readability checks. Integrate Visualization → Connect your Google Sheet to Looker Studio for SEO performance dashboards. Schedule Auto-Runs → Use the Cron Node to run weekly or monthly for competitor keyword refreshes. Summary This workflow automates competitor keyword research using: Decodo for intelligent web scraping OpenAI GPT-4.1-mini for keyword and SEO analysis Google Sheets for live tracking and reporting It’s a complete AI-powered SEO intelligence pipeline ideal for teams that want actionable insights on keyword gaps, optimization opportunities, and content focus trends, without relying on expensive SEO SaaS tools.

By Ranjan Dailata

161

Generate song lyrics and music from text prompts using OpenAI and Fal.ai Minimax

Spark your creativity instantly in any chat—turn a simple prompt like "heartbreak ballad" into original, full-length lyrics and a professional AI-generated music track, all without leaving your conversation. 📋 What This Template Does This chat-triggered workflow harnesses AI to generate detailed, genre-matched song lyrics (at least 600 characters) from user messages, then queues them for music synthesis via Fal.ai's minimax-music model. It polls asynchronously until the track is ready, delivering lyrics and audio URL back in chat. Crafts original, structured lyrics with verses, choruses, and bridges using OpenAI Submits to Fal.ai for melody, instrumentation, and vocals aligned to the style Handles long-running generations with smart looping and status checks Returns complete song package (lyrics + audio link) for seamless sharing 🔧 Prerequisites n8n account (self-hosted or cloud with chat integration enabled) OpenAI account with API access for GPT models Fal.ai account for AI music generation 🔑 Required Credentials OpenAI API Setup Go to platform.openai.com → API keys (sidebar) Click "Create new secret key" → Name it (e.g., "n8n Songwriter") Copy the key and add to n8n as "OpenAI API" credential type Test by sending a simple chat completion request Fal.ai HTTP Header Auth Setup Sign up at fal.ai → Dashboard → API Keys Generate a new API key → Copy it In n8n, create "HTTP Header Auth" credential: Name="Fal.ai", Header Name="Authorization", Header Value="Key [Your API Key]" Test with a simple GET to their queue endpoint (e.g., /status) ⚙️ Configuration Steps Import the workflow JSON into your n8n instance Assign OpenAI API credentials to the "OpenAI Chat Model" node Assign Fal.ai HTTP Header Auth to the "Generate Music Track", "Check Generation Status", and "Fetch Final Result" nodes Activate the workflow—chat trigger will appear in your n8n chat interface Test by messaging: "Create an upbeat pop song about road trips" 🎯 Use Cases Content Creators: YouTubers generating custom jingles for videos on the fly, streamlining production from idea to audio export Educators: Music teachers using chat prompts to create era-specific folk tunes for classroom discussions, fostering interactive learning Gift Personalization: Friends crafting anniversary R&B tracks from shared memories via quick chats, delivering emotional audio surprises Artist Brainstorming: Songwriters prototyping hip-hop beats in real-time during sessions, accelerating collaboration and iteration ⚠️ Troubleshooting Invalid JSON from AI Agent: Ensure the system prompt stresses valid JSON; test the agent standalone with a sample query Music Generation Fails (401/403): Verify Fal.ai API key has minimax-music access; check usage quotas in dashboard Status Polling Loops Indefinitely: Bump wait time to 45-60s for complex tracks; inspect fal.ai queue logs for bottlenecks Lyrics Under 600 Characters: Tweak agent prompt to enforce fuller structures like [V1][C][V2][B][C]; verify output length in executions

By Daniel Nkencho

601

Automate invoice processing with OCR, GPT-4 & Salesforce opportunity creation

PDF Invoice Extractor (AI) End-to-end pipeline: Watch Drive ➜ Download PDF ➜ OCR text ➜ AI normalize to JSON ➜ Upsert Buyer (Account) ➜ Create Opportunity ➜ Map Products ➜ Create OLI via Composite API ➜ Archive to OneDrive. --- Node by node (what it does & key setup) 1) Google Drive Trigger Purpose: Fire when a new file appears in a specific Google Drive folder. Key settings: Event: fileCreated Folder ID: google drive folder id Polling: everyMinute Creds: googleDriveOAuth2Api Output: Metadata { id, name, ... } for the new file. --- 2) Download File From Google Purpose: Get the file binary for processing and archiving. Key settings: Operation: download File ID: ={{ $json.id }} Creds: googleDriveOAuth2Api Output: Binary (default key: data) and original metadata. --- 3) Extract from File Purpose: Extract text from PDF (OCR as needed) for AI parsing. Key settings: Operation: pdf OCR: enable for scanned PDFs (in options) Output: JSON with OCR text at {{ $json.text }}. --- 4) Message a model (AI JSON Extractor) Purpose: Convert OCR text into strict normalized JSON array (invoice schema). Key settings: Node: @n8n/n8n-nodes-langchain.openAi Model: gpt-4.1 (or gpt-4.1-mini) Message role: system (the strict prompt; references {{ $json.text }}) jsonOutput: true Creds: openAiApi Output (per item): $.message.content → the parsed JSON (ensure it’s an array). --- 5) Create or update an account (Salesforce) Purpose: Upsert Buyer as Account using an external ID. Key settings: Resource: account Operation: upsert External Id Field: taxid_c External Id Value: ={{ $json.message.content.buyer.tax_id }} Name: ={{ $json.message.content.buyer.name }} Creds: salesforceOAuth2Api Output: Account record (captures Id) for downstream Opportunity. --- 6) Create an opportunity (Salesforce) Purpose: Create Opportunity linked to the Buyer (Account). Key settings: Resource: opportunity Name: ={{ $('Message a model').item.json.message.content.invoice.code }} Close Date: ={{ $('Message a model').item.json.message.content.invoice.issue_date }} Stage: Closed Won Amount: ={{ $('Message a model').item.json.message.content.summary.grand_total }} AccountId: ={{ $json.id }} (from Upsert Account output) Creds: salesforceOAuth2Api Output: Opportunity Id for OLI creation. --- 7) Build SOQL (Code / JS) Purpose: Collect unique product codes from AI JSON and build a SOQL query for PricebookEntry by Pricebook2Id. Key settings: pricebook2Id (hardcoded in script): e.g., 01sxxxxxxxxxxxxxxx Source lines: $('Message a model').first().json.message.content.products Output: { soql, codes } --- 8) Query PricebookEntries (Salesforce) Purpose: Fetch PricebookEntry.Id for each Product2.ProductCode. Key settings: Resource: search Query: ={{ $json.soql }} Creds: salesforceOAuth2Api Output: Items with Id, Product2.ProductCode (used for mapping). --- 9) Code in JavaScript (Build OLI payloads) Purpose: Join lines with PBE results and Opportunity Id ➜ build OpportunityLineItem payloads. Inputs: OpportunityId: ={{ $('Create an opportunity').first().json.id }} Lines: ={{ $('Message a model').first().json.message.content.products }} PBE rows: from previous node items Output: { body: { allOrNone:false, records:[{ OpportunityLineItem... }] } } Notes: Converts discount_total ➜ per-unit if needed (currently commented for standard pricing). Throws on missing PBE mapping or empty lines. --- 10) Create Opportunity Line Items (HTTP Request) Purpose: Bulk create OLIs via Salesforce Composite API. Key settings: Method: POST URL: https://<your-instance>.my.salesforce.com/services/data/v65.0/composite/sobjects Auth: salesforceOAuth2Api (predefined credential) Body (JSON): ={{ $json.body }} Output: Composite API results (per-record statuses). --- 11) Update File to One Drive Purpose: Archive the original PDF in OneDrive. Key settings: Operation: upload File Name: ={{ $json.name }} Parent Folder ID: onedrive folder id Binary Data: true (from the Download node) Creds: microsoftOneDriveOAuth2Api Output: Uploaded file metadata. --- Data flow (wiring) Google Drive Trigger → Download File From Google Download File From Google → Extract from File → Update File to One Drive Extract from File → Message a model Message a model → Create or update an account Create or update an account → Create an opportunity Create an opportunity → Build SOQL Build SOQL → Query PricebookEntries Query PricebookEntries → Code in JavaScript Code in JavaScript → Create Opportunity Line Items --- Quick setup checklist 🔐 Credentials: Connect Google Drive, OneDrive, Salesforce, OpenAI. 📂 IDs: Drive Folder ID (watch) OneDrive Parent Folder ID (archive) Salesforce Pricebook2Id (in the JS SOQL builder) 🧠 AI Prompt: Use the strict system prompt; jsonOutput = true. 🧾 Field mappings: Buyer tax id/name → Account upsert fields Invoice code/date/amount → Opportunity fields Product name must equal your Product2.ProductCode in SF. ✅ Test: Drop a sample PDF → verify: AI returns array JSON only Account/Opportunity created OLI records created PDF archived to OneDrive --- Notes & best practices If PDFs are scans, enable OCR in Extract from File. If AI returns non-JSON, keep “Return only a JSON array” as the last line of the prompt and keep jsonOutput enabled. Consider adding validation on parsing.warnings to gate Salesforce writes. For discounts/taxes in OLI: Standard OLI fields don’t support per-line discount amounts directly; model them in UnitPrice or custom fields. Replace the Composite API URL with your org’s domain or use the Salesforce node’s Bulk Upsert for simplicity.

By Le Nguyen

942