Back to Catalog

Automate invoice processing with OCR, GPT-4 & Salesforce opportunity creation

Le NguyenLe Nguyen
942 views
2/3/2026
Official Page

PDF Invoice Extractor (AI)

End-to-end pipeline: Watch Drive ➜ Download PDF ➜ OCR text ➜ AI normalize to JSON ➜ Upsert Buyer (Account) ➜ Create Opportunity ➜ Map Products ➜ Create OLI via Composite API ➜ Archive to OneDrive.


Node by node (what it does & key setup)

1) Google Drive Trigger

  • Purpose: Fire when a new file appears in a specific Google Drive folder.
  • Key settings:
    • Event: fileCreated
    • Folder ID: google drive folder id
    • Polling: everyMinute
    • Creds: googleDriveOAuth2Api
  • Output: Metadata { id, name, ... } for the new file.

2) Download File From Google

  • Purpose: Get the file binary for processing and archiving.
  • Key settings:
    • Operation: download
    • File ID: ={{ $json.id }}
    • Creds: googleDriveOAuth2Api
  • Output: Binary (default key: data) and original metadata.

3) Extract from File

  • Purpose: Extract text from PDF (OCR as needed) for AI parsing.
  • Key settings:
    • Operation: pdf
    • OCR: enable for scanned PDFs (in options)
  • Output: JSON with OCR text at {{ $json.text }}.

4) Message a model (AI JSON Extractor)

  • Purpose: Convert OCR text into strict normalized JSON array (invoice schema).
  • Key settings:
    • Node: @n8n/n8n-nodes-langchain.openAi
    • Model: gpt-4.1 (or gpt-4.1-mini)
    • Message role: system (the strict prompt; references {{ $json.text }})
    • jsonOutput: true
    • Creds: openAiApi
  • Output (per item): $.message.content → the parsed JSON (ensure it’s an array).

5) Create or update an account (Salesforce)

  • Purpose: Upsert Buyer as Account using an external ID.
  • Key settings:
    • Resource: account
    • Operation: upsert
    • External Id Field: tax_id__c
    • External Id Value: ={{ $json.message.content.buyer.tax_id }}
    • Name: ={{ $json.message.content.buyer.name }}
    • Creds: salesforceOAuth2Api
  • Output: Account record (captures Id) for downstream Opportunity.

6) Create an opportunity (Salesforce)

  • Purpose: Create Opportunity linked to the Buyer (Account).
  • Key settings:
    • Resource: opportunity
    • Name: ={{ $('Message a model').item.json.message.content.invoice.code }}
    • Close Date: ={{ $('Message a model').item.json.message.content.invoice.issue_date }}
    • Stage: Closed Won
    • Amount: ={{ $('Message a model').item.json.message.content.summary.grand_total }}
    • AccountId: ={{ $json.id }} (from Upsert Account output)
    • Creds: salesforceOAuth2Api
  • Output: Opportunity Id for OLI creation.

7) Build SOQL (Code / JS)

  • Purpose: Collect unique product codes from AI JSON and build a SOQL query for PricebookEntry by Pricebook2Id.
  • Key settings:
    • pricebook2Id (hardcoded in script): e.g., 01sxxxxxxxxxxxxxxx
    • Source lines: $('Message a model').first().json.message.content.products
  • Output: { soql, codes }

8) Query PricebookEntries (Salesforce)

  • Purpose: Fetch PricebookEntry.Id for each Product2.ProductCode.
  • Key settings:
    • Resource: search
    • Query: ={{ $json.soql }}
    • Creds: salesforceOAuth2Api
  • Output: Items with Id, Product2.ProductCode (used for mapping).

9) Code in JavaScript (Build OLI payloads)

  • Purpose: Join lines with PBE results and Opportunity Id ➜ build OpportunityLineItem payloads.
  • Inputs:
    • OpportunityId: ={{ $('Create an opportunity').first().json.id }}
    • Lines: ={{ $('Message a model').first().json.message.content.products }}
    • PBE rows: from previous node items
  • Output: { body: { allOrNone:false, records:[{ OpportunityLineItem... }] } }
  • Notes:
    • Converts discount_total ➜ per-unit if needed (currently commented for standard pricing).
    • Throws on missing PBE mapping or empty lines.

10) Create Opportunity Line Items (HTTP Request)

  • Purpose: Bulk create OLIs via Salesforce Composite API.
  • Key settings:
    • Method: POST
    • URL: https://<your-instance>.my.salesforce.com/services/data/v65.0/composite/sobjects
    • Auth: salesforceOAuth2Api (predefined credential)
    • Body (JSON): ={{ $json.body }}
  • Output: Composite API results (per-record statuses).

11) Update File to One Drive

  • Purpose: Archive the original PDF in OneDrive.
  • Key settings:
    • Operation: upload
    • File Name: ={{ $json.name }}
    • Parent Folder ID: onedrive folder id
    • Binary Data: true (from the Download node)
    • Creds: microsoftOneDriveOAuth2Api
  • Output: Uploaded file metadata.

Data flow (wiring)

  1. Google Drive TriggerDownload File From Google
  2. Download File From Google
    • Extract from File
    • Update File to One Drive
  3. Extract from FileMessage a model
  4. Message a model
    • Create or update an account
  5. Create or update an accountCreate an opportunity
  6. Create an opportunityBuild SOQL
  7. Build SOQLQuery PricebookEntries
  8. Query PricebookEntriesCode in JavaScript
  9. Code in JavaScriptCreate Opportunity Line Items

Quick setup checklist

  • 🔐 Credentials: Connect Google Drive, OneDrive, Salesforce, OpenAI.
  • 📂 IDs:
    • Drive Folder ID (watch)
    • OneDrive Parent Folder ID (archive)
    • Salesforce Pricebook2Id (in the JS SOQL builder)
  • 🧠 AI Prompt: Use the strict system prompt; jsonOutput = true.
  • 🧾 Field mappings:
    • Buyer tax id/name → Account upsert fields
    • Invoice code/date/amount → Opportunity fields
    • Product name must equal your Product2.ProductCode in SF.
  • Test: Drop a sample PDF → verify:
    • AI returns array JSON only
    • Account/Opportunity created
    • OLI records created
    • PDF archived to OneDrive

Notes & best practices

  • If PDFs are scans, enable OCR in Extract from File.
  • If AI returns non-JSON, keep “Return only a JSON array” as the last line of the prompt and keep jsonOutput enabled.
  • Consider adding validation on parsing.warnings to gate Salesforce writes.
  • For discounts/taxes in OLI:
    • Standard OLI fields don’t support per-line discount amounts directly; model them in UnitPrice or custom fields.
  • Replace the Composite API URL with your org’s domain or use the Salesforce node’s Bulk Upsert for simplicity.

Automate Invoice Processing with OCR, GPT-4, and Salesforce Opportunity Creation

This n8n workflow automates the end-to-end processing of invoices, from document detection in cloud storage to extracting key information using AI, and finally creating or updating opportunities in Salesforce. It simplifies a complex, manual task, reducing errors and saving significant time.

What it does

  1. Monitors Cloud Storage for New Invoices: The workflow is triggered when a new file (assumed to be an invoice) is uploaded to a specified folder in Google Drive or Microsoft OneDrive.
  2. Extracts Text from Invoice: It uses an "Extract from File" node to perform Optical Character Recognition (OCR) on the newly uploaded invoice, converting the image or PDF content into extractable text.
  3. Analyzes Invoice Data with OpenAI (GPT-4): The extracted text is then sent to OpenAI (GPT-4) to intelligently identify and extract crucial information such as vendor name, invoice number, line items, amounts, due dates, and other relevant details.
  4. Creates/Updates Salesforce Opportunity: Based on the extracted invoice data, the workflow interacts with Salesforce to either create a new Opportunity or update an existing one, populating fields like Opportunity Name, Account, Amount, Close Date, and Stage.
  5. Logs Processing Details (HTTP Request): A generic HTTP Request node is included, likely for logging the processing details or sending a notification to an external system, providing transparency and traceability for each processed invoice.
  6. Optional: Manual Trigger for Testing: A Sticky Note is present, indicating a potential manual trigger or a place for notes, useful for testing and development.

Prerequisites/Requirements

To use this workflow, you will need:

  • n8n Instance: A running instance of n8n.
  • Google Drive Account (with credentials configured in n8n) or Microsoft OneDrive Account (with credentials configured in n8n) to monitor for new invoice files.
  • OpenAI API Key (with access to GPT-4 or a similar large language model) configured as a credential in n8n.
  • Salesforce Account (with credentials configured in n8n) and appropriate permissions to create and update Opportunities.
  • An endpoint for logging/notifications (if the HTTP Request node is configured to send data to an external service).

Setup/Usage

  1. Import the Workflow: Download the provided JSON and import it into your n8n instance.
  2. Configure Credentials:
    • Set up your Google Drive or Microsoft OneDrive credentials.
    • Set up your OpenAI API key credential.
    • Set up your Salesforce credentials.
  3. Configure Cloud Storage Trigger:
    • In the "Google Drive Trigger" node (or "Microsoft OneDrive" if you switch), select your Google Drive (or OneDrive) credential.
    • Specify the folder ID where new invoice files will be uploaded.
  4. Configure OpenAI Node:
    • In the "OpenAI" node, ensure your OpenAI credential is selected.
    • Review and adjust the prompt to accurately extract the desired information from your invoices.
  5. Configure Salesforce Node:
    • In the "Salesforce" node, select your Salesforce credential.
    • Map the extracted data from the OpenAI node to the appropriate fields for creating or updating a Salesforce Opportunity (e.g., Opportunity Name, Account ID, Amount, Close Date, Stage).
  6. Configure HTTP Request Node:
    • If you intend to use the "HTTP Request" node for logging or notifications, configure its URL, method, and body according to your external service's API documentation.
  7. Activate the Workflow: Once all configurations are complete, activate the workflow. It will now automatically process new invoices uploaded to your specified cloud storage folder.

Related Templates

Auto-create TikTok videos with VEED.io AI avatars, ElevenLabs & GPT-4

💥 Viral TikTok Video Machine: Auto-Create Videos with Your AI Avatar --- 🎯 Who is this for? This workflow is for content creators, marketers, and agencies who want to use Veed.io’s AI avatar technology to produce short, engaging TikTok videos automatically. It’s ideal for creators who want to appear on camera without recording themselves, and for teams managing multiple brands who need to generate videos at scale. --- ⚙️ What problem this workflow solves Manually creating videos for TikTok can take hours — finding trends, writing scripts, recording, and editing. By combining Veed.io, ElevenLabs, and GPT-4, this workflow transforms a simple Telegram input into a ready-to-post TikTok video featuring your AI avatar powered by Veed.io — speaking naturally with your cloned voice. --- 🚀 What this workflow does This automation links Veed.io’s video-generation API with multiple AI tools: Analyzes TikTok trends via Perplexity AI Writes a 10-second viral script using GPT-4 Generates your voiceover via ElevenLabs Uses Veed.io (Fabric 1.0 via FAL.ai) to animate your avatar and sync the lips to the voice Creates an engaging caption + hashtags for TikTok virality Publishes the video automatically via Blotato TikTok API Logs all results to Google Sheets for tracking --- 🧩 Setup Telegram Bot Create your bot via @BotFather Configure it as the trigger for sending your photo and theme Connect Veed.io Create an account on Veed.io Get your FAL.ai API key (Veed Fabric 1.0 model) Use HTTPS image/audio URLs compatible with Veed Fabric Other APIs Add Perplexity, ElevenLabs, and Blotato TikTok keys Connect your Google Sheet for logging results --- 🛠️ How to customize this workflow Change your Avatar: Upload a new image through Telegram, and Veed.io will generate a new talking version automatically. Modify the Script Style: Adjust the GPT prompt for tone (educational, funny, storytelling). Adjust Voice Tone: Tweak ElevenLabs stability and similarity settings. Expand Platforms: Add Instagram, YouTube Shorts, or X (Twitter) posting nodes. Track Performance: Customize your Google Sheet to measure your most successful Veed.io-based videos. --- 🧠 Expected Outcome In just a few seconds after sending your photo and theme, this workflow — powered by Veed.io — creates a fully automated TikTok video featuring your AI avatar with natural lip-sync and voice. The result is a continuous stream of viral short videos, made without cameras, editing, or effort. --- ✅ Import the JSON file in n8n, add your API keys (including Veed.io via FAL.ai), and start generating viral TikTok videos starring your AI avatar today! 🎥 Watch This Tutorial --- 📄 Documentation: Notion Guide Need help customizing? Contact me for consulting and support : Linkedin / Youtube

Dr. FirasBy Dr. Firas
39510

Track competitor SEO keywords with Decodo + GPT-4.1-mini + Google Sheets

This workflow automates competitor keyword research using OpenAI LLM and Decodo for intelligent web scraping. Who this is for SEO specialists, content strategists, and growth marketers who want to automate keyword research and competitive intelligence. Marketing analysts managing multiple clients or websites who need consistent SEO tracking without manual data pulls. Agencies or automation engineers using Google Sheets as an SEO data dashboard for keyword monitoring and reporting. What problem this workflow solves Tracking competitor keywords manually is slow and inconsistent. Most SEO tools provide limited API access or lack contextual keyword analysis. This workflow solves that by: Automatically scraping any competitor’s webpage with Decodo. Using OpenAI GPT-4.1-mini to interpret keyword intent, density, and semantic focus. Storing structured keyword insights directly in Google Sheets for ongoing tracking and trend analysis. What this workflow does Trigger — Manually start the workflow or schedule it to run periodically. Input Setup — Define the website URL and target country (e.g., https://dev.to, france). Data Scraping (Decodo) — Fetch competitor web content and metadata. Keyword Analysis (OpenAI GPT-4.1-mini) Extract primary and secondary keywords. Identify focus topics and semantic entities. Generate a keyword density summary and SEO strength score. Recommend optimization and internal linking opportunities. Data Structuring — Clean and convert GPT output into JSON format. Data Storage (Google Sheets) — Append structured keyword data to a Google Sheet for long-term tracking. Setup Prerequisites If you are new to Decode, please signup on this link visit.decodo.com n8n account with workflow editor access Decodo API credentials OpenAI API key Google Sheets account connected via OAuth2 Make sure to install the Decodo Community node. Create a Google Sheet Add columns for: primarykeywords, seostrengthscore, keyworddensity_summary, etc. Share with your n8n Google account. Connect Credentials Add credentials for: Decodo API credentials - You need to register, login and obtain the Basic Authentication Token via Decodo Dashboard OpenAI API (for GPT-4o-mini) Google Sheets OAuth2 Configure Input Fields Edit the “Set Input Fields” node to set your target site and region. Run the Workflow Click Execute Workflow in n8n. View structured results in your connected Google Sheet. How to customize this workflow Track Multiple Competitors → Use a Google Sheet or CSV list of URLs; loop through them using the Split In Batches node. Add Language Detection → Add a Gemini or GPT node before keyword analysis to detect content language and adjust prompts. Enhance the SEO Report → Expand the GPT prompt to include backlink insights, metadata optimization, or readability checks. Integrate Visualization → Connect your Google Sheet to Looker Studio for SEO performance dashboards. Schedule Auto-Runs → Use the Cron Node to run weekly or monthly for competitor keyword refreshes. Summary This workflow automates competitor keyword research using: Decodo for intelligent web scraping OpenAI GPT-4.1-mini for keyword and SEO analysis Google Sheets for live tracking and reporting It’s a complete AI-powered SEO intelligence pipeline ideal for teams that want actionable insights on keyword gaps, optimization opportunities, and content focus trends, without relying on expensive SEO SaaS tools.

Ranjan DailataBy Ranjan Dailata
161

Two-way property repair management system with Google Sheets & Drive

This workflow automates the repair request process between tenants and building managers, keeping all updates organized in a single spreadsheet. It is composed of two coordinated workflows, as two separate triggers are required — one for new repair submissions and another for repair updates. A Unique Unit ID that corresponds to individual units is attributed to each request, and timestamps are used to coordinate repair updates with specific requests. General use cases include: Property managers who manage multiple buildings or units. Building owners looking to centralize tenant repair communication. Automation builders who want to learn multi-trigger workflow design in n8n. --- ⚙️ How It Works Workflow 1 – New Repair Requests Behind the Scenes: A tenant fills out a Google Form (“Repair Request Form”), which automatically adds a new row to a linked Google Sheet. Steps: Trigger: Google Sheets rowAdded – runs when a new form entry appears. Extract & Format: Collects all relevant form data (address, unit, urgency, contacts). Generate Unit ID: Creates a standardized identifier (e.g., BUILDING-UNIT) for tracking. Email Notification: Sends the building manager a formatted email summarizing the repair details and including a link to a Repair Update Form (which activates Workflow 2). --- Workflow 2 – Repair Updates Behind the Scenes:\ Triggered when the building manager submits a follow-up form (“Repair Update Form”). Steps: Lookup by UUID: Uses the Unit ID from Workflow 1 to find the existing row in the Google Sheet. Conditional Logic: If photos are uploaded: Saves each image to a Google Drive folder, renames files consistently, and adds URLs to the sheet. If no photos: Skips the upload step and processes textual updates only. Merge & Update: Combines new data with existing repair info in the same spreadsheet row — enabling a full repair history in one place. --- 🧩 Requirements Google Account (for Forms, Sheets, and Drive) Gmail/email node connected for sending notifications n8n credentials configured for Google API access --- ⚡ Setup Instructions (see more detail in workflow) Import both workflows into n8n, then copy one into a second workflow. Change manual trigger in workflow 2 to a n8n Form node. Connect Google credentials to all nodes. Update spreadsheet and folder IDs in the corresponding nodes. Customize email text, sender name, and form links for your organization. Test each workflow with a sample repair request and a repair update submission. --- 🛠️ Customization Ideas Add Slack or Telegram notifications for urgent repairs. Auto-create folders per building or unit for photo uploads. Generate monthly repair summaries using Google Sheets triggers. Add an AI node to create summaries/extract relevant repair data from repair request that include long submissions.

Matt@VeraisonLabsBy Matt@VeraisonLabs
208