Back to Catalog

Extract and structure invoice data with DocSumo and export to Excel

AnuragAnurag
502 views
2/3/2026
Official Page

Description

This workflow automates the extraction of structured data from invoices or similar documents using Docsumo's API. Users can upload a PDF via an n8n form trigger, which is then sent to Docsumo for processing and structured parsing. The workflow fetches key document metadata and all line items, reconstructs each invoice row with combined header and item details, and finally exports all results as an Excel file. Ideal for automating invoice data entry, reporting, or integrating with accounting systems.

How It Works

  • A user uploads a PDF document using the integrated n8n form trigger.
  • The workflow securely sends the document to Docsumo via REST API.
  • After uploading, it checks and retrieves the parsed document results.
  • Header information and table line items are extracted and mapped into structured records.
  • The complete result is exported as an Excel (.xls) file.

Setup Steps

  1. Docsumo Account:
    • Register and obtain your API key from Docsumo.
  2. n8n Credentials Manager:
    • Add your Docsumo API key as an HTTP header credential (never hardcode the key in the workflow).
  3. Workflow Configuration:
    • In the HTTP Request nodes, set the authentication to your saved Docsumo credentials.
    • Update the file type or document type in the request (e.g., "type": "invoice") as needed for your use case.
  4. Testing:
    • Enable the workflow and use the built-in form to upload a sample invoice for extraction.

Features

  • Supports PDF uploads via n8n’s built-in form or via API/webhook extension.
  • Sends files directly to Docsumo for document data extraction using secure credentials.
  • Extracts invoice-level metadata (number, date, vendor, totals) and full line item tables.
  • Consolidates all data in easy-to-use Excel format for download or integration.
  • Modular node structure, easily extensible for further automation.

Prerequisites

  • Docsumo account with API access enabled.
  • n8n instance with form, HTTP Request, Code, and Excel/Convert to File nodes.
  • Working Docsumo API Key stored securely in n8n’s credential manager.

Example Use Cases

| Scenario | Benefit | |---------------------|-----------------------------------------| | Invoice Automation | Extract line items and metadata rapidly | | Receipts Processing | Parse and digitize business receipts | | Bulk Bill Imports | Batch process bills for analytics |

Notes

  • Credentials Security: Do not store your API key directly in HTTP Request nodes; always use n8n credentials manager.
  • Sticky Notes: The workflow includes sticky notes for setup, input, API call, extraction, and output steps to assist template users.
  • Custom Columns: You can customize header or line item extraction by editing the Code node as needed.

n8n Form to Docsumo and Excel Export Workflow

This n8n workflow provides a streamlined solution for extracting and structuring invoice data submitted via an n8n form, and then exporting that data into an Excel-compatible file. It simplifies the process of collecting invoice information and preparing it for further analysis or record-keeping.

What it does

This workflow automates the following steps:

  1. Triggers on Form Submission: It starts when an n8n form is submitted, likely containing invoice-related data or a file.
  2. Processes Data with Code: A Code node is used to process the incoming data, potentially transforming or preparing it for subsequent steps.
  3. Makes an HTTP Request: An HTTP Request node is included, which could be used to send the processed data to an external API (e.g., Docsumo for OCR and data extraction) or retrieve additional information.
  4. Converts to File: The final step involves converting the processed data into a file format, likely for export as an Excel spreadsheet (.xlsx or .csv).

Prerequisites/Requirements

  • n8n Instance: A running n8n instance to host and execute the workflow.
  • n8n Form: An n8n form configured to collect the initial invoice data.
  • External API (Optional): If the HTTP Request node is configured to interact with a service like Docsumo, an account and API key for that service would be required. The current JSON does not explicitly define the HTTP Request node's configuration, so this is speculative based on the directory name.

Setup/Usage

  1. Import the Workflow: Import the provided JSON into your n8n instance.
  2. Configure the n8n Form Trigger:
    • Open the "On form submission" node.
    • Ensure the form is configured to capture the necessary invoice data.
    • Activate the workflow.
  3. Configure the Code Node:
    • Review and modify the JavaScript code within the "Code" node to match your specific data processing needs. This might involve parsing form fields, preparing data for an API, or structuring it for the final export.
  4. Configure the HTTP Request Node (if applicable):
    • If the HTTP Request node is intended to send data to an external service (e.g., Docsumo), configure its URL, method, headers (including API keys), and body according to the external API's documentation.
  5. Configure the Convert to File Node:
    • Ensure the "Convert to File" node is configured to output the desired file format (e.g., CSV, XLSX) and that the input data matches the expected structure for the conversion.
  6. Activate the Workflow: Once all nodes are configured, activate the workflow.

Upon form submission, the workflow will automatically process the data and generate an output file. You can then download this file from the workflow execution history or configure further nodes to store it in a cloud storage service (e.g., Google Drive, S3).

Related Templates

Auto-create TikTok videos with VEED.io AI avatars, ElevenLabs & GPT-4

💥 Viral TikTok Video Machine: Auto-Create Videos with Your AI Avatar --- 🎯 Who is this for? This workflow is for content creators, marketers, and agencies who want to use Veed.io’s AI avatar technology to produce short, engaging TikTok videos automatically. It’s ideal for creators who want to appear on camera without recording themselves, and for teams managing multiple brands who need to generate videos at scale. --- ⚙️ What problem this workflow solves Manually creating videos for TikTok can take hours — finding trends, writing scripts, recording, and editing. By combining Veed.io, ElevenLabs, and GPT-4, this workflow transforms a simple Telegram input into a ready-to-post TikTok video featuring your AI avatar powered by Veed.io — speaking naturally with your cloned voice. --- 🚀 What this workflow does This automation links Veed.io’s video-generation API with multiple AI tools: Analyzes TikTok trends via Perplexity AI Writes a 10-second viral script using GPT-4 Generates your voiceover via ElevenLabs Uses Veed.io (Fabric 1.0 via FAL.ai) to animate your avatar and sync the lips to the voice Creates an engaging caption + hashtags for TikTok virality Publishes the video automatically via Blotato TikTok API Logs all results to Google Sheets for tracking --- 🧩 Setup Telegram Bot Create your bot via @BotFather Configure it as the trigger for sending your photo and theme Connect Veed.io Create an account on Veed.io Get your FAL.ai API key (Veed Fabric 1.0 model) Use HTTPS image/audio URLs compatible with Veed Fabric Other APIs Add Perplexity, ElevenLabs, and Blotato TikTok keys Connect your Google Sheet for logging results --- 🛠️ How to customize this workflow Change your Avatar: Upload a new image through Telegram, and Veed.io will generate a new talking version automatically. Modify the Script Style: Adjust the GPT prompt for tone (educational, funny, storytelling). Adjust Voice Tone: Tweak ElevenLabs stability and similarity settings. Expand Platforms: Add Instagram, YouTube Shorts, or X (Twitter) posting nodes. Track Performance: Customize your Google Sheet to measure your most successful Veed.io-based videos. --- 🧠 Expected Outcome In just a few seconds after sending your photo and theme, this workflow — powered by Veed.io — creates a fully automated TikTok video featuring your AI avatar with natural lip-sync and voice. The result is a continuous stream of viral short videos, made without cameras, editing, or effort. --- ✅ Import the JSON file in n8n, add your API keys (including Veed.io via FAL.ai), and start generating viral TikTok videos starring your AI avatar today! 🎥 Watch This Tutorial --- 📄 Documentation: Notion Guide Need help customizing? Contact me for consulting and support : Linkedin / Youtube

Dr. FirasBy Dr. Firas
39510

Two-way property repair management system with Google Sheets & Drive

This workflow automates the repair request process between tenants and building managers, keeping all updates organized in a single spreadsheet. It is composed of two coordinated workflows, as two separate triggers are required — one for new repair submissions and another for repair updates. A Unique Unit ID that corresponds to individual units is attributed to each request, and timestamps are used to coordinate repair updates with specific requests. General use cases include: Property managers who manage multiple buildings or units. Building owners looking to centralize tenant repair communication. Automation builders who want to learn multi-trigger workflow design in n8n. --- ⚙️ How It Works Workflow 1 – New Repair Requests Behind the Scenes: A tenant fills out a Google Form (“Repair Request Form”), which automatically adds a new row to a linked Google Sheet. Steps: Trigger: Google Sheets rowAdded – runs when a new form entry appears. Extract & Format: Collects all relevant form data (address, unit, urgency, contacts). Generate Unit ID: Creates a standardized identifier (e.g., BUILDING-UNIT) for tracking. Email Notification: Sends the building manager a formatted email summarizing the repair details and including a link to a Repair Update Form (which activates Workflow 2). --- Workflow 2 – Repair Updates Behind the Scenes:\ Triggered when the building manager submits a follow-up form (“Repair Update Form”). Steps: Lookup by UUID: Uses the Unit ID from Workflow 1 to find the existing row in the Google Sheet. Conditional Logic: If photos are uploaded: Saves each image to a Google Drive folder, renames files consistently, and adds URLs to the sheet. If no photos: Skips the upload step and processes textual updates only. Merge & Update: Combines new data with existing repair info in the same spreadsheet row — enabling a full repair history in one place. --- 🧩 Requirements Google Account (for Forms, Sheets, and Drive) Gmail/email node connected for sending notifications n8n credentials configured for Google API access --- ⚡ Setup Instructions (see more detail in workflow) Import both workflows into n8n, then copy one into a second workflow. Change manual trigger in workflow 2 to a n8n Form node. Connect Google credentials to all nodes. Update spreadsheet and folder IDs in the corresponding nodes. Customize email text, sender name, and form links for your organization. Test each workflow with a sample repair request and a repair update submission. --- 🛠️ Customization Ideas Add Slack or Telegram notifications for urgent repairs. Auto-create folders per building or unit for photo uploads. Generate monthly repair summaries using Google Sheets triggers. Add an AI node to create summaries/extract relevant repair data from repair request that include long submissions.

Matt@VeraisonLabsBy Matt@VeraisonLabs
208

Automate invoice processing with OCR, GPT-4 & Salesforce opportunity creation

PDF Invoice Extractor (AI) End-to-end pipeline: Watch Drive ➜ Download PDF ➜ OCR text ➜ AI normalize to JSON ➜ Upsert Buyer (Account) ➜ Create Opportunity ➜ Map Products ➜ Create OLI via Composite API ➜ Archive to OneDrive. --- Node by node (what it does & key setup) 1) Google Drive Trigger Purpose: Fire when a new file appears in a specific Google Drive folder. Key settings: Event: fileCreated Folder ID: google drive folder id Polling: everyMinute Creds: googleDriveOAuth2Api Output: Metadata { id, name, ... } for the new file. --- 2) Download File From Google Purpose: Get the file binary for processing and archiving. Key settings: Operation: download File ID: ={{ $json.id }} Creds: googleDriveOAuth2Api Output: Binary (default key: data) and original metadata. --- 3) Extract from File Purpose: Extract text from PDF (OCR as needed) for AI parsing. Key settings: Operation: pdf OCR: enable for scanned PDFs (in options) Output: JSON with OCR text at {{ $json.text }}. --- 4) Message a model (AI JSON Extractor) Purpose: Convert OCR text into strict normalized JSON array (invoice schema). Key settings: Node: @n8n/n8n-nodes-langchain.openAi Model: gpt-4.1 (or gpt-4.1-mini) Message role: system (the strict prompt; references {{ $json.text }}) jsonOutput: true Creds: openAiApi Output (per item): $.message.content → the parsed JSON (ensure it’s an array). --- 5) Create or update an account (Salesforce) Purpose: Upsert Buyer as Account using an external ID. Key settings: Resource: account Operation: upsert External Id Field: taxid_c External Id Value: ={{ $json.message.content.buyer.tax_id }} Name: ={{ $json.message.content.buyer.name }} Creds: salesforceOAuth2Api Output: Account record (captures Id) for downstream Opportunity. --- 6) Create an opportunity (Salesforce) Purpose: Create Opportunity linked to the Buyer (Account). Key settings: Resource: opportunity Name: ={{ $('Message a model').item.json.message.content.invoice.code }} Close Date: ={{ $('Message a model').item.json.message.content.invoice.issue_date }} Stage: Closed Won Amount: ={{ $('Message a model').item.json.message.content.summary.grand_total }} AccountId: ={{ $json.id }} (from Upsert Account output) Creds: salesforceOAuth2Api Output: Opportunity Id for OLI creation. --- 7) Build SOQL (Code / JS) Purpose: Collect unique product codes from AI JSON and build a SOQL query for PricebookEntry by Pricebook2Id. Key settings: pricebook2Id (hardcoded in script): e.g., 01sxxxxxxxxxxxxxxx Source lines: $('Message a model').first().json.message.content.products Output: { soql, codes } --- 8) Query PricebookEntries (Salesforce) Purpose: Fetch PricebookEntry.Id for each Product2.ProductCode. Key settings: Resource: search Query: ={{ $json.soql }} Creds: salesforceOAuth2Api Output: Items with Id, Product2.ProductCode (used for mapping). --- 9) Code in JavaScript (Build OLI payloads) Purpose: Join lines with PBE results and Opportunity Id ➜ build OpportunityLineItem payloads. Inputs: OpportunityId: ={{ $('Create an opportunity').first().json.id }} Lines: ={{ $('Message a model').first().json.message.content.products }} PBE rows: from previous node items Output: { body: { allOrNone:false, records:[{ OpportunityLineItem... }] } } Notes: Converts discount_total ➜ per-unit if needed (currently commented for standard pricing). Throws on missing PBE mapping or empty lines. --- 10) Create Opportunity Line Items (HTTP Request) Purpose: Bulk create OLIs via Salesforce Composite API. Key settings: Method: POST URL: https://<your-instance>.my.salesforce.com/services/data/v65.0/composite/sobjects Auth: salesforceOAuth2Api (predefined credential) Body (JSON): ={{ $json.body }} Output: Composite API results (per-record statuses). --- 11) Update File to One Drive Purpose: Archive the original PDF in OneDrive. Key settings: Operation: upload File Name: ={{ $json.name }} Parent Folder ID: onedrive folder id Binary Data: true (from the Download node) Creds: microsoftOneDriveOAuth2Api Output: Uploaded file metadata. --- Data flow (wiring) Google Drive Trigger → Download File From Google Download File From Google → Extract from File → Update File to One Drive Extract from File → Message a model Message a model → Create or update an account Create or update an account → Create an opportunity Create an opportunity → Build SOQL Build SOQL → Query PricebookEntries Query PricebookEntries → Code in JavaScript Code in JavaScript → Create Opportunity Line Items --- Quick setup checklist 🔐 Credentials: Connect Google Drive, OneDrive, Salesforce, OpenAI. 📂 IDs: Drive Folder ID (watch) OneDrive Parent Folder ID (archive) Salesforce Pricebook2Id (in the JS SOQL builder) 🧠 AI Prompt: Use the strict system prompt; jsonOutput = true. 🧾 Field mappings: Buyer tax id/name → Account upsert fields Invoice code/date/amount → Opportunity fields Product name must equal your Product2.ProductCode in SF. ✅ Test: Drop a sample PDF → verify: AI returns array JSON only Account/Opportunity created OLI records created PDF archived to OneDrive --- Notes & best practices If PDFs are scans, enable OCR in Extract from File. If AI returns non-JSON, keep “Return only a JSON array” as the last line of the prompt and keep jsonOutput enabled. Consider adding validation on parsing.warnings to gate Salesforce writes. For discounts/taxes in OLI: Standard OLI fields don’t support per-line discount amounts directly; model them in UnitPrice or custom fields. Replace the Composite API URL with your org’s domain or use the Salesforce node’s Bulk Upsert for simplicity.

Le NguyenBy Le Nguyen
942