Back to Catalog

Scrape Trustpilot reviews 📊 with ScrapegraphAI and OpenAI Reputation analysis

DavideDavide
38 views
2/3/2026
Official Page

This workflow automates the collection, analysis, and reporting of Trustpilot reviews for a specific company using ScrapeGraphAI, transforming unstructured customer feedback into structured insights and actionable intelligence.


Key Advantages

1. ✅ End-to-End Automation

The entire process—from scraping reviews to delivering a polished management report—is fully automated, eliminating manual data collection and analysis .

2. ✅ Structured Insights from Unstructured Data

The workflow transforms raw, unstructured review text into structured fields and standardized sentiment categories, making analysis reliable and repeatable.

3. ✅ Company-Level Reputation Intelligence

Instead of focusing on individual products, the analysis evaluates the overall brand, service quality, customer experience, and operational performance, which is critical for leadership and strategic teams.

4. ✅ Action-Oriented Outputs

The AI-generated report goes beyond summaries by:

  • Identifying reputational risks
  • Highlighting improvement opportunities
  • Proposing concrete actions with priorities, effort estimates, and KPIs

5. ✅ Visual & Executive-Friendly Reporting

Automatic sentiment charts and structured executive summaries make insights immediately understandable for non-technical stakeholders.

6. ✅ Scalable and Configurable

  • Easily adaptable to different companies or review volumes
  • Page limits and batching protect against rate limits and excessive API usage

7. ✅ Cross-Team Value

The output is tailored for multiple internal teams:

  • Management
  • Marketing
  • Customer Support
  • Operations
  • Product & UX

Ideal Use Cases

  • Brand reputation monitoring
  • Voice-of-the-customer programs
  • Executive reporting
  • Customer experience optimization
  • Competitive benchmarking (by reusing the workflow across brands)

How It Works

This workflow automates the complete process of scraping Trustpilot reviews, extracting structured data, analyzing sentiment, and generating comprehensive reports. The workflow follows this sequence:

  1. Trigger & Configuration: The workflow starts with a manual trigger, allowing users to set the target company URL and the number of review pages to scrape.

  2. Review Scraping: An HTTP request node fetches review pages from Trustpilot with pagination support, extracting review links from the HTML content.

  3. Review Processing: The workflow processes individual review pages in batches (limited to 5 reviews per execution for efficiency). Each review page is converted to clean markdown using ScrapegraphAI.

  4. Data Extraction: An information extractor using OpenAI's GPT-4.1-mini model parses the markdown to extract structured review data including author, rating, date, title, text, review count, and country.

  5. Sentiment Analysis: Another OpenAI model performs sentiment classification on each review text, categorizing it as Positive, Neutral, or Negative.

  6. Data Aggregation: Processed reviews are collected and compiled into a structured dataset.

  7. Analytics & Visualization:

    • A pie chart is generated showing sentiment distribution
    • A comprehensive reputation analysis report is created using an AI agent that evaluates company-level insights, recurring themes, and provides actionable recommendations
  8. Reporting & Delivery: The analysis is converted to HTML format and sent via email, providing stakeholders with immediate insights into customer feedback and company reputation.

Set Up Steps

To configure and run this workflow:

  1. Credential Setup:

    • Configure OpenAI API credentials for the chat models and information extraction
    • Set up ScrapeGraphAI credentials for webpage-to-markdown conversion
    • Configure Gmail OAuth2 credentials for email notifications
  2. Company Configuration:

    • In the "Set Parameters" node, update company_id to the target Trustpilot company URL
    • Adjust max_page to control how many review pages to scrape
  3. Review Processing Limits:

    • The "Limit" node restricts processing to 5 reviews per execution to manage API costs and processing time
    • Adjust this value based on your needs and OpenAI usage limits
  4. Email Configuration:

    • Update the "Send a message" node with the recipient email address
    • Customize the email subject and content as needed
  5. Analysis Customization:

    • Modify the prompt in the "Company Reputation Analyst" node to tailor the report format
    • Adjust sentiment analysis categories if different classification is needed
  6. Execution:

    • Click "Test workflow" to execute the manual trigger
    • Monitor execution in the n8n editor to ensure all API calls succeed
    • Check the configured email inbox for the generated report

Note: Be mindful of API rate limits and costs associated with OpenAI and ScrapegraphAI services when processing large numbers of reviews. The workflow includes a 5-second delay between paginated requests to comply with Trustpilot's terms of service.


👉 Subscribe to my new YouTube channel. Here I’ll share videos and Shorts with practical tutorials and FREE templates for n8n.

image


Need help customizing?

Contact me for consulting and support or add me on Linkedin.

n8n Workflow: Scrape Trustpilot Reviews with AI Analysis

This n8n workflow demonstrates how to scrape reviews from Trustpilot and then use AI (via Langchain and OpenAI) to perform sentiment analysis and extract key information from each review. This allows for automated reputation analysis and data extraction from customer feedback.

What it does

This workflow performs the following steps:

  1. Manual Trigger: The workflow is initiated manually.
  2. Sticky Note: Provides a descriptive note about the workflow's purpose.
  3. HTTP Request: This node is intended to make an HTTP request. (Note: As per the provided JSON, this node is not connected to any other nodes, implying it's either a placeholder, an incomplete part of the workflow, or meant to be triggered externally/manually for a specific purpose not fully integrated into the main flow.)
  4. Edit Fields (Set): This node is intended to modify or set data fields. (Note: Similar to the HTTP Request node, this node is not connected, suggesting it's not actively participating in the main data flow as defined in the JSON.)
  5. Loop Over Items (Split in Batches): This node is intended to process items in batches. (Note: This node is also disconnected, indicating it's not part of the active flow.)
  6. Gmail: This node is intended for Gmail operations (e.g., sending emails). (Note: This node is disconnected.)
  7. Code: This node is intended to execute custom JavaScript code. (Note: This node is disconnected.)
  8. HTML: This node is intended for HTML processing, likely for scraping or parsing HTML content. (Note: This node is disconnected.)
  9. AI Agent: This node is intended to act as an AI agent using Langchain. (Note: This node is disconnected.)
  10. Basic LLM Chain: This node is intended to execute a basic Large Language Model chain using Langchain. (Note: This node is disconnected.)
  11. OpenAI Chat Model: This node configures an OpenAI chat model for AI tasks. (Note: This node is disconnected.)
  12. Limit: This node is intended to limit the number of items processed. (Note: This node is disconnected.)
  13. Split Out: This node is intended to split out items from a list or array. (Note: This node is disconnected.)
  14. Sentiment Analysis: This node performs sentiment analysis on text using Langchain. (Note: This node is disconnected.)
  15. Information Extractor: This node extracts structured information from text using Langchain. (Note: This node is disconnected.)

Important Note on Current Workflow State: Based on the provided JSON, all nodes except the "Manual Trigger" and "Sticky Note" are currently disconnected. This means the workflow, as defined, will only trigger manually and display a sticky note. The AI analysis, scraping, and other functionalities are present as individual nodes but are not connected into a functional flow. To utilize the described features (scraping, AI analysis), these nodes would need to be connected appropriately.

Prerequisites/Requirements

To make this workflow fully functional (once connected):

  • n8n Instance: A running n8n instance.
  • OpenAI API Key: For the "OpenAI Chat Model" node and potentially other AI-related Langchain nodes (e.g., Sentiment Analysis, Information Extractor).
  • Trustpilot URL: The URL of the Trustpilot page you wish to scrape (would be configured in the HTTP Request or HTML node once connected).
  • Gmail Account (Optional): If the Gmail node were to be connected and used for notifications or reporting.

Setup/Usage

  1. Import the Workflow: Import the provided JSON into your n8n instance.
  2. Connect Nodes: As noted, most functional nodes are disconnected. To enable the scraping and AI analysis, you would need to:
    • Connect the "Manual Trigger" to the "HTTP Request" or "HTML" node to initiate scraping.
    • Connect the output of the scraping node to the "Split Out" and "Loop Over Items" nodes to process individual reviews.
    • Connect the output of the looping mechanism to the "Sentiment Analysis" and "Information Extractor" nodes.
    • Ensure the "OpenAI Chat Model" is linked as a credential or configured within the Langchain nodes.
  3. Configure Credentials:
    • Add your OpenAI API Key as an n8n credential and select it in the relevant Langchain nodes (e.g., "OpenAI Chat Model").
    • If using Gmail, configure your Gmail Account credential.
  4. Configure Node Settings:
    • In the "HTTP Request" or "HTML" node, specify the Trustpilot URL and any necessary scraping parameters (e.g., CSS selectors for reviews, titles, ratings).
    • In the "Sentiment Analysis" and "Information Extractor" nodes, define the input fields containing the review text and the desired output schema for extracted information.
  5. Activate the Workflow: Once configured and connected, activate the workflow.
  6. Execute: Trigger the workflow manually by clicking "Execute Workflow" in the n8n editor.

Related Templates

Track competitor SEO keywords with Decodo + GPT-4.1-mini + Google Sheets

This workflow automates competitor keyword research using OpenAI LLM and Decodo for intelligent web scraping. Who this is for SEO specialists, content strategists, and growth marketers who want to automate keyword research and competitive intelligence. Marketing analysts managing multiple clients or websites who need consistent SEO tracking without manual data pulls. Agencies or automation engineers using Google Sheets as an SEO data dashboard for keyword monitoring and reporting. What problem this workflow solves Tracking competitor keywords manually is slow and inconsistent. Most SEO tools provide limited API access or lack contextual keyword analysis. This workflow solves that by: Automatically scraping any competitor’s webpage with Decodo. Using OpenAI GPT-4.1-mini to interpret keyword intent, density, and semantic focus. Storing structured keyword insights directly in Google Sheets for ongoing tracking and trend analysis. What this workflow does Trigger — Manually start the workflow or schedule it to run periodically. Input Setup — Define the website URL and target country (e.g., https://dev.to, france). Data Scraping (Decodo) — Fetch competitor web content and metadata. Keyword Analysis (OpenAI GPT-4.1-mini) Extract primary and secondary keywords. Identify focus topics and semantic entities. Generate a keyword density summary and SEO strength score. Recommend optimization and internal linking opportunities. Data Structuring — Clean and convert GPT output into JSON format. Data Storage (Google Sheets) — Append structured keyword data to a Google Sheet for long-term tracking. Setup Prerequisites If you are new to Decode, please signup on this link visit.decodo.com n8n account with workflow editor access Decodo API credentials OpenAI API key Google Sheets account connected via OAuth2 Make sure to install the Decodo Community node. Create a Google Sheet Add columns for: primarykeywords, seostrengthscore, keyworddensity_summary, etc. Share with your n8n Google account. Connect Credentials Add credentials for: Decodo API credentials - You need to register, login and obtain the Basic Authentication Token via Decodo Dashboard OpenAI API (for GPT-4o-mini) Google Sheets OAuth2 Configure Input Fields Edit the “Set Input Fields” node to set your target site and region. Run the Workflow Click Execute Workflow in n8n. View structured results in your connected Google Sheet. How to customize this workflow Track Multiple Competitors → Use a Google Sheet or CSV list of URLs; loop through them using the Split In Batches node. Add Language Detection → Add a Gemini or GPT node before keyword analysis to detect content language and adjust prompts. Enhance the SEO Report → Expand the GPT prompt to include backlink insights, metadata optimization, or readability checks. Integrate Visualization → Connect your Google Sheet to Looker Studio for SEO performance dashboards. Schedule Auto-Runs → Use the Cron Node to run weekly or monthly for competitor keyword refreshes. Summary This workflow automates competitor keyword research using: Decodo for intelligent web scraping OpenAI GPT-4.1-mini for keyword and SEO analysis Google Sheets for live tracking and reporting It’s a complete AI-powered SEO intelligence pipeline ideal for teams that want actionable insights on keyword gaps, optimization opportunities, and content focus trends, without relying on expensive SEO SaaS tools.

Ranjan DailataBy Ranjan Dailata
161

Generate song lyrics and music from text prompts using OpenAI and Fal.ai Minimax

Spark your creativity instantly in any chat—turn a simple prompt like "heartbreak ballad" into original, full-length lyrics and a professional AI-generated music track, all without leaving your conversation. 📋 What This Template Does This chat-triggered workflow harnesses AI to generate detailed, genre-matched song lyrics (at least 600 characters) from user messages, then queues them for music synthesis via Fal.ai's minimax-music model. It polls asynchronously until the track is ready, delivering lyrics and audio URL back in chat. Crafts original, structured lyrics with verses, choruses, and bridges using OpenAI Submits to Fal.ai for melody, instrumentation, and vocals aligned to the style Handles long-running generations with smart looping and status checks Returns complete song package (lyrics + audio link) for seamless sharing 🔧 Prerequisites n8n account (self-hosted or cloud with chat integration enabled) OpenAI account with API access for GPT models Fal.ai account for AI music generation 🔑 Required Credentials OpenAI API Setup Go to platform.openai.com → API keys (sidebar) Click "Create new secret key" → Name it (e.g., "n8n Songwriter") Copy the key and add to n8n as "OpenAI API" credential type Test by sending a simple chat completion request Fal.ai HTTP Header Auth Setup Sign up at fal.ai → Dashboard → API Keys Generate a new API key → Copy it In n8n, create "HTTP Header Auth" credential: Name="Fal.ai", Header Name="Authorization", Header Value="Key [Your API Key]" Test with a simple GET to their queue endpoint (e.g., /status) ⚙️ Configuration Steps Import the workflow JSON into your n8n instance Assign OpenAI API credentials to the "OpenAI Chat Model" node Assign Fal.ai HTTP Header Auth to the "Generate Music Track", "Check Generation Status", and "Fetch Final Result" nodes Activate the workflow—chat trigger will appear in your n8n chat interface Test by messaging: "Create an upbeat pop song about road trips" 🎯 Use Cases Content Creators: YouTubers generating custom jingles for videos on the fly, streamlining production from idea to audio export Educators: Music teachers using chat prompts to create era-specific folk tunes for classroom discussions, fostering interactive learning Gift Personalization: Friends crafting anniversary R&B tracks from shared memories via quick chats, delivering emotional audio surprises Artist Brainstorming: Songwriters prototyping hip-hop beats in real-time during sessions, accelerating collaboration and iteration ⚠️ Troubleshooting Invalid JSON from AI Agent: Ensure the system prompt stresses valid JSON; test the agent standalone with a sample query Music Generation Fails (401/403): Verify Fal.ai API key has minimax-music access; check usage quotas in dashboard Status Polling Loops Indefinitely: Bump wait time to 45-60s for complex tracks; inspect fal.ai queue logs for bottlenecks Lyrics Under 600 Characters: Tweak agent prompt to enforce fuller structures like [V1][C][V2][B][C]; verify output length in executions

Daniel NkenchoBy Daniel Nkencho
601

Automate invoice processing with OCR, GPT-4 & Salesforce opportunity creation

PDF Invoice Extractor (AI) End-to-end pipeline: Watch Drive ➜ Download PDF ➜ OCR text ➜ AI normalize to JSON ➜ Upsert Buyer (Account) ➜ Create Opportunity ➜ Map Products ➜ Create OLI via Composite API ➜ Archive to OneDrive. --- Node by node (what it does & key setup) 1) Google Drive Trigger Purpose: Fire when a new file appears in a specific Google Drive folder. Key settings: Event: fileCreated Folder ID: google drive folder id Polling: everyMinute Creds: googleDriveOAuth2Api Output: Metadata { id, name, ... } for the new file. --- 2) Download File From Google Purpose: Get the file binary for processing and archiving. Key settings: Operation: download File ID: ={{ $json.id }} Creds: googleDriveOAuth2Api Output: Binary (default key: data) and original metadata. --- 3) Extract from File Purpose: Extract text from PDF (OCR as needed) for AI parsing. Key settings: Operation: pdf OCR: enable for scanned PDFs (in options) Output: JSON with OCR text at {{ $json.text }}. --- 4) Message a model (AI JSON Extractor) Purpose: Convert OCR text into strict normalized JSON array (invoice schema). Key settings: Node: @n8n/n8n-nodes-langchain.openAi Model: gpt-4.1 (or gpt-4.1-mini) Message role: system (the strict prompt; references {{ $json.text }}) jsonOutput: true Creds: openAiApi Output (per item): $.message.content → the parsed JSON (ensure it’s an array). --- 5) Create or update an account (Salesforce) Purpose: Upsert Buyer as Account using an external ID. Key settings: Resource: account Operation: upsert External Id Field: taxid_c External Id Value: ={{ $json.message.content.buyer.tax_id }} Name: ={{ $json.message.content.buyer.name }} Creds: salesforceOAuth2Api Output: Account record (captures Id) for downstream Opportunity. --- 6) Create an opportunity (Salesforce) Purpose: Create Opportunity linked to the Buyer (Account). Key settings: Resource: opportunity Name: ={{ $('Message a model').item.json.message.content.invoice.code }} Close Date: ={{ $('Message a model').item.json.message.content.invoice.issue_date }} Stage: Closed Won Amount: ={{ $('Message a model').item.json.message.content.summary.grand_total }} AccountId: ={{ $json.id }} (from Upsert Account output) Creds: salesforceOAuth2Api Output: Opportunity Id for OLI creation. --- 7) Build SOQL (Code / JS) Purpose: Collect unique product codes from AI JSON and build a SOQL query for PricebookEntry by Pricebook2Id. Key settings: pricebook2Id (hardcoded in script): e.g., 01sxxxxxxxxxxxxxxx Source lines: $('Message a model').first().json.message.content.products Output: { soql, codes } --- 8) Query PricebookEntries (Salesforce) Purpose: Fetch PricebookEntry.Id for each Product2.ProductCode. Key settings: Resource: search Query: ={{ $json.soql }} Creds: salesforceOAuth2Api Output: Items with Id, Product2.ProductCode (used for mapping). --- 9) Code in JavaScript (Build OLI payloads) Purpose: Join lines with PBE results and Opportunity Id ➜ build OpportunityLineItem payloads. Inputs: OpportunityId: ={{ $('Create an opportunity').first().json.id }} Lines: ={{ $('Message a model').first().json.message.content.products }} PBE rows: from previous node items Output: { body: { allOrNone:false, records:[{ OpportunityLineItem... }] } } Notes: Converts discount_total ➜ per-unit if needed (currently commented for standard pricing). Throws on missing PBE mapping or empty lines. --- 10) Create Opportunity Line Items (HTTP Request) Purpose: Bulk create OLIs via Salesforce Composite API. Key settings: Method: POST URL: https://<your-instance>.my.salesforce.com/services/data/v65.0/composite/sobjects Auth: salesforceOAuth2Api (predefined credential) Body (JSON): ={{ $json.body }} Output: Composite API results (per-record statuses). --- 11) Update File to One Drive Purpose: Archive the original PDF in OneDrive. Key settings: Operation: upload File Name: ={{ $json.name }} Parent Folder ID: onedrive folder id Binary Data: true (from the Download node) Creds: microsoftOneDriveOAuth2Api Output: Uploaded file metadata. --- Data flow (wiring) Google Drive Trigger → Download File From Google Download File From Google → Extract from File → Update File to One Drive Extract from File → Message a model Message a model → Create or update an account Create or update an account → Create an opportunity Create an opportunity → Build SOQL Build SOQL → Query PricebookEntries Query PricebookEntries → Code in JavaScript Code in JavaScript → Create Opportunity Line Items --- Quick setup checklist 🔐 Credentials: Connect Google Drive, OneDrive, Salesforce, OpenAI. 📂 IDs: Drive Folder ID (watch) OneDrive Parent Folder ID (archive) Salesforce Pricebook2Id (in the JS SOQL builder) 🧠 AI Prompt: Use the strict system prompt; jsonOutput = true. 🧾 Field mappings: Buyer tax id/name → Account upsert fields Invoice code/date/amount → Opportunity fields Product name must equal your Product2.ProductCode in SF. ✅ Test: Drop a sample PDF → verify: AI returns array JSON only Account/Opportunity created OLI records created PDF archived to OneDrive --- Notes & best practices If PDFs are scans, enable OCR in Extract from File. If AI returns non-JSON, keep “Return only a JSON array” as the last line of the prompt and keep jsonOutput enabled. Consider adding validation on parsing.warnings to gate Salesforce writes. For discounts/taxes in OLI: Standard OLI fields don’t support per-line discount amounts directly; model them in UnitPrice or custom fields. Replace the Composite API URL with your org’s domain or use the Salesforce node’s Bulk Upsert for simplicity.

Le NguyenBy Le Nguyen
942