Scrape multi-page websites recursively with Google Sheets storage
Configurable Multi-Page Web Scraper
Introduction
This n8n workflow provides a robust and highly reusable solution for scraping data from paginated websites. Instead of building a complex series of nodes for every new site, you only need to update a simple JSON configuration in the initial Input Node, making your scraping tasks faster and more standardized.
Purpose
The core purpose of this template is to automate the extraction of structured data (e.g., product details, quotes, articles) from websites with multiple pages. It is designed to be fully recursive: it follows the "next page" link until no link is found, aggregates the results from all pages, and cleanly structures the final output into a single list of items.
Setup and Configuration
- Locate the Input Node: The entire configuration for the scraper is held within the first node of the workflow.
- Update the JSON: Replace the existing JSON content with your target website's details:
startUrl: The URL of the first page to begin scraping.nextPageSelector: The CSS selector for the "Next" or "Continue" link element that leads to the next page. This is crucial for the pagination loop.fields: An array of objects defining the data to extract on each page. For each field, specify thename(the output key), theselector(the CSS selector pointing to the data), and thevalue(the HTML attribute to pull, usuallytextorhref).
- Run the Workflow: After updating the configuration, execute the workflow. It will automatically loop through all pages and deliver a final, structured list of the scraped data.
For a detailed breakdown of the internal logic, including how the loop is constructed using the Set, If, and HTTP Request nodes, please refer to the original blog post: Flexible Web Scraping with n8n: A Configurable, Multi-Page Template
Scrape Multi-Page Websites Recursively with Google Sheets Storage
This n8n workflow automates the process of recursively scraping data from multi-page websites and storing the extracted information in a Google Sheet. It's designed to efficiently navigate through linked pages, extract specific data, and centralize it for analysis or further processing.
What it does
- Manual Trigger: Initiates the workflow upon manual execution.
- HTTP Request: Fetches the HTML content of the initial web page.
- HTML (Scrape Initial Page): Extracts specific data points from the initial page's HTML.
- If: Checks if there are more pages to scrape (e.g., a "next page" link or pagination indicator).
- If True (More Pages):
- HTTP Request (Next Page): Fetches the HTML content of the next page.
- HTML (Scrape Next Page): Extracts data from the subsequent page.
- Merge: Combines the data from the current page with data from previous pages.
- Edit Fields (Set): Prepares the combined data for storage, potentially cleaning or reformatting it.
- If: Loops back to check for more pages, creating a recursive scraping mechanism.
- If False (No More Pages):
- Aggregate: Collects all scraped data into a single structure.
- Split Out: Processes the aggregated data, potentially splitting it into individual items.
- Google Sheets: Appends the final, structured data to a specified Google Sheet.
- If True (More Pages):
Prerequisites/Requirements
- n8n Instance: A running instance of n8n.
- Google Account: A Google account with access to Google Sheets.
- Google Sheets Credential: An n8n credential configured for Google Sheets to allow the workflow to write data.
Setup/Usage
- Import the Workflow: Import the provided JSON into your n8n instance.
- Configure Credentials:
- Set up a Google Sheets credential in n8n.
- Configure HTTP Request Nodes:
- Update the "HTTP Request" nodes with the starting URL of the website you want to scrape.
- Adjust the "HTTP Request (Next Page)" node to correctly identify and fetch subsequent pages (e.g., by extracting the "next page" URL from the initial scrape).
- Configure HTML Nodes:
- Modify the "HTML (Scrape Initial Page)" and "HTML (Scrape Next Page)" nodes to define the CSS selectors for the data you wish to extract from the web pages (e.g., product names, prices, descriptions, links to other pages).
- Configure If Nodes:
- Adjust the conditions in the "If" nodes to accurately detect the presence of more pages to scrape. This typically involves checking for the existence of a "next page" link or a specific pagination element.
- Configure Google Sheets Node:
- Specify the Spreadsheet ID and Sheet Name where you want to store the scraped data.
- Ensure the column headers in your Google Sheet match the keys of the data being output by the "Edit Fields (Set)" node.
- Activate and Execute:
- Save and activate the workflow.
- Click "Execute Workflow" on the "Manual Trigger" node to run it.
Related Templates
AI-powered code review with linting, red-marked corrections in Google Sheets & Slack
Advanced Code Review Automation (AI + Lint + Slack) Who’s it for For software engineers, QA teams, and tech leads who want to automate intelligent code reviews with both AI-driven suggestions and rule-based linting — all managed in Google Sheets with instant Slack summaries. How it works This workflow performs a two-layer review system: Lint Check: Runs a lightweight static analysis to find common issues (e.g., use of var, console.log, unbalanced braces). AI Review: Sends valid code to Gemini AI, which provides human-like review feedback with severity classification (Critical, Major, Minor) and visual highlights (red/orange tags). Formatter: Combines lint and AI results, calculating an overall score (0–10). Aggregator: Summarizes results for quick comparison. Google Sheets Writer: Appends results to your review log. Slack Notification: Posts a concise summary (e.g., number of issues and average score) to your team’s channel. How to set up Connect Google Sheets and Slack credentials in n8n. Replace placeholders (<YOURSPREADSHEETID>, <YOURSHEETGIDORNAME>, <YOURSLACKCHANNEL_ID>). Adjust the AI review prompt or lint rules as needed. Activate the workflow — reviews will start automatically whenever new code is added to the sheet. Requirements Google Sheets and Slack integrations enabled A configured AI node (Gemini, OpenAI, or compatible) Proper permissions to write to your target Google Sheet How to customize Add more linting rules (naming conventions, spacing, forbidden APIs) Extend the AI prompt for project-specific guidelines Customize the Slack message formatting Export analytics to a dashboard (e.g., Notion or Data Studio) Why it’s valuable This workflow brings realistic, team-oriented AI-assisted code review to n8n — combining the speed of automated linting with the nuance of human-style feedback. It saves time, improves code quality, and keeps your team’s review history transparent and centralized.
Automated weekly security audit reports with Gmail delivery
🔒 N8N Security Audit Report - Automated Weekly Email 🎯 What does this workflow do? This workflow automatically generates and emails a comprehensive security audit report for your N8N instance every week. It identifies potential security risks related to: Credentials 🔑 : Exposed or insecure credentials Nodes 🧩 : Sensitive nodes (Code, HTTP Request, SSH, FTP, etc.) Instance settings 🏢 : Global security configuration Community nodes 📦 : Third-party nodes that may pose risks The report includes direct links to affected workflows, execution statuses, and actionable recommendations. --- ✨ Key Features 📊 Smart Risk Assessment Calculates overall risk level: 🟩 Low / 🟧 Moderate / 🟥 High Tracks unique credentials (not just total occurrences) Provides detailed breakdown by node type 🔗 Direct Workflow Links Clickable links to each workflow mentioned Shows last execution status (🟢 success / 🔴 failed) Displays execution timestamps 🌍 Bilingual Support Full support for French and English Switch language with a single variable 📧 Beautiful HTML Email Clean, professional formatting Color-coded risk levels Emoji icons for easy scanning --- 🚀 Quick Setup (5 minutes) 1️⃣ Configure Credentials N8N API: Generate an API key in your N8N settings Gmail OAuth2: Set up OAuth2 for Gmail sending 2️⃣ Set Your Variables Edit the "Set Config Variables" node: javascript { "email_to": "your.email@domain.com", "project_name": "My-N8N-Project", "server_url": "https://n8n.yourdomain.com", // NO trailing slash! "Language": "EN" // or "FR" } 3️⃣ Test & Activate Click "Execute Workflow" to test Check your email inbox Activate for weekly automation --- 📧 Example Report Output Subject: 🔒 Audit Report My-Project – Risk 🟧 Moderate Content: 📊 Summary • Credentials involved: 8 (5 unique) • Nodes involved: 12 💻 code: 4 🌐 httpRequest: 3 🔐 ssh: 2 • Community nodes: 1 • Overall risk level: 🟧 Moderate 🔐 Credentials Risk Report 🔹 Credentials with full access 🔑 My AWS Credentials 🔑 Database Admin 📋 Workflow: Data Processing Pipeline 🟢 (25-10-2024 06:15 → 06:16) 💻 Process Data 🌐 API Call 🧩 Nodes Risk Report [...detailed node analysis...] --- 🎨 Customization Options Change Schedule Modify the "Schedule Trigger" node to run: Daily at 8 AM Monthly on the 1st Custom cron expression Add Recipients Add multiple emails in the Gmail node's toList parameter Adjust Risk Thresholds Edit the JavaScript in "Format Audit Report" nodes to customize when risk levels change Use Different Email Service Replace Gmail node with: SMTP Microsoft Outlook SendGrid Any email service N8N supports --- 💡 Use Cases ✅ Compliance Monitoring: Track security posture for audits ✅ Team Awareness: Keep your team informed of security status ✅ Change Detection: Notice when new risky nodes are added ✅ Best Practices: Get recommendations to improve security ✅ Multi-Environment: Run separate instances for dev/staging/prod --- 🔧 Technical Details Nodes Used: 8 Credentials Required: 2 (N8N API + Gmail OAuth2) External Dependencies: None N8N Version: Compatible with latest N8N versions Execution Time: ~10-20 seconds --- 📋 Requirements N8N instance with API access Gmail account (or other email service) N8N API key with audit permissions Valid SSL certificate for workflow links (recommended) --- 🐛 Troubleshooting Empty report? → Check your N8N API key has audit permissions Workflow links don't work? → Verify server_url is correct and has no trailing slash No execution status shown? → Workflows must have been executed at least once Wrong language displayed? → Set Language to exactly "FR" or "EN" (uppercase) --- 🌟 Why This Template? Unlike basic monitoring tools, this workflow: ✅ Provides context-aware security analysis ✅ Links directly to affected workflows ✅ Shows real execution data (not just theoretical risks) ✅ Calculates unique credential exposure (not just counts) ✅ Supports bilingual reports ✅ Delivers actionable recommendations --- 🤝 Feedback & Support Found this helpful? Please rate the template! Have suggestions? Drop a comment below. Pro tip: Combine this with N8N's native alerting for real-time incident response! --- Tags: security audit monitoring compliance automation email reporting credentials governance --- 📜 License MIT - Feel free to modify and share!
Generate Weather-Based Date Itineraries with Google Places, OpenRouter AI, and Slack
🧩 What this template does This workflow builds a 120-minute local date course around your starting point by querying Google Places for nearby spots, selecting the top candidates, fetching real-time weather data, letting an AI generate a matching emoji, and drafting a friendly itinerary summary with an LLM in both English and Japanese. It then posts the full bilingual plan with a walking route link and weather emoji to Slack. 👥 Who it’s for Makers and teams who want a plug-and-play bilingual local itinerary generator with weather awareness — no custom code required. ⚙️ How it works Trigger – Manual (or schedule/webhook). Discovery – Google Places nearby search within a configurable radius. Selection – Rank by rating and pick the top 3. Weather – Fetch current weather (via OpenWeatherMap). Emoji – Use an AI model to match the weather with an emoji 🌤️. Planning – An LLM writes the itinerary in Markdown (JP + EN). Route – Compose a Google Maps walking route URL. Share – Post the bilingual itinerary, route link, and weather emoji to Slack. 🧰 Requirements n8n (Cloud or self-hosted) Google Maps Platform (Places API) OpenWeatherMap API key Slack Bot (chat:write) LLM provider (e.g., OpenRouter or DeepL for translation) 🚀 Setup (quick) Open Set → Fields: Config and fill in coords/radius/time limit. Connect Credentials for Google, OpenWeatherMap, Slack, and your LLM. Test the workflow and confirm the bilingual plan + weather emoji appear in Slack. 🛠 Customize Adjust ranking filters (type, min rating). Modify translation settings (target language or tone). Change output layout (side-by-side vs separated). Tune emoji logic or travel mode. Add error handling, retries, or logging for production use.