Back to Catalog

ScrapingBee and Google Sheets integration template

Sahil SunnySahil Sunny
204 views
2/3/2026
Official Page

This workflow contains community nodes that are only compatible with the self-hosted version of n8n.

Screenshot 20250904 at 9.57.43 AM.png

This workflow allows users to extract sitemap links using ScrapingBee API. It only needs the domain name www.example.com and it automatically checks robots.txt and sitemap.xml to find the links. It is also designed to recursively run the workflow when new .xml links are found while scraping the sitemap.

How It Works

  1. Trigger: The workflow waits for a webhook request that contains domain=www.example.com
  2. It then looks for robots.txt file, if not found it checks sitemap.xml
  3. Once it finds xml links, it recursively scrapes them to extract the website links
  4. For each xml file, first it checks whether it's a binary file and whether it's a compressed xml
  5. If it's a text response, it directly runs a code that extracts normal website link and another code to extract xml links
  6. If it's a binary that is not compressed, it just extracts text from the binary and then extract webiste links and xml links
  7. If it's a compressed binary, it first decompresses it and then extracts the text and then the links and xml
  8. After extracting website links, it appends those links directly to a sheet
  9. After extracting xml links, it scrapes them recursively until it finds all website links

When the workflow is finished, you will see the output in the links column of the Google Sheet that we added to the workflow.

Screenshot 20250904 at 10.08.42 AM.png

Set Up Steps

  1. Get your ScrapingBee API Key here
  2. Create a new google sheet with an empty column named links. Connect to the sheet by signing in using your Google Credential and add the link to your sheet.
  3. Copy the webhook url, and send a GET request with domain as query parameter. Example:
curl "https://webhook_link?domain=scrapingbee.com"

Customisation Options

  1. If the website you are scraping is blocking your request, you can try using premium or stealth proxy in Scrape robots.txt file, Scrape sitemap.xml file, and Scrape xml file nodes.
  2. If you wish to store the data in a different app/tool or store it as a file, you would just need to replace Append links to sheet node with a relevant node.

Next Steps

If you wish to scrape the pages using the extracted links, then you can implement a new workflow that reads the sheet or file (output generated by this workflow) for links and for each link send a request to ScrapingBee's HTML API and save the returned data.


NOTE: Some heavy sitemaps could result in a crash if the workflow consumes more memory than what is available in your n8n plan or self-hosted system. If this happens, we would recommend you to either upgrade your plan or use a self-hosted solution with a higher memory.

n8n ScrapingBee and Google Sheets Integration Template

This n8n workflow provides a template for integrating web scraping results from ScrapingBee with Google Sheets. It demonstrates how to fetch data, process it, and conditionally store it based on specific criteria.

What it does

This workflow is designed to:

  1. Receive a webhook trigger: It starts by listening for incoming data via a webhook. This webhook is expected to contain a binary file, likely a CSV or similar data export from a scraping process.
  2. Decompress the received file: If the incoming binary data is compressed (e.g., a ZIP file), it decompresses it.
  3. Extract data from the file: It then extracts structured data (e.g., CSV content) from the decompressed file.
  4. Process extracted data with custom code: A Code node is used to manipulate the extracted data. This step is where you would implement specific logic to transform or filter the scraped data.
  5. Conditionally filter data: An If node evaluates the processed data based on a defined condition.
  6. Write data to Google Sheets (if condition is met): If the condition in the If node evaluates to true, the data is then appended to a specified Google Sheet.

Prerequisites/Requirements

To use this workflow, you will need:

  • n8n instance: A running n8n instance.
  • Webhook: An external service or application configured to send data to the n8n webhook URL. This data should ideally be a binary file (e.g., a compressed CSV).
  • Google Account: A Google account with access to Google Sheets.
  • Google Sheets Credential: An n8n credential configured for Google Sheets to allow the workflow to write data.
  • ScrapingBee (Optional but implied by directory name): While not directly present in the JSON, the directory name suggests this workflow is intended to work with data from ScrapingBee. You would typically use ScrapingBee to generate the data that your webhook sends.

Setup/Usage

  1. Import the workflow:
    • Download the provided JSON workflow definition.
    • In your n8n instance, go to "Workflows" and click "New".
    • Click the three dots next to "New Workflow" and select "Import from JSON".
    • Paste the JSON content or upload the file.
  2. Configure the Webhook:
    • Open the "Webhook" node.
    • Copy the "Webhook URL". This is the URL you will use to send data to this workflow from your external application (e.g., your ScrapingBee post-processing script).
  3. Configure Google Sheets Credential:
    • Open the "Google Sheets" node.
    • Click "Create New Credential" or select an existing Google Sheets OAuth2 credential.
    • Follow the n8n instructions to authenticate with your Google account and grant necessary permissions.
    • Specify the "Spreadsheet ID" and "Sheet Name" where you want the data to be written.
  4. Customize the Compression Node:
    • If your incoming data is compressed, ensure the "Compression" node is configured for the correct "Operation" (e.g., "Decompress") and "File Format" (e.g., "ZIP").
  5. Customize the Extract from File Node:
    • Configure this node to correctly parse the format of your extracted data (e.g., "CSV", "JSON").
  6. Customize the Code Node:
    • Open the "Code" node and modify the JavaScript code to implement your specific data processing logic. This is where you would transform, clean, or enrich the scraped data before it's stored.
  7. Customize the If Node:
    • Open the "If" node and define the condition(s) for filtering your data. For example, you might want to only store items where a certain field meets a specific value.
  8. Activate the workflow: Once configured, activate the workflow by toggling the "Active" switch in the top right corner of the n8n editor.

Now, whenever data is sent to the webhook URL, the workflow will execute, process the data, and conditionally add it to your Google Sheet.

Related Templates

Generate verified job offer letters with OpenAI, Gmail and Slack

📄 AI-Powered Verified Job Offer Letter Generator Description Creating job offer letters manually is time-consuming, error-prone, and difficult to scale. This AI-powered workflow automates the entire job offer letter process — from validating candidate emails to generating and delivering professional PDF offer letters. This intelligent workflow eliminates repetitive drafting, reduces human errors, and ensures offer letters are sent only to verified email addresses, helping HR teams move faster while maintaining professionalism and accuracy. --- What This Workflow Does Transforms manual offer letter creation into a seamless, automated HR process: 📝 Capture Candidate & Job Details – Receives candidate name, email, job role, salary, joining date, and company details via webhook or form. 📧 Email Verification – Validates the candidate’s email address before sending any communication to prevent delivery errors. 🧠 AI-Powered Offer Letter Generation – Uses AI to generate a clear, professional, and structured job offer letter. 📄 HTML Offer Letter Formatting – Converts the AI-generated content into a clean and readable HTML layout. 📑 PDF Generation – Automatically converts the offer letter into a professional PDF document. 📧 Offer Letter Delivery – Sends the PDF offer letter directly to the verified candidate email. 🗂️ Document Storage – Saves a copy of the offer letter for internal records and future reference. 🔁 Confirmation Response – Returns a success response confirming completion. --- Key Features 🤖 AI Offer Letter Writing – Generates professional, ready-to-send offer letters automatically. 📧 Email Verification Built-In – Ensures offer letters are only sent to valid email addresses. 📑 PDF Generation – Creates clean, official-looking offer letters. ⚙️ End-to-End Automation – No manual drafting, formatting, or sending required. 📂 Centralized Record Keeping – Keeps copies of all generated offer letters. 🔄 Flexible Triggering – Can be triggered from HR systems, forms, or internal tools. --- Perfect For 🏢 HR & Recruitment Teams – Automate offer letter creation and delivery. 🚀 Startups & Growing Companies – Send professional offer letters without extra admin work. 🏫 Staffing & Hiring Agencies – Generate offer letters quickly for multiple candidates. 💻 Remote & Distributed Teams – Ensure consistent communication across locations. 🧠 Operations Teams – Maintain accurate records and reduce manual errors. --- What You’ll Need Required Integrations 🌐 Webhook Trigger – Receives candidate and job details. 🤖 OpenAI – Generates offer letter content. 📧 VerifyEmail – Validates candidate email addresses. 📄 HTMLCSS to PDF – Converts HTML into PDF offer letters. 📧 Gmail – Sends the offer letter email. ☁️ Google Drive (optional) – Stores generated offer letters. --- Optional Enhancements 🎨 Brand Customization – Add company logo, colors, and formatting to offer letters. 🧾 HR System Integration – Connect with ATS or HR tools for automatic triggering. 🌍 Multilingual Offer Letters – Generate offer letters in different languages. 🔐 Approval Step – Add internal approval before sending offer letters. 📊 Audit Logging – Store offer letter data in Google Sheets or databases. 📎 Additional Attachments – Include policies or onboarding documents with the offer letter. --- Quick Start 1️⃣ Import the workflow template into your n8n workspace 2️⃣ Connect credentials for OpenAI, VerifyEmail, Gmail, and HTMLCSS to PDF 3️⃣ Send test candidate data to the webhook 4️⃣ Review the generated PDF offer letter 5️⃣ Activate the workflow and start sending offer letters automatically --- Customization Options 1️⃣ Offer Letter Tone – Adjust AI prompt for formal or friendly tone. 2️⃣ Company Branding – Customize HTML layout and styling. 3️⃣ Email Content – Modify subject line and email message. 4️⃣ PDF Layout – Adjust spacing, fonts, and structure. 5️⃣ Storage Location – Change where offer letters are saved. 6️⃣ Validation Rules – Extend email or input checks. --- Expected Results ⚡ Faster Hiring Process – Generate offer letters in minutes. 🤖 Consistent Quality – Every offer letter follows a professional format. 📧 Error-Free Delivery – Verified emails reduce failed communication. 🗂️ Organized Records – All offer letters stored automatically. 🏢 Professional Candidate Experience – Clean, official documents every time. --- Workflow Structure Visualization 📝 Candidate & Job Details ↓ 📧 Email Verification ↓ 🧠 AI Offer Letter Generation ↓ 📄 HTML Formatting ↓ 📑 PDF Conversion ↓ 📧 Email Delivery ↓ 🔁 Confirmation Response --- 🚀 Ready to Automate Job Offer Letters? Import this template today and let AI handle offer letter creation, verification, and delivery — so your team can focus on hiring the right talent faster. ✨ ---

Jitesh DugarBy Jitesh Dugar
5

Create an AI Telegram bot using Google Drive, Qdrant, and OpenAI GPT-4.1

How it works This workflow creates an intelligent Telegram bot with a knowledge base powered by Qdrant vector database. The bot automatically processes documents uploaded to Google Drive, stores them as embeddings, and uses this knowledge to answer questions in Telegram. It consists of two independent flows: document processing (Google Drive → Qdrant) and chat interaction (Telegram → AI Agent → Telegram). Step-by-step Document Processing Flow: New File Trigger: The workflow starts when the New File Trigger node detects a new file created in the specified Google Drive folder (polling every 15 minutes). Download File: The Download File (Google Drive) node downloads the detected file from Google Drive. Text Splitting: The Split Text into Chunks node splits the document text into chunks of 3000 characters with 300 character overlap for optimal embedding. Load Document Data: The Load Document Data node processes the binary file data and prepares it for vectorization. OpenAI Embeddings: The OpenAI Embeddings node generates vector embeddings for each text chunk. Insert into Qdrant: The Insert into Qdrant node stores the embeddings in the Qdrant vector database collection. Move to Processed Folder: After successful processing, the Move to Processed Folder (Google Drive) node moves the file to a "Qdrant Ready" folder to keep files organized. Telegram Chat Flow: Telegram Message Trigger: The Telegram Message Trigger node receives new messages from the Telegram bot. Filter Authorized User: The Filter Authorized User node checks if the message is from an authorized chat ID (26899549) to restrict bot access. AI Agent Processing: The AI Agent receives the user's message text and processes it using the fine-tuned GPT-4.1 model with access to the Qdrant knowledge base tool. Qdrant Knowledge Base: The Qdrant Knowledge Base node retrieves relevant information from the vector database to provide context for the AI agent's responses. Conversation Memory: The Conversation Memory node maintains conversation history per chat ID, allowing the bot to remember context. Send Response to Telegram: The Send Response to Telegram node sends the AI-generated response back to the user in Telegram. Set up steps Estimated set up time: 15 minutes Google Drive Setup: Add your Google Drive OAuth2 credentials to the New File Trigger, Download File, and Move to Processed Folder nodes. Create two folders in your Google Drive: one for incoming files and one for processed files. Copy the folder IDs from the URLs and update them in the New File Trigger (folderToWatch) and Move to Processed Folder (folderId) nodes. Qdrant Setup: Add your Qdrant API credentials to the Insert into Qdrant and Qdrant Knowledge Base nodes. Create a collection in your Qdrant instance (e.g., "Test-youtube-adept-ecom"). Update the collection name in both Qdrant nodes. OpenAI Setup: Add your OpenAI API credentials to the OpenAI Chat Model and OpenAI Embeddings nodes. (Optional) Replace the fine-tuned model ID in OpenAI Chat Model with your own model or use a standard model like gpt-4-turbo. Telegram Setup: Create a Telegram bot via @BotFather and obtain the bot token. Add your Telegram bot credentials to the Telegram Message Trigger and Send Response to Telegram nodes. Update the authorized chat ID in the Filter Authorized User node (replace 26899549 with your Telegram user ID). Customize System Prompt (Optional): Modify the system message in the AI Agent node to customize your bot's personality and behavior. The current prompt is configured for an n8n automation expert creating social media content. Activate the Workflow: Toggle "Active" in the top-right to enable both the Google Drive trigger and Telegram trigger. Upload a document to your Google Drive folder to test the document processing flow. Send a message to your Telegram bot to test the chat interaction flow.

KonstantinBy Konstantin
167

Sync Zendesk tickets with subsequent comments to Asana tasks

This workflow creates an Asana task when a new ticket is created in Zendesk. Subsequent comments on the ticket in Zendesk are added as comments to the task in Asana. Prerequisites Zendesk account and Zendesk credentials. Asana account and Asana credentials. Asana workspace to create tasks in. How it works The workflow listens for new tickets in Zendesk. When a new ticket is created, the workflow creates a new task in Asana. The Asana GID is then saved in one of the ticket's fields (in setup we call this "Asana GID"). The next time a comment is added to the ticket, the workflow retrieves the Asana GID from the ticket's field and adds the comment to the task in Asana. Setup This workflow requires that you set up a webhook in Zendesk. To do so, follow the steps below: In the workflow, open the On new Zendesk ticket node and copy the webhook URL. In Zendesk, navigate to Admin Center > Apps and integrations > Webhooks > Actions > Create Webhook. Add all the required details which can be retrieved from the On new Zendesk ticket node. The webhook URL gets added to the “Endpoint URL” field, and the “Request method” should match what is shown in n8n. Save the webhook. In Zendesk, navigate to Admin Center > Objects and rules > Business rules > Triggers > Add trigger. Give trigger a name such as “New tickets”. Under “Conditions” in “Meet ALL of the following conditions”, add “Status is New”. Under “Actions”, select “Notify active webhook” and select the webhook you created previously. In the JSON body, add the following: { "id": "{{ticket.id}}", "comment": "{{ticket.latestcommenthtml}}" } Save the Zendesk trigger. You will also need to set up a field in Zendesk to store the Asana GID. To do so, follow the steps below: In Zendesk, navigate to Admin Center > Objects and rules > Tickets > Fields > Add field. Use the number field option and give the field a name such as “Asana GID”. Save the field. In n8n, open the Update ticket node and select the field you created in Zendesk.

n8n TeamBy n8n Team
1060