Back to Catalog

Real estate intelligence tracker with Bright Data & OpenAI

Ranjan DailataRanjan Dailata
10441 views
2/3/2026
Official Page

Who this is for

The Real Estate Intelligence Tracker is a powerful automated workflow designed for real estate analysts, investors, proptech startups, and market researchers who need to collect and analyze structured data from real estate listings across the web at scale.

This workflow is tailored for:

  • Real Estate Analysts - Tracking property prices, locations, and market trends

  • Investment Firms - Sourcing high-opportunity listings for portfolio decisions

  • PropTech Developers - Automating listing insights for SaaS platforms

  • Market Researchers - Extracting insights from competitive housing data

  • Growth Teams - Monitoring geographic property trends and pricing fluctuations

What problem is this workflow solving?

Collecting structured real estate listing data from property websites is difficult due to bot protections and unstructured HTML content. Manual data collection is slow and error-prone, and traditional scrapers often get blocked or miss context.

This workflow solves:

  • Automated bypass of anti-bot protection using Bright Data Web Unlocker

  • Conversion of unstructured HTML content into clean text using a Markdown-to-text LLM pipeline

  • Structured extraction of key listing data like price, location, property type, and features using OpenAI

  • Aggregation and delivery of insights to Google Sheets, local storage, and webhook-based alerts

What this workflow does

Convert to Text: Transforms scraped HTML/markdown into clean text using a Basic LLM Chain

Structured Data Extraction: Uses OpenAI GPT-4o with the Information Extractor node to parse property attributes (price, address, area, type, etc.)

Aggregate & Merge: Combines data from multiple pages or listings into a cohesive structure

Outbound Data Handling:

  • Google Sheets – Appends the structured real estate data for further analysis

  • Save to Disk – Persists structured JSON/text data locally

  • Webhook Notification – Sends data alerts or summaries to any third-party platform

Pre-conditions

  1. You need to have a Bright Data account and do the necessary setup as mentioned in the "Setup" section below.
  2. You need to have an OpenAI Account.

Setup

  • Sign up at Bright Data.
  • Navigate to Proxies & Scraping and create a new Web Unlocker zone by selecting Web Unlocker API under Scraping Solutions.
  • In n8n, configure the Header Auth account under Credentials (Generic Auth Type: Header Authentication). Header Authentication.png The Value field should be set with the Bearer XXXXXXXXXXXXXX. The XXXXXXXXXXXXXX should be replaced by the Web Unlocker Token.
  • In n8n, Configure the Google Sheet Credentials with your own account. Follow this documentation - Set Google Sheet Credential
  • In n8n, configure the OpenAi account credentials.
  • Ensure the URL and Bright Data zone name are correctly set in the Set URL, Filename and Bright Data Zone node.
  • Set the desired local path in the Write a file to disk node to save the responses.

How to customize this workflow to your needs

Target Multiple Sites or Locations

  • Update the Bright Data URL node dynamically with a list of regional real estate websites

  • Loop through different city/state filter URLs

Customize Extracted Fields

Modify the Information Extractor prompt to extract fields like:

  • Property size, number of bedrooms/bathrooms

  • Days on market

  • Nearby amenities or schools

  • Agent contact details

Integrate with More Destinations

  • Add nodes to export data to Notion, Airtable, HubSpot, or your custom database

  • Generate automated reports using PDF generators and email them

Data Quality and Logging

  • Add validation checks (e.g., missing price or address)

  • Save intermediate files (markdown, raw HTML, JSON output) to disk for audit purposes

Real Estate Intelligence Tracker with Bright Data & OpenAI

This n8n workflow automates the process of extracting, enriching, and storing real estate listing data. It leverages web scraping (via Bright Data, though the specific node is not present in the provided JSON, it's implied by the directory name), OpenAI for intelligent data extraction, and Google Sheets for structured storage.

The workflow simplifies the task of monitoring real estate markets by automatically pulling listing information, identifying key details, and organizing them in a spreadsheet.

What it does

  1. Manual Trigger: Initiates the workflow upon manual execution.
  2. Read/Write Files from Disk: (Likely a placeholder or part of a larger external process) Reads data from disk. In the context of "Bright Data," this might involve loading a list of URLs or configurations for scraping.
  3. Edit Fields (Set): Prepares the input for the OpenAI model by setting specific fields, likely formatting the raw real estate listing text into a prompt.
  4. OpenAI Chat Model: Processes the prepared text using an OpenAI chat model (e.g., GPT-3.5, GPT-4) to extract structured information.
  5. Information Extractor: Further refines the output from the OpenAI model, focusing on extracting specific data points relevant to real estate listings (e.g., price, address, number of bedrooms, etc.).
  6. Basic LLM Chain: (Potentially used for additional processing or re-prompting the LLM) Applies a basic Langchain LLM chain to the extracted information.
  7. Function: Executes custom JavaScript code, likely for data manipulation, transformation, or validation of the extracted real estate data.
  8. Merge: Combines data from different branches or previous steps. This could be merging the original listing data with the newly extracted and enriched information.
  9. HTTP Request: (Likely a placeholder or part of a larger external process) Makes an HTTP request. In the context of "Bright Data," this could be for triggering a scrape job, fetching results, or interacting with another API.
  10. Aggregate: Collects and combines items into a single item or a structured list, preparing the data for final storage.
  11. Google Sheets: Appends the processed and enriched real estate listing data as new rows to a specified Google Sheet.
  12. Sticky Note: Provides a comment or note within the workflow for documentation purposes.

Prerequisites/Requirements

  • n8n Instance: A running n8n instance (cloud or self-hosted).
  • OpenAI API Key: An API key for accessing the OpenAI API, configured as an n8n credential.
  • Google Account: A Google account with access to Google Sheets, configured as an n8n credential.
  • Bright Data Account: (Implied by directory name, but not explicitly in JSON) If using Bright Data for web scraping, an account and associated API keys/credentials would be required.

Setup/Usage

  1. Import the Workflow: Download the provided JSON and import it into your n8n instance.
  2. Configure Credentials:
    • Set up your OpenAI API Key credential.
    • Set up your Google Sheets credential, granting it access to the target spreadsheet.
    • (If applicable) Configure any Bright Data credentials if you integrate a Bright Data node.
  3. Customize Nodes:
    • Read/Write Files from Disk: Adjust the file path if you're loading initial data from disk.
    • Edit Fields (Set): Review and adjust the fields being set to match your input data structure.
    • OpenAI Chat Model: Select the desired OpenAI model and review the prompt for optimal information extraction.
    • Information Extractor: Define the schema or instructions for the specific real estate data points you want to extract.
    • Function: Modify the JavaScript code if custom data transformations are needed.
    • Google Sheets: Specify the Spreadsheet ID and Sheet Name where the data should be appended.
  4. Activate the Workflow: Once configured, activate the workflow.
  5. Execute: Manually trigger the workflow using the "When clicking ‘Execute workflow’" node to start processing.

Related Templates

Automate Dutch Public Procurement Data Collection with TenderNed

TenderNed Public Procurement What This Workflow Does This workflow automates the collection of public procurement data from TenderNed (the official Dutch tender platform). It: Fetches the latest tender publications from the TenderNed API Retrieves detailed information in both XML and JSON formats for each tender Parses and extracts key information like organization names, titles, descriptions, and reference numbers Filters results based on your custom criteria Stores the data in a database for easy querying and analysis Setup Instructions This template comes with sticky notes providing step-by-step instructions in Dutch and various query options you can customize. Prerequisites TenderNed API Access - Register at TenderNed for API credentials Configuration Steps Set up TenderNed credentials: Add HTTP Basic Auth credentials with your TenderNed API username and password Apply these credentials to the three HTTP Request nodes: "Tenderned Publicaties" "Haal XML Details" "Haal JSON Details" Customize filters: Modify the "Filter op ..." node to match your specific requirements Examples: specific organizations, contract values, regions, etc. How It Works Step 1: Trigger The workflow can be triggered either manually for testing or automatically on a daily schedule. Step 2: Fetch Publications Makes an API call to TenderNed to retrieve a list of recent publications (up to 100 per request). Step 3: Process & Split Extracts the tender array from the response and splits it into individual items for processing. Step 4: Fetch Details For each tender, the workflow makes two parallel API calls: XML endpoint - Retrieves the complete tender documentation in XML format JSON endpoint - Fetches metadata including reference numbers and keywords Step 5: Parse & Merge Parses the XML data and merges it with the JSON metadata and batch information into a single data structure. Step 6: Extract Fields Maps the raw API data to clean, structured fields including: Publication ID and date Organization name Tender title and description Reference numbers (kenmerk, TED number) Step 7: Filter Applies your custom filter criteria to focus on relevant tenders only. Step 8: Store Inserts the processed data into your database for storage and future analysis. Customization Tips Modify API Parameters In the "Tenderned Publicaties" node, you can adjust: offset: Starting position for pagination size: Number of results per request (max 100) Add query parameters for date ranges, status filters, etc. Add More Fields Extend the "Splits Alle Velden" node to extract additional fields from the XML/JSON data, such as: Contract value estimates Deadline dates CPV codes (procurement classification) Contact information Integrate Notifications Add a Slack, Email, or Discord node after the filter to get notified about new matching tenders. Incremental Updates Modify the workflow to only fetch new tenders by: Storing the last execution timestamp Adding date filters to the API query Only processing publications newer than the last run Troubleshooting No data returned? Verify your TenderNed API credentials are correct Check that you have setup youre filter proper Need help setting this up or interested in a complete tender analysis solution? Get in touch 🔗 LinkedIn – Wessel Bulte

Wessel BulteBy Wessel Bulte
247

AI-powered code review with linting, red-marked corrections in Google Sheets & Slack

Advanced Code Review Automation (AI + Lint + Slack) Who’s it for For software engineers, QA teams, and tech leads who want to automate intelligent code reviews with both AI-driven suggestions and rule-based linting — all managed in Google Sheets with instant Slack summaries. How it works This workflow performs a two-layer review system: Lint Check: Runs a lightweight static analysis to find common issues (e.g., use of var, console.log, unbalanced braces). AI Review: Sends valid code to Gemini AI, which provides human-like review feedback with severity classification (Critical, Major, Minor) and visual highlights (red/orange tags). Formatter: Combines lint and AI results, calculating an overall score (0–10). Aggregator: Summarizes results for quick comparison. Google Sheets Writer: Appends results to your review log. Slack Notification: Posts a concise summary (e.g., number of issues and average score) to your team’s channel. How to set up Connect Google Sheets and Slack credentials in n8n. Replace placeholders (<YOURSPREADSHEETID>, <YOURSHEETGIDORNAME>, <YOURSLACKCHANNEL_ID>). Adjust the AI review prompt or lint rules as needed. Activate the workflow — reviews will start automatically whenever new code is added to the sheet. Requirements Google Sheets and Slack integrations enabled A configured AI node (Gemini, OpenAI, or compatible) Proper permissions to write to your target Google Sheet How to customize Add more linting rules (naming conventions, spacing, forbidden APIs) Extend the AI prompt for project-specific guidelines Customize the Slack message formatting Export analytics to a dashboard (e.g., Notion or Data Studio) Why it’s valuable This workflow brings realistic, team-oriented AI-assisted code review to n8n — combining the speed of automated linting with the nuance of human-style feedback. It saves time, improves code quality, and keeps your team’s review history transparent and centralized.

higashiyama By higashiyama
90

🎓 How to transform unstructured email data into structured format with AI agent

This workflow automates the process of extracting structured, usable information from unstructured email messages across multiple platforms. It connects directly to Gmail, Outlook, and IMAP accounts, retrieves incoming emails, and sends their content to an AI-powered parsing agent built on OpenAI GPT models. The AI agent analyzes each email, identifies relevant details, and returns a clean JSON structure containing key fields: From – sender’s email address To – recipient’s email address Subject – email subject line Summary – short AI-generated summary of the email body The extracted information is then automatically inserted into an n8n Data Table, creating a structured database of email metadata and summaries ready for indexing, reporting, or integration with other tools. --- Key Benefits ✅ Full Automation: Eliminates manual reading and data entry from incoming emails. ✅ Multi-Source Integration: Handles data from different email providers seamlessly. ✅ AI-Driven Accuracy: Uses advanced language models to interpret complex or unformatted content. ✅ Structured Storage: Creates a standardized, query-ready dataset from previously unstructured text. ✅ Time Efficiency: Processes emails in real time, improving productivity and response speed. *✅ Scalability: Easily extendable to handle additional sources or extract more data fields. --- How it works This workflow automates the transformation of unstructured email data into a structured, queryable format. It operates through a series of connected steps: Email Triggering: The workflow is initiated by one of three different email triggers (Gmail, Microsoft Outlook, or a generic IMAP account), which constantly monitor for new incoming emails. AI-Powered Parsing & Structuring: When a new email is detected, its raw, unstructured content is passed to a central "Parsing Agent." This agent uses a specified OpenAI language model to intelligently analyze the email text. Data Extraction & Standardization: Following a predefined system prompt, the AI agent extracts key information from the email, such as the sender, recipient, subject, and a generated summary. It then forces the output into a strict JSON structure using a "Structured Output Parser" node, ensuring data consistency. Data Storage: Finally, the clean, structured data (the from, to, subject, and summarize fields) is inserted as a new row into a specified n8n Data Table, creating a searchable and reportable database of email information. --- Set up steps To implement this workflow, follow these configuration steps: Prepare the Data Table: Create a new Data Table within n8n. Define the columns with the following names and string type: From, To, Subject, and Summary. Configure Email Credentials: Set up the credential connections for the email services you wish to use (Gmail OAuth2, Microsoft Outlook OAuth2, and/or IMAP). Ensure the accounts have the necessary permissions to read emails. Configure AI Model Credentials: Set up the OpenAI API credential with a valid API key. The workflow is configured to use the model, but this can be changed in the respective nodes if needed. Connect the Nodes: The workflow canvas is already correctly wired. Visually confirm that the email triggers are connected to the "Parsing Agent," which is connected to the "Insert row" (Data Table) node. Also, ensure the "OpenAI Chat Model" and "Structured Output Parser" are connected to the "Parsing Agent" as its AI model and output parser, respectively. Activate the Workflow: Save the workflow and toggle the "Active" switch to ON. The triggers will begin polling for new emails according to their schedule (e.g., every minute), and the automation will start processing incoming messages. --- Need help customizing? Contact me for consulting and support or add me on Linkedin.

DavideBy Davide
1616