Back to Catalog

Dynamic AI web researcher: From plain text to custom CSV with GPT-4 and Linkup

Guillaume DuvernayGuillaume Duvernay
325 views
2/3/2026
Official Page

This template introduces a revolutionary approach to automated web research. Instead of a rigid workflow that can only find one type of information, this system uses a "thinker" and "doer" AI architecture. It dynamically interprets your plain-English research request, designs a custom spreadsheet (CSV) with the perfect columns for your goal, and then deploys a web-scraping AI to fill it out.

It's like having an expert research assistant who not only finds the data you need but also builds the perfect container for it on the fly. Whether you're looking for sales leads, competitor data, or market trends, this workflow adapts to your request and delivers a perfectly structured, ready-to-use dataset every time.

Who is this for?

  • Sales & marketing teams: Generate targeted lead lists, compile competitor analysis, or gather market intelligence with a simple text prompt.
  • Researchers & analysts: Quickly gather and structure data from the web for any topic without needing to write custom scrapers.
  • Entrepreneurs & business owners: Perform rapid market research to validate ideas, find suppliers, or identify opportunities.
  • Anyone who needs structured data: Transform unstructured, natural language requests into clean, organized spreadsheets.

What problem does this solve?

  • Eliminates rigid, single-purpose workflows: This workflow isn't hardcoded to find just one thing. It dynamically adapts its entire research plan and data structure based on your request.
  • Automates the entire research process: It handles everything from understanding the goal and planning the research to executing the web search and structuring the final data.
  • Bridges the gap between questions and data: It translates your high-level goal (e.g., "I need sales leads") into a concrete, structured spreadsheet with all the necessary columns (Company Name, Website, Key Contacts, etc.).
  • Optimizes for cost and efficiency: It intelligently uses a combination of deep-dive and standard web searches from Linkup.so to gather high-quality initial results and then enrich them cost-effectively.

How it works (The "Thinker & Doer" Method)

The process is cleverly split into two main phases:

  1. The "Thinker" (AI Planner): You submit a research request via the built-in form (e.g., "Find 50 US-based fashion companies for a sales outreach campaign").
    • The first AI node acts as the "thinker." It analyzes your request and determines the optimal structure for your final spreadsheet.
    • It dynamically generates a plan, which includes a discoveryQuery to find the initial list, an enrichmentQuery to get details for each item, and the JSON schemas that define the exact columns for your CSV.
  2. The "Doer" (AI Researcher): The rest of the workflow is the "doer," which executes the plan.
    • Discovery: It uses a powerful "deep search" with Linkup.so to execute the discoveryQuery and find the initial list of items (e.g., the 50 fashion companies).
    • Enrichment: It then loops through each item in the list. For each one, it performs a fast and cost-effective "standard search" with Linkup to execute the enrichmentQuery, filling in all the detailed columns defined by the "thinker."
    • Final Output: The workflow consolidates all the enriched data and converts it into a final CSV file, ready for download or further processing.

Setup

  1. Connect your AI provider: In the OpenAI Chat Model node, add your AI provider's credentials.
  2. Connect your Linkup account: In the two Linkup (HTTP Request) nodes, add your Linkup API key (free account at linkup.so). We recommend creating a "Generic Credential" of type "Bearer Token" for this. Linkup offers €5 of free credits monthly, which is enough for 1k standard searches or 100 deep queries.
  3. Activate the workflow: Toggle the workflow to "Active." You can now use the form to submit your first research request!

Taking it further

  • Add a custom dashboard: Replace the form trigger and final CSV output with a more polished user experience. For example, build a simple web app where users can submit requests and download their completed research files.
  • Make it company-aware: Modify the "thinker" AI's prompt to include context about your company. This will allow it to generate research plans that are automatically tailored to finding leads or data relevant to your specific products and services.
  • Add an AI summary layer: After the CSV is generated, add a final AI node to read the entire file and produce a high-level summary, such as "Here are the top 5 leads to contact first and why," turning the raw data into an instant, actionable report.

Dynamic AI Web Researcher: From Plain Text to Custom CSV with GPT-4 and LinkUp

This n8n workflow automates the process of performing web research based on a plain text input, leveraging advanced AI capabilities (GPT-4) for summarization and a web scraping service (LinkUp) to gather relevant information. The results are then compiled into a structured CSV file.

What it does

This workflow streamlines your research process by:

  1. Receiving a research query: It starts by accepting a plain text input, likely a research topic or a list of items to research, via an n8n form.
  2. Preparing the query for AI: The input is processed to ensure it's in a suitable format for the AI model.
  3. Generating search terms with AI: A Basic LLM Chain (likely powered by GPT-4 via OpenAI Chat Model) takes the initial query and generates specific search terms or research questions.
  4. Performing web searches: For each generated search term, an HTTP Request node (likely configured to interact with a web scraping or search API like LinkUp) fetches relevant web data.
  5. Extracting and structuring data: The raw web data is then processed and transformed into a structured format. This might involve parsing JSON responses or extracting specific fields.
  6. Summarizing and refining with AI: The gathered web data is fed back into the Basic LLM Chain (GPT-4) to summarize findings, extract key insights, or answer specific questions based on the research.
  7. Structuring output for CSV: The AI-processed data is further refined and prepared for tabular output using Edit Fields (Set) and Split Out nodes.
  8. Generating a CSV file: Finally, the structured data is converted into a CSV file, ready for download or further use.

Prerequisites/Requirements

To use this workflow, you will need:

  • n8n instance: A running n8n instance to import and execute the workflow.
  • OpenAI API Key: For the OpenAI Chat Model node (likely GPT-4). You will need to configure an OpenAI credential in n8n.
  • Web Scraping/Search API Key: An API key for a web scraping or search service (e.g., LinkUp, Google Search API, etc.) that the HTTP Request node is configured to use. You will need to configure the appropriate credential in n8n.

Setup/Usage

  1. Import the workflow: Download the workflow JSON and import it into your n8n instance.
  2. Configure Credentials:
    • Set up your OpenAI API Key credential within n8n.
    • Set up the credential for your Web Scraping/Search API (e.g., LinkUp) within n8n.
  3. Activate the workflow: Ensure the workflow is active.
  4. Access the n8n Form Trigger: The workflow is triggered by an n8n form. You can find the URL for this form by clicking on the "n8n Form Trigger" node and looking for the "Webhook URL" or "Form URL" in its settings.
  5. Submit your research query: Open the form URL in your browser, enter your plain text research query, and submit it.
  6. Receive the CSV: After the workflow runs, the generated CSV file will be available in the output of the final Convert to File node. Depending on your n8n setup, you might download it directly from the execution logs or configure a subsequent node to send it to a specific destination (e.g., email, cloud storage).

Related Templates

Synchronizing WooCommerce inventory and creating products with Google Gemini AI and BrowserAct

Synchronize WooCommerce Inventory & Create Products with Gemini AI & BrowserAct This sophisticated n8n template automates WooCommerce inventory management by scraping supplier data, updating existing products, and intelligently creating new ones with AI-formatted descriptions. This workflow is essential for e-commerce operators, dropshippers, and inventory managers who need to ensure their product pricing and stock levels are synchronized with multiple third-party suppliers, minimizing overselling and maximizing profit. --- Self-Hosted Only This Workflow uses a community contribution and is designed and tested for self-hosted n8n instances only. --- How it works The workflow is typically run by a Schedule Trigger (though a Manual Trigger is also shown) to check stock automatically. It reads a list of suppliers and their inventory page URLs from a central Google Sheet. The workflow loops through each supplier: A BrowserAct node scrapes the current stock and price data from the supplier's inventory page. A Code node parses this bulk data into individual product items. It then loops through each individual product found. The workflow checks WooCommerce to see if the product already exists based on its name. If the product exists: It proceeds to update the existing product's price and stock quantity. If the product DOES NOT exist: An If node checks if the missing product's category matches a predefined type (optional filtering). If it passes the filter, a second BrowserAct workflow scrapes detailed product attributes from a dedicated product page (e.g., DigiKey). An AI Agent (Gemini) transforms these attributes into a specific, styled HTML table for the product description. Finally, the product is created in WooCommerce with all scraped details and the AI-generated description. Error Handling: Multiple Slack nodes are configured to alert your team immediately if any scraping task fails or if the product update/creation process encounters an issue. Note: This workflow does not support image uploads for new products. To enable this functionality, you must modify both the n8n and BrowserAct workflows. --- Requirements BrowserAct API account for web scraping BrowserAct n8n Community Node -> (n8n Nodes BrowserAct) BrowserAct templates named “WooCommerce Inventory & Stock Synchronization” and “WooCommerce Product Data Reconciliation” Google Sheets credentials for the supplier list WooCommerce credentials for product management Google Gemini account for the AI Agent Slack credentials for error alerts --- Need Help? How to Find Your BrowseAct API Key & Workflow ID How to Connect n8n to Browseract How to Use & Customize BrowserAct Templates How to Use the BrowserAct N8N Community Node --- Workflow Guidance and Showcase STOP Overselling! Auto-Sync WooCommerce Inventory from ANY Supplier

Madame AI Team | KaiBy Madame AI Team | Kai
600

Ai website scraper & company intelligence

AI Website Scraper & Company Intelligence Description This workflow automates the process of transforming any website URL into a structured, intelligent company profile. It's triggered by a form, allowing a user to submit a website and choose between a "basic" or "deep" scrape. The workflow extracts key information (mission, services, contacts, SEO keywords), stores it in a structured Supabase database, and archives a full JSON backup to Google Drive. It also features a secondary AI agent that automatically finds and saves competitors for each company, building a rich, interconnected database of company intelligence. --- Quick Implementation Steps Import the Workflow: Import the provided JSON file into your n8n instance. Install Custom Community Node: You must install the community node from: https://www.npmjs.com/package/n8n-nodes-crawl-and-scrape FIRECRAWL N8N Documentation https://docs.firecrawl.dev/developer-guides/workflow-automation/n8n Install Additional Nodes: n8n-nodes-crawl-and-scrape and n8n-nodes-mcp fire crawl mcp . Set up Credentials: Create credentials in n8n for FIRE CRAWL API,Supabase, Mistral AI, and Google Drive. Configure API Key (CRITICAL): Open the Web Search tool node. Go to Parameters → Headers and replace the hardcoded Tavily AI API key with your own. Configure Supabase Nodes: Assign your Supabase credential to all Supabase nodes. Ensure table names (e.g., companies, competitors) match your schema. Configure Google Drive Nodes: Assign your Google Drive credential to the Google Drive2 and save to Google Drive1 nodes. Select the correct Folder ID. Activate Workflow: Turn on the workflow and open the Webhook URL in the “On form submission” node to access the form. --- What It Does Form Trigger Captures user input: “Website URL” and “Scraping Type” (basic or deep). Scraping Router A Switch node routes the flow: Deep Scraping → AI-based MCP Firecrawler agent. Basic Scraping → Crawlee node. Deep Scraping (Firecrawl AI Agent) Uses Firecrawl and Tavily Web Search. Extracts a detailed JSON profile: mission, services, contacts, SEO keywords, etc. Basic Scraping (Crawlee) Uses Crawl and Scrape node to collect raw text. A Mistral-based AI extractor structures the data into JSON. Data Storage Stores structured data in Supabase tables (companies, company_basicprofiles). Archives a full JSON backup to Google Drive. Automated Competitor Analysis Runs after a deep scrape. Uses Tavily web search to find competitors (e.g., from Crunchbase). Saves competitor data to Supabase, linked by company_id. --- Who's It For Sales & Marketing Teams: Enrich leads with deep company info. Market Researchers: Build structured, searchable company databases. B2B Data Providers: Automate company intelligence collection. Developers: Use as a base for RAG or enrichment pipelines. --- Requirements n8n instance (self-hosted or cloud) Supabase Account: With tables like companies, competitors, social_links, etc. Mistral AI API Key Google Drive Credentials Tavily AI API Key (Optional) Custom Nodes: n8n-nodes-crawl-and-scrape --- How It Works Flow Summary Form Trigger: Captures “Website URL” and “Scraping Type”. Switch Node: deep → MCP Firecrawler (AI Agent). basic → Crawl and Scrape node. Scraping & Extraction: Deep path: Firecrawler → JSON structure. Basic path: Crawlee → Mistral extractor → JSON. Storage: Save JSON to Supabase. Archive in Google Drive. Competitor Analysis (Deep Only): Finds competitors via Tavily. Saves to Supabase competitors table. End: Finishes with a No Operation node. --- How To Set Up Import workflow JSON. Install community nodes (especially n8n-nodes-crawl-and-scrape from npm). Configure credentials (Supabase, Mistral AI, Google Drive). Add your Tavily API key. Connect Supabase and Drive nodes properly. Fix disconnected “basic” path if needed. Activate workflow. Test via the webhook form URL. --- How To Customize Change LLMs: Swap Mistral for OpenAI or Claude. Edit Scraper Prompts: Modify system prompts in AI agent nodes. Change Extraction Schema: Update JSON Schema in extractor nodes. Fix Relational Tables: Add Items node before Supabase inserts for arrays (social links, keywords). Enhance Automation: Add email/slack notifications, or replace form trigger with a Google Sheets trigger. --- Add-ons Automated Trigger: Run on new sheet rows. Notifications: Email or Slack alerts after completion. RAG Integration: Use the Supabase database as a chatbot knowledge source. --- Use Case Examples Sales Lead Enrichment: Instantly get company + competitor data from a URL. Market Research: Collect and compare companies in a niche. B2B Database Creation: Build a proprietary company dataset. --- WORKFLOW IMAGE --- Troubleshooting Guide | Issue | Possible Cause | Solution | |-------|----------------|-----------| | Form Trigger 404 | Workflow not active | Activate the workflow | | Web Search Tool fails | Missing Tavily API key | Replace the placeholder key | | FIRECRAWLER / find competitor fails | Missing MCP node | Install n8n-nodes-mcp | | Basic scrape does nothing | Switch node path disconnected | Reconnect “basic” output | | Supabase node error | Wrong table/column names | Match schema exactly | --- Need Help or More Workflows? Want to customize this workflow for your business or integrate it with your existing tools? Our team at Digital Biz Tech can tailor it precisely to your use case from automation logic to AI-powered enhancements. Contact: shilpa.raju@digitalbiz.tech For more such offerings, visit us: https://www.digitalbiz.tech ---

DIGITAL BIZ TECHBy DIGITAL BIZ TECH
923

Tax deadline management & compliance alerts with GPT-4, Google Sheets & Slack

AI-Driven Tax Compliance & Deadline Management System Description Automate tax deadline monitoring with AI-powered insights. This workflow checks your tax calendar daily at 8 AM, uses GPT-4 to analyze upcoming deadlines across multiple jurisdictions, detects overdue and critical items, and sends intelligent alerts via email and Slack only when immediate action is required. Perfect for finance teams and accounting firms who need proactive compliance management without manual tracking. 🏛️🤖📊 Good to Know AI-Powered: GPT-4 provides risk assessment and strategic recommendations Multi-Jurisdiction: Handles Federal, State, and Local tax requirements automatically Smart Alerts: Only notifies executives when deadlines are overdue or critical (≤3 days) Priority Classification: Categorizes deadlines as Overdue, Critical, High, or Medium priority Dual Notifications: Critical alerts to leadership + daily summaries to team channel Complete Audit Trail: Logs all checks and deadlines to Google Sheets for compliance records How It Works Daily Trigger - Runs at 8:00 AM every morning Fetch Data - Pulls tax calendar and company configuration from Google Sheets Analyze Deadlines - Calculates days remaining, filters by jurisdiction/entity type, categorizes by priority AI Analysis - GPT-4 provides strategic insights and risk assessment on upcoming deadlines Smart Routing - Only sends alerts if overdue or critical deadlines exist Critical Alerts - HTML email to executives + Slack alert for urgent items Team Updates - Slack summary to finance channel with all upcoming deadlines Logging - Records compliance check results to Google Sheets for audit trail Requirements Google Sheets Structure Sheet 1: TaxCalendar DeadlineID | DeadlineName | DeadlineDate | Jurisdiction | Category | AssignedTo | IsActive FED-Q1 | Form 1120 Q1 | 2025-04-15 | Federal | Income | John Doe | TRUE Sheet 2: CompanyConfig (single row) Jurisdictions | EntityType | FiscalYearEnd Federal, California | Corporation | 12-31 Sheet 3: ComplianceLog (auto-populated) Date | AlertLevel | TotalUpcoming | CriticalCount | OverdueCount 2025-01-15 | HIGH | 12 | 3 | 1 Credentials Needed Google Sheets - Service Account OAuth2 OpenAI - API Key (GPT-4 access required) SMTP - Email account for sending alerts Slack - Bot Token with chat:write permission Setup Steps Import workflow JSON into n8n Add all 4 credentials Replace these placeholders: YOURTAXCALENDAR_ID - Tax calendar sheet ID YOURCONFIGID - Company config sheet ID YOURLOGID - Compliance log sheet ID C12345678 - Slack channel ID tax@company.com - Sender email cfo@company.com - Recipient email Share all sheets with Google service account email Invite Slack bot to channels Test workflow manually Activate the trigger Customizing This Workflow Change Alert Thresholds: Edit "Analyze Deadlines" node: Critical: Change <= 3 to <= 5 for 5-day warning High: Change <= 7 to <= 14 for 2-week notice Medium: Change <= 30 to <= 60 for 2-month lookout Adjust Schedule: Edit "Daily Tax Check" trigger: Change hour/minute for different run time Add multiple trigger times for tax season (8 AM, 2 PM, 6 PM) Add More Recipients: Edit "Send Email" node: To: cfo@company.com, director@company.com CC: accounting@company.com BCC: archive@company.com Customize Email Design: Edit "Format Email" node to change colors, add logo, or modify layout Add SMS Alerts: Insert Twilio node after "Is Critical" for emergency notifications Integrate Task Management: Add HTTP Request node to create tasks in Asana/Jira for critical deadlines Troubleshooting | Issue | Solution | |-------|----------| | No deadlines found | Check date format (YYYY-MM-DD) and IsActive = TRUE | | AI analysis failed | Verify OpenAI API key and account credits | | Email not sending | Test SMTP credentials and check if critical condition met | | Slack not posting | Invite bot to channel and verify channel ID format | | Permission denied | Share Google Sheets with service account email | 📞 Professional Services Need help with implementation or customization? Our team offers: 🎯 Custom workflow development 🏢 Enterprise deployment support 🎓 Team training sessions 🔧 Ongoing maintenance 📊 Custom reporting & dashboards 🔗 Additional API integrations Discover more workflows – Get in touch with us

Oneclick AI SquadBy Oneclick AI Squad
93