Scrape ProductHunt using Google Gemini
Workflow Description: Product Data Extractor This workflow automates the extraction of product data from Product Hunt by combining webhook interactions, HTML processing, AI-based data analysis, and structured output formatting. It is designed to handle incoming requests dynamically and return detailed JSON responses for further usage. Overview The workflow processes a product name submitted through a webhook. It fetches the corresponding Product Hunt page, extracts and analyzes inline scripts, and structures the data into a well-defined JSON format using AI tools. The final JSON response is returned to the client through the webhook. Workflow Steps Webhook Listener Node: Receive Product Request Function: Captures incoming requests containing the product name to process. Details: Accepts HTTP requests and extracts the product parameter from the query string, such as <customwebhookurl>/?product=epigram. Fetch Product HTML Node: Fetch Product HTML Function: Sends an HTTP request to retrieve the HTML content of the specified Product Hunt page. Details: Constructs a dynamic URL using the product name and fetches the page data. Extract Inline Scripts Node: Extract Inline Scripts Function: Parses the HTML content to extract inline scripts located within the <head> section. Details: Excludes scripts containing src attributes and validates the presence of inline scripts. Process Data with LLM Node: Process Script with LLM Function: Analyzes the extracted scripts using a language model to identify key product data. Details: Processes the script to derive structured and meaningful insights. Refine Data with Google Gemini Node: Analyze Script with Google Gemini Function: Leverages Google Gemini AI for enhanced analysis of script data. Details: Ensures the extracted data is precise and enriched. Format Product Data to JSON Node: Format Product Data to JSON Function: Structures the processed data into a clean JSON format. Details: Defines a schema to ensure all relevant fields are included in the output. Send JSON Response to Client Node: Send JSON Response to Client Function: Returns the final structured JSON response to the client. Details: Sends the response back via the same webhook that initiated the request. For example, <customwebhookurl>. Key Features Versatile Use Cases: This workflow can be used to gather Product Hunt data for creating blog posts or as a tool for AI agents to research products efficiently. Dynamic Processing: Adapts to various product names through dynamic URL construction. AI Integration: Utilizes the Gemini 1.5 8B AI model, offering reduced latency and minimal or no cost depending on the use case. Selector Independence: Functions even if Product Hunt's DOM structure changes, as it does not rely on direct DOM selectors. Reliable Data Output: A low temperature setting (0) and a precisely defined JSON schema ensure accurate and real data extraction. Dynamic Processing: Adapts to various product names through dynamic URL construction. AI Integration: Utilizes advanced language models for data extraction and refinement. Structured Output: Ensures the output JSON adheres to a predefined schema for consistency. Error Handling: Includes validations to handle missing or malformed data gracefully. Customization Options Limitations Dependency on Product Hunt: Significant changes to the way Product Hunt loads data on its pages might require modifications to the workflow. Adaptability: Even if changes occur, the workflow can be updated to maintain functionality due to its reliance on AI and not direct DOM selectors. Modify the webhook path to suit your application. Adjust the prompt for the language model to include additional fields. Extend the JSON schema to capture more data fields as needed. Expected Output Performance Metrics Response Time: Typically ~6 seconds per product. Accuracy: Data extracted with >95% precision due to the pre-defined JSON schema. A JSON object containing detailed information about the specified product. Below is an example of a complete response for the product Epigram: json { "id": "861675", "slug": "epigram", "followersCount": 181, "name": "Epigram", "tagline": "Open-Source, Free, and AI-Powered News in Short", "reviewsRating": 0, "logoUuid": "735c2528-554c-467c-9dcf-745ee4b8bbdd.png", "postsCount": 1, "websiteUrl": "https://epigram.news", "websiteDomain": "epigram.news", "metaTitle": "Epigram - Open-source, free, and ai-powered news in short", "postName": "Epigram", "postTagline": "Open-source, free, and ai-powered news in short", "dailyRank": "3", "description": "An open-source, AI-powered news app for busy people. Stay updated with bite-sized news, real-time updates, and in-depth analysis. Experience balanced, trustworthy reporting tailored for fast-paced lifestyles in a sleek, user-friendly interface.", "pricingType": "free", "userName": "Fazle Rahman", "userHeadline": "Co-founder & CEO, Hashnode", "userUsername": "fazlerocks", "userAvatarUrl": "https://ph-avatars.imgix.net/129147/f84e1796-548b-4d6f-9dcf-745ee4b8bbdd.jpeg", "makerName1": "Fazle Rahman", "makerHeadline1": "Co-founder & CEO, Hashnode", "makerUsername1": "fazlerocks", "makerAvatarUrl1": "https://ph-avatars.imgix.net/129147/f84e1796-548b-4d6f-9dcf-745ee4b8bbdd.jpeg", "makerName2": "Sandeep Panda", "makerHeadline2": "Co-Founder @ Hashnode", "makerUsername2": "sandeepg33k", "makerAvatarUrl2": "https://ph-avatars.imgix.net/101872/80b0b618-a540-4110-a6d1-74df39675ad0.jpeg", "primaryLinkUrl": "https://epigram.news/", "media1OriginalHeight": 1080, "media1OriginalWidth": 1440, "media1ImageUuid": "ac426fd1-3854-4734-b43d-34a5e06347ea.gif", "media1MediaType": "video", "media1MetadataUrl": "https://www.loom.com/share/b1a48a9b3cac4ba89ce772a3fbcc2847?sid=75efc771-25fa-4ac0-bb1b-5e38fc447deb", "media1VideoId": "b1a48a9b3cac4ba89ce772a3fbcc2847", "media2OriginalHeight": 630, "media2OriginalWidth": 1200, "media2ImageUuid": "8521a6bd-7640-487b-abd6-29b9f65fee32", "media2MediaType": "image", "media2MetadataUrl": null, "launchState": "featured", "thumbnailImageUuid": "735c2528-554c-467c-9dcf-745ee4b8bbdd.png", "link1StoreName": "Website", "link1WebsiteName": "epigram.news", "link2StoreName": "Github", "link2WebsiteName": "github.com", "latestScore": 233, "launchDayScore": 233, "userId": "129147", "topic1": "News", "topic2": "Open Source", "topic3": "Artificial Intelligence", "weeklyRank": "24", "commentsCount": 20, "postUrl": "https://www.producthunt.com/posts/epigram" } Target Audience This workflow is ideal for developers, marketers, and data analysts seeking to automate the extraction and structuring of product data from Product Hunt for analytics, reporting, or integration with other tools.
Automated candidate screening & response using GPT-4, Mistral OCR and Slack notifications
📊 Description Streamline your HR recruitment process with this intelligent automation that reads candidate emails and resumes, analyzes them using GPT-4, and automatically shortlists or rejects applicants based on skill and experience match. 📩🤖 The workflow updates your HR Google Sheet with detailed AI evaluations, notifies recruiters on Slack about high-scoring candidates, and sends personalized shortlist or rejection emails to applicants — all in one seamless flow. 🚀 What This Template Does 1️⃣ Trigger – Monitors the HR Gmail inbox for new job applications with attachments. 📬 2️⃣ Extracts Resume Data – Uploads attached resumes to Mistral OCR to extract text for analysis. 📄 3️⃣ Combines Inputs – Merges candidate email data and resume content for complete context. 🔗 4️⃣ AI Evaluation – GPT-4 analyzes the candidate’s qualifications against job requirements in a connected Google Sheet. 🧠 5️⃣ Scoring & Recommendation – Generates a structured JSON output with job fit summary, skill match, AI score, and recommendation (Shortlist or Reject). 📊 6️⃣ Record Update – Logs AI evaluation results in a Google Sheet for centralized tracking. 📋 7️⃣ Communication – Sends professional shortlist or rejection emails to applicants via Gmail. 💌 8️⃣ Team Alert – Notifies HR on Slack when a high-scoring candidate is detected. 🔔 Key Benefits ✅ Saves hours of manual resume screening and sorting ✅ Ensures consistent, unbiased candidate evaluation ✅ Provides detailed AI-driven insights for every applicant ✅ Automates communication and record-keeping ✅ Improves HR productivity and response speed Features Gmail trigger for new candidate emails Resume text extraction via Mistral OCR API GPT-4–powered resume and email evaluation Integration with Google Sheets for HR requirement mapping Slack notifications for shortlisted candidates Automated shortlist/rejection emails with custom templates Structured AI output for analytics and reporting Requirements Gmail OAuth2 credentials for inbox and email automation Google Sheets OAuth2 credentials with edit access OpenAI API key (GPT-4 or GPT-4o-mini) Slack Bot token with chat:write permissions Mistral AI OCR API key for resume text extraction Target Audience HR and recruitment teams managing large applicant volumes 🧑💼 Talent acquisition managers looking for AI-driven screening 🤖 Organizations standardizing hiring communication 💬 Agencies building automated candidate evaluation systems 📈 Step-by-Step Setup Instructions 1️⃣ Connect your Gmail account and configure the inbox trigger. 2️⃣ Add Mistral API credentials for resume OCR extraction. 3️⃣ Set up your Google Sheet with job role requirements and access credentials. 4️⃣ Add OpenAI credentials (GPT-4 or GPT-4o-mini) for AI evaluation. 5️⃣ Configure Slack credentials and HR channel ID for alerts. 6️⃣ Test with a sample application to ensure correct data mapping. 7️⃣ Activate the workflow to start automated recruitment processing. ✅
Extract website intelligence & classify ecommerce URLs with Gemini & Firecrawl to Google Sheets
Description This n8n template automates website analysis and ecommerce URL classification using AI. It scrapes a website, extracts business intelligence, maps all internal pages, and categorises them into products, categories, or non-commerce pages. All outputs are saved in Google Sheets for easy access. --- Use cases Lead enrichment for sales and marketing teams Ecommerce product & category discovery Competitor website analysis Website audits and content mapping Market and industry research --- How it works A user submits a website URL via an n8n form. The homepage is scraped and cleaned. AI extracts company insights (value proposition, industry, audience, B2B/B2C). Firecrawl maps all internal URLs. URLs are enriched with metadata. AI classifies each URL as product, category, or other. Results are written into structured Google Sheets tabs. --- How to use Import the workflow into n8n. Connect Google Sheets, Firecrawl, and AI credentials. Update the Google Sheets document links. Open the form URL and submit a website. Let the workflow run and review the results in Sheets. --- Requirements n8n (self-hosted or cloud) Firecrawl API key Google Gemini or compatible LLM credentials Google Sheets account --- Customising this workflow Change AI prompts to match your niche (SaaS, ecommerce, services). Add filters to exclude unwanted URLs (blogs, legal pages, etc.). Extend Sheets with scoring, tagging, or lead qualification logic. Replace the LLM with another supported model if needed. --- What this template demonstrates End-to-end website intelligence extraction Safe, rule-based AI classification (no hallucinations) Scalable URL processing with batching Clean data pipelines into Google Sheets Practical AI usage for real business workflows This template is designed to work out-of-the-box for website intelligence, ecommerce mapping, and lead research. Feel free to reach out for custom implementation or enhancements: 📧 Email: @dinakars2003@gmail.com