Back to Catalog
Hybroht

Hybroht

Company dedicated to delivering tailored software solutions and data-driven experiences through effective technology. We develop workflows leveraging AI agents to maximize the productive benefits of artificial intelligence.

Total Views6,249
Templates6

Templates by Hybroht

Tech news_curator - analyze daily news using AI_agents with Mistral

Using Mistral API, you can use this n8n workflow to automate the process of: collecting, filtering, analyzing, and summarizing news articles from multiple sources. The sources come from pre-built RSS feeds and a custom DuckDuckGo node, which you can change if you need. It will deliver the most relevant news of the day in a concise manner. ++How It Works++ The workflow begins each weekday at noon. The news are gathered from RSS feeds and a custom DuckDuckGo node, using HTTPS GET when needed. News not from today or containing unwanted keywords are filtered out. The first AI Agent will select the top news from their titles alone and generate a general title & summary. The next AI Agent will summarize the full content of the selected top news articles. The general summary and title will be combined with the top 10 news summaries into a final output. ++Requirements++ An active n8n instance (self-hosted or cloud). Install the custom DuckDuckGo node: n8n-nodes-duckduckgo-search A Mistral API key Configure the Sub-Workflow for the content which requires HTTP GET requests. It is provided in the template itself. ++Fair Notice++ This is an older version of the template. There is a superior updated version which isn't restricted to tech news, with enhanced capabilities such as communication through different channels (email, social media) and advanced keyword filtering. It was recently published in n8n. You can find it here. If you are interested or would like to discuss specific needs, then feel free to contact us.

HybrohtBy Hybroht
1815

Simulate debates between AI agents using Mistral to optimize answers

This workflow contains community nodes that are only compatible with the self-hosted version of n8n. AI Arena - Debate of AI Agents to Optimize Answers and Simulate Diverse Scenarios Overview Version: 1.0 The AI Arena Workflow is designed to facilitate a refined answer generation process by enabling a structured debate among multiple AI agents. This workflow allows for diverse perspectives to be considered before arriving at a final output, enhancing the quality and depth of the generated responses. ✨ Features Multi-Agent Debate Simulation: Engage multiple AI agents in a debate to generate nuanced responses. Configurable Rounds and Agents: Easily adjust the number of debate rounds and participating agents to fit your needs. Contextualized AI Responses: Each agent operates based on predefined roles and characteristics, ensuring relevant and focused discussions. JSON Output: The final output is structured in JSON format, making it easy to integrate with other systems or workflows. πŸ‘€ Who is this for? This workflow is ideal for developers, data scientists, content creators, and businesses looking to leverage AI for decision-making, content generation, or any scenario requiring diverse viewpoints. It is particularly useful for those who need to synthesize information from multiple personalities or perspectives. πŸ’‘ What problem does this solve? The workflow addresses the challenge of generating nuanced responses by simulating a debate among AI agents. This approach ensures that multiple perspectives are considered, reducing bias and enhancing the overall quality of the output. Use-Case examples: πŸ—“οΈ Meeting/Interview Simulation βœ”οΈ Quality Assurance πŸ“– Storywriter Test Environment πŸ›οΈ Forum/Conference/Symposium Simulation πŸ” What this workflow does The workflow orchestrates a debate among AI agents, allowing them to discuss, critique, and suggest rewrites for a given input based on their roles and predefined characteristics. This collaborative process leads to a more refined and comprehensive final output. πŸ”„ Workflow Steps Input & Setup: The initial input is provided, and the AI environment is configured with necessary parameters. Round Execution: AI agents execute their roles, providing replies and actions based on the input and their individual characteristics. Round Results: The results of each round are aggregated, and a summary is created to capture the key points discussed by the agents. Continue to Next Round: If more rounds are defined, the process repeats until the specified number of rounds is completed. Final Output: The final output is generated based on the agents' discussions and suggestions, providing a cohesive response. ⚑ How to Use/Setup πŸ” Credentials Obtain an API key for the Mistral API or another LLM API. This key is necessary for the AI agents to function properly. πŸ”§ Configuration Set up the workflow in n8n, ensuring all nodes are correctly configured according to the workflow requirements. This includes setting the appropriate input parameters and defining the roles of each AI agent. This workflow uses a custom node for Global Variables called 'n8n-nodes-globals.' Alternatively, you can use the 'Edit Field (Set)' node to achieve the same functionality. ✏️ Customizing this workflow To customize the workflow, adjust the AI agent parameters in the JSON configuration. This includes defining their roles, personalities, and preferences, which will influence how they interact during the debate. One of the notes includes a ready-to-use example of how to customize the agents and the environment. You can simply edit it and insert it as your credential in the Global Variables node. πŸ“Œ Example An example with both input and final output is provided in a note within the workflow. πŸ› οΈ Tools Used n8n: A workflow automation tool that allows users to connect various applications and services. Mistral API: A powerful language model API used for generating AI responses. (You can replace it with any LLM API of your choice) Podman: A container management tool that allows users to create, manage, and run containers without requiring a daemon. (It serves as an alternative to Docker for container orchestration.) βš™οΈ n8n Setup Used n8n Version: 1.100.1 n8n-nodes-globals: 1.1.0 Running n8n via: Podman 4.3.1 Operating System: Linux ⚠️ Notes, Assumptions & Warnings Ensure that the AI agents are configured with clear roles to maximize the effectiveness of the debate. Each agent's characteristics should align with the overall goals of the workflow. The workflow can be adapted for various use cases, including meeting simulations, content generation, and brainstorming sessions. This workflow assumes that users have a basic understanding of n8n and JSON configuration. This workflow assumes that users have access to the necessary API keys and permissions to utilize the Mistral API or other LLM APIs. Ensure that the input provided to the AI agents is clear and concise to avoid confusion in the debate process. Ambiguous inputs may lead to unclear or irrelevant outputs. Monitor the output for relevance and accuracy, as AI-generated content may require human oversight to ensure it meets standards and expectations before being used in production. ℹ️ About Us This workflow was developed by the Hybroht team of AI enthusiasts and developers dedicated to enhancing the capabilities of AI through collaborative processes. Our goal is to create tools that harness the possibilities of AI technology and more.

HybrohtBy Hybroht
1690

Generate dynamic JSON output formats for AI agents with Mistral

This workflow contains community nodes that are only compatible with the self-hosted version of n8n. JSON Architect - Dynamically Generate JSON Output Formats for Any AI Agent Overview Version: 1.0 The JSON Architect Workflow is designed to instruct AI agents on the required JSON structure for a given context and create the appropriate JSON output format. This workflow ensures that the generated JSON is validated and tested, providing a reliable JSON output format for use in various applications. ✨ Features Dynamic JSON Generation: Automatically generate the JSON format based on the input requirements. Validation and Testing: Validate the generated JSON format and test its functionality, ensuring reliability before output. Iterative Improvement: If the generated JSON is invalid or fails testing, the workflow will attempt to regenerate it until successful or until a defined maximum number of rounds is reached. Structured Output: The final output is the generated JSON output format, making it easy to integrate with other systems or workflows. πŸ‘€ Who is this for? This workflow is ideal for developers, data scientists, and businesses that require dynamic JSON structures for the responses of AI agents. It is particularly useful for those involved in procedural generation, data interchange formats, configuration management and machine learning model input/output. πŸ’‘ What problem does this solve? The workflow addresses the challenge of generating optimal JSON structures by automating the process of creation, validation, and testing. This approach ensures that the JSON format is appropriate for its intended use, reducing errors and enhancing the overall quality of data interchange. Use-Case examples: πŸ”„ Data Interchange Formats πŸ› οΈ Procedural Generation πŸ“Š Machine Learning Model Input/Output βš™οΈ Configuration Management πŸ” What this workflow does The workflow orchestrates a process where AI agents generate, validate, and test JSON output formats based on the provided input. This approach leads to a more refined and functional JSON output parser. πŸ”„ Workflow Steps Input & Setup: The initial input is provided, and the workflow is configured with necessary parameters. Round Start: Initiates the round of JSON construction, ensuring the input is as expected. JSON Generation & Validation: Generates and validates the JSON output format according to the input. JSON Test: Verifies whether the generated JSON output format works as intended. Validation or Test Fails: If the JSON fails validation or testing, the process loops back to the Round Start for correction. Final Output: The final output is generated based on successful JSON construction, providing a cohesive response. πŸ“Œ Expected Input input: The input that requires a proper JSON structure. max_rounds: The maximum number of rounds before stopping the loop if it fails to produce and test a valid JSON structure. Suggested: 10. rounds: The initial number of rounds. Default: 0. πŸ“¦ Expected Output input: The original input used to create the JSON structure. jsonformatname: A snake_case identifier for the generated JSON format. Useful if you plan to reuse it for multiple AI agents or Workflows. jsonformatusage: A description of how to use the JSON output format in an input. Meant to be used by AI agents receiving the JSON output format in their output parser. jsonformatvalid_reason: The reason provided by the AI agents explaining why this JSON format works for the input. jsonformatstructure: The JSON format itself, intended for application through the Advanced JSON Output Parser custom node. jsonformatinput: The input after the JSON output format ( jsonformatstructure ) has been applied in an AI agent's output parser. πŸ“Œ Example An example that includes both the input and the final output is provided in a note within the workflow. βš™οΈ n8n Setup Used n8n Version: 1.100.1 n8n-nodes-advanced-output-parser: 1.0.1 Running n8n via: Podman 4.3.1 Operating System: Linux ⚑ Requirements to Use/Setup πŸ”πŸ”§ Credentials & Configuration Obtain the necessary LLM API key and permissions to utilize the workflow effectively. This workflow is dependent on a custom node for dynamically inputting JSON output formats called n8n-nodes-advanced-output-parser. You can find the repository here. Warning: As of 2025-07-09, the custom node creator has warned that this node is not production-ready. Beware when using it in production environments without being aware of its readiness. ⚠️ Notes, Assumptions & Warnings This workflow assumes that users have a basic understanding of n8n and JSON configuration. This workflow assumes that users have access to the necessary API keys and permissions to utilize the Mistral API or other LLM APIs. Ensure that the input provided to the AI agents is clear and concise to avoid confusion in the JSON generation process. Ambiguous inputs may lead to invalid or irrelevant JSON output formats. ℹ️ About Us This workflow was developed by the Hybroht team of AI enthusiasts and developers dedicated to enhancing the capabilities of AI through collaborative processes. Our goal is to create tools that harness the possibilities of AI technology and more.

HybrohtBy Hybroht
1645

Multi-platform source discovery with SerpAPI, DuckDuckGo, GitHub, Reddit & Bluesky

Source Discovery - Automatically Search More Up-to-Date Information Sources 🎬 Overview Version : 1.0 This workflow utilizes various nodes to discover and analyze potential sources of information from platforms like Google, Reddit, GitHub, Bluesky, and others. It is designed to streamline the process of finding relevant sources based on specified search themes. ✨ Features Automated source discovery from multiple platforms. Filtering of existing and undesired sources. Error handling for API requests. User-friendly configuration options. πŸ‘€ Who is this for? This workflow is ideal for researchers, content marketers, journalists, and anyone looking to efficiently gather and analyze information from various online sources. πŸ’‘ What problem does this solve? This workflow addresses the challenge of manually searching for relevant information sources, saving time and effort while ensuring that users have access to the most pertinent content. Ideal use-cases include: Resource Compilation for Academic and Educational Purposes Journalism and Research Content Marketing Competitor Analysis πŸ” What this workflow does The workflow gathers data from selected platforms through search terms. It filters out known and undesired sources, analyzes the content, and provides insights into potential sources relevant to the user's needs. πŸ”„ Workflow Steps Search Queries Fetch sources using SerpAPI search, DuckDuckGo, and Bluesky. Utilizes GitHub repositories to find relevant links. Leverages RSS feeds from subreddits to identify potential sources. Filtering Step Removes existing and undesired sources from the results. Source Selection Analyzes the content of the identified sources for relevance. πŸ“Œ Expected Input / Configuration The workflow is primarily configured via the Configure Workflow Args (Manual) node or the Global Variables custom node. Search themes: Keywords or phrases relevant to the desired content. Lists of known sources and undesired sources for filtering. πŸ“¦ Expected Output A curated list of potential sources relevant to the specified search themes, along with insights into their content. πŸ“Œ Example βš™οΈ n8n Setup Used n8n version: 1.105.3 n8n-nodes-serpapi: 0.1.6 n8n-nodes-globals: 1.1.0 n8n-nodes-bluesky-enhanced: 1.6.0 n8n-nodes-duckduckgo-search: 30.0.4 LLM Model: mistral-small-latest (API) Platform: Podman 4.3.1 on Linux Date: 2025-08-06 ⚑ Requirements to Use / Setup Self-hosted or cloud n8n instance. Install the following custom nodes: SerpAPI, Bluesky, and DuckDuckGo Search. n8n-nodes-serpapi n8n-nodes-duckduckgo-search n8n-nodes-bluesky-enhanced Install the Global Variables Node for enhanced configuration: n8n-nodes-globals (or use Edit Field (Set) node instead) Provide valid credentials to nodes for your preferred LLM model, SerpAPI, and Bluesky. Credentials for GitHub recommended. ⚠️ Notes, Assumptions \& Warnings Ensure compliance with the terms of service of any platforms accessed or discovered in this workflow, particularly concerning data usage and attribution. Monitor API usage to avoid hitting rate limits. The workflow may encounter errors such as 403 responses; in such cases, it will continue by ignoring the affected substep. Duplicate removal is applied, but occasional overlaps might still appear depending on the sources. This workflow assumes familiarity with n8n, APIs, and search engines. Using AI agents (Mistral or substitute LLMs) requires access to their API services and keys. This is not a Curator of News. It is designed to find websites that are relevant and useful to your searches. If you are looking for a relevant news selector, please check this workflow. ℹ️ About Us This workflow was developed by the Hybroht team. Our goal is to create tools that harness the possibilities of technology and more. We aim to continuously improve and expand functionalities based on community feedback and evolving use cases. For questions, reach out via contact@hybroht.com. --- βš–οΈ Warranty & Legal Notice This free workflow is provided "as-is" without any warranties of any kind, either express or implied, including but not limited to the implied warranties of merchantability, fitness for a particular purpose, or non-infringement. By using this workflow, you acknowledge that you do so at your own risk. We shall not be held responsible for any damages, losses, or liabilities arising from the use or inability to use this workflow, including but not limited to any direct, indirect, incidental, or consequential damages. It is your responsibility to ensure that your use of this workflow complies with all applicable laws and regulations. ---

HybrohtBy Hybroht
810

Multi-source news curator with Mistral AI analysis, summaries & custom channels

Flexible News Curator - Multi-Sources, AI Analysis, Summaries, Translation, and Settable Channels 🎬 Overview The Flexible News Curator workflow can automate the collection, filtering, AI-driven analysis, and summarization of news from diverse sources of your interest. Using customizable search themes, RSS feeds, and (optional) video descriptions, it delivers concise, quality news summaries via configurable channels. This workflow is designed to help reduce information overload and keep you updated effortlessly. Click the image below to watch the video guide: [](https://www.youtube.com/watch?v=ajtmTstc6Lo) ✨ Features Multi-Source News Aggregation: Collect news from customizable RSS feeds, SerpAPI, and Video Channel Feeds (if enabled). AI-Powered News Selection \& Summarization: Uses advanced AI agents (Mistral Cloud Chat Model by default) to select, analyze, and summarize top news. Quality Assurance Step: Optional AI-powered filtering to improve news selection quality before analysis. Multi-Language Translation \& Tone Customization: Translate summaries and customize tone for localized or tailored consumption. Multi-Channel Delivery: Send outputs via Email, Telegram, WhatsApp, Webhook, or save to disk. Advanced Filtering: Regex-based filtering on URLs, titles, and content to exclude unwanted articles. Sub-Workflow Architecture: Modular handling of video transcripts, content retrieval, multi-theme searching, and more. Flexible Scheduling \& Trigger Options: Supports schedule-based triggering, email (IMAP) triggers, and webhook-based activation. (Optional) Video Search: Video content descriptions via Video Channel Feeds. πŸ‘€ Who is this for? This workflow benefits professionals, researchers, marketers, and anyone who needs to stay informed about specific news themes without wasting time on irrelevant information or reading too many news to select the most interesting ones. πŸ’‘ What problem does this solve? The workflow tackles the challenge of information overload by automatically filtering, summarizing, and delivering the essential news tailored to your interests and preferences. It integrates various data sources and channels for comprehensive yet efficient news consumption. Ideal use-cases include: Monitoring breakthroughs in research fields Receiving daily business opportunity updates (e.g., real estate) Lower the cognitive load required to follow your favorite news Translating news summaries to your chosen language πŸ” What this workflow does The workflow gathers news from RSS feeds, search engines, and social media, then: Filters duplicates and irrelevant content via custom regex filters and date ranges. Applies optional AI-powered Quality Assurance for headline evaluation. Selects top news articles with AI analysis focused on user-defined criteria and audience. Summarizes individual articles using AI summarization agents, ensuring structured, consistent outputs. Optionally translates and adjusts the tone of summaries. Distributes summaries through configured channels such as email, social media, messaging apps, or webhook calls. πŸ”„ Workflow Steps News Gathering Fetch news using RSS feeds, SerpAPI search, and optionally video channel feeds. Standardize output structures for seamless merging. Employ sub-workflows for video transcript retrieval and looping over custom RSS feed lists. Filtering Remove duplicates and news outside the specified date range. Exclude articles matching user-defined keywords via regex filters in URLs, titles, and content. Limit the number of news articles for AI analysis. News Selection Optionally invoke an AI Quality Assurance agent to pre-filter headlines. Aggregate news for AI analysis. Select and summarize top news articles with AI agents using customizable criteria. Parse AI responses into defined JSON structures to ensure consistent data. News Summarization Prepare individual article content. Summarize content with AI agents and validate structured output. Sender Preparation Combine general summaries with selected top news summaries. Format final summaries as text and HTML suitable for delivery. Optionally apply translation and tone adjustment. Sending Deliver summaries through selected channels (Email, Telegram, WhatsApp, Webhook). Optionally save the output to disk as JSON. πŸ“Œ Expected Input / Configuration The workflow is primarily configured via the Configure Workflow Args node or the Global Variables custom node, with these key parameters: | Parameter | Description | Type | | :-- | :-- | :-- | | search_themes | List of keywords/themes to search in SerpAPI | List of strings | | datetime_delta | Number of days back to include news from; e.g., 0 = today | Integer | | link_censor | Regex to exclude unwanted URLs | Regex string | | title_censor | Regex to exclude unwanted titles | Regex string | | content_censor | Regex to exclude unwanted content | Regex string | | use_qa | Flag to enable AI Quality Assurance for headline filtering | Boolean | | maxnewsanalysis | Max number of articles sent to News Analyzer | Integer | | qamaxnews | Number of headlines the QA Agent analyzes | Integer | | qamaxtop_news | Number of headlines selected by QA Agent | Integer | | qacheckcriteria | Criteria used by QA Agent to discard low-quality headlines | List of strings | | qaselectcriteria | Criteria used by QA Agent to rank/select the best headlines | List of strings | | news_focus | What the News Analyzer should focus on while selecting news | String | | newstargetaudience | Target audience description for the News Analyzer | String | | news_criteria | Instructions for the News Analyzer to identify relevant news | List of strings | | language | Language for news summaries; triggers translation if not English | String | | translator_tone | Tone for translation (e.g., casual, professional) | String | | translator_notes | Additional instructions for the translator | String | | email_sender | Email address used for sending (via SMTP) | String | | email_recipients | Recipient email addresses (comma-separated) | String | | email_subject | Email subject line | String | | telegramchatid | Telegram chat ID for sending notifications | String | | phone_number | Phone number for WhatsApp messages | String | | rssfeeds | Custom list of RSS feeds (objects with link and needscontent_search properties) | JSON array of objects | | videorssfeeds | Custom list of video RSS feeds | JSON array of objects | | enablevideosearch | Enable/disable video search functionality | Boolean | | enabled_senders | List of enabled delivery channels (email, telegram, whatsapp, webhook, save-to-disk) | List of strings | Hint: To add or combine keywords in censors, use the pattern "keyword1|keyword2|keyword3". πŸ“¦ Expected Output Structured JSON containing a general news summary, top news with summaries, and metadata, suitable for your preferred channel delivery. πŸ“Œ Example An example that includes workflow parameters is provided in a note within the workflow. βš™οΈ n8n Setup Used n8n version: 1.100.1 n8n-nodes-serpapi: 0.1.6 n8n-nodes-globals: 1.1.0 LLM Model: mistral-small-latest (API) IMAP: imap.gmail.com (Port 993) Platform: Podman 4.3.1 on Linux Date: 2025-07-15 ⚑ Requirements to Use / Setup Self-hosted n8n instance. (This workflow contains community nodes that are only compatible with the self-hosted version of n8n.) Install necessary custom nodes: n8n-nodes-serpapi n8n-nodes-globals (or use Edit Field (Set) node instead) Configure all sub-workflows bundled within this template (see Sub-Workflows Guide). Provide valid credentials to nodes for SerpAPI, Telegram, WhatsApp, Mistral Cloud Chat API, and SMTP (for email). Custom RSS feed list must be set by you in the workflow args. You must either install the SerpAPI custom node or deactivate it. ⚠️ Notes, Assumptions \& Warnings The workflow timeout is set to 30 minutes by default; adjust depending on your setup and workload. Duplicate removal is applied, but occasional overlaps might still appear depending on feed sources. This workflow assumes familiarity with n8n, RSS feeds, API key management and regex expressions. Video search only works for configured video channels; remember to respect the rights of these channels. Using AI agents (Mistral or substitute LLMs) requires access to their API services and keys. Out-of-the-box customization is done via the Global Variables node or direct workflow argument edits. ℹ️ About Us This workflow was developed by the Hybroht team of AI enthusiasts and developers dedicated to enhancing the capabilities of AI through collaborative processes. Our goal is to create tools that harness the possibilities of AI technology and more. For questions, support, or feature requests, reach out via contact@hybroht.com. --- ❓ Questions & Issues We will answer any questions, provided they are related to this workflow. Please contact us if there is any bug/issue with this workflow. We will assist you. βš–οΈ Warranty & Legal Notice You can view the full license terms here. Please review them before making your purchase. By purchasing this product, you agree to these terms. ---

HybrohtBy Hybroht
258

Filter URLs with AI-powered robots.txt compliance & source verification

URL Officer - Respect robots.txt and Avoid Undesirable Sources 🎬 Overview Version : 1.0 The URL Officer workflow automates the filtering of URLs by checking them against a database of forbidden sources and the rules defined in robots.txt files. It proactively respects robot exclusion protocols and user-defined banned sources to aid in lawful and ethical web automation. Designed primarily as a sub-workflow, it serves automation pipelines with robust URL validation to avoid undesirable or restricted sources. ✨ Features Dual-layer URL Filtering: Checks URLs against a manually maintained forbidden sources list and robots.txt restrictions. Automated robots.txt Retrieval & Update: Automatically fetches and updates robots.txt content for new or outdated sources (older than 3 days). AI-backed robots.txt Interpretation: Uses AI models to interpret robots.txt comments and restrictions, ensuring nuanced compliance. Configurable User-Agent Identification: Allows customization of User-Agent strings that are checked against robots.txt directives. Sub-Workflow Ready: Easily integrates as a sub-workflow for link validation in larger automation pipelines. Multi-Model AI Support: Supports mistral, groq, and gemini AI models for enhanced robots.txt compliance checks. Detailed Diagnostic Outputs: Returns comprehensive link allowance statuses and metadata for use in downstream processing. Database Integration: Utilizes PostgreSQL to store and manage robots.txt content and banned source lists. πŸ‘€ Who is this for? Ideal for developers, data engineers, researchers, or businesses implementing web crawlers, scrapers, or any automation that processes URLs. This workflow helps your compliance with source restrictions and avoids content from blacklisted sites, reducing legal exposure and promoting ethical data use. πŸ’‘ What problem does this solve? URL Officer addresses the challenge of automating URL validation by combining manual blacklist filtering with automated and AI-assisted robots.txt parsing. It prevents accidental scraping or processing from undesirable or disallowed sources, helping automate respect for webmasters' policies and legal boundaries. πŸ” What this workflow does When given a URL, the workflow: Extracts the base URL. Checks the URL against a manually configured banned sources list (stored in database). Fetches robots.txt for new or stale sources (older than 3 days). Performs a programmatic parse and check of robots.txt directives against the URL using the specified User-Agent. Runs an AI model to analyze robots.txt content and confirm if the URL is allowed, taking into account any special comments or prohibitions relevant to the automation goal. Returns a final "allow or disallow" determination for both the URL and its base URL, along with metadata about the robots.txt fetch status and timing. πŸ”„ Workflow Steps Input Parsing & Base URL Extraction Accepts workflow arguments including the URL, User-Agent information, automation goal, and AI model choice. Extracts and normalizes the base URL for processing. Forbidden Source Check Queries PostgreSQL tables containing banned sources. Immediately rejects URLs matching forbidden sources. robots.txt Handling Checks if robots.txt content for the source is in the database and is recent (under 3 days old). If missing or outdated, fetches the robots.txt file from the base URL and updates the database. Code-Based robots.txt Analysis Parses robots.txt directives, matching the User-Agent to appropriate groups. Checks if the URL and base URL paths are allowed according to the parsed rules. Uses a conservative URL and agent matching algorithm for prefix-based allow/disallow checks. AI-Based robots.txt Verification Uses the selected AI model (mistral, groq, or gemini) to analyze robots.txt content and comments regarding allowed automation use. Applies AI understanding to confirm or override automated code checks based on the automation's goal. Output Preparation Produces output indicating permission statuses (allowlink and allowbaseUrl), original URLs, User-Agent info, fetch timestamps, and whether robots.txt was successfully retrieved. Designed to be consumed by other workflows as a validation step. πŸ”€ Expected Input / Configuration The workflow is configured primarily via workflow input arguments: | Parameter | Description | Type | |-------------------|---------------------------------------------------------------------------------------------------|---------| | link | The URL to be checked. | String | | userAgent | User-Agent string representing your automation, used for robots.txt checks. | String | | userAgent_extra | Additional User-Agent information such as version or contact info. | String | | automationGoal | Description of your automation’s purpose, used by the AI to verify suitability against robots.txt. | String | | model | AI model to use for the robots.txt compliance check. Options: mistral, groq, gemini. | String | Database Requirements PostgreSQL database configured with credentials accessible to the workflow. Two tables: one for banned sources (manually maintained) and one for robots.txt content with timestamps. The workflow auto-creates and manages these tables. Recommended to use a containerized PostgreSQL instance (Podman or Docker). πŸ“¦ Expected Output A structured JSON object containing: | Output Key | Description | |-----------------|-----------------------------------------------------------------| | link | The URL that was checked. | | baseUrl | The base URL of the checked link. | | allow_link | Boolean indicating if the link is allowed according to checks.| | allow_baseUrl | Boolean indicating if the base URL is allowed. | | userAgent | User-Agent string used in the check. | | userAgent_extra| Additional User-Agent metadata. | | robots_fetched| Boolean, true if robots.txt content was successfully fetched. | | fetched_at | Timestamp of the last robots.txt content fetch. | πŸ“Œ Example Example input payload: βš™οΈ n8n Setup Used n8n version: 1.108.2 Platform: Podman 4.3.1 on Linux PostgreSQL: Running in Podman 4.3.1 container LLM Models: mistral-small-latest, llama-3.1-8b-instant (Groq), gemini-2.5-flash Date: 2025-08-29 ⚑ Requirements to Use / Setup Self-hosted or cloud n8n instance with database connectivity. PostgreSQL database configured and accessible by n8n. Setup PostgreSQL using the recommended containerized deployment or your preferred method. Configure database credentials inside the workflow. Provide API credentials for your chosen AI model (mistral, groq, gemini). Manually maintain the banned sources list in the database. Familiarity with n8n variables and sub-workflow integration is recommended. Internet connectivity for fetching robots.txt files. ⚠️ Notes, Assumptions & Warnings Database tables used by this workflow are automatically created and managed by the workflow. robots.txt refresh interval is set to every 3 days; this can be adjusted by modifying the workflow. The robots.txt parser is relatively simple and does not support wildcard (*) or end-of-string ($) rules. User-Agent matching is substring-based and longer string matches take precedence. AI analysis adds a human-like understanding of robots.txt comments and prohibitions but depends on the quality and capability of the chosen AI model. This workflow does NOT handle: Terms of Service compliance. Preference for official APIs over HTML scraping. Rate-limiting or request throttling. Handling paywalled or restricted content. De-duplication or filtering beyond the banned sources list. Encryption or secure storage. You remain responsible for ensuring your automation complies with legal, ethical, and platform-specific rules. The workflow is designed as a sub-workflow; integrate it into larger automation processes to validate URLs. πŸ›  PostgreSQL Setup Instructions (Self-Hosted Route) Available inside the Workflow Notes, alongside podman commands. ℹ️ About Us This workflow was developed by the Hybroht team. Our goal is to create tools that harness the possibilities of technology and more. We aim to continuously improve and expand functionalities based on community feedback and evolving use cases. For questions, support, or feedback, please contact us at: contact@hybroht.com --- βš–οΈ Warranty & Legal Notice This workflow is provided "as-is" without warranties of any kind. By using this workflow, you agree that you are responsible for complying with all applicable laws, regulations, and terms of service related to your data sources and automations. Please review all relevant legal terms and use this workflow responsibly. Hybroht disclaims any liability arising from use or misuse of this workflow. This tool assists with robots.txt compliance but is not a substitute for full legal or compliance advice. You can view the full license terms here. Please review them before making your purchase. By purchasing this product, you agree to these terms. ---

HybrohtBy Hybroht
31
All templates loaded