Back to Catalog

Vision RAG and image embeddings using Cohere Command-A and Embed v4

JimleukJimleuk
1260 views
2/3/2026
Official Page

Cohere's new multimodal model releases make building your own Vision RAG agents a breeze. If you're new to Multimodal RAG and for the intent of this template, it means to embed and retrieve only document scans relevant to a query and then have a vision model read those scans to answer.

The benefits being (1) the vision model doesn't need to keep all document scans in context (expensive) and (2) ability to query on graphical content such as charts, graphs and tables.

How it works

  • Page extracts from a technology report containing graphs and charts are downloaded, converted to base64 and embedded using Cohere's Embed v4 model.
  • This produces embedding vectors which we will associate with the original page url and store them in our Qdrant vector store collection using the Qdrant community node.
  • Our Vision RAG agent is split into 2 parts; one regular AI agent for chat and a second Q&A agent powered by Cohere's Command-A-vision model which is required to read contents of images.
  • When a query requires access to the technology report, the Q&A agent branch is activated. This branch performs a vector search on our image embeddings and returns a list of matching image urls. These urls are then used as input for our vision model along with the user's original query.
  • The Q&A vision agent can then reply to the user using the "respond to chat" node.
  • Because both agents share the same memory space, it would be the same conversation to the user.

How to use

  • Ensure you have a Cohere account and sufficient credit to avoid rate limit or token usage restrictions.
  • For embeddings, swap out the page extracts for your own. You may need to split and convert document pages to images if you want to use image embeddings.
  • For chat, you may want to structure the agent(s) in another way which makes sense for your environment eg. using MCP servers.

Requirements

  • Cohere account for Embeddings and LLM
  • Qdrant for vector store

Vision RAG and Image Embeddings using Cohere Command-A and Embed v4

This n8n workflow demonstrates a sophisticated Retrieval Augmented Generation (RAG) system that leverages Cohere's Command-A for chat and Embed v4 for image embeddings. It enables an AI agent to process chat messages, potentially extract files (like images), generate embeddings, store them in a Qdrant vector store, and use this information to respond to queries.

What it does

This workflow automates the following steps:

  1. Triggers on Chat Message: The workflow starts when a new chat message is received.
  2. Initial Response: Immediately sends an initial "Thinking..." message to the chat.
  3. File Extraction (if any): Attempts to extract file data from the incoming chat message.
  4. Conditional Processing:
    • If a file is detected:
      • Loops over each detected file.
      • Generates Cohere Embeddings (Embed v4) for the file.
      • Stores these embeddings in a Qdrant Vector Store.
      • Aggregates the results.
      • Responds to the chat confirming the file processing.
    • If no file is detected:
      • Sets up a Cohere Chat Model (Command-A).
      • Initializes a Simple Memory for conversational context.
      • Configures a Qdrant Vector Store for retrieval.
      • Uses an AI Agent to process the chat message, potentially using the Qdrant store for RAG.
      • Responds to the chat with the AI Agent's generated answer.

Prerequisites/Requirements

To use this workflow, you will need:

  • n8n Instance: A running n8n instance.
  • Cohere API Key: For generating chat responses (Command-A) and image embeddings (Embed v4).
  • Qdrant Instance: A running Qdrant vector database to store and retrieve embeddings.
  • n8n Langchain Nodes: Ensure the @n8n/n8n-nodes-langchain package is installed in your n8n instance.

Setup/Usage

  1. Import the Workflow: Import the provided JSON into your n8n instance.
  2. Configure Credentials:
    • Cohere Chat Model: Set up your Cohere API credentials for the Cohere Chat Model node.
    • Embeddings Cohere: Set up your Cohere API credentials for the Embeddings Cohere node.
    • Qdrant Vector Store: Configure your Qdrant connection details (host, API key if applicable, collection name) for both Qdrant Vector Store nodes.
  3. Activate the Workflow: Once configured, activate the workflow.
  4. Interact via Chat: Send chat messages to the configured Chat Trigger endpoint. If you include files (e.g., images) in your chat messages, the workflow will process them to create embeddings. For text-based queries, the AI agent will use the RAG system to generate responses.

Related Templates

Generate song lyrics and music from text prompts using OpenAI and Fal.ai Minimax

Spark your creativity instantly in any chat—turn a simple prompt like "heartbreak ballad" into original, full-length lyrics and a professional AI-generated music track, all without leaving your conversation. 📋 What This Template Does This chat-triggered workflow harnesses AI to generate detailed, genre-matched song lyrics (at least 600 characters) from user messages, then queues them for music synthesis via Fal.ai's minimax-music model. It polls asynchronously until the track is ready, delivering lyrics and audio URL back in chat. Crafts original, structured lyrics with verses, choruses, and bridges using OpenAI Submits to Fal.ai for melody, instrumentation, and vocals aligned to the style Handles long-running generations with smart looping and status checks Returns complete song package (lyrics + audio link) for seamless sharing 🔧 Prerequisites n8n account (self-hosted or cloud with chat integration enabled) OpenAI account with API access for GPT models Fal.ai account for AI music generation 🔑 Required Credentials OpenAI API Setup Go to platform.openai.com → API keys (sidebar) Click "Create new secret key" → Name it (e.g., "n8n Songwriter") Copy the key and add to n8n as "OpenAI API" credential type Test by sending a simple chat completion request Fal.ai HTTP Header Auth Setup Sign up at fal.ai → Dashboard → API Keys Generate a new API key → Copy it In n8n, create "HTTP Header Auth" credential: Name="Fal.ai", Header Name="Authorization", Header Value="Key [Your API Key]" Test with a simple GET to their queue endpoint (e.g., /status) ⚙️ Configuration Steps Import the workflow JSON into your n8n instance Assign OpenAI API credentials to the "OpenAI Chat Model" node Assign Fal.ai HTTP Header Auth to the "Generate Music Track", "Check Generation Status", and "Fetch Final Result" nodes Activate the workflow—chat trigger will appear in your n8n chat interface Test by messaging: "Create an upbeat pop song about road trips" 🎯 Use Cases Content Creators: YouTubers generating custom jingles for videos on the fly, streamlining production from idea to audio export Educators: Music teachers using chat prompts to create era-specific folk tunes for classroom discussions, fostering interactive learning Gift Personalization: Friends crafting anniversary R&B tracks from shared memories via quick chats, delivering emotional audio surprises Artist Brainstorming: Songwriters prototyping hip-hop beats in real-time during sessions, accelerating collaboration and iteration ⚠️ Troubleshooting Invalid JSON from AI Agent: Ensure the system prompt stresses valid JSON; test the agent standalone with a sample query Music Generation Fails (401/403): Verify Fal.ai API key has minimax-music access; check usage quotas in dashboard Status Polling Loops Indefinitely: Bump wait time to 45-60s for complex tracks; inspect fal.ai queue logs for bottlenecks Lyrics Under 600 Characters: Tweak agent prompt to enforce fuller structures like [V1][C][V2][B][C]; verify output length in executions

Daniel NkenchoBy Daniel Nkencho
601

Auto-reply & create Linear tickets from Gmail with GPT-5, gotoHuman & human review

This workflow automatically classifies every new email from your linked mailbox, drafts a personalized reply, and creates Linear tickets for bugs or feature requests. It uses a human-in-the-loop with gotoHuman and continuously improves itself by learning from approved examples. How it works The workflow triggers on every new email from your linked mailbox. Self-learning Email Classifier: an AI model categorizes the email into defined categories (e.g., Bug Report, Feature Request, Sales Opportunity, etc.). It fetches previously approved classification examples from gotoHuman to refine decisions. Self-learning Email Writer: the AI drafts a reply to the email. It learns over time by using previously approved replies from gotoHuman, with per-classification context to tailor tone and style (e.g., different style for sales vs. bug reports). Human Review in gotoHuman: review the classification and the drafted reply. Drafts can be edited or retried. Approved values are used to train the self-learning agents. Send approved Reply: the approved response is sent as a reply to the email thread. Create ticket: if the classification is Bug or Feature Request, a ticket is created by another AI agent in Linear. Human Review in gotoHuman: How to set up Most importantly, install the gotoHuman node before importing this template! (Just add the node to a blank canvas before importing) Set up credentials for gotoHuman, OpenAI, your email provider (e.g. Gmail), and Linear. In gotoHuman, select and create the pre-built review template "Support email agent" or import the ID: 6fzuCJlFYJtlu9mGYcVT. Select this template in the gotoHuman node. In the "gotoHuman: Fetch approved examples" http nodes you need to add your formId. It is the ID of the review template that you just created/imported in gotoHuman. Requirements gotoHuman (human supervision, memory for self-learning) OpenAI (classification, drafting) Gmail or your preferred email provider (for email trigger+replies) Linear (ticketing) How to customize Expand or refine the categories used by the classifier. Update the prompt to reflect your own taxonomy. Filter fetched training data from gotoHuman by reviewer so the writer adapts to their personalized tone and preferences. Add more context to the AI email writer (calendar events, FAQs, product docs) to improve reply quality.

gotoHumanBy gotoHuman
353

Synchronizing WooCommerce inventory and creating products with Google Gemini AI and BrowserAct

Synchronize WooCommerce Inventory & Create Products with Gemini AI & BrowserAct This sophisticated n8n template automates WooCommerce inventory management by scraping supplier data, updating existing products, and intelligently creating new ones with AI-formatted descriptions. This workflow is essential for e-commerce operators, dropshippers, and inventory managers who need to ensure their product pricing and stock levels are synchronized with multiple third-party suppliers, minimizing overselling and maximizing profit. --- Self-Hosted Only This Workflow uses a community contribution and is designed and tested for self-hosted n8n instances only. --- How it works The workflow is typically run by a Schedule Trigger (though a Manual Trigger is also shown) to check stock automatically. It reads a list of suppliers and their inventory page URLs from a central Google Sheet. The workflow loops through each supplier: A BrowserAct node scrapes the current stock and price data from the supplier's inventory page. A Code node parses this bulk data into individual product items. It then loops through each individual product found. The workflow checks WooCommerce to see if the product already exists based on its name. If the product exists: It proceeds to update the existing product's price and stock quantity. If the product DOES NOT exist: An If node checks if the missing product's category matches a predefined type (optional filtering). If it passes the filter, a second BrowserAct workflow scrapes detailed product attributes from a dedicated product page (e.g., DigiKey). An AI Agent (Gemini) transforms these attributes into a specific, styled HTML table for the product description. Finally, the product is created in WooCommerce with all scraped details and the AI-generated description. Error Handling: Multiple Slack nodes are configured to alert your team immediately if any scraping task fails or if the product update/creation process encounters an issue. Note: This workflow does not support image uploads for new products. To enable this functionality, you must modify both the n8n and BrowserAct workflows. --- Requirements BrowserAct API account for web scraping BrowserAct n8n Community Node -> (n8n Nodes BrowserAct) BrowserAct templates named “WooCommerce Inventory & Stock Synchronization” and “WooCommerce Product Data Reconciliation” Google Sheets credentials for the supplier list WooCommerce credentials for product management Google Gemini account for the AI Agent Slack credentials for error alerts --- Need Help? How to Find Your BrowseAct API Key & Workflow ID How to Connect n8n to Browseract How to Use & Customize BrowserAct Templates How to Use the BrowserAct N8N Community Node --- Workflow Guidance and Showcase STOP Overselling! Auto-Sync WooCommerce Inventory from ANY Supplier

Madame AI Team | KaiBy Madame AI Team | Kai
600