Back to Catalog

Process voice, images & documents with GPT-4o, MongoDB & Gmail tools

NanaBNanaB
716 views
2/3/2026
Official Page

What it does

This n8n workflow creates a cutting-edge, multi-modal AI Memory Assistant designed to capture, understand, and intelligently recall your personal or business information from diverse sources. It automatically processes voice notes, images, documents (like PDFs), and text messages sent via Telegram. Leveraging GPT-4o for advanced AI processing (including visual analysis, document parsing, transcription, and semantic understanding) and MongoDB Atlas Vector Search for persistent and lightning-fast recall, this assistant acts as an external brain. Furthermore, it integrates with Gmail, allowing the AI to send and search emails as part of its memory and response capabilities. This end-to-end solution blurprint provides a powerful starting point for personal knowledge management and intelligent automation.

How it works

1. Multi-Modal Input Ingestion πŸ—£οΈπŸ“ΈπŸ“„πŸ’¬

Your memories begin when you send a voice note, an image, a document (e.g., PDF), or a text message to your Telegram bot. The workflow immediately identifies the input type.

2. Advanced AI Content Processing 🧠✨

Each input type undergoes specialized AI processing by GPT-4o:

  1. Voice notes are transcribed into text using OpenAI Whisper.

  2. Images are visually analyzed by GPT-4o Vision, generating detailed textual descriptions.

  3. Documents (PDFs) are processed for text extraction, leveraging GPT-4o for robust parsing and understanding of content and structure. Unsupported document types are gracefully handled with a user notification.

  4. Text messages are directly forwarded for further processing.

This phase transforms all disparate input formats into a unified, rich textual representation.

3. Intelligent Memory Chunking & Vectorization βœ‚οΈπŸ·οΈβž‘οΈπŸ”’

The processed content (transcriptions, image descriptions, extracted document text, or direct text) is then fed back into GPT-4o. The AI intelligently chunks the information into smaller, semantically coherent pieces, extracts relevant keywords and tags, and generates concise summaries. Each of these enhanced memory chunks is then converted into a high-dimensional vector embedding using OpenAI Embeddings.

4. Persistent Storage & Recall (MongoDB Atlas Vector Search) πŸ’ΎπŸ”

These vector embeddings, along with their original content, metadata, and tags, are stored in your MongoDB Atlas cluster, which is configured with Atlas Vector Search. This allows for highly efficient and semantically relevant retrieval of memories based on user queries, forming the core of your "smart recall" system.

5. AI Agent & External Tools (Gmail Integration) πŸ€–πŸ› οΈ

When you ask a question, the AI Agent (powered by GPT-4o) acts as the central intelligence. It uses the MongoDB Chat Memory to maintain conversational context and, crucially, queries the MongoDB Atlas Vector Search store to retrieve relevant past memories. The agent also has access to Gmail tools, enabling it to send emails on your behalf or search your past emails to find information or context that might not be in your personal memory store.

6. Smart Response Generation & Delivery πŸ’¬βž‘οΈπŸ“±

Finally, using the retrieved context from MongoDB and the conversational history, GPT-4o synthesizes a concise, accurate, and contextually aware answer. This response is then delivered back to you via your Telegram bot.

How to set it up (~20 Minutes)

Getting this powerful workflow running requires a few key configurations and external service dependencies.

Telegram Bot Setup:

  1. Use BotFather in Telegram to create a new bot and obtain its API Token.

  2. In your n8n instance, add a new Telegram API credential. Give it a clear name (e.g., "My AI Memory Bot") and paste your API Token.

OpenAI API Key Setup:

  1. Log in to your OpenAI account and generate a new API key.

  2. Within n8n, create a new OpenAI API credential. Name it appropriately (e.g., "My OpenAI Key for GPT-4o") and paste your API key. This credential will be used by the OpenAI Chat Model (GPT-4o for processing, chunking, and RAG), Analyze Image, and Transcribe Audio nodes.

MongoDB Atlas Setup:

  1. If you don't have one, create a free-tier or paid cluster on MongoDB Atlas.

  2. Create a database and a collection within your cluster to store your memory chunks and their vector embeddings.

  3. Crucially, configure an Atlas Vector Search index on your chosen collection. This index will be on the field containing your embeddings (e.g., embedding field, type knnVector). Refer to MongoDB Atlas documentation for detailed instructions on creating vector search indexes.

  4. In n8n, add a new MongoDB credential. Provide your MongoDB Atlas connection string (ensure it includes your username, password, and database name), and give it a clear name (e.g., "My Atlas DB"). This credential will be used by the MongoDB Chat Memory node and for any custom HTTP requests you might use for Atlas Vector Search insertion/querying.

Gmail Account Setup:

  1. Go to Google Cloud Console, enable the Gmail API for your project, and configure your OAuth consent screen.

  2. Create an OAuth 2.0 Client ID for a Desktop app (or Web application, depending on your n8n setup and redirect URI). Download the JSON credentials.

  3. In n8n, add a new Gmail OAuth2 API credential. Follow the n8n instructions to configure it using your Google Client ID and Client Secret, and authenticate with your Gmail account, ensuring it has sufficient permissions to send and search emails.

External API Services:

  1. If your Extract from File node relies on an external service for robust PDF/DocX text extraction, ensure you have an API key and the service is operational. The current flow uses ConvertAPI.

  2. Add the necessary credential (e.g., ConvertAPI) in n8n.

How you could enhance it ✨

This workflow offers numerous avenues for advanced customization and expansion:

  1. Expanded Document Type Support: Enhance the "Document Processing" section to handle a wider range of document types beyond just PDFs (e.g., .docx, .xlsx, .pptx, markdown, CSV) by integrating additional conversion APIs or specialized parsing libraries (e.g., using a custom code node or dedicated third-party services like Apache Tika, Unstructured.io).

  2. Fine-tuned Memory Chunks & Metadata: Implement more sophisticated chunking strategies for very long documents, perhaps based on semantic breaks or document structure (headings, sections), to improve recall accuracy. Add more metadata fields (e.g., original author, document date, custom categories) to your MongoDB entries for richer filtering and context.

  3. Advanced AI Prompting: Allow users to dynamically set parameters for their memory inputs (e.g., "This is a high-priority meeting note," "This image contains sensitive information") which can influence how GPT-4o processes, tags, and stores the memory, or how it's retrieved later.

  4. n8n Tool Expansion for Proactive Actions: Significantly expand the AI Agent's capabilities by providing it with access to a wider range of n8n tools, moving beyond just information retrieval and email

  5. External Data Source Integration (APIs): Expand the AI Agent's tools to query other external APIs (e.g., weather, stock prices, news, CRM systems) so it can provide real-time information relevant to your memories.

Getting Assistance & More Resources

Need assistance setting this up, adapting it to a unique use case, or exploring more advanced customizations? Don't hesitate to reach out! You can contact me directly at nanabrownsnr@gmail.com. Also, feel free to check out my Youtube Channel where I discuss other n8n templates, as well as Innovation and automation solutions.

Process Voice, Images & Documents with GPT-4o, MongoDB & Telegram

This n8n workflow provides a powerful and flexible solution for processing various types of user input (text, images, and documents) received via Telegram. It leverages GPT-4o for intelligent analysis and response generation, stores chat history in MongoDB, and sends contextual replies back to the user.

What it does

This workflow automates the following steps:

  1. Listens for Telegram Messages: It continuously monitors a Telegram bot for incoming messages.
  2. Initial Message Processing: Extracts the message text, user ID, and checks for binary data (files/images).
  3. Determines Message Type:
    • If a file is attached: It extracts text from the file (PDF, images via OCR, etc.) and uses this text as the prompt for the AI agent.
    • If no file is attached: The message text directly becomes the prompt for the AI agent.
  4. Manages Chat History (MongoDB): It retrieves the conversation history for the specific user from MongoDB.
  5. Engages AI Agent (GPT-4o): It sends the user's prompt (and extracted file content if any) along with the chat history to an OpenAI GPT-4o model via a LangChain AI Agent.
  6. Stores New Chat Entry: The AI agent's response is added to the chat history and saved back to MongoDB.
  7. Responds via Telegram: The AI agent's generated response is sent back to the user on Telegram.

Prerequisites/Requirements

To use this workflow, you will need:

  • n8n Instance: A running instance of n8n.
  • Telegram Bot: A Telegram bot token and a chat ID where the bot will listen for messages and send replies.
  • OpenAI API Key: An API key for OpenAI (specifically for GPT-4o). This will be used by the "OpenAI Chat Model" and "OpenAI" nodes.
  • MongoDB Database: Access to a MongoDB database to store chat history. You'll need connection details (host, port, database name, collection name, credentials).

Setup/Usage

  1. Import the Workflow:
    • Download the provided JSON file.
    • In your n8n instance, click "Workflows" in the left sidebar.
    • Click "New" -> "Import from JSON" and paste the workflow JSON.
  2. Configure Credentials:
    • Telegram Trigger: Configure your Telegram Bot API Token.
    • Telegram Node: Configure your Telegram Bot API Token.
    • OpenAI Chat Model: Configure your OpenAI API Key.
    • OpenAI Node: Configure your OpenAI API Key.
    • MongoDB Chat Memory: Configure your MongoDB credentials (e.g., connection string, database, collection).
  3. Activate the Workflow: Once all credentials are set up, activate the workflow by toggling the "Active" switch in the top right corner of the workflow editor.
  4. Interact with your Telegram Bot: Send messages, images, or documents to your configured Telegram bot, and the workflow will process them and respond.

Related Templates

Automate Dutch Public Procurement Data Collection with TenderNed

TenderNed Public Procurement What This Workflow Does This workflow automates the collection of public procurement data from TenderNed (the official Dutch tender platform). It: Fetches the latest tender publications from the TenderNed API Retrieves detailed information in both XML and JSON formats for each tender Parses and extracts key information like organization names, titles, descriptions, and reference numbers Filters results based on your custom criteria Stores the data in a database for easy querying and analysis Setup Instructions This template comes with sticky notes providing step-by-step instructions in Dutch and various query options you can customize. Prerequisites TenderNed API Access - Register at TenderNed for API credentials Configuration Steps Set up TenderNed credentials: Add HTTP Basic Auth credentials with your TenderNed API username and password Apply these credentials to the three HTTP Request nodes: "Tenderned Publicaties" "Haal XML Details" "Haal JSON Details" Customize filters: Modify the "Filter op ..." node to match your specific requirements Examples: specific organizations, contract values, regions, etc. How It Works Step 1: Trigger The workflow can be triggered either manually for testing or automatically on a daily schedule. Step 2: Fetch Publications Makes an API call to TenderNed to retrieve a list of recent publications (up to 100 per request). Step 3: Process & Split Extracts the tender array from the response and splits it into individual items for processing. Step 4: Fetch Details For each tender, the workflow makes two parallel API calls: XML endpoint - Retrieves the complete tender documentation in XML format JSON endpoint - Fetches metadata including reference numbers and keywords Step 5: Parse & Merge Parses the XML data and merges it with the JSON metadata and batch information into a single data structure. Step 6: Extract Fields Maps the raw API data to clean, structured fields including: Publication ID and date Organization name Tender title and description Reference numbers (kenmerk, TED number) Step 7: Filter Applies your custom filter criteria to focus on relevant tenders only. Step 8: Store Inserts the processed data into your database for storage and future analysis. Customization Tips Modify API Parameters In the "Tenderned Publicaties" node, you can adjust: offset: Starting position for pagination size: Number of results per request (max 100) Add query parameters for date ranges, status filters, etc. Add More Fields Extend the "Splits Alle Velden" node to extract additional fields from the XML/JSON data, such as: Contract value estimates Deadline dates CPV codes (procurement classification) Contact information Integrate Notifications Add a Slack, Email, or Discord node after the filter to get notified about new matching tenders. Incremental Updates Modify the workflow to only fetch new tenders by: Storing the last execution timestamp Adding date filters to the API query Only processing publications newer than the last run Troubleshooting No data returned? Verify your TenderNed API credentials are correct Check that you have setup youre filter proper Need help setting this up or interested in a complete tender analysis solution? Get in touch πŸ”— LinkedIn – Wessel Bulte

Wessel BulteBy Wessel Bulte
247

πŸŽ“ How to transform unstructured email data into structured format with AI agent

This workflow automates the process of extracting structured, usable information from unstructured email messages across multiple platforms. It connects directly to Gmail, Outlook, and IMAP accounts, retrieves incoming emails, and sends their content to an AI-powered parsing agent built on OpenAI GPT models. The AI agent analyzes each email, identifies relevant details, and returns a clean JSON structure containing key fields: From – sender’s email address To – recipient’s email address Subject – email subject line Summary – short AI-generated summary of the email body The extracted information is then automatically inserted into an n8n Data Table, creating a structured database of email metadata and summaries ready for indexing, reporting, or integration with other tools. --- Key Benefits βœ… Full Automation: Eliminates manual reading and data entry from incoming emails. βœ… Multi-Source Integration: Handles data from different email providers seamlessly. βœ… AI-Driven Accuracy: Uses advanced language models to interpret complex or unformatted content. βœ… Structured Storage: Creates a standardized, query-ready dataset from previously unstructured text. βœ… Time Efficiency: Processes emails in real time, improving productivity and response speed. *βœ… Scalability: Easily extendable to handle additional sources or extract more data fields. --- How it works This workflow automates the transformation of unstructured email data into a structured, queryable format. It operates through a series of connected steps: Email Triggering: The workflow is initiated by one of three different email triggers (Gmail, Microsoft Outlook, or a generic IMAP account), which constantly monitor for new incoming emails. AI-Powered Parsing & Structuring: When a new email is detected, its raw, unstructured content is passed to a central "Parsing Agent." This agent uses a specified OpenAI language model to intelligently analyze the email text. Data Extraction & Standardization: Following a predefined system prompt, the AI agent extracts key information from the email, such as the sender, recipient, subject, and a generated summary. It then forces the output into a strict JSON structure using a "Structured Output Parser" node, ensuring data consistency. Data Storage: Finally, the clean, structured data (the from, to, subject, and summarize fields) is inserted as a new row into a specified n8n Data Table, creating a searchable and reportable database of email information. --- Set up steps To implement this workflow, follow these configuration steps: Prepare the Data Table: Create a new Data Table within n8n. Define the columns with the following names and string type: From, To, Subject, and Summary. Configure Email Credentials: Set up the credential connections for the email services you wish to use (Gmail OAuth2, Microsoft Outlook OAuth2, and/or IMAP). Ensure the accounts have the necessary permissions to read emails. Configure AI Model Credentials: Set up the OpenAI API credential with a valid API key. The workflow is configured to use the model, but this can be changed in the respective nodes if needed. Connect the Nodes: The workflow canvas is already correctly wired. Visually confirm that the email triggers are connected to the "Parsing Agent," which is connected to the "Insert row" (Data Table) node. Also, ensure the "OpenAI Chat Model" and "Structured Output Parser" are connected to the "Parsing Agent" as its AI model and output parser, respectively. Activate the Workflow: Save the workflow and toggle the "Active" switch to ON. The triggers will begin polling for new emails according to their schedule (e.g., every minute), and the automation will start processing incoming messages. --- Need help customizing? Contact me for consulting and support or add me on Linkedin.

DavideBy Davide
1616

Tax deadline management & compliance alerts with GPT-4, Google Sheets & Slack

AI-Driven Tax Compliance & Deadline Management System Description Automate tax deadline monitoring with AI-powered insights. This workflow checks your tax calendar daily at 8 AM, uses GPT-4 to analyze upcoming deadlines across multiple jurisdictions, detects overdue and critical items, and sends intelligent alerts via email and Slack only when immediate action is required. Perfect for finance teams and accounting firms who need proactive compliance management without manual tracking. πŸ›οΈπŸ€–πŸ“Š Good to Know AI-Powered: GPT-4 provides risk assessment and strategic recommendations Multi-Jurisdiction: Handles Federal, State, and Local tax requirements automatically Smart Alerts: Only notifies executives when deadlines are overdue or critical (≀3 days) Priority Classification: Categorizes deadlines as Overdue, Critical, High, or Medium priority Dual Notifications: Critical alerts to leadership + daily summaries to team channel Complete Audit Trail: Logs all checks and deadlines to Google Sheets for compliance records How It Works Daily Trigger - Runs at 8:00 AM every morning Fetch Data - Pulls tax calendar and company configuration from Google Sheets Analyze Deadlines - Calculates days remaining, filters by jurisdiction/entity type, categorizes by priority AI Analysis - GPT-4 provides strategic insights and risk assessment on upcoming deadlines Smart Routing - Only sends alerts if overdue or critical deadlines exist Critical Alerts - HTML email to executives + Slack alert for urgent items Team Updates - Slack summary to finance channel with all upcoming deadlines Logging - Records compliance check results to Google Sheets for audit trail Requirements Google Sheets Structure Sheet 1: TaxCalendar DeadlineID | DeadlineName | DeadlineDate | Jurisdiction | Category | AssignedTo | IsActive FED-Q1 | Form 1120 Q1 | 2025-04-15 | Federal | Income | John Doe | TRUE Sheet 2: CompanyConfig (single row) Jurisdictions | EntityType | FiscalYearEnd Federal, California | Corporation | 12-31 Sheet 3: ComplianceLog (auto-populated) Date | AlertLevel | TotalUpcoming | CriticalCount | OverdueCount 2025-01-15 | HIGH | 12 | 3 | 1 Credentials Needed Google Sheets - Service Account OAuth2 OpenAI - API Key (GPT-4 access required) SMTP - Email account for sending alerts Slack - Bot Token with chat:write permission Setup Steps Import workflow JSON into n8n Add all 4 credentials Replace these placeholders: YOURTAXCALENDAR_ID - Tax calendar sheet ID YOURCONFIGID - Company config sheet ID YOURLOGID - Compliance log sheet ID C12345678 - Slack channel ID tax@company.com - Sender email cfo@company.com - Recipient email Share all sheets with Google service account email Invite Slack bot to channels Test workflow manually Activate the trigger Customizing This Workflow Change Alert Thresholds: Edit "Analyze Deadlines" node: Critical: Change <= 3 to <= 5 for 5-day warning High: Change <= 7 to <= 14 for 2-week notice Medium: Change <= 30 to <= 60 for 2-month lookout Adjust Schedule: Edit "Daily Tax Check" trigger: Change hour/minute for different run time Add multiple trigger times for tax season (8 AM, 2 PM, 6 PM) Add More Recipients: Edit "Send Email" node: To: cfo@company.com, director@company.com CC: accounting@company.com BCC: archive@company.com Customize Email Design: Edit "Format Email" node to change colors, add logo, or modify layout Add SMS Alerts: Insert Twilio node after "Is Critical" for emergency notifications Integrate Task Management: Add HTTP Request node to create tasks in Asana/Jira for critical deadlines Troubleshooting | Issue | Solution | |-------|----------| | No deadlines found | Check date format (YYYY-MM-DD) and IsActive = TRUE | | AI analysis failed | Verify OpenAI API key and account credits | | Email not sending | Test SMTP credentials and check if critical condition met | | Slack not posting | Invite bot to channel and verify channel ID format | | Permission denied | Share Google Sheets with service account email | πŸ“ž Professional Services Need help with implementation or customization? Our team offers: 🎯 Custom workflow development 🏒 Enterprise deployment support πŸŽ“ Team training sessions πŸ”§ Ongoing maintenance πŸ“Š Custom reporting & dashboards πŸ”— Additional API integrations Discover more workflows – Get in touch with us

Oneclick AI SquadBy Oneclick AI Squad
93