Automate a 'Chat with your PDF' Bot on Telegram with Google Gemini & Pinecone

577 views

2/3/2026

Notification Slack Webhook Database Feedback

This n8n template from Intuz provides a complete solution to automate a powerful, AI-driven 'Chat with your PDF' bot on Telegram.

It uses Retrieval-Augmented Generation (RAG) to allow users to upload documents, which are then indexed into a vector database, enabling the bot to answer questions based only on the provided content.

Who's this workflow for?

Researchers & Students
Legal & Compliance Teams
Business Analysts & Financial Advisors
Anyone needing to quickly find information within large documents

How it works

This workflow has two primary functions: indexing a new document and answering questions about it.

1. Uploading & Indexing a Document:

A user sends a PDF file to the Telegram bot.
n8n downloads the document, extracts the text, and splits it into small, manageable chunks.
Using Google Gemini, each text chunk is converted into a numerical representation (an "embedding").
These embeddings are stored in a Pinecone vector database, making the document's content searchable.
The bot sends a confirmation message to the user that the document has been successfully saved.

2. Asking a Question (RAG):

A user sends a regular text message (a question) to the bot.
n8n converts the user's question into an embedding using Google Gemini.
It then searches the Pinecone database to find the most relevant text chunks from the uploaded PDF that match the question.
These relevant chunks (the "context") are sent to the Gemini chat model along with the original question.
Gemini generates a new, accurate answer based only on the provided context and sends it back to the user in Telegram.

Key Requirements to Use This Template

1. n8n Instance & Required Nodes:

An active n8n account (Cloud or self-hosted).
This workflow uses the official n8n LangChain integration (@n8n/n8n-nodes-langchain). If you are using a self-hosted version of n8n, please ensure this package is installed.

2. Telegram Account:

A Telegram bot created via the BotFather, along with its API token.

3. Google Gemini AI Account:

A Google Cloud account with the Vertex AI API enabled and an associated API Key.

4. Pinecone Account:

A Pinecone account with an API key.
You must have a vector index created in Pinecone. For use with Google Gemini's embedding-001 model, the index must be configured with 768 dimensions.

Setup Instructions

1. Telegram Configuration:

In the "Telegram Message Trigger" node, create a new credential and add your Telegram bot's API token.
Do the same for the "Telegram Response" and "Telegram Response about Database" nodes.

2. Pinecone Configuration:

In both "Pinecone Vector Store" nodes, create a new credential and add your Pinecone API key.
In the "Index" field of both nodes, enter the name of your pre-configured Pinecone index (e.g., telegram).

3. Google Gemini Configuration:

In all three Google Gemini nodes (Embeddings Google Gemini, Embeddings Google Gemini1, and Google Gemini Chat Model), create a new credential and add your Google Gemini (Palm) API key.

4. Activate and Use:

Save the workflow and toggle the "Active" switch to ON.
To use: First, send a PDF document to your bot. Wait for the confirmation message. Then, you can start asking questions about the content of that PDF.

Connect with us

Website: https://www.intuz.com/services
Email: getstarted@intuz.com
LinkedIn: https://www.linkedin.com/company/intuz
Get Started: https://n8n.partnerlinks.io/intuz

For Custom Workflow Automation

Click here- Get Started

Automate a Chat with Your PDF Bot on Telegram with Google Gemini & Pinecone

This n8n workflow enables you to create an interactive Telegram bot that can answer questions based on the content of a PDF document. It leverages Google Gemini for language understanding and Pinecone as a vector store for efficient document retrieval, allowing users to chat with their PDF bot directly through Telegram.

What it does

This workflow automates the following steps:

Listens for Telegram Messages: It acts as a Telegram bot, waiting for incoming messages from users.
Validates Input: Checks if the incoming message contains text. If not, it stops with an error.
Processes PDF Content (Initial Setup/Update):
- Loads a PDF document using a Default Data Loader.
- Splits the document into smaller, manageable chunks using a Recursive Character Text Splitter.
- Generates embeddings for these text chunks using Google Gemini Embeddings.
- Stores these embeddings in a Pinecone Vector Store, making the document searchable.
- Limits the number of processed items for efficiency.
Retrieves Relevant Information: When a user sends a query, it uses the Pinecone Vector Store to retrieve the most relevant document chunks based on the user's question.
Generates Answers with Google Gemini: It then uses the retrieved information and the user's query to generate a comprehensive answer using the Google Gemini Chat Model.
Sends Response to Telegram: Finally, it sends the generated answer back to the user via Telegram.

Prerequisites/Requirements

To use this workflow, you will need:

n8n Instance: A running n8n instance.
Telegram Bot Token: A Telegram bot token obtained from BotFather.
Google Gemini API Key: An API key for Google Gemini (for both embeddings and the chat model).
Pinecone Account and API Key: An account with Pinecone, including an API key and environment details.
PDF Document: The PDF document you want your bot to answer questions from. This PDF needs to be accessible by the workflow (e.g., uploaded as binary data to the "Default Data Loader" node or fetched from a URL).

Setup/Usage

Import the Workflow:
- Download the provided JSON file.
- In your n8n instance, go to "Workflows" and click "New".
- Click the three dots next to "New Workflow" and select "Import from JSON".
- Paste the workflow JSON or upload the file.
Configure Credentials:
- Telegram Trigger: Create a new Telegram API credential using your bot token.
- Telegram (Send Message): Use the same Telegram API credential.
- Embeddings Google Gemini: Create a new Google Gemini API credential with your API key.
- Google Gemini Chat Model: Create a new Google Gemini API credential with your API key.
- Pinecone Vector Store: Create a new Pinecone API credential with your API key and environment details.
Configure PDF Document:
- Locate the "Default Data Loader" node (ID 1243).
- Ensure your PDF content is correctly loaded into this node. You might need to manually upload a binary file or configure it to fetch from a URL if your PDF is hosted elsewhere.
Activate the Workflow: Once all credentials and configurations are set, activate the workflow.

Your Telegram bot will now be ready to answer questions based on your PDF content!

Related Templates

Automate Dutch Public Procurement Data Collection with TenderNed

TenderNed Public Procurement What This Workflow Does This workflow automates the collection of public procurement data from TenderNed (the official Dutch tender platform). It: Fetches the latest tender publications from the TenderNed API Retrieves detailed information in both XML and JSON formats for each tender Parses and extracts key information like organization names, titles, descriptions, and reference numbers Filters results based on your custom criteria Stores the data in a database for easy querying and analysis Setup Instructions This template comes with sticky notes providing step-by-step instructions in Dutch and various query options you can customize. Prerequisites TenderNed API Access - Register at TenderNed for API credentials Configuration Steps Set up TenderNed credentials: Add HTTP Basic Auth credentials with your TenderNed API username and password Apply these credentials to the three HTTP Request nodes: "Tenderned Publicaties" "Haal XML Details" "Haal JSON Details" Customize filters: Modify the "Filter op ..." node to match your specific requirements Examples: specific organizations, contract values, regions, etc. How It Works Step 1: Trigger The workflow can be triggered either manually for testing or automatically on a daily schedule. Step 2: Fetch Publications Makes an API call to TenderNed to retrieve a list of recent publications (up to 100 per request). Step 3: Process & Split Extracts the tender array from the response and splits it into individual items for processing. Step 4: Fetch Details For each tender, the workflow makes two parallel API calls: XML endpoint - Retrieves the complete tender documentation in XML format JSON endpoint - Fetches metadata including reference numbers and keywords Step 5: Parse & Merge Parses the XML data and merges it with the JSON metadata and batch information into a single data structure. Step 6: Extract Fields Maps the raw API data to clean, structured fields including: Publication ID and date Organization name Tender title and description Reference numbers (kenmerk, TED number) Step 7: Filter Applies your custom filter criteria to focus on relevant tenders only. Step 8: Store Inserts the processed data into your database for storage and future analysis. Customization Tips Modify API Parameters In the "Tenderned Publicaties" node, you can adjust: offset: Starting position for pagination size: Number of results per request (max 100) Add query parameters for date ranges, status filters, etc. Add More Fields Extend the "Splits Alle Velden" node to extract additional fields from the XML/JSON data, such as: Contract value estimates Deadline dates CPV codes (procurement classification) Contact information Integrate Notifications Add a Slack, Email, or Discord node after the filter to get notified about new matching tenders. Incremental Updates Modify the workflow to only fetch new tenders by: Storing the last execution timestamp Adding date filters to the API query Only processing publications newer than the last run Troubleshooting No data returned? Verify your TenderNed API credentials are correct Check that you have setup youre filter proper Need help setting this up or interested in a complete tender analysis solution? Get in touch 🔗 LinkedIn – Wessel Bulte

By Wessel Bulte

247

Auto-reply & create Linear tickets from Gmail with GPT-5, gotoHuman & human review

This workflow automatically classifies every new email from your linked mailbox, drafts a personalized reply, and creates Linear tickets for bugs or feature requests. It uses a human-in-the-loop with gotoHuman and continuously improves itself by learning from approved examples. How it works The workflow triggers on every new email from your linked mailbox. Self-learning Email Classifier: an AI model categorizes the email into defined categories (e.g., Bug Report, Feature Request, Sales Opportunity, etc.). It fetches previously approved classification examples from gotoHuman to refine decisions. Self-learning Email Writer: the AI drafts a reply to the email. It learns over time by using previously approved replies from gotoHuman, with per-classification context to tailor tone and style (e.g., different style for sales vs. bug reports). Human Review in gotoHuman: review the classification and the drafted reply. Drafts can be edited or retried. Approved values are used to train the self-learning agents. Send approved Reply: the approved response is sent as a reply to the email thread. Create ticket: if the classification is Bug or Feature Request, a ticket is created by another AI agent in Linear. Human Review in gotoHuman: How to set up Most importantly, install the gotoHuman node before importing this template! (Just add the node to a blank canvas before importing) Set up credentials for gotoHuman, OpenAI, your email provider (e.g. Gmail), and Linear. In gotoHuman, select and create the pre-built review template "Support email agent" or import the ID: 6fzuCJlFYJtlu9mGYcVT. Select this template in the gotoHuman node. In the "gotoHuman: Fetch approved examples" http nodes you need to add your formId. It is the ID of the review template that you just created/imported in gotoHuman. Requirements gotoHuman (human supervision, memory for self-learning) OpenAI (classification, drafting) Gmail or your preferred email provider (for email trigger+replies) Linear (ticketing) How to customize Expand or refine the categories used by the classifier. Update the prompt to reflect your own taxonomy. Filter fetched training data from gotoHuman by reviewer so the writer adapts to their personalized tone and preferences. Add more context to the AI email writer (calendar events, FAQs, product docs) to improve reply quality.

By gotoHuman

353

🎓 How to transform unstructured email data into structured format with AI agent

This workflow automates the process of extracting structured, usable information from unstructured email messages across multiple platforms. It connects directly to Gmail, Outlook, and IMAP accounts, retrieves incoming emails, and sends their content to an AI-powered parsing agent built on OpenAI GPT models. The AI agent analyzes each email, identifies relevant details, and returns a clean JSON structure containing key fields: From – sender’s email address To – recipient’s email address Subject – email subject line Summary – short AI-generated summary of the email body The extracted information is then automatically inserted into an n8n Data Table, creating a structured database of email metadata and summaries ready for indexing, reporting, or integration with other tools. --- Key Benefits ✅ Full Automation: Eliminates manual reading and data entry from incoming emails. ✅ Multi-Source Integration: Handles data from different email providers seamlessly. ✅ AI-Driven Accuracy: Uses advanced language models to interpret complex or unformatted content. ✅ Structured Storage: Creates a standardized, query-ready dataset from previously unstructured text. ✅ Time Efficiency: Processes emails in real time, improving productivity and response speed. *✅ Scalability: Easily extendable to handle additional sources or extract more data fields. --- How it works This workflow automates the transformation of unstructured email data into a structured, queryable format. It operates through a series of connected steps: Email Triggering: The workflow is initiated by one of three different email triggers (Gmail, Microsoft Outlook, or a generic IMAP account), which constantly monitor for new incoming emails. AI-Powered Parsing & Structuring: When a new email is detected, its raw, unstructured content is passed to a central "Parsing Agent." This agent uses a specified OpenAI language model to intelligently analyze the email text. Data Extraction & Standardization: Following a predefined system prompt, the AI agent extracts key information from the email, such as the sender, recipient, subject, and a generated summary. It then forces the output into a strict JSON structure using a "Structured Output Parser" node, ensuring data consistency. Data Storage: Finally, the clean, structured data (the from, to, subject, and summarize fields) is inserted as a new row into a specified n8n Data Table, creating a searchable and reportable database of email information. --- Set up steps To implement this workflow, follow these configuration steps: Prepare the Data Table: Create a new Data Table within n8n. Define the columns with the following names and string type: From, To, Subject, and Summary. Configure Email Credentials: Set up the credential connections for the email services you wish to use (Gmail OAuth2, Microsoft Outlook OAuth2, and/or IMAP). Ensure the accounts have the necessary permissions to read emails. Configure AI Model Credentials: Set up the OpenAI API credential with a valid API key. The workflow is configured to use the model, but this can be changed in the respective nodes if needed. Connect the Nodes: The workflow canvas is already correctly wired. Visually confirm that the email triggers are connected to the "Parsing Agent," which is connected to the "Insert row" (Data Table) node. Also, ensure the "OpenAI Chat Model" and "Structured Output Parser" are connected to the "Parsing Agent" as its AI model and output parser, respectively. Activate the Workflow: Save the workflow and toggle the "Active" switch to ON. The triggers will begin polling for new emails according to their schedule (e.g., every minute), and the automation will start processing incoming messages. --- Need help customizing? Contact me for consulting and support or add me on Linkedin.

By Davide

1616