Why another Telegram bot framework?
Building Telegram bots is straightforward — the API is well-documented, and there are dozens of libraries in every language. However, most existing Java frameworks share the same fundamental limitation: they process updates sequentially on blocking threads. This works fine for hobby projects handling a few messages per minute, but it becomes a bottleneck the moment your bot needs to serve thousands of concurrent users, handle complex multi-step workflows, or integrate with other microservices in a distributed system.
The challenge is not building a bot — it is building a bot that scales horizontally, recovers gracefully from failures, and fits into a modern reactive architecture without becoming a blocking chokepoint.
This is the problem that motivated the creation of telegram-bot — a modular, fully reactive framework built on Project Reactor and Spring WebFlux. In this article, I will walk through the design decisions, architecture, and practical patterns that make it suitable for production workloads.
Design Goals
Before writing any code, I established several non-negotiable requirements:
- Fully non-blocking I/O. Every operation — from polling the Telegram API to publishing messages on a queue — must return
MonoorFlux. No thread is ever blocked waiting for a response. - Separation of polling and processing. The component that fetches updates from Telegram should be independent from the component that handles business logic. This enables them to scale independently.
- Pluggable message transport. The framework should not be coupled to a specific message broker. Today we use Apache Pulsar; tomorrow it could be Kafka or RabbitMQ.
- Type safety at the boundary. Bot responses should be modeled as a sealed type hierarchy so that the compiler catches missing cases, not production logs.
- Minimal boilerplate. A developer implementing a bot should only need to write a single function:
Update → Flux<BotResponse>.
High-Level Architecture
The framework follows a producer-consumer pattern with two independently deployable services communicating through reactive queues:
graph TD
TG["Telegram API"]
TG -- "poll updates" --> Poller
Poller -- "send responses" --> TG
subgraph Framework
Poller["TelegramUpdatePoller<br/><i>polls updates, dispatches responses</i>"]
IQ[("Inbound Queue<br/>(Update)")]
OQ[("Outbound Queue<br/>(BotResponse)")]
Worker["TelegramUpdateWorker<br/><i>invokes handler, emits responses</i>"]
Poller -- "publish Update" --> IQ
IQ -- "subscribe Update" --> Worker
Worker -- "publish BotResponse" --> OQ
OQ -- "subscribe BotResponse" --> Poller
end
Poller is responsible for two things: fetching new updates from the Telegram API on a configurable interval, and dispatching outbound responses back to Telegram. It is stateless apart from tracking the current update offset.
Worker subscribes to inbound updates, invokes the user-provided UpdateHandler, and publishes the resulting BotResponse objects to the outbound queue.
Multiple workers can run in parallel — each processing a subset of updates from a shared subscription.
This separation means you can run a single poller with ten workers behind it, scaling processing power without opening additional connections to the Telegram API.
Module Breakdown
The framework is organized into five Gradle modules, each with a single responsibility:
| Module | Responsibility | Key Classes |
|---|---|---|
telegram-bot-client |
Reactive HTTP client for the Telegram Bot API | TelegramBotClient, 37 DTO records |
telegram-bot-core |
Shared abstractions | ReactiveQueue<T>, UpdateHandler, BotResponse |
telegram-bot-poller |
Update polling and response dispatch | TelegramUpdatePoller |
telegram-bot-worker |
Update processing orchestration | TelegramUpdateWorker |
telegram-bot-queue-pulsar |
Apache Pulsar queue implementation | PulsarReactiveQueue<T> |
This modularity allows consumers to depend only on what they need. If you are building a simple bot that does not require distributed processing, you can use
telegram-bot-client alone and call the Telegram API directly.
Core Abstractions
ReactiveQueue
The entire distributed architecture rests on a single interface:
public interface ReactiveQueue<T> {
Mono<Void> publish(T message);
Flux<T> subscribe();
Mono<Void> close();
}
This interface is intentionally minimal. Any message broker that supports publish-subscribe semantics can implement it. The framework ships with an Apache Pulsar implementation, but adding Kafka or an in-memory queue for testing is a matter of implementing three methods.
UpdateHandler
The developer-facing API is a single functional interface:
@FunctionalInterface
public interface UpdateHandler {
Flux<BotResponse> handle(Update update);
}
Returning Flux<BotResponse> rather than Mono is a deliberate design choice. A single incoming message might require the bot to send a typing indicator,
process a payment, and reply with a confirmation — three distinct responses from one update. The framework handles dispatching each response individually.
BotResponse — Sealed Type Hierarchy
Bot responses are modeled as a sealed interface with over 20 subtypes:
@JsonTypeInfo(use = JsonTypeInfo.Id.NAME, property = "@type")
@JsonSubTypes({
@JsonSubTypes.Type(value = BotResponse.SendMessage.class, name = "send_message"),
@JsonSubTypes.Type(value = BotResponse.SendPhoto.class, name = "send_photo"),
@JsonSubTypes.Type(value = BotResponse.SendDocument.class, name = "send_document"),
// ... 17 more subtypes
})
public sealed interface BotResponse {
record SendMessage(long chatId, String text, ...) implements BotResponse {}
record SendPhoto(long chatId, String photo, ...) implements BotResponse {}
record EditMessageText(long chatId, int messageId, String text, ...) implements BotResponse {}
// ...
}
Using sealed interfaces with records provides three benefits:
- Exhaustive pattern matching. The compiler ensures every response type is handled when dispatching to the Telegram API.
- Immutability. Records guarantee that response objects cannot be mutated after creation.
- Serialization. Jackson’s
@JsonTypeInfoenables polymorphic serialization over the queue, with a@typediscriminator field that maps each record to its Telegram API method.
The Poller — Bridging Telegram and the Queue
TelegramUpdatePoller runs a continuous polling loop on a boundedElastic scheduler:
public void start() {
Flux.interval(pollingInterval)
.flatMap(tick -> client.getUpdates(offset, limit, timeout, allowedUpdates))
.flatMap(response -> Flux.fromIterable(response.getResult()))
.doOnNext(update -> offset = update.updateId() + 1)
.flatMap(inboundQueue::publish)
.retry()
.subscribe();
outboundQueue.subscribe()
.flatMap(this::dispatchResponse)
.retry()
.subscribe();
}
Key design decisions:
- Long polling with configurable timeout reduces unnecessary API calls while maintaining low latency for new messages.
- Offset tracking ensures that each update is processed exactly once, even across restarts (when combined with a persistent queue).
- Retry with backoff ensures transient network failures do not crash the polling loop.
- The poller also subscribes to the outbound queue and dispatches responses using pattern matching over the sealed
BotResponsetype.
The Worker — Processing at Scale
TelegramUpdateWorker subscribes to the inbound queue and processes updates concurrently:
public void start() {
inboundQueue.subscribe()
.flatMap(
update -> handler.handle(update)
.flatMap(outboundQueue::publish)
.onErrorResume(e -> {
log.error("Handler error for update {}", update.updateId(), e);
return Mono.empty();
}),
concurrency // default: 256
)
.retry()
.subscribe();
}
The concurrency parameter in flatMap controls how many updates are processed simultaneously. At the default of 256, a single worker instance can handle 256
concurrent handler invocations — all without allocating a thread per request, thanks to Project Reactor’s event-loop model.
Error isolation is critical here: if a handler throws for one update, that error is caught and logged, but the subscription continues processing subsequent updates. The bot does not crash because one user sent an unexpected message.
Queue Implementation — Apache Pulsar
The Pulsar queue implementation wraps the Pulsar async client in Reactor types:
public class PulsarReactiveQueue<T> implements ReactiveQueue<T> {
@Override
public Mono<Void> publish(T message) {
return Mono.fromFuture(() -> producer.sendAsync(serialize(message)))
.then();
}
@Override
public Flux<T> subscribe() {
return Flux.defer(() -> Mono.fromFuture(consumer::receiveAsync))
.repeat()
.flatMap(msg -> {
try {
T value = deserialize(msg.getData());
consumer.acknowledgeAsync(msg.getMessageId());
return Mono.just(value);
} catch (Exception e) {
consumer.negativeAcknowledge(msg.getMessageId());
return Mono.empty();
}
});
}
}
Notable patterns:
- Lazy initialization. Producers and consumers are created on first use, not at construction time.
- Negative acknowledgment. If a message cannot be deserialized (e.g., schema version mismatch), it is negatively acknowledged, causing Pulsar to redeliver it later or route it to a dead-letter topic.
- Shared subscription. Multiple worker instances can subscribe to the same topic, and Pulsar distributes messages across them — enabling horizontal scaling without code changes.
Practical Example — Building an Echo Bot
With the framework in place, building a bot requires minimal code:
// 1. Create the Telegram client
TelegramBotClient client = new TelegramBotClient("YOUR_BOT_TOKEN");
// 2. Set up queues (Pulsar, or in-memory for development)
ReactiveQueue<Update> inbound = new PulsarReactiveQueue<>(...);
ReactiveQueue<BotResponse> outbound = new PulsarReactiveQueue<>(...);
// 3. Define your handler — the only code you need to write
UpdateHandler handler = update -> {
if (update.message() != null && update.message().text() != null) {
return Flux.just(new BotResponse.SendMessage(
update.message().chat().id(),
"Echo: " + update.message().text(),
ParseMode.HTML,
null
));
}
return Flux.empty();
};
// 4. Start the poller and worker
TelegramUpdatePoller poller = new TelegramUpdatePoller(
client, inbound, outbound,
Duration.ofSeconds(1), 100, 30,
List.of("message")
);
TelegramUpdateWorker worker = new TelegramUpdateWorker(inbound, outbound, handler);
poller.start();
worker.start();
The entire bot logic lives in the UpdateHandler lambda. Everything else — polling, queuing, dispatching, error handling, concurrency — is handled by the
framework.
Benefits of This Approach
Performance
Non-blocking I/O means the framework uses a small, fixed thread pool regardless of the number of concurrent operations. A single worker instance with 256 concurrency can sustain thousands of messages per second without the memory overhead of thread-per-request models.
Horizontal Scalability
Because the poller and workers communicate through a message queue, scaling is straightforward. The diagram below illustrates how a single poller fans out to multiple workers, and how the system scales from a minimal setup to handling peak load:
graph LR
TG["Telegram API"]
subgraph Polling
P1["Poller"]
end
subgraph Broker
IQ[("Inbound Queue")]
OQ[("Outbound Queue")]
end
subgraph Workers
W1["Worker 1"]
W2["Worker 2"]
W3["Worker 3"]
WN["Worker N"]
end
TG -- "updates" --> P1
P1 -- "publish" --> IQ
IQ -- "shared sub" --> W1
IQ -- "shared sub" --> W2
IQ -- "shared sub" --> W3
IQ -. "scale out" .-> WN
W1 -- "publish" --> OQ
W2 -- "publish" --> OQ
W3 -- "publish" --> OQ
WN -. "publish" .-> OQ
OQ -- "subscribe" --> P1
P1 -- "responses" --> TG
Each worker processes updates independently with its own concurrency pool. Scaling from one to ten workers requires no code changes — Pulsar’s shared subscription distributes messages across all active consumers automatically.
- More processing power? Add more worker instances. Pulsar’s shared subscription distributes load automatically.
- Multiple bot tokens? Run separate pollers, each publishing to the same queue infrastructure.
- Geographic distribution? Deploy workers closer to your downstream services while keeping the poller in a single region.
Operational Resilience
- Backpressure. If workers cannot keep up, messages accumulate in the queue rather than causing out-of-memory errors.
- Retry semantics. Both the poller and worker retry on transient failures without losing messages.
- Graceful shutdown. The
dispose()method on both components ensures in-flight operations complete before the process exits.
Developer Experience
- One interface to implement. The
UpdateHandlerfunctional interface keeps the developer-facing API surface minimal. - Type-safe responses. The sealed
BotResponsehierarchy makes it impossible to construct an invalid response at compile time. - Testability. Handlers are pure functions from
UpdatetoFlux<BotResponse>— no mocking of HTTP clients or queue infrastructure needed for unit tests.
Testing Strategy
The framework employs multiple testing layers:
- Unit tests for handlers — pure function testing with
StepVerifier:
@Test
void shouldEchoMessage() {
Update update = createTestUpdate("hello");
StepVerifier.create(handler.handle(update))
.assertNext(response -> {
assertInstanceOf(BotResponse.SendMessage.class, response);
assertEquals("Echo: hello", ((BotResponse.SendMessage) response).text());
})
.verifyComplete();
}
- Integration tests for the HTTP client using MockServer to simulate the Telegram API.
- End-to-end tests for the Pulsar queue using Docker Compose to spin up a Pulsar standalone instance.
This layered approach ensures that each component can be tested in isolation, while the full system is validated against real infrastructure in CI.
Tech Stack Summary
| Layer | Technology |
|---|---|
| Language | Java 25+ (records, sealed interfaces, pattern matching) |
| Reactive runtime | Project Reactor 3.7 |
| HTTP client | Spring WebFlux (WebClient) |
| Message broker | Apache Pulsar 3.3 |
| Serialization | Jackson with polymorphic type support |
| Observability | Micrometer + Prometheus |
| Build | Gradle 9.1 with version catalogs |
| CI | GitHub Actions |
Conclusion
Building a Telegram bot framework from scratch might seem like reinventing the wheel. But existing solutions were not designed for the constraints of modern distributed systems — non-blocking I/O, horizontal scalability, backpressure handling, and type-safe message contracts.
By separating concerns into distinct modules, modeling the queue as a pluggable abstraction, and leveraging Java’s sealed interfaces for compile-time safety, the framework remains simple for common use cases while supporting complex production deployments.
The project is open-source and available on GitHub. Contributions, feedback, and alternative queue implementations are welcome.