Building a Reactive Telegram Bot Framework with Java

Why another Telegram bot framework?

Building Telegram bots is straightforward — the API is well-documented, and there are dozens of libraries in every language. However, most existing Java frameworks share the same fundamental limitation: they process updates sequentially on blocking threads. This works fine for hobby projects handling a few messages per minute, but it becomes a bottleneck the moment your bot needs to serve thousands of concurrent users, handle complex multi-step workflows, or integrate with other microservices in a distributed system.

The challenge is not building a bot — it is building a bot that scales horizontally, recovers gracefully from failures, and fits into a modern reactive architecture without becoming a blocking chokepoint.

This is the problem that motivated the creation of telegram-bot — a modular, fully reactive framework built on Project Reactor and Spring WebFlux. In this article, I will walk through the design decisions, architecture, and practical patterns that make it suitable for production workloads.

Design Goals

Before writing any code, I established several non-negotiable requirements:

Fully non-blocking I/O. Every operation — from polling the Telegram API to publishing messages on a queue — must return Mono or Flux. No thread is ever blocked waiting for a response.
Separation of polling and processing. The component that fetches updates from Telegram should be independent from the component that handles business logic. This enables them to scale independently.
Pluggable message transport. The framework should not be coupled to a specific message broker. Today we use Apache Pulsar; tomorrow it could be Kafka or RabbitMQ.
Type safety at the boundary. Bot responses should be modeled as a sealed type hierarchy so that the compiler catches missing cases, not production logs.
Minimal boilerplate. A developer implementing a bot should only need to write a single function: Update → Flux<BotResponse>.

High-Level Architecture

The framework follows a producer-consumer pattern with two independently deployable services communicating through reactive queues:

  graph TD
    TG["Telegram API"]

    TG -- "poll updates" --> Poller
    Poller -- "send responses" --> TG

    subgraph Framework
        Poller["TelegramUpdatePoller<br/><i>polls updates, dispatches responses</i>"]
        IQ[("Inbound Queue<br/>(Update)")]
        OQ[("Outbound Queue<br/>(BotResponse)")]
        Worker["TelegramUpdateWorker<br/><i>invokes handler, emits responses</i>"]

        Poller -- "publish Update" --> IQ
        IQ -- "subscribe Update" --> Worker
        Worker -- "publish BotResponse" --> OQ
        OQ -- "subscribe BotResponse" --> Poller
    end

Poller is responsible for two things: fetching new updates from the Telegram API on a configurable interval, and dispatching outbound responses back to Telegram. It is stateless apart from tracking the current update offset.

Worker subscribes to inbound updates, invokes the user-provided UpdateHandler, and publishes the resulting BotResponse objects to the outbound queue. Multiple workers can run in parallel — each processing a subset of updates from a shared subscription.

This separation means you can run a single poller with ten workers behind it, scaling processing power without opening additional connections to the Telegram API.

Module Breakdown

The framework is organized into five Gradle modules, each with a single responsibility:

Module	Responsibility	Key Classes
`telegram-bot-client`	Reactive HTTP client for the Telegram Bot API	`TelegramBotClient`, 37 DTO records
`telegram-bot-core`	Shared abstractions	`ReactiveQueue<T>`, `UpdateHandler`, `BotResponse`
`telegram-bot-poller`	Update polling and response dispatch	`TelegramUpdatePoller`
`telegram-bot-worker`	Update processing orchestration	`TelegramUpdateWorker`
`telegram-bot-queue-pulsar`	Apache Pulsar queue implementation	`PulsarReactiveQueue<T>`

This modularity allows consumers to depend only on what they need. If you are building a simple bot that does not require distributed processing, you can use telegram-bot-client alone and call the Telegram API directly.

Core Abstractions

ReactiveQueue

The entire distributed architecture rests on a single interface:

public interface ReactiveQueue<T> {
    Mono<Void> publish(T message);
    Flux<T> subscribe();
    Mono<Void> close();
}

This interface is intentionally minimal. Any message broker that supports publish-subscribe semantics can implement it. The framework ships with an Apache Pulsar implementation, but adding Kafka or an in-memory queue for testing is a matter of implementing three methods.

UpdateHandler

The developer-facing API is a single functional interface:

@FunctionalInterface
public interface UpdateHandler {
    Flux<BotResponse> handle(Update update);
}

Returning Flux<BotResponse> rather than Mono is a deliberate design choice. A single incoming message might require the bot to send a typing indicator, process a payment, and reply with a confirmation — three distinct responses from one update. The framework handles dispatching each response individually.

BotResponse — Sealed Type Hierarchy

Bot responses are modeled as a sealed interface with over 20 subtypes:

@JsonTypeInfo(use = JsonTypeInfo.Id.NAME, property = "@type")
@JsonSubTypes({
    @JsonSubTypes.Type(value = BotResponse.SendMessage.class, name = "send_message"),
    @JsonSubTypes.Type(value = BotResponse.SendPhoto.class, name = "send_photo"),
    @JsonSubTypes.Type(value = BotResponse.SendDocument.class, name = "send_document"),
    // ... 17 more subtypes
})
public sealed interface BotResponse {
    record SendMessage(long chatId, String text, ...) implements BotResponse {}
    record SendPhoto(long chatId, String photo, ...) implements BotResponse {}
    record EditMessageText(long chatId, int messageId, String text, ...) implements BotResponse {}
    // ...
}

Using sealed interfaces with records provides three benefits:

Exhaustive pattern matching. The compiler ensures every response type is handled when dispatching to the Telegram API.
Immutability. Records guarantee that response objects cannot be mutated after creation.
Serialization. Jackson’s @JsonTypeInfo enables polymorphic serialization over the queue, with a @type discriminator field that maps each record to its Telegram API method.

The Poller — Bridging Telegram and the Queue

TelegramUpdatePoller runs a continuous polling loop on a boundedElastic scheduler:

public void start() {
    Flux.interval(pollingInterval)
        .flatMap(tick -> client.getUpdates(offset, limit, timeout, allowedUpdates))
        .flatMap(response -> Flux.fromIterable(response.getResult()))
        .doOnNext(update -> offset = update.updateId() + 1)
        .flatMap(inboundQueue::publish)
        .retry()
        .subscribe();

    outboundQueue.subscribe()
        .flatMap(this::dispatchResponse)
        .retry()
        .subscribe();
}

Key design decisions:

Long polling with configurable timeout reduces unnecessary API calls while maintaining low latency for new messages.
Offset tracking ensures that each update is processed exactly once, even across restarts (when combined with a persistent queue).
Retry with backoff ensures transient network failures do not crash the polling loop.
The poller also subscribes to the outbound queue and dispatches responses using pattern matching over the sealed BotResponse type.

The Worker — Processing at Scale

TelegramUpdateWorker subscribes to the inbound queue and processes updates concurrently:

public void start() {
    inboundQueue.subscribe()
        .flatMap(
            update -> handler.handle(update)
                .flatMap(outboundQueue::publish)
                .onErrorResume(e -> {
                    log.error("Handler error for update {}", update.updateId(), e);
                    return Mono.empty();
                }),
            concurrency  // default: 256
        )
        .retry()
        .subscribe();
}

The concurrency parameter in flatMap controls how many updates are processed simultaneously. At the default of 256, a single worker instance can handle 256 concurrent handler invocations — all without allocating a thread per request, thanks to Project Reactor’s event-loop model.

Error isolation is critical here: if a handler throws for one update, that error is caught and logged, but the subscription continues processing subsequent updates. The bot does not crash because one user sent an unexpected message.

Queue Implementation — Apache Pulsar

The Pulsar queue implementation wraps the Pulsar async client in Reactor types:

public class PulsarReactiveQueue<T> implements ReactiveQueue<T> {

    @Override
    public Mono<Void> publish(T message) {
        return Mono.fromFuture(() -> producer.sendAsync(serialize(message)))
                   .then();
    }

    @Override
    public Flux<T> subscribe() {
        return Flux.defer(() -> Mono.fromFuture(consumer::receiveAsync))
                   .repeat()
                   .flatMap(msg -> {
                       try {
                           T value = deserialize(msg.getData());
                           consumer.acknowledgeAsync(msg.getMessageId());
                           return Mono.just(value);
                       } catch (Exception e) {
                           consumer.negativeAcknowledge(msg.getMessageId());
                           return Mono.empty();
                       }
                   });
    }
}

Notable patterns:

Lazy initialization. Producers and consumers are created on first use, not at construction time.
Negative acknowledgment. If a message cannot be deserialized (e.g., schema version mismatch), it is negatively acknowledged, causing Pulsar to redeliver it later or route it to a dead-letter topic.
Shared subscription. Multiple worker instances can subscribe to the same topic, and Pulsar distributes messages across them — enabling horizontal scaling without code changes.

Practical Example — Building an Echo Bot

With the framework in place, building a bot requires minimal code:

// 1. Create the Telegram client
TelegramBotClient client = new TelegramBotClient("YOUR_BOT_TOKEN");

// 2. Set up queues (Pulsar, or in-memory for development)
ReactiveQueue<Update> inbound = new PulsarReactiveQueue<>(...);
ReactiveQueue<BotResponse> outbound = new PulsarReactiveQueue<>(...);

// 3. Define your handler — the only code you need to write
UpdateHandler handler = update -> {
    if (update.message() != null && update.message().text() != null) {
        return Flux.just(new BotResponse.SendMessage(
            update.message().chat().id(),
            "Echo: " + update.message().text(),
            ParseMode.HTML,
            null
        ));
    }
    return Flux.empty();
};

// 4. Start the poller and worker
TelegramUpdatePoller poller = new TelegramUpdatePoller(
    client, inbound, outbound,
    Duration.ofSeconds(1), 100, 30,
    List.of("message")
);
TelegramUpdateWorker worker = new TelegramUpdateWorker(inbound, outbound, handler);

poller.start();
worker.start();

The entire bot logic lives in the UpdateHandler lambda. Everything else — polling, queuing, dispatching, error handling, concurrency — is handled by the framework.

Benefits of This Approach

Performance

Non-blocking I/O means the framework uses a small, fixed thread pool regardless of the number of concurrent operations. A single worker instance with 256 concurrency can sustain thousands of messages per second without the memory overhead of thread-per-request models.

Horizontal Scalability

Because the poller and workers communicate through a message queue, scaling is straightforward. The diagram below illustrates how a single poller fans out to multiple workers, and how the system scales from a minimal setup to handling peak load:

  graph LR
    TG["Telegram API"]

    subgraph Polling
        P1["Poller"]
    end

    subgraph Broker
        IQ[("Inbound Queue")]
        OQ[("Outbound Queue")]
    end

    subgraph Workers
        W1["Worker 1"]
        W2["Worker 2"]
        W3["Worker 3"]
        WN["Worker N"]
    end

    TG -- "updates" --> P1
    P1 -- "publish" --> IQ
    IQ -- "shared sub" --> W1
    IQ -- "shared sub" --> W2
    IQ -- "shared sub" --> W3
    IQ -. "scale out" .-> WN
    W1 -- "publish" --> OQ
    W2 -- "publish" --> OQ
    W3 -- "publish" --> OQ
    WN -. "publish" .-> OQ
    OQ -- "subscribe" --> P1
    P1 -- "responses" --> TG

Each worker processes updates independently with its own concurrency pool. Scaling from one to ten workers requires no code changes — Pulsar’s shared subscription distributes messages across all active consumers automatically.

More processing power? Add more worker instances. Pulsar’s shared subscription distributes load automatically.
Multiple bot tokens? Run separate pollers, each publishing to the same queue infrastructure.
Geographic distribution? Deploy workers closer to your downstream services while keeping the poller in a single region.

Operational Resilience

Backpressure. If workers cannot keep up, messages accumulate in the queue rather than causing out-of-memory errors.
Retry semantics. Both the poller and worker retry on transient failures without losing messages.
Graceful shutdown. The dispose() method on both components ensures in-flight operations complete before the process exits.

Developer Experience

One interface to implement. The UpdateHandler functional interface keeps the developer-facing API surface minimal.
Type-safe responses. The sealed BotResponse hierarchy makes it impossible to construct an invalid response at compile time.
Testability. Handlers are pure functions from Update to Flux<BotResponse> — no mocking of HTTP clients or queue infrastructure needed for unit tests.

Testing Strategy

The framework employs multiple testing layers:

Unit tests for handlers — pure function testing with StepVerifier:

@Test
void shouldEchoMessage() {
    Update update = createTestUpdate("hello");

    StepVerifier.create(handler.handle(update))
        .assertNext(response -> {
            assertInstanceOf(BotResponse.SendMessage.class, response);
            assertEquals("Echo: hello", ((BotResponse.SendMessage) response).text());
        })
        .verifyComplete();
}

Integration tests for the HTTP client using MockServer to simulate the Telegram API.
End-to-end tests for the Pulsar queue using Docker Compose to spin up a Pulsar standalone instance.

This layered approach ensures that each component can be tested in isolation, while the full system is validated against real infrastructure in CI.

Tech Stack Summary

Layer	Technology
Language	Java 25+ (records, sealed interfaces, pattern matching)
Reactive runtime	Project Reactor 3.7
HTTP client	Spring WebFlux (WebClient)
Message broker	Apache Pulsar 3.3
Serialization	Jackson with polymorphic type support
Observability	Micrometer + Prometheus
Build	Gradle 9.1 with version catalogs
CI	GitHub Actions

Conclusion

Building a Telegram bot framework from scratch might seem like reinventing the wheel. But existing solutions were not designed for the constraints of modern distributed systems — non-blocking I/O, horizontal scalability, backpressure handling, and type-safe message contracts.

By separating concerns into distinct modules, modeling the queue as a pluggable abstraction, and leveraging Java’s sealed interfaces for compile-time safety, the framework remains simple for common use cases while supporting complex production deployments.

The project is open-source and available on GitHub. Contributions, feedback, and alternative queue implementations are welcome.

Why another Telegram bot framework?#

Design Goals#

High-Level Architecture#

Module Breakdown#

Core Abstractions#

ReactiveQueue#

UpdateHandler#

BotResponse — Sealed Type Hierarchy#

The Poller — Bridging Telegram and the Queue#

The Worker — Processing at Scale#

Queue Implementation — Apache Pulsar#

Practical Example — Building an Echo Bot#

Benefits of This Approach#

Performance#

Horizontal Scalability#

Operational Resilience#

Developer Experience#

Testing Strategy#

Tech Stack Summary#

Conclusion#

References#