Long Polling Implementation With Java and Spring Boot11 min read

Long polling is a concept that was being used aggressively in the past. It was the technique that made web feel like real-time.

I think a little history would help you to understand better.

The Brief History Of Internet

If you are old enough then you would know that web in its early days was very boring. And by boring I mean static, no moving parts.

It simply worked synchronously in a uni-directional way under the HTTP Request/Response model.

Unidirectional Request/Response paradigm
Request Response Model

In this model, the client requests server to send a particular webpage. And the server sends it to the client. The story ends there.

If the client wants another page, it will send another request for the page’s data and the server will send the requested data as an HTTP Response.

There is nothing fancy. Very simple, very monotonous.

So in the early days of the web, HTTP made sense as a simple request-response protocol because it was designed to serve the documents from the server on the client’s demand. That document would then contain the links (hyperlinks) to other documents.

There was nothing like javascript. It was all HTML.

Era After Javascript

Javascript was born in the year 1995 when Netscape Communications hired Brendan Eich to implement scripting capabilities in Netscape Navigator, and over a ten-day period, the JavaScript language was born.

How much can you do with a language that was built in 10 days – Well nothing much. It was only used for complementing the HTML… like providing form validation and a lightweight insertion of dynamic HTML.

But this gave a whole new way to the world of web development. It was moving towards the era of dynamic from static. The web was no more static, it found the ability to change itself dynamically.

In the next 5 years, when the browser war heated up, Microsoft gave birth to the XMLHttpRequest object. This, however, was only supported by Microsoft Internet Explorer.

It was the year 2006 when the World Wide Web Consortium published a working specification of the XMLHttpRequest object on April 5, 2006. This gave the power to web developers – They can now send asynchronous requests to get data from the server and change the parts of DOM with the new data. No need to load the entire page anymore.

This made the web more exciting and less monotonous.

This was the time in the history where a real-time Chat Application became real. It was implemented using a Long Polling mechanism.

The Birth Of Long Polling

Long polling was born out of necessity. Before long polling people used to work with short polling techniques.

The entire web works on the Request/Response protocol so there was no way for server to push the message to the client. It was always the client who has to request the data.

Short Polling Technique

In short polling technique, the client continuously sends a request to the server and ask for the new data. If there is no new data server sends back the empty response, but if the server has got the new data then it sends back the data.

Short Polling Technique
Short Polling

This seems to be a working model but it has several drawbacks.

The obvious drawback was the frequency of chat between client and server. The clients will continue sending an HTTP request to the server.

Processing HTTP requests are costly. There is a lot of processing involved.

  • every time a new connection needs to be established.
  • the HTTP headers must be parsed
  • a query for new data must be performed
  • and a response (usually with no new data to offer) must be generated and delivered.
  • The connection must then be closed, and any resources must be cleaned up.

Now imagine the above steps taking place for every single request coming into the server from every single client. There is a lot of resources that are getting wasted for doing no work (practically).

I tried hard to show the processing, chaos and data transfer involved in the image below (:p).

Short polling with multiple clients
Short polling with multiple clients

So, how can we improve the above nasty scenario?

A simple solution was a long polling technique.

Long Polling

The solution seems to be pretty simple. Make the connection once and let them wait for as long as possible. So that in the meanwhile if any new data comes to the server, the server can directly give the response back. This way we can definitely reduce the number of requests and response cycles involved.

Let’s understand the scenario with the help of an image.

Long Polling Technique
Long Polling

Here, every client sends the request to the server. Server checks for the new data, if the data is not available then it does not send back the response immediately; rather, it waits for sometime before sending the response. And in the meantime, if the data is available, it sends back the response with the data.

By implementing Long polling technique, we can easily reduce the number of request and response cycles that was taking place before (short polling).

In short, Long polling a technique to hang the connection between client and server until the data is available.

You must be thinking about the connection timeout issue?

There are multiple ways to deal with this situation:

  • Wait till the connection times out, and send a new request again.
  • Use Keep-Alive header when listening for a response – With long polling, the client may be configured to allow for a longer timeout period (via a Keep-Alive header) when listening for a response – something that would usually be avoided seeing as the timeout period is generally used to indicate problems communicating with the server.
  • Maintain the state of the client’s request and continue redirecting the client to a /poll endpoint after a certain duration of time.

The Long Polling technique is a lot better than short polling but still, it is resource consuming. In the next article, I will write about WebSockets and see how WebSockets is a much better solution when it comes to real-time applications.

Now, let’s jump to the implementation part of Long polling with Java using Spring boot framework.

Implementation Of Long Polling Technique

The idea of writing this article came from a technical problem that I faced while working on one of the client’s project.

The problem was that the task that was executed in a request was taking too much time for the completion. Because of this long wait, the client was facing the connection timeout issue.

Websockets solution was not possible because the client is not in our control. There was something needed to be done from our end in order to make it work.

I thought of using the Keep-Alive header but as I said it is not the right thing to do because something that would usually be avoided seeing as the timeout period is generally used to indicate problems communicating with the server. This alerts the monitoring systems that something is wrong.

The solution that seemed to be feasible was a continuous redirection to the polling endpoint with the unique id for the task.

That was a bit complex but here I’m going to give you the glimpse of long polling with a simple Real-time Chatting API.

Project Directory Structure

Project's Directory Structure
Project’s Directory Structure

The only thing you should focus here is the LongPollingController.java and LongPollingControllerTest.java.


Let’s look into the LongPollingController.java code:

First, take a look at the /sendMessage endpoint:

private static final List<CustomMessage> messageStore = new ArrayList<>();
    public ResponseEntity<List<CustomMessage>> saveMessage(@RequestBody CustomMessage message) {
        message.setId(messageStore.size() + 1);
        return ResponseEntity.ok(messageStore);

It is a very generic endpoint, it takes in the input from a POST request and stores the message in the message store. For simplicity, I’m storing the messages in the memory using a ArrayList instance.

The message-id is the unique key with which we will identify what needs to be sent back in the response.

Suppose you sent a new post request, the message-id will be +1 of the last message.

Next part of this implementation is the /getMessages endpoint:

    public ResponseEntity<List<CustomMessage>> getMessage(GetMessage input) throws InterruptedException {
        if (lastStoredMessage().isPresent() && lastStoredMessage().get().getId() > input.getId()) {
            List<CustomMessage> output = new ArrayList<>();
            for (int index = input.getId(); index < messageStore.size(); index++) {
            return ResponseEntity.ok(output);
        return keepPolling(input);

The method expects two parameters:

  • id – This is the message-id after which it wants the messages.
  • to – This field represents the person for which the message was intended. So, the client wants all the messages that were sent to him.

The first part of the logic checks, if the last message id stored in the messageStore is greater than the asked id. If it is greater then it means there is a new message available since the client last checked. Therefore, create the output with the messages that have been received after that index and sent it back in response.

KeepPolling Method

Another part of the logic is the one that stalls the connection.

    private ResponseEntity<List<CustomMessage>> keepPolling(GetMessage input) throws InterruptedException {
        HttpHeaders headers = new HttpHeaders();
        headers.setLocation(URI.create("/getMessages?id=" + input.getId() + "&to=" + input.getTo()));
        return new ResponseEntity<>(headers, HttpStatus.TEMPORARY_REDIRECT);

If there is no new message in the store, simply wait for 5 seconds (halt the connection) and then redirect it back to the /getMessages endpoint. If the message is available by then send back the response with the data otherwise repeat the process.

Sure you can make it more efficient by checking it once again before sending back the temporary redirect response but you got the idea, right?

And the main reason I’m sending back a redirect header response is that if I stall the connection for too long then it will time out. And I don’t want that. I want the client to keep coming back until I have served it with the data.

This implementation was exactly suited for the kind of situation I was in during my project, but you can also take the learnings from this approach and bend it and make it work for yourself.

That said, if you want to make a chatting application, go with the WebSockets. That is the most efficient and modern ways of creating real-time applications.


I also want you to take a look at the test class to understand the implementation of the API.

Here I have created one asynchronous thread that will act as a sender. It will wait for 10 seconds and send the message that will be addressed for a user named Bar.

And the receiver will send a get request with the id of the current messages it has i.e. 0 for the first time. And you will see that the connection will be established for 10 seconds. And as soon as the sender sends the message, the receiver will get the response.


This was a very simple and straightforward implementation of the Long Polling technique.

There is a lot of complex stuff that needs to be taken care, such as maintaining the state of the client’s request. The data that needs to be sent to each client and maintaining the concurrent connections at once.

This technique is fairly intuitive to understand and could result productive in many applications.

That said, I would not want you to implement Long Polling in any real-time application that you are building now. There are WebSockets available and they are very feasible and efficient when it comes to bi-directional communication. In short, use WebSockets to build real-time applications.

Don’t forget to comment and subscribe. The next article will be sent directly to your inbox.

  • Article By: Varun Shrivastava

  • Varun Shrivastava is an innovative Full Stack Developer at ThoughtWorks with around 4 years of experience in building enterprise software systems in finance and retail domain. Experienced in design, development, and deployment of scalable software. He is a passionate blogger and loves to write about philosophy, programming, tech and relationships. This is his space, you can get in touch with him here anytime you want.