If you have worked with Amazon Web Services, I am sure you would have heard of Amazon SQS, a reliable, highly scalable, distributed, fully managed message queuing service on cloud.
This blog post intends to explain what is involved while consuming messages from Amazon SQS, its nuances and best practices. As you may know, Amazon SQS has supported standard queues for more than a decade now, and has recently started supporting FIFO queues as well. Although the basics with respect to consumption remain the same, there are some key factors that differ in terms of using FIFO queues, hence to keep our discussion simple and consistent, we will restrict this blog to usage of standard queues and talk about FIFO queues in another blog post. Let us jump right into the topic for today’s blog post – consuming SQS messages.
Basic mechanism – Push vs Poll
Amazon SQS being a http based service, it does not support push based message consumption as some of the other messaging brokers support, especially those complying to the JMS specification. Thus, as a consumer, one has to poll Amazon SQS for consuming messages.
There are two types of polling mechanisms supported by Amazon SQS for consuming messages – Short Polling and Long Polling. Both have their own use cases, peculiarities, benefits and limitations, although, as we may see through this blog post, Long Polling is the one that you may end up using in most of the cases. Let us try and understand each of these mechanisms in detail and then compare them against each other, talking about the use cases they cater to and the benefits they offer.
This is the default polling mechanism used for Amazon SQS queue consumers that do not specifically choose to turn on Long Polling option while consuming messages.
The basic idea in this type of polling is to return the request to read messages as soon as possible, almost instantaneously, typically within tens (or lower hundreds) of milliseconds. Amazon SQS infrastructure being highly distributed, to maintain the quick sub-second response time, it may not be able to check all the servers, thus it typically checks a few servers based on a weighted random distribution and returns messages only from those servers. As per Amazon documentation, if one keeps issuing requests to read messages continuously, subsequent message read requests would return messages from all the servers – more on this later in the experiments described below.
A direct side effect of this is that a read request may not return any messages even if there are messages in the queue. Yes, it may sound somewhat surprising, but that is true and one can practically experience this by using Short Polling for a queue having a handful of messages.
Following are some observations from a few experiments I carried out using Short Polling based consumers for Amazon SQS.
As part of these experiments, I created a standard queue within Amazon SQS and tried consuming messages from that queue for varied number of messages. I used the Amazon Java SDK to create the consumer application. A sample code of a consumer is as shown below.
final AmazonSQS sqs = AmazonSQSClient.builder().build();
String queueName = "some-queue";
final String queueUrl = sqs.getQueueUrl(new GetQueueUrlRequest().withQueueName(queueName)).getQueueUrl();
ReceiveMessageRequest receiveMessageRequest = new ReceiveMessageRequest(queueUrl);
ReceiveMessageResult receiveMessage = sqs.receiveMessage(receiveMessageRequest);
List<Message> messages = receiveMessage.getMessages();
As you may note in the code snippet, we can specify, in the read request, how many messages you would like Amazon to return as part of the response. The valid values for this parameter are 1 through 10. So, at max, we can request Amazon SQS to return 10 messages in the response. The default value is 1. In these experiments, to be consistent across the tests performed, the number of messages to be returned was always set to 10. Also, although one can run multiple consumers (using threads, multiple machines etc.) consuming messages from the same queue, in these experiments, we are just running one single consumer so that we do not run into scenarios where messages are unavailable for one consumer since another one is currently consuming them. All the experiments were conducted by running the same operation 200 times to average out the findings.
For each of these runs, the following metrics and observations are noted.
- Response times – With Short Polling, the average response time observed for a read request was around ~50-60 milliseconds.
- Number of messages returned– The results for this somewhat surprised me, because Amazon SQS does not seem to guarantee that it would return the number of messages that are requested in the read request. A careful read of the API documentation also states that it may return upto the requested number of messages (maximum for which is 10). In fact, in these experiments, I have observed that the number of messages returned in the response approaches the requested number of messages as the number of messages in the queue increase. So, more the number of messages in the queue, higher the chance of getting the requested (or near that) number of messages in read requests. This can be noted in the graphs from the experiments below. With around 1000 messages in the queue, all 200 requests returned back 10 messages every time, however for other cases with less number of messages in the queue, the number of returned messages is much lesser than the requested 10 messages.
- Number of requests – Since Amazon SQS pricing is based on the number of requests made to SQS (be it any type of request – send, read, delete etc.), this is another important factor to consider while using SQS. However since the number of messages returned is somewhat non deterministic (as noted in the previous observation), the number of read requests required to consume a set of messages is also somewhat fluid. Likewise, as a read request can return an empty response irrespective of whether there are messages or there aren’t any messages in the queue, it can lead to higher number of requests to Amazon sqs which would obviously lead to higher cost.
- False empty responses – As we have already noted, Amazon SQS read requests may return empty responses even if there are messages in the queue. For example, as noted in the graphs below, for the case when the queue contained just 1 message, 191 out of 200 times Short Polling read requests came back with an empty response.
Following graphs try to represent the above findings in terms of actual observations noted during the experiments.
Alright, so we now have some understanding of how Short Polling works and what are some of the characteristics of Short Polling while consuming messages from Amazon SQS. Let us now try to understand how does its close friend – Long Polling – works and how does that fare in similar circumstances.
The basic idea of Long Polling, as the name suggests, is to do a Long poll while trying to read messages. So, as part of the read request, Amazon lets us specify a timeout period (in seconds) for which the read request blocks on the server if there are no messages available to consume. If there are messages in the queue to consume, the read message request does not block. Essentially, the focus here is to at least return one (or more) messages as long as messages are available for consumption. If the queue is empty, then the read messages request would wait for the specific time (1 to 20 seconds as specified in the request) before returning an empty response.
Its experimentation time again…!
Let us try to see results from a similar set of experiments I carried out while using the Long Polling mechanism to consume SQS messages.
Again, the conditions remain the same as described in the case of Short Polling experimentation section – the number of messages to return is always requested as 10, and there is just a single consumer trying to consume messages in the queue using Long Polling. I set the Long Polling wait timeout value to 20 seconds, which means that the request would block on the server for a maximum of 20 seconds, in case there are no messages to return.
Following are the observations and the metrics noted from this experiment.
- Response times – With Long Polling, the average response time for a read request varied significantly based on the number of messages in the queue. When the queue has one or more messages available for consumption, the response times varied from 7 seconds (yes, you read it right, 7 seconds) to about 40 milliseconds. Generally, the ratio is inversely proportional. The response time decreases (up to a certain point) with increase in the number of messages in the queue. The primary reason for the increased response time is that Amazon SQS tries to ensure that the response contains at least one (may be more) messages as long as the queue is non empty. So, it may have to check all the distributed nodes for messages before returning the response.
- Number of messages returned– The results here are similar to what was described in the corresponding section for Short Polling. Amazon SQS does not guarantee that it would return the number of messages that are requested in the read request. One can request up to 10 messages, and the response may contain 1 or more (up to 10) messages as long as there are messages in the queue. Similarly, the more messages you have in the queue, you run a higher chance of getting the requested (or near to that) number of messages in read requests.
- Number of requests – Once again it cannot be guaranteed that a fixed number of requests would be sufficient to consume a fixed number of messages in the queue always, however it is still better than Short Polling because you are guaranteed to get at least one message in the response per request if there are messages in the queue. Since Amazon SQS pricing is based on the number of requests made to SQS (be it any type of request – send, read, delete etc.), use of Long Polling can definitely result in reduced costs as compared to Short Polling.
- False Empty responses – Long Polling can never result in False Empty responses since you are guaranteed to get at least one message in the response as long as the queue is non empty. Please note that Long Polling may result in empty responses if the queue is empty.
Following graphs try to represent the above findings in terms of actual observations noted during the experiments
Considerations & Recommendations
As can be inferred from the experiments and observations referenced earlier in the blog post, a lot of factors can come into play while making a decision to use Long Polling versus Short Polling, although Long Polling definitely seems to have an edge above Short Polling in most cases. It not only eliminates false empty responses, but also reduces the number of empty responses, thus leading to significant cost benefits. In fact, it almost makes me think why does Amazon even support Short Polling at all, or even if it had to be supported for the limited use cases it would cater to, why is Short Polling the default option while consuming messages from SQS. That brings up a good question. Which use cases does Short Polling cater to. Let us try and unravel that a little bit.
On the face of it (and by the naming as well), it may seem that Short Polling should be used in cases where messages need to be processed immediately as soon as they are received, however a deeper thought actually reveals that it is rather the other way around. If you want your messages to be processed as soon as possible or as close to the messages being sent as possible, Long Polling is the option to go with. Think about it, a Short Polling request may return immediately, potentially without any messages even if there are messages to consume. So there is no guarantee as to when a message would be returned as part of a Short Polling request. Alternatively a Long Polling request may take a little longer to fulfill each request but is guaranteed to return a message as long as there are messages in the queue. In fact we have seen in our experiments above that when the queue is sufficiently filled up, Long Polling and Short Polling requests take almost the same amount of time to return. Thus, Long Polling mechanism is recommended for use cases where messages need to be processed as soon as possible. While Short Polling can be used for cases in which the timing of message processing is not as critical. Likewise, since Short Polling requests always return in a few milliseconds as compared to Long Polling requests, irrespective of the number of messages in the queue, Short Polling is more suitable for cases when the consuming application is more sensitive about (queue read request) response time, may be because the queue is being read in the main application thread as opposed to a background thread. That brings up a good point – from an application design perspective, Long Polling requests should always be done in background threads and not the application main thread since a Long Polling request can potentially get blocked for up to 20 seconds on the server.
Alright, so we had started this blog by introducing message consumption in SQS, talking about the two polling mechanisms and how each of these polling mechanisms manifest when tested under different conditions, evaluating them using the characteristics they exhibit including various important aspects, and further went on to understand the use cases where each of them can be leveraged, as well as touched upon a few application design and implementation considerations while doing so. This should hopefully provide a good view into what is involved while design and implementing, efficient and cost effective message consuming systems using Amazon SQS.
Hope you enjoyed reading, as always, appreciate any feedback/comments you may have.
Happy learning, happy sharing!