Bergens Tidende team is cross-functional team located in Kraków. They deliver web solutions for the Bergens Tidende newspaper. |
In this article I want to share some basic information about these systems which might allow you to decide which is the best for you. I will skip some advanced topics to keep the article clear and simple for beginners.
SQS
Amazon Simple Queue Service (SQS) is a distributed queue system that enables web services to quickly and reliably queue messages that one component in the application generates to be consumed by another component. A queue is a temporary repository for messages that await processing.
How it works – developer perspective
You first component, usually called producer, creates a message and puts it into a queue. You can have multiple producers and add multiple messages to the queue at the same time – you don’t have to worry about the traffic or peaks, SQS handles that for you.
Now queued messages are waiting for processing by a second app, called consumer.
The consumer is requesting new messages periodically from the queue. You can have multiple consumers, but you have to remember that each message can be processed only once. It means that you can have multiple instances of the same consumer, but you can’t read the same message from one queue in two different components. Each of these components should use a separate SQS queue. There will be an example later with how to do that fast and simple.
After a consumer processes the message, it has to be deleted from the queue. Deleting is important because SQS assumes that processing can fail. To prevent that, after the consumer receives a message, it is hidden from the queue for a defined period of time and after that, if it is not deleted, the message shows up in the queue again.
Limitations
Because SQS is a distributed queue and it is possible that messages won’t be deleted by a consumer, the message can be received “at least once”. If your component shouldn’t process the same message more than once, the best way to prevent duplication is to use unique message IDs generated by the consumer.
The message size is limited to 256KB (the body and all attributes). If you want to process larger messages you should point to data stored elsewhere, e.g. on S3.
Messages are not retained indefinitely in the queue. By default messages are deleted after 4 days, but you can extend this time to 14 days.
SQS, being a distributed queue, can return messages to the consumers randomly, with no defined order. You can add your own identifier as a part of the message, but the approximate order is probably enough to work with in most cases.
“Expect failure but don’t accept it”. Some messages can generate errors every time they are processed – in SQS you can define a special queue “dead letter Q”. It receives messages processed with errors more than n times (where 1⩽n<1000). This is especially useful if you don’t want to lose messages.
Example use cases
- file processing – image scaling, video recompression
- form processing – when it can’t be done during a single request
- communication with external services (asynchronous)
- sendings emails
- search engine indexing.
Code
Here is example code in NodeJS which shows how to work with SQS: github.com/wojtekk/aws-messaging-services/tree/master/sqs
The project shows you how to:
- create an SQS queue
- run example app as a worker
- send messages
- delete the SQS queue.
SNS
Amazon Simple Notification Service (SNS) is a push messaging service. It allows you to send messages to multiple services. Most significantly: HTTP(S), SQS and Lambda, but you can use also SMS, email or mobile push notifications. The latter can be useful if you use SNS to notify about errors or other important situations like Amazon does in Cloud Watch Alarms.
How it works – developer perspective
Before you start working with SNS you have to create an SNS topic and add subscribers to it. To each subscriber, SNS sends a special message with a subscription confirmation URL. Only when the subscription is confirmed, your component will start receiving messages.
After you do that, your first component can begin sending messages to the topic. SNS immediately pushes messages to the subscribers and waits until processing is finished, e.g. HTTP(S) subscribers should return HTTP code 200 if everything goes well.
For me, the most interesting subscriber type is HTTP(S). Thanks to that, your app can be as simple as possible – you don’t have to create and maintain workers if you don’t really need them, that is if you can process messages immediately and quickly. If you have spikes causing delays in processing, the SNS topic will buffer requests and send only a limited number of parallel messages to your component (defined in configuration).
SNS is also very useful for one more use case – the design pattern called “fanout”. In this pattern, messages published to an SNS topic are distributed in parallel to multiple SQS queues. Thanks to that you can build applications that are able to do multiple asynchronous processes in parallel. For example, you could publish a message to a topic when an article has changed: independent processes could clear cache, feed search engine index, rebuild RSS feed and more – reading from separate SQS queues.
Limitations
If a subscriber fails, SNS will send the message again a defined number of times, with a delay between requests. After that, the message will be dropped.
SNS doesn’t have a solution like “dead letter Q” in SQS. If your messages are really important you can add a receival confirmation (for fail or success), although I think in that situation the better solution would be using SQS.
SNS, like SQS and many other cloud/AWS services, delivers messages “at least once”.
Example use cases
- notifications about errors or important events
- integration with an external system – if you don’t want your app accessed directly
- duplicate messages to multiple SQS queues.
Code
Here is example code in NodeJS showing how to work with SNS: github.com/wojtekk/aws-messaging-services/tree/master/sns
The example code shows you how to:
- create SNS topic
- run example app as HTTP endpoint
- subscribe example app to the topic
- send messages
- delete SNS topic.
Kinesis
The latest tool from Amazon available on AWS is Amazon Kinesis – a platform for data streaming. Kinesis Streams enables you to build applications that process or analyze streaming data in real time.
How it works – developer perspective
The producer sends a message to Kinesis Stream. Kinesis buffers it for 24 hours and allows consumers to read and process messages in the same order as they were received.
The consumer doesn’t have to confirm successful processing or the removal of a message but you should store somewhere the last sequence number – it allows you to continue processing from the last message after the consumer restarted or crashed.
In contrast to previous solutions Kinesis keeps the order of the messages.
Limitations
A data blob can be up to 1MB.
Messages are deleted from the stream after 24 hours. You can extend that time to 7 days.
Data can be split into multiple shards. Each shard can support up to 5 transactions per second for read operations, up to a maximum total data read rate of 2 MB per second and up to 1,000 records per second for write operations, up to a maximum total data write rate of 1 MB per second (including partition keys).
Example use cases
- click stream – information about page views and events generated by your users
- Internet of Things (IoT) – analyse information from multiple sensors
- application logs aggregation
Code
Here is example code in NodeJS which shows how to work with Kinesis: github.com/wojtekk/aws-messaging-services/tree/master/kinesis
The example code shows you how to:
- create Kinesis Stream
- run example app as a worker
- send messages
- delete Kinesis Stream.
Summary
Each of the described apps can be useful depending on your needs.
If you want a simple solution to notify other systems about something you can chose SNS.
When you start working with messaging system you’ll most probably need SQS – this is a service that you should discover at the beginning.
Kinesis is a good solution if you have a really large number of messages or if their order is important.
Comparison table
SNS | SQS | Kinesis | |
---|---|---|---|
High Throughput | auto scalable | auto scalable | manually scalable |
Throughput spikes | ok | ok | pre-scale |
Failure Guarantees | at-least once | at-least once | at-least once |
Reliable jobs (dead letter Q) | nope | supported | nope |
FIFO (retain order) | not guaranteed | not guaranteed | supported |
Multiple recipients | supported | nope | supported |
Multiple writers and readers | yes | yes | yes |
Message size | 256 KB | 256 KB | 1 MB |
Data retention | nope | 1-14 days (if not deleted) | 1-7 days |
Open source alternative | ~Apache Camel with extensions | RabbitMQ | Kafka |
Mock for local development | fake_sns for http(s) endpoints | fake_sqs or ElasticMQ | kinesalite |
Wojciech Krawczyk is the team leader of the Bergens Tidende team located in Krakow.