New – Enhanced Lifeless-letter Queue Administration Expertise for Amazon SQS Commonplace Queues
8 mins read

New – Enhanced Lifeless-letter Queue Administration Expertise for Amazon SQS Commonplace Queues


A whole bunch of hundreds of shoppers use Amazon Easy Queue Service (SQS) to construct message-based purposes to decouple and scale microservices, distributed methods, and serverless apps. When a message can’t be efficiently processed by the queue client, you’ll be able to configure SQS to retailer it in a dead-letter queue (DLQ).

As a software program developer or architect, you’d like to look at and evaluation unconsumed messages in your DLQs to determine why they couldn’t be processed, establish patterns, resolve code errors, and in the end reprocess these messages within the authentic queue. The life cycle of those unconsumed messages is a part of your error-handling workflow, which is usually handbook and time consuming.

Immediately, I’m pleased to announce the overall availability of a brand new enhanced DLQ administration expertise for SQS commonplace queues that allows you to simply redrive unconsumed messages out of your DLQ to the supply queue.

This new performance is offered within the SQS console and helps you concentrate on the essential section of your error dealing with workflow, which consists of figuring out and resolving processing errors. With this new improvement expertise, you’ll be able to simply examine a pattern of the unconsumed messages and transfer them again to the unique queue with a click on, and with out writing, sustaining, and securing any customized code. This new expertise additionally takes care of redriving messages in batches, lowering general prices.

DLQ and Lambda Processor Setup
In the event you’re already snug with the DLQ setup, then skip the setup and leap into the brand new DLQ redrive expertise.

First, I create two queues: the supply queue and the dead-letter queue.

I edit the supply queue and configure the Lifeless-letter queue part. Right here, I choose the DLQ and configure the Most receives, which is the variety of occasions after which a message is reprocessed earlier than being despatched to the DLQ. For this demonstration, I’ve set it to at least one. Which means each failed message goes to the DLQ instantly. In a real-world setting, you would possibly wish to set the next quantity relying in your necessities and based mostly on what a failure means with respect to your software.

I additionally edit the DLQ to guarantee that solely my supply queue is allowed to make use of this DLQ. This configuration is optionally available: when this Redrive permit coverage is disabled, any SQS queue can use this DLQ. There are circumstances the place you wish to reuse a single DLQ for a number of queues. However often it’s thought-about greatest practices to setup unbiased DLQs per supply queue to simplify the redrive section with out affecting value. Take into account that you’re charged based mostly on the variety of API calls, not the variety of queues.

As soon as the DLQ is appropriately arrange, I would like a processor. Let’s implement a easy message client utilizing AWS Lambda.

The Lambda perform written in Python will iterate over the batch of incoming messages, fetch two values from the message physique, and print the sum of those two values.

import json

def lambda_handler(occasion, context):
    for document in occasion['Records']:
        payload = json.hundreds(document['body'])

        value1 = payload['value1']
        value2 = payload['value2']

        value_sum = value1 + value2
        print("the sum is %s" % value_sum)
        
    return "OK"

The code above assumes that every message’s physique accommodates two integer values that may be summed, with out coping with any validation or error dealing with. As you’ll be able to think about, this may result in bother in a while.

Earlier than processing any messages, it’s essential to grant this Lambda perform sufficient permissions to learn messages from SQS and configure its set off. For the IAM permissions, I take advantage of the managed coverage named AWSLambdaSQSQueueExecutionRole, which grants permissions to invoke sqs:ReceiveMessage, sqs:DeleteMessage, and sqs:GetQueueAttributes.

I take advantage of the Lambda console to arrange the SQS set off. I may obtain the identical from the SQS console too.

Now I’m able to course of new messages utilizing Ship and obtain messages for my supply queue within the SQS console. I write {"value1": 10, "value2": 5} within the message physique, and choose Ship message.

After I take a look at the CloudWatch logs of my Lambda perform, I see a profitable invocation.

START RequestId: 637888a3-c98b-5c20-8113-d2a74fd9edd1 Model: $LATEST
the sum is 15
END RequestId: 637888a3-c98b-5c20-8113-d2a74fd9edd1
REPORT RequestId: 637888a3-c98b-5c20-8113-d2a74fd9edd1	Length: 1.31 ms	Billed Length: 2 ms	Reminiscence Measurement: 128 MB	Max Reminiscence Used: 39 MB	Init Length: 116.90 ms	

Troubleshooting powered by DLQ Redrive
Now what if a distinct producer begins publishing messages with the flawed format? For instance, {"value1": "10", "value2": 5}. The primary quantity is a string and that is fairly prone to grow to be an issue in my processor.

In reality, that is what I discover within the CloudWatch logs:

START RequestId: 542ac2ca-1db3-5575-a1fb-98ce9b30f4b3 Model: $LATEST
[ERROR] TypeError: can solely concatenate str (not "int") to str
Traceback (most up-to-date name final):
  File "/var/process/lambda_function.py", line 8, in lambda_handler
    value_sum = value1 + value2
END RequestId: 542ac2ca-1db3-5575-a1fb-98ce9b30f4b3
REPORT RequestId: 542ac2ca-1db3-5575-a1fb-98ce9b30f4b3	Length: 1.69 ms	Billed Length: 2 ms	Reminiscence Measurement: 128 MB	Max Reminiscence Used: 39 MB	

To determine what’s flawed within the offending message, I take advantage of the brand new SQS redrive performance, choosing DLQ redrive in my dead-letter queue.

I take advantage of Ballot for messages and fetch all unconsumed messages from the DLQ.

After which I examine the unconsumed message by choosing it.

The issue is obvious, and I determine to replace my processing code to deal with this case correctly. Within the supreme world, that is an upstream situation that needs to be mounted within the message producer. However let’s assume that I can’t management that system and it’s critically essential for the enterprise that I course of this new kind of messages.

Due to this fact, I replace the processing logic as follows:

import json

def lambda_handler(occasion, context):
    for document in occasion['Records']:
        payload = json.hundreds(document['body'])
        value1 = int(payload['value1'])
        value2 = int(payload['value2'])
        value_sum = value1 + value2
        print("the sum is %s" % value_sum)
        # do some extra stuff
        
    return "OK"

Now that my code is able to course of the unconsumed message, I begin a brand new redrive process from the DLQ to the supply queue.

By default, SQS will redrive unconsumed messages to the supply queue. However you may additionally specify a distinct vacation spot and supply a customized velocity to set the utmost variety of messages per second.

I look ahead to the redrive process to finish by monitoring the redrive standing within the console. This new part at all times reveals the standing of most up-to-date redrive process.

The message has been moved again to the supply queue and efficiently processed by my Lambda perform. Every part appears to be like high-quality in my CloudWatch logs.

START RequestId: 637888a3-c98b-5c20-8113-d2a74fd9edd1 Model: $LATEST
the sum is 15
END RequestId: 637888a3-c98b-5c20-8113-d2a74fd9edd1
REPORT RequestId: 637888a3-c98b-5c20-8113-d2a74fd9edd1	Length: 1.31 ms	Billed Length: 2 ms	Reminiscence Measurement: 128 MB	Max Reminiscence Used: 39 MB	Init Length: 116.90 ms	

Out there Immediately at No Extra Value
Immediately you can begin leveraging the brand new DLQ redrive expertise to simplify your improvement and troubleshooting workflows, with none further value. This new console expertise is offered in all AWS Areas the place SQS is offered, and we’re wanting ahead to listening to your suggestions.

Take a look at the DLQ redrive documentation right here.

Alex



Leave a Reply

Your email address will not be published. Required fields are marked *