-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
@FifoQueueListener retry does not work when FIFO queue has messages of same message group id #369
Comments
Attaching debug logs: Started MessageRetriever print logs for first message attempt 1 c.j.s.b.grouping.GroupingMessageBroker : Error processing message com.jashmore.sqs.processor.MessageProcessingException c.j.s.r.b.BatchingMessageRetriever : Downloaded 2 messages print logs for first message attempt 2 c.j.s.b.grouping.GroupingMessageBroker : Error processing message com.jashmore.sqs.processor.MessageProcessingException c.j.s.r.b.BatchingMessageRetriever : Downloaded 2 messages print logs for first message attempt 3 c.j.s.b.grouping.GroupingMessageBroker : Error processing message com.jashmore.sqs.processor.MessageProcessingException At every attempt the receive count of 2nd message keeps incrementing without processing and both messages end up in DLQ Maybe it an expected scenario but if both messages are consumed in single poll, then 2nd message will go to DLQ without processing and if 2nd message is read in 2nd retry poll, 2nd message is attempted twice and if 2nd message is read in some poll when all retry are finished for 1st message then 2nd message is retried 3 times. Can there be consistency in such scenarios or this is expected/recommended from SQS FIFO queue? |
Hey, yeah so from memory if the first message fails to be processed it will not attempt the second message because that breaks the FIFO ordering. E.g in this scenario:
You can see in this scenario that you have broken the FIFO ordering if you allow the second message to be processed after the first has failed. Are you seeing cases where the first message is being placed into the DLQ but then it does attempt the second message afterwards? |
Hi, Below are the scenarios: (Let me know if the consumer understanding is correct) firstmessage - m1, secondmessage - m2 . Both are part of same message group Scenario 1: (when m1 is successfully sent to DLQ and then m2 is placed in queue)
Scenario 2: (when m1 is retried 1st time and then m2 is placed in queue)
Attaching logs for scenario 2 for clarity: BatchingMessageRetriever : Started MessageRetriever Exception for m1 BatchingMessageRetriever : Downloaded 2 messages (m1,m2) Exception for m1 BatchingMessageRetriever : Downloaded 2 messages (m1,m2) Exception for m1 Exception for m2 So if both m1 and m2 are received first time in same poll, then m2 is never retried if m1 keeps on failing and both goes to DLQ. Adding some thought: Thanks |
Hmm yeah I have been thinking about this and unsure what the best approach for this is as this is what the SQS FIFO queue allows us to do. If you want to make sure that the second scenario cannot happen and it follows similar logic as scenario one, then I would recommend to set maximumMessagesInMessageGroup to be 1. In this case we would never pickup m1 and m2 at the same time and therefore would be practically the same as that first case. |
Using @FifoQueueListener with concurrency of 10 in spring boot application.
SQS FIFO queue has configured visibility timeout of 10 seconds and configured DLQ with max receives count 3.
First: Putting 2 messages in FIFO queue with same message group id and different deduplication id.
Second: Starting Spring boot consumer @FifoQueueListener application.
3 attempts are done on 1st message dropped in queue as exception is thrown from the process. But not noticing any attempt done on 2nd message and sometimes 3 attempts(APPROXIMATE_RECEIVE_COUNT -1,2,3) done on 1st message and only one attempt done on 2nd message(APPROXIMATE_RECEIVE_COUNT - 3).
And finally both messages end up in DLQ.
Expectation should be 3 attempts done on both the messages before putting messages in DLQ.
Noticed that on application start, both messages move to inflight(means both are read in 1st poll).
Maybe the poller for @FifoQueueListener keeps on increasing the receive count of 2nd message while first message attempts are going on.
Also tried increasing the visibility on sqs to 1 hour and overriding visibility to 5 seconds in @FifoQueueListener.
The text was updated successfully, but these errors were encountered: